Skip to content

[ws-gc] WS soft deletion improvements #20271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Oct 15, 2024
Merged

Conversation

filiptronicek
Copy link
Member

@filiptronicek filiptronicek commented Oct 4, 2024

Description

Makes some improvements on our workspace garbage collection we periodically run:

  1. Adds more logging, such as logging the workspaceMinAgeDays and prebuildMinAgeDays when the job starts, as well as logging every soft- and hard-deleted entry along with its workspace ID and deletionEligibilityTime.
  2. Changes the way we decide the deletionEligibilityTime of a workspace from looking at the latest instance's stoppingTime, startedTime or creationTime or workspace's creationTime to actually using the current time if we're calling the update as part of a stop, start, create or git status update.
  3. Prevents workspaces with a currently running instance from being GC'd

Related Issue(s)

Fixes ENT-854

How to test

See test updates

@roboquat roboquat added size/L and removed size/M labels Oct 7, 2024
@filiptronicek filiptronicek changed the title [ws-gc] Additional logging [ws-gc] WS soft deletion improvements Oct 7, 2024
@filiptronicek filiptronicek self-assigned this Oct 7, 2024
// we do not allow to run this with a future date
if (cutOffDate > new Date()) {
throw new Error("cutOffDate must not be in the future, was: " + cutOffDate.toISOString());
}
const workspaceRepo = await this.getWorkspaceRepo();
const dbResults = await workspaceRepo.query(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to align this with findWorkspacesForPurging by converting from a raw SQL statement, but maybe it's too out of scope here. I'm curious if anyone has thoughts about it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the "async predicate" which looks way too much like a join in disguise, let' snot do this right now. 👍 🙃

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about the readability of this with 13e1a76 (#20271)? I tried putting everything into a single query and think that it makes a bit more sense.

The benefit of using this join is not too big though so I'm happy to revert if you think it's not that helpful.

this.db.store(this.ws),
this.db.storeInstance(this.wsi1),
this.db.storeInstance(this.wsi2),
this.db.storeInstance(this.wsi3),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧡

@@ -561,6 +561,12 @@ export class WorkspaceService {
daysToLive = daysToLive * 2;
}
deletionEligibilityTime.setDate(deletionEligibilityTime.getDate() + daysToLive);
if (new Date() > deletionEligibilityTime) {
log.warn({ userId, workspaceId }, "Prevented setting deletion eligibility time in the past", {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🫧 I wonder whether we are also interested in the "reason" for the update. But I guess we can see that from looking at the context in the logs later on. 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more details in 39065c7 (#20271), WDYT?

@filiptronicek
Copy link
Member Author

Here's a sample log event when adding git changes and then discarding them:

{
  "@type": "type.googleapis.com/google.devtools.clouderrorreporting.v1beta1.ReportedErrorEvent",
  "serviceContext": {
    "service": "server",
    "version": "ft-additioda303f614e-dev-2024-10-08_T18-33-52"
  },
  "component": "server",
  "severity": "WARNING",
  "time": "2024-10-08T18:55:11.397Z",
  "context": {
    "requestId": "bf218162-f8d9-4fb2-aaab-095129de31ed",
    "requestKind": "jsonrpc",
    "requestMethod": "updateGitStatus",
    "traceId": "",
    "subjectId": "user_dd98d787-76ed-4114-abee-9acbdda79786",
    "userId": "dd98d787-76ed-4114-abee-9acbdda79786",
    "contextTimeMs": 27.451227009296417,
    "workspaceId": "[redacted:md5:0e98a0a31bc6576effd2fcd40f668b24]"
  },
  "message": "Prevented moving deletion eligibility time backwards",
  "payload": {
    "hasGitChanges": false,
    "timestamps": {
      "wouldBeDeletionEligibilityTime": "2024-10-22T18:55:11.397Z",
      "currentDeletionEligibilityTime": "2024-11-05T18:53:26.402Z",
      "instanceStartedTime": "2024-10-08T18:52:32.735Z",
      "instanceCreationTime": "2024-10-08T18:52:20.175Z",
      "workspaceCreationtime": "2024-10-08T18:52:20.112Z",
      "lastActive": "2024-10-08T18:52:32.735Z"
    }
  }
}

cc @geropl

@filiptronicek filiptronicek requested a review from geropl October 10, 2024 07:11
Copy link
Member

@geropl geropl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, thanks for jumping through the hoops together @filiptronicek ! 🧡

@roboquat roboquat merged commit 33942e7 into main Oct 15, 2024
18 checks passed
@roboquat roboquat deleted the ft/additional-ws-gc-logging branch October 15, 2024 12:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants