-
Notifications
You must be signed in to change notification settings - Fork 648
Remove the id
column from version_downloads
#1694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
bors
merged 2 commits into
rust-lang:master
from
sgrif:sg-remove-pk-from-version-downloads
Apr 1, 2019
Merged
Remove the id
column from version_downloads
#1694
bors
merged 2 commits into
rust-lang:master
from
sgrif:sg-remove-pk-from-version-downloads
Apr 1, 2019
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
☔ The latest upstream changes (presumably #1699) made this pull request unmergeable. Please resolve the merge conflicts. |
jtgeibel
approved these changes
Apr 1, 2019
The `id` column will be removed in the next commit
This column is no longer being used, and can be safely removed. Rolling back this migration in production requires creating indexes concurrently, which Diesel CLI doesn't support, so rolling back this migration in production will require queries to be run manually. A blocking version is included for development and testing
@bors r=jtgeibel |
📌 Commit fbfb1b0 has been approved by |
fbfb1b0
to
86ba333
Compare
@bors r=jtgeibel |
📌 Commit 86ba333 has been approved by |
bors
added a commit
that referenced
this pull request
Apr 1, 2019
…geibel Remove the `id` column from `version_downloads` :warning: This PR contains destructive migrations. Each commit must be deployed separately :warning: The `version_downloads` table and its indexes account for a whopping 69% of our total database size. The primary key alone is nearly 1GB, or 25% of our total database. This is never really used for anything besides fetching in batches in `update-downloads`. We could just do batching based on crate_id and date, but I've opted to remove the batching entirely. We usually process somewhere between 2k and 3k rows every time it runs -- we'd have to process over 50k rows to even consume 1MB of memory, so we can just load them all at once. With our code changed to never remove the column, we can just completely remove it. This brings the size of the table down to 1600MB total, or 1256MB after a full vacuum. Since Diesel always runs migrations in a transaction, and we can't add an index concurrently inside a transaction, the `down.sql` included is only for development/tests. If we need to roll this migration back in production, the column will need to be added/populated and the indexes created manually in a way that doesn't lock the table. If we were to run the blocking version included in down.sql against production, the table would be locked for multiple minutes, which would take some pages down and make the download endpoint painfully slow. I don't anticipate needing to roll this migration back.
☀️ Test successful - checks-travis |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The
version_downloads
table and its indexes account for a whopping 69% of our total database size. The primary key alone is nearly 1GB, or 25% of our total database. This is never really used for anything besides fetching in batches inupdate-downloads
. We could just do batching based on crate_id and date, but I've opted to remove the batching entirely. We usually process somewhere between 2k and 3k rows every time it runs -- we'd have to process over 50k rows to even consume 1MB of memory, so we can just load them all at once.With our code changed to never remove the column, we can just completely remove it. This brings the size of the table down to 1600MB total, or 1256MB after a full vacuum.
Since Diesel always runs migrations in a transaction, and we can't add an index concurrently inside a transaction, the
down.sql
included is only for development/tests. If we need to roll this migration back in production, the column will need to be added/populated and the indexes created manually in a way that doesn't lock the table. If we were to run the blocking version included in down.sql against production, the table would be locked for multiple minutes, which would take some pages down and make the download endpoint painfully slow.I don't anticipate needing to roll this migration back.