Skip to content

Create a smaller index to be used on the summary page #1328

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

sgrif
Copy link
Contributor

@sgrif sgrif commented Apr 5, 2018

While #1312 improved that query a lot, we can shave off even more time
by giving it an index containing only the data it's going to need. This
index would need to be refreshed daily by a heroku scheduler task at
00:01 UTC.

I don't have a large enough dataset locally to get a great perf
comparison, so it'd be great to have a test comparison on prod before we
merge this.

The funkiness in how we go about creating/dropping the index is due to
the fact that we can't use bind parameters to create indices, nor can we
use CURRENT_DATE directly. If we want to create the index concurrently
(I don't think we need to), then we will have to do string concatenation
in Rust.

While rust-lang#1312 improved that query a lot, we can shave off even more time
by giving it an index containing only the data it's going to need. This
index would need to be refreshed daily by a heroku scheduler task at
00:01 UTC.

I don't have a large enough dataset locally to get a great perf
comparison, so it'd be great to have a test comparison on prod before we
merge this.

The funkiness in how we go about creating/dropping the index is due to
the fact that we can't use bind parameters to create indices, nor can we
use `CURRENT_DATE` directly. If we want to create the index concurrently
(I don't think we need to), then we will have to do string concatenation
in Rust.
DROP INDEX IF EXISTS todays_recent_crate_downloads;
EXECUTE 'CREATE INDEX todays_recent_crate_downloads
ON crate_downloads (date)
WHERE date > date(''' || CURRENT_DATE::text || '''::date - INTERVAL ''90 days'')';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the WHERE clause in the query need to match to ensure the index is used? Especially regarding CURRENT_TIMESTAMP vs CURRENT_DATE. The query is currently:

WHERE crate_downloads.date > date(CURRENT_TIMESTAMP - INTERVAL '90 days')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed that the query as it is written today will use this index before opening the PR

@sgrif sgrif closed this Apr 24, 2018
@sgrif
Copy link
Contributor Author

sgrif commented Apr 24, 2018

#1363 is a better approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants