Skip to content

Switch most recently downloaded summary query to be faster #1312

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 18 additions & 13 deletions src/controllers/krate/metadata.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,7 @@ use models::krate::ALL_COLUMNS;

/// Handles the `GET /summary` route.
pub fn summary(req: &mut Request) -> CargoResult<Response> {
use diesel::dsl::*;
use diesel::sql_types::{BigInt, Nullable};
use diesel::sql_query;
use schema::crates::dsl::*;

let conn = req.db_conn()?;
Expand Down Expand Up @@ -57,17 +56,23 @@ pub fn summary(req: &mut Request) -> CargoResult<Response> {
.limit(10)
.load(&*conn)?;

let recent_downloads = sql::<Nullable<BigInt>>("SUM(crate_downloads.downloads)");
let most_recently_downloaded = crates
.left_join(
crate_downloads::table.on(id.eq(crate_downloads::crate_id)
.and(crate_downloads::date.gt(date(now - 90.days())))),
)
.group_by(id)
.order(recent_downloads.desc().nulls_last())
.limit(10)
.select(ALL_COLUMNS)
.load::<Crate>(&*conn)?;
// This query needs to be structured in this way to have the LIMIT
// happen before the joining/sorting for performance reasons.
// It needs to use sql_query because Diesel doesn't have a great way
// to join on subselects right now :(
let most_recently_downloaded = sql_query(
"SELECT crates.* \
FROM crates \
JOIN ( \
SELECT crate_downloads.crate_id, SUM(crate_downloads.downloads) \
FROM crate_downloads \
WHERE crate_downloads.date > date(CURRENT_TIMESTAMP - INTERVAL '90 days') \
GROUP BY crate_downloads.crate_id \
ORDER BY SUM(crate_downloads.downloads) DESC NULLS LAST \
LIMIT 10 \
) cd ON crates.id = cd.crate_id \
ORDER BY cd.sum DESC NULLS LAST",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe NULLS LAST can be dropped because this is an INNER JOIN which should not include any null sums. Doesn't hurt to be explicit though.

).load::<Crate>(&*conn)?;

let popular_keywords = keywords::table
.order(keywords::crates_cnt.desc())
Expand Down
3 changes: 2 additions & 1 deletion src/models/krate.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ pub struct CrateDownload {
pub date: NaiveDate,
}

#[derive(Debug, Clone, Queryable, Identifiable, Associations, AsChangeset)]
#[derive(Debug, Clone, Queryable, Identifiable, Associations, AsChangeset, QueryableByName)]
#[table_name = "crates"]
pub struct Crate {
pub id: i32,
pub name: String,
Expand Down