Skip to content

Allow database timeouts to be configured #1344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 13, 2018

Conversation

sgrif
Copy link
Contributor

@sgrif sgrif commented Apr 13, 2018

We're consistently seeing the endpoint for crate downloads H12. The logs
are showing that we are eventually finishing without error, it's just
taking >30s, so Heroku is killing the connection.

Unfortunately, this means that we don't have visibility into why it's
timing out. It's either waiting for a long time for a DB connection,
updating the version_downloads table (my guess), or talking to S3. If
it's database related, this will turn those requests into actual errors.

This setting will increase our error rates, as we're seeing the 99th
percentile push up to 5s at some points, and we do have requests that
take 15s to complete (but are completing). Eventually I think we should
be able to consider 10s response times unacceptable, but for now I'm
only going to leave this at 10s for long enough for me to see what is
failing, at which point I will set it back to 30s until we no longer see
requests pushing 10s regularly.

We're consistently seeing the endpoint for crate downloads H12. The logs
are showing that we are *eventually* finishing without error, it's just
taking >30s, so Heroku is killing the connection.

Unfortunately, this means that we don't have visibility into *why* it's
timing out. It's either waiting for a long time for a DB connection,
updating the `version_downloads` table (my guess), or talking to S3. If
it's database related, this will turn those requests into actual errors.

This setting will increase our error rates, as we're seeing the 99th
percentile push up to 5s at some points, and we do have requests that
take 15s to complete (but are completing). Eventually I think we should
be able to consider 10s response times unacceptable, but for now I'm
only going to leave this at 10s for long enough for me to see *what* is
failing, at which point I will set it back to 30s until we no longer see
requests pushing 10s regularly.
@sgrif sgrif requested a review from ashleygwilliams April 13, 2018 14:51
Copy link
Member

@ashleygwilliams ashleygwilliams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 🐑

@@ -71,11 +72,19 @@ impl App {
_ => 1,
};

let db_connection_timeout = match (env::var("DB_TIMEOUT"), config.env) {
(Ok(num), _) => num.parse().expect("couldn't parse DB_TIMEOUT"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the long run we might want to make this error message a little better but it's fine for now

@ashleygwilliams
Copy link
Member

bors: r+

bors-voyager bot added a commit that referenced this pull request Apr 13, 2018
1344: Allow database timeouts to be configured r=ashleygwilliams

We're consistently seeing the endpoint for crate downloads H12. The logs
are showing that we are *eventually* finishing without error, it's just
taking >30s, so Heroku is killing the connection.

Unfortunately, this means that we don't have visibility into *why* it's
timing out. It's either waiting for a long time for a DB connection,
updating the `version_downloads` table (my guess), or talking to S3. If
it's database related, this will turn those requests into actual errors.

This setting will increase our error rates, as we're seeing the 99th
percentile push up to 5s at some points, and we do have requests that
take 15s to complete (but are completing). Eventually I think we should
be able to consider 10s response times unacceptable, but for now I'm
only going to leave this at 10s for long enough for me to see *what* is
failing, at which point I will set it back to 30s until we no longer see
requests pushing 10s regularly.
@bors-voyager
Copy link
Contributor

bors-voyager bot commented Apr 13, 2018

Build succeeded

@bors-voyager bors-voyager bot merged commit c8c39a9 into rust-lang:master Apr 13, 2018
@sgrif sgrif deleted the sg-configurable-timeouts branch April 13, 2018 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants