Allow blocking arbitrary traffic without redeploying #1806

sgrif · 2019-08-18T14:10:06Z

This generalizes the block_ips module to allow checking against any
header instead of just the X-Real-Ip header. This does not enable more
complex matching such as regexes, but it does let us apply the simple
string matching we have today to any header.

Additionally, which headers we check against is now configured by an
environment variable, which is a comma separated list in the form
Header=VALUE_ENV_VAR. So to keep the behavior we had today, one would
set BLOCKED_TRAFFIC="X-Real-Ip=BLOCKED_IPS".

This means that when we see traffic patterns that we need to block which
can be handled by the existing logic, we can now begin applying it to
patterns we didn't previosuly anticipate without requiring a redeploy

This generalizes the `block_ips` module to allow checking against any header instead of just the `X-Real-Ip` header. This does not enable more complex matching such as regexes, but it does let us apply the simple string matching we have today to any header. Additionally, which headers we check against is now configured by an environment variable, which is a comma separated list in the form `Header=VALUE_ENV_VAR`. So to keep the behavior we had today, one would set `BLOCKED_TRAFFIC="X-Real-Ip=BLOCKED_IPS"`. This means that when we see traffic patterns that we need to block which can be handled by the existing logic, we can now begin applying it to patterns we didn't previosuly anticipate without requiring a redeploy

rust-highfive · 2019-08-18T14:10:16Z

r? @carols10cents

(rust_highfive has picked a reviewer for you, use r? to override)

smarnach · 2019-08-18T20:31:50Z

I'm wondering – why do we implement IP blocking and rate limiting in the backend server? Wouldn't it be easier to include this in the nginx configuration?

sgrif · 2019-08-18T21:04:11Z

Rate limiting is performed at the NGINX level unless it can't be. The only case that isn't handled in nginx is for publishing new crates. Determining whether PUT /api/v1/crates/new is a new crate or a new version of an existing crate requires connecting to the database, so it must be in app code. For this case we also want to limit by user instead of IP, which requires connecting to the database (this also has the side effect of making this consistent regardless of how many servers we run -- which is generally not that important, but is for this case).

We have another rate limit for publishes, which is less strict (and mostly to ensure we don't get rate limited by GH) which is handled in nginx, and I'm planning on opening a PR to add a few more at the nginx level soon. In general that's where we prefer to handle rate limiting, and I don't think it's correct to say we aren't handling it there.

As for why blocking is done in app code, that's mostly a function of familiarity amongst the team. Blocking traffic is something that primarily happens during incident response, so having it happen where folks who are on call are most familiar with things is important to make sure we can effectively respond to these incidents quickly. My understanding is that nginx isn't really meant for that use case, and while it can be used for IP blocking (it couldn't when we first introduced the app level since $remote_addr is not correct by default, but I believe we do have the appropriate extensions configured for it to handle it now) -- It still isn't really set up to do the sort of arbitrary blocking that I'm referring to here, especially not without a redeploy. That said, I'm open to being proven wrong and moving this to nginx if everyone handling ops is up to date

smarnach · 2019-08-19T18:17:33Z

@sgrif Thanks for the detailed explanation. The rate-limiting setup completely makes sense to me now – if we need to connect to the DB, we can only do this in the backend, and the common case is already handled by nginx.

IP blocking can be configured in nginx using deny, but of course this would require to modify the configuration file, so we would need to redeploy on Heroku. It is possible to set up dynamic reconfiguration in nginx, but this would require further work.

So overall I understand the reasoning for implementing this ourselves, and it seems like the best approach for now. However, if we go ahead and implement more general HTTP filtering rules, we may get to the point where switching to a different reverse proxy like HAProxy would be worthwhile. We don't have a ton of logic in the nginx configuration, so the migration should be rather straightforward.

Somewhat related, do we have any ops documentation for crates.io? If you want to get more people on the on-call rota, we should have documentation for common tasks.

sgrif · 2019-08-19T18:27:44Z

However, if we go ahead and implement more general HTTP filtering rules, we may get to the point where switching to a different reverse proxy like HAProxy would be worthwhile.

Yes, I agree. This is very much a "simplest thing that works for our needs right now".

Somewhat related, do we have any ops documentation for crates.io? If you want to get more people on the on-call rota, we should have documentation for common tasks.

A little bit, but it's sparse and scattered. Once we start bringing more folks on I'm going to make sure we have a playbook that they can reference

smarnach · 2019-08-19T19:55:06Z

I think you closed this by accident.

sgrif · 2019-08-19T20:16:20Z

Thank you

kzys · 2019-08-27T05:56:59Z

Looks good to me.

carols10cents · 2019-08-29T22:03:40Z

@sgrif I pushed 2 commits to this branch; one adding a bit of docs and one adding a happy path test. If my new commits look good to you, feel free to merge this.

sgrif · 2019-08-29T22:11:05Z

Looks great. Thanks!

@bors: r+

bors · 2019-08-29T22:11:06Z

📌 Commit 8b7611b has been approved by sgrif

bors · 2019-08-29T22:11:13Z

⌛ Testing commit 8b7611b with merge 123cf65...

Allow blocking arbitrary traffic without redeploying This generalizes the `block_ips` module to allow checking against any header instead of just the `X-Real-Ip` header. This does not enable more complex matching such as regexes, but it does let us apply the simple string matching we have today to any header. Additionally, which headers we check against is now configured by an environment variable, which is a comma separated list in the form `Header=VALUE_ENV_VAR`. So to keep the behavior we had today, one would set `BLOCKED_TRAFFIC="X-Real-Ip=BLOCKED_IPS"`. This means that when we see traffic patterns that we need to block which can be handled by the existing logic, we can now begin applying it to patterns we didn't previosuly anticipate without requiring a redeploy

bors · 2019-08-29T22:35:11Z

☀️ Test successful - checks-travis
Approved by: sgrif
Pushing 123cf65 to master...

rust-highfive assigned carols10cents Aug 18, 2019

rust-highfive added the S-waiting-on-review label Aug 18, 2019

rustfmt

4d2963e

sgrif added 2 commits August 19, 2019 10:13

Don't panic if the header we're blocking isn't set

e62f021

Add a test to ensure we don't panic when the header is not given

b4de5e5

sgrif closed this Aug 19, 2019

sgrif reopened this Aug 19, 2019

carols10cents added 2 commits August 29, 2019 17:36

Add some usage documentation to the block traffic module

b4b39dc

Add a happy path test for the block_traffic module

09fc93e

carols10cents assigned sgrif and unassigned carols10cents Aug 29, 2019

carols10cents approved these changes Aug 29, 2019

View reviewed changes

Fix doc formatting

8b7611b

bors merged commit 8b7611b into rust-lang:master Aug 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow blocking arbitrary traffic without redeploying #1806

Allow blocking arbitrary traffic without redeploying #1806

Uh oh!

sgrif commented Aug 18, 2019

Uh oh!

rust-highfive commented Aug 18, 2019

Uh oh!

smarnach commented Aug 18, 2019

Uh oh!

sgrif commented Aug 18, 2019 •

edited

Loading

Uh oh!

smarnach commented Aug 19, 2019

Uh oh!

sgrif commented Aug 19, 2019

Uh oh!

smarnach commented Aug 19, 2019

Uh oh!

sgrif commented Aug 19, 2019

Uh oh!

kzys commented Aug 27, 2019

Uh oh!

carols10cents commented Aug 29, 2019

Uh oh!

sgrif commented Aug 29, 2019

Uh oh!

bors commented Aug 29, 2019

Uh oh!

bors commented Aug 29, 2019

Uh oh!

bors commented Aug 29, 2019

Uh oh!

Uh oh!

Allow blocking arbitrary traffic without redeploying #1806

Allow blocking arbitrary traffic without redeploying #1806

Uh oh!

Conversation

sgrif commented Aug 18, 2019

Uh oh!

rust-highfive commented Aug 18, 2019

Uh oh!

smarnach commented Aug 18, 2019

Uh oh!

sgrif commented Aug 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

smarnach commented Aug 19, 2019

Uh oh!

sgrif commented Aug 19, 2019

Uh oh!

smarnach commented Aug 19, 2019

Uh oh!

sgrif commented Aug 19, 2019

Uh oh!

kzys commented Aug 27, 2019

Uh oh!

carols10cents commented Aug 29, 2019

Uh oh!

sgrif commented Aug 29, 2019

Uh oh!

bors commented Aug 29, 2019

Uh oh!

bors commented Aug 29, 2019

Uh oh!

bors commented Aug 29, 2019

Uh oh!

Uh oh!

sgrif commented Aug 18, 2019 •

edited

Loading