Skip to content

Require all traffic to provide a User-Agent header. #1534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 24, 2018

Conversation

sgrif
Copy link
Contributor

@sgrif sgrif commented Oct 23, 2018

We want to be able to actually differentiate crawlers from each other,
so I've nudged them towards actually using a unique user agent (we
probably won't ever actually block generic UAs since folks sometimes do
actually use curl/wget from the command line).

Additionally, I've had a lot of cases lately where a crawler has been
outside of what we allow, but wasn't actually causing a service impact.
If I could contact those people without having to block their traffic, I
would. So I've also worded the message to try and nudge folks towards
including contact info, which most commercial bots already do.

We want to be able to actually differentiate crawlers from each other,
so I've nudged them towards actually using a unique user agent (we
probably won't ever actually block generic UAs since folks sometimes do
actually use curl/wget from the command line).

Additionally, I've had a lot of cases lately where a crawler has been
outside of what we allow, but wasn't actually causing a service impact.
If I could contact those people without having to block their traffic, I
would. So I've also worded the message to try and nudge folks towards
including contact info, which most commercial bots already do.
@sgrif sgrif requested a review from jtgeibel October 23, 2018 14:39
It was necessary to check the response status manually, because the
bad_resp! macro expects to be able to decode a JSON response.
Copy link
Member

@jtgeibel jtgeibel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a new commit with a few minor fixes to the test. Should be good to go.

@sgrif
Copy link
Contributor Author

sgrif commented Oct 24, 2018

bors: r+

bors-voyager bot added a commit that referenced this pull request Oct 24, 2018
1534: Require all traffic to provide a `User-Agent` header. r=sgrif a=sgrif

We want to be able to actually differentiate crawlers from each other,
so I've nudged them towards actually using a unique user agent (we
probably won't ever actually block generic UAs since folks sometimes do
actually use curl/wget from the command line).

Additionally, I've had a lot of cases lately where a crawler has been
outside of what we allow, but wasn't actually causing a service impact.
If I could contact those people without having to block their traffic, I
would. So I've also worded the message to try and nudge folks towards
including contact info, which most commercial bots already do.

Co-authored-by: Sean Griffin <[email protected]>
Co-authored-by: Justin Geibel <[email protected]>
@bors-voyager
Copy link
Contributor

bors-voyager bot commented Oct 24, 2018

Build succeeded

@bors-voyager bors-voyager bot merged commit 5a27609 into rust-lang:master Oct 24, 2018
@cuviper
Copy link
Member

cuviper commented Oct 26, 2018

AIUI, cargo didn't set any User-Agent until rust-lang/cargo#3485, which is in cargo 0.17.0 -> rust 1.16. Does this PR mean that earlier versions of cargo won't be able to fetch from crates.io at all?

@sgrif
Copy link
Contributor Author

sgrif commented Oct 26, 2018

Yes, we learned that yesterday afternoon. This change has since been reverted and we're going to be discussing it during the next team meeting.

@cuviper
Copy link
Member

cuviper commented Oct 26, 2018

OK, thanks for the info!

@sgrif sgrif deleted the sg-require-user-agent branch March 9, 2019 01:34
@est31 est31 mentioned this pull request Apr 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants