Skip to content

Commit c8eb565

Browse files
Merge #1499
1499: Replace `curl` with `reqwest` r=sgrif a=sgrif **The diff stats may look scary, but the vast majority of the added lines is because `reqwest` depends on `tokio`, which caused 448 lines to be added to `Cargo.lock`. The actual code changes are not particularly large** Why should we do this? --- `curl::Easy` is anything but. Its API is verbose, and consistently misunderstood in our codebase. These commits replace the `curl` crate with `reqwest`, which is designed to be a higher level HTTP client. It doesn't have every random nicety we might want (like setting `Date` on `POST`/`PUT` requests), but does have a ton of things we want built in like helpers for JSON request/response bodies that assumes the use of serde, and "turn 4xx/5xx response codes into an error" helper functions. At best the curl crate required some very verbose code, or confusing APIs to make it possible to work with stack data, but at worst it led to some code that was confusing or even dangerous. A few things that were weird or gave me pause while writing this were: - Setting the `Host` header manually, which is at best pointless and at worst a major latent bug waiting to strike. I suspect that the first instance of this was in the S3 code, which was done because `Host` is mentioned in the S3 docs for common headers for some reason, and the other places this was done were due to that code being copied from the S3 code - Specifically calling a CURL function that opted into `Expect: 100-Continue`, and then manually overriding the `Expect` header with a comment that it's not clear what it's purpose is - Setting the `Date` header on requests for no reason (which on requests with no body, is actually a violation of the HTTP/1.1 spec) - Choosing "hello!" as a User-Agent string for endpoints which require it What changes occur in our HTTP requests? ---- The `http-data` files are basically impossible to review, so here are the changes that were made: All requests have the following headers added: - `User-Agent: reqwest/0.9.1` - `Accept-Encoding: gzip` All requests had the following header removed: - `Proxy-Connection: Keep-Alive` - Uncyclopedia describes this header as "Implemented as a misunderstanding of the HTTP specifications. Common because of mistakes in implementations of early HTTP versions." It has been removed from Firefox and more recent versions of curl. Some requests had the following header removed: - `User-Agent: hello!` - The HTTP specifications states that all user agents SHOULD send this header. All requests that we make are a direct result of user interaction, making us a user agent (and it's pretty standard to expect this header even from non-user-agents like crawlers). Arguably we should be setting this to `crates.io`, but this is at least marginally useful unlike "hello!" which is not appropriate here. What notable changes happened in the code? --- For the most part, the code is just replacing manual stuff with built-in stuff. I've removed anything that set a `Date` or `User-Agent` header manually (unless `Date` was required, which is the case for S3, since the date is used as part of the auth signature). I've removed all cases that were manually setting a `Host` header, since if this ever differed from the URL we used to lookup the domain, it would be very bad. While making this change, a few refactorings occurred. The S3 code was taking a `file_length` parameter as a `u64`. This was expected to be the length of the bytes passed in, but... we have a slice so we can just ask it the length. Due to a combination of the API reqwest wants to expose, and the fact that its backed by tokio, it can only work with static data. This was only relevant for S3 uploads, but changing the relevant functions from taking `&[u8]` to `Vec<u8>` was a trivial change. The functions for interacting with the GitHub API were split into two pieces, one which made the request, and one which processed it. The only reason for this split was so that one consumer could intercept 404 responses and return `Ok(false)`. I've opted to merge all of this into a single function, and return `NotFound` for 404, which we can check for in that one place. I was expecting to use `Any::is` there, but due to the way its written, it required duplicating its implementation (see the commit message for that commit for more details) The only other point of note is that the changes to `GhUser::init` are not tested, since they only run if some files in this repo are not present, and for that code to run we'd need the passwords for two GitHub users that we don't have access to. I suspect we should probably just delete that code and panic if we don't have the file, but I haven't done so here. Co-authored-by: Sean Griffin <[email protected]>
2 parents 74d533a + 0d5ed7a commit c8eb565

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+630
-461
lines changed

Cargo.lock

Lines changed: 448 additions & 151 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,6 @@ url = "1.2.1"
3636
tar = "0.4.13"
3737

3838
openssl = "0.9.14"
39-
curl = "0.4"
4039
oauth2 = "0.3"
4140
log = "0.3"
4241
env_logger = "0.5"
@@ -58,6 +57,7 @@ docopt = "0.8.1"
5857
itertools = "0.6.0"
5958
scheduled-thread-pool = "0.2.0"
6059
derive_deref = "1.0.0"
60+
reqwest = "0.9.1"
6161

6262
lettre = "0.8"
6363
lettre_email = "0.8"

src/app.rs

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5,12 +5,13 @@ use std::path::PathBuf;
55
use std::sync::{Arc, Mutex};
66
use std::time::Duration;
77

8-
use curl::easy::Easy;
98
use diesel::r2d2;
109
use git2;
1110
use oauth2;
11+
use reqwest;
1212
use scheduled_thread_pool::ScheduledThreadPool;
1313

14+
use util::CargoResult;
1415
use {db, Config, Env};
1516

1617
/// The `App` struct holds the main components of the application like
@@ -99,16 +100,16 @@ impl App {
99100
}
100101
}
101102

102-
/// Returns a handle for making HTTP requests to upload crate files.
103+
/// Returns a client for making HTTP requests to upload crate files.
103104
///
104105
/// The handle will go through a proxy if the uploader being used has specified one, which
105106
/// is only done in test mode in order to be able to record and inspect the HTTP requests
106107
/// that tests make.
107-
pub fn handle(&self) -> Easy {
108-
let mut handle = Easy::new();
108+
pub fn http_client(&self) -> CargoResult<reqwest::Client> {
109+
let mut builder = reqwest::Client::builder();
109110
if let Some(proxy) = self.config.uploader.proxy() {
110-
handle.proxy(proxy).unwrap();
111+
builder = builder.proxy(reqwest::Proxy::all(proxy)?);
111112
}
112-
handle
113+
Ok(builder.build()?)
113114
}
114115
}

src/bin/render-readmes.rs

Lines changed: 17 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -10,27 +10,24 @@ extern crate serde_derive;
1010

1111
extern crate cargo_registry;
1212
extern crate chrono;
13-
extern crate curl;
1413
extern crate diesel;
1514
extern crate docopt;
1615
extern crate flate2;
1716
extern crate itertools;
17+
extern crate reqwest;
1818
extern crate tar;
1919
extern crate toml;
20-
extern crate url;
2120

2221
use chrono::{TimeZone, Utc};
23-
use curl::easy::{Easy, List};
2422
use diesel::dsl::any;
2523
use diesel::prelude::*;
2624
use docopt::Docopt;
2725
use flate2::read::GzDecoder;
2826
use itertools::Itertools;
29-
use std::io::{Cursor, Read};
27+
use std::io::Read;
3028
use std::path::Path;
3129
use std::thread;
3230
use tar::Archive;
33-
use url::Url;
3431

3532
use cargo_registry::render::readme_to_html;
3633
use cargo_registry::Config;
@@ -146,15 +143,13 @@ fn main() {
146143
}
147144
let readme = readme.unwrap();
148145
let readme_path = format!("readmes/{0}/{0}-{1}.html", krate_name, version.num);
149-
let readme_len = readme.len();
150146
config
151147
.uploader
152148
.upload(
153-
Easy::new(),
149+
&reqwest::Client::new(),
154150
&readme_path,
155-
readme.as_bytes(),
151+
readme.into_bytes(),
156152
"text/html",
157-
readme_len as u64,
158153
).unwrap_or_else(|_| {
159154
panic!(
160155
"[{}-{}] Couldn't upload file to S3",
@@ -174,53 +169,32 @@ fn main() {
174169

175170
/// Renders the readme of an uploaded crate version.
176171
fn get_readme(config: &Config, version: &Version, krate_name: &str) -> Option<String> {
177-
let mut handle = Easy::new();
178-
let location = match config
172+
let location = config
179173
.uploader
180-
.crate_location(krate_name, &version.num.to_string())
181-
{
182-
Some(l) => l,
183-
None => return None,
184-
};
185-
let date = Utc::now().to_rfc2822();
186-
let url = Url::parse(&location)
187-
.unwrap_or_else(|_| panic!("[{}-{}] Couldn't parse crate URL", krate_name, version.num));
188-
189-
let mut headers = List::new();
190-
headers
191-
.append(&format!("Host: {}", url.host().unwrap()))
192-
.unwrap();
193-
headers.append(&format!("Date: {}", date)).unwrap();
174+
.crate_location(krate_name, &version.num.to_string())?;
194175

195-
handle.url(url.as_str()).unwrap();
196-
handle.get(true).unwrap();
197-
handle.http_headers(headers).unwrap();
198-
199-
let mut response = Vec::new();
200-
{
201-
let mut req = handle.transfer();
202-
req.write_function(|data| {
203-
response.extend(data);
204-
Ok(data.len())
205-
}).unwrap();
206-
if let Err(err) = req.perform() {
176+
let mut response = match reqwest::get(&location) {
177+
Ok(r) => r,
178+
Err(err) => {
207179
println!(
208180
"[{}-{}] Unable to fetch crate: {}",
209181
krate_name, version.num, err
210182
);
211183
return None;
212184
}
213-
}
214-
if handle.response_code().unwrap() != 200 {
215-
let response = String::from_utf8_lossy(&response);
185+
};
186+
187+
if !response.status().is_success() {
216188
println!(
217189
"[{}-{}] Failed to get a 200 response: {}",
218-
krate_name, version.num, response
190+
krate_name,
191+
version.num,
192+
response.text().unwrap()
219193
);
220194
return None;
221195
}
222-
let reader = Cursor::new(response);
223-
let reader = GzDecoder::new(reader);
196+
197+
let reader = GzDecoder::new(response);
224198
let mut archive = Archive::new(reader);
225199
let mut entries = archive.entries().unwrap_or_else(|_| {
226200
panic!(

src/controllers/krate/publish.rs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -166,10 +166,10 @@ pub fn publish(req: &mut dyn Request) -> CargoResult<Response> {
166166
// Upload the crate, return way to delete the crate from the server
167167
// If the git commands fail below, we shouldn't keep the crate on the
168168
// server.
169-
let (cksum, mut crate_bomb, mut readme_bomb) =
170-
app.config
171-
.uploader
172-
.upload_crate(req, &krate, readme, file_length, maximums, vers)?;
169+
let (cksum, mut crate_bomb, mut readme_bomb) = app
170+
.config
171+
.uploader
172+
.upload_crate(req, &krate, readme, maximums, vers)?;
173173
version.record_readme_rendering(&conn)?;
174174

175175
let mut hex_cksum = String::new();

src/controllers/user/session.rs

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -100,8 +100,7 @@ pub fn github_access_token(req: &mut dyn Request) -> CargoResult<Response> {
100100
.exchange(code.clone())
101101
.map_err(|s| human(&s))?;
102102

103-
let (handle, resp) = github::github(req.app(), "/user", &token)?;
104-
let ghuser: GithubUser = github::parse_github_response(handle, &resp)?;
103+
let ghuser = github::github::<GithubUser>(req.app(), "/user", &token)?;
105104

106105
let user = NewUser::new(
107106
ghuser.id,

src/github.rs

Lines changed: 37 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -1,88 +1,58 @@
11
//! This module implements functionality for interacting with GitHub.
22
3-
use curl;
4-
use curl::easy::{Easy, List};
5-
63
use oauth2::*;
4+
use reqwest::{self, header};
75

8-
use serde::Deserialize;
9-
use serde_json;
6+
use serde::de::DeserializeOwned;
107

118
use std::str;
129

1310
use app::App;
14-
use util::{human, internal, CargoResult, ChainError};
11+
use util::{errors::NotFound, human, internal, CargoError, CargoResult};
1512

1613
/// Does all the nonsense for sending a GET to Github. Doesn't handle parsing
1714
/// because custom error-code handling may be desirable. Use
1815
/// `parse_github_response` to handle the "common" processing of responses.
19-
pub fn github(app: &App, url: &str, auth: &Token) -> Result<(Easy, Vec<u8>), curl::Error> {
16+
pub fn github<T>(app: &App, url: &str, auth: &Token) -> CargoResult<T>
17+
where
18+
T: DeserializeOwned,
19+
{
2020
let url = format!("{}://api.github.com{}", app.config.api_protocol, url);
2121
info!("GITHUB HTTP: {}", url);
2222

23-
let mut headers = List::new();
24-
headers
25-
.append("Accept: application/vnd.github.v3+json")
26-
.unwrap();
27-
headers.append("User-Agent: hello!").unwrap();
28-
headers
29-
.append(&format!("Authorization: token {}", auth.access_token))
30-
.unwrap();
31-
32-
let mut handle = app.handle();
33-
handle.url(&url).unwrap();
34-
handle.get(true).unwrap();
35-
handle.http_headers(headers).unwrap();
36-
37-
let mut data = Vec::new();
38-
{
39-
let mut transfer = handle.transfer();
40-
transfer
41-
.write_function(|buf| {
42-
data.extend_from_slice(buf);
43-
Ok(buf.len())
44-
}).unwrap();
45-
transfer.perform()?;
46-
}
47-
Ok((handle, data))
23+
let client = app.http_client()?;
24+
client
25+
.get(&url)
26+
.header(header::ACCEPT, "application/vnd.github.v3+json")
27+
.header(
28+
header::AUTHORIZATION,
29+
format!("token {}", auth.access_token),
30+
).send()?
31+
.error_for_status()
32+
.map_err(|e| handle_error_response(&e))?
33+
.json()
34+
.map_err(Into::into)
4835
}
4936

50-
/// Checks for normal responses
51-
pub fn parse_github_response<'de, 'a: 'de, T: Deserialize<'de>>(
52-
mut resp: Easy,
53-
data: &'a [u8],
54-
) -> CargoResult<T> {
55-
match resp.response_code().unwrap() {
56-
// Ok!
57-
200 => {}
58-
// Unauthorized or Forbidden
59-
401 | 403 => {
60-
return Err(human(
61-
"It looks like you don't have permission \
62-
to query a necessary property from Github \
63-
to complete this request. \
64-
You may need to re-authenticate on \
65-
crates.io to grant permission to read \
66-
github org memberships. Just go to \
67-
https://crates.io/login",
68-
));
69-
}
70-
// Something else
71-
n => {
72-
let resp = String::from_utf8_lossy(data);
73-
return Err(internal(&format_args!(
74-
"didn't get a 200 result from \
75-
github, got {} with: {}",
76-
n, resp
77-
)));
78-
}
79-
}
80-
81-
let json = str::from_utf8(data)
82-
.ok()
83-
.chain_error(|| internal("github didn't send a utf8-response"))?;
37+
fn handle_error_response(error: &reqwest::Error) -> Box<dyn CargoError> {
38+
use reqwest::StatusCode as Status;
8439

85-
serde_json::from_str(json).chain_error(|| internal("github didn't send a valid json response"))
40+
match error.status() {
41+
Some(Status::UNAUTHORIZED) | Some(Status::FORBIDDEN) => human(
42+
"It looks like you don't have permission \
43+
to query a necessary property from Github \
44+
to complete this request. \
45+
You may need to re-authenticate on \
46+
crates.io to grant permission to read \
47+
github org memberships. Just go to \
48+
https://crates.io/login",
49+
),
50+
Some(Status::NOT_FOUND) => Box::new(NotFound),
51+
_ => internal(&format_args!(
52+
"didn't get a 200 result from github: {}",
53+
error
54+
)),
55+
}
8656
}
8757

8858
/// Gets a token with the given string as the access token, but all

src/lib.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@
1212
extern crate ammonia;
1313
extern crate chrono;
1414
extern crate comrak;
15-
extern crate curl;
1615
#[macro_use]
1716
extern crate derive_deref;
1817
#[macro_use]
@@ -32,6 +31,7 @@ extern crate log;
3231
extern crate oauth2;
3332
extern crate openssl;
3433
extern crate rand;
34+
extern crate reqwest;
3535
extern crate s3;
3636
extern crate scheduled_thread_pool;
3737
extern crate semver;

src/models/team.rs

Lines changed: 8 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ use diesel::prelude::*;
22

33
use app::App;
44
use github;
5-
use util::{human, CargoResult};
5+
use util::{errors::NotFound, human, CargoResult};
66

77
use models::{Crate, CrateOwner, Owner, OwnerKind, User};
88
use schema::{crate_owners, teams};
@@ -142,8 +142,7 @@ impl Team {
142142
// links. A hundred teams should be enough for any org, right?
143143
let url = format!("/orgs/{}/teams?per_page=100", org_name);
144144
let token = github::token(req_user.gh_access_token.clone());
145-
let (handle, data) = github::github(app, &url, &token)?;
146-
let teams: Vec<GithubTeam> = github::parse_github_response(handle, &data)?;
145+
let teams = github::github::<Vec<GithubTeam>>(app, &url, &token)?;
147146

148147
let team = teams
149148
.into_iter()
@@ -165,8 +164,7 @@ impl Team {
165164
}
166165

167166
let url = format!("/orgs/{}", org_name);
168-
let (handle, resp) = github::github(app, &url, &token)?;
169-
let org: Org = github::parse_github_response(handle, &resp)?;
167+
let org = github::github::<Org>(app, &url, &token)?;
170168

171169
NewTeam::new(&login.to_lowercase(), team.id, team.name, org.avatar_url)
172170
.create_or_update(conn)
@@ -225,14 +223,11 @@ fn team_with_gh_id_contains_user(app: &App, github_id: i32, user: &User) -> Carg
225223

226224
let url = format!("/teams/{}/memberships/{}", &github_id, &user.gh_login);
227225
let token = github::token(user.gh_access_token.clone());
228-
let (mut handle, resp) = github::github(app, &url, &token)?;
229-
230-
// Officially how `false` is returned
231-
if handle.response_code().unwrap() == 404 {
232-
return Ok(false);
233-
}
234-
235-
let membership: Membership = github::parse_github_response(handle, &resp)?;
226+
let membership = match github::github::<Membership>(app, &url, &token) {
227+
// Officially how `false` is returned
228+
Err(ref e) if e.is::<NotFound>() => return Ok(false),
229+
x => x?,
230+
};
236231

237232
// There is also `state: pending` for which we could possibly give
238233
// some feedback, but it's not obvious how that should work.

src/router.rs

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -213,7 +213,6 @@ mod tests {
213213

214214
// All other error types are propogated up the middleware, eventually becoming status 500
215215
assert!(C(|_| Err(internal(""))).call(&mut req).is_err());
216-
assert!(C(|_| err(::curl::Error::new(0))).call(&mut req).is_err());
217216
assert!(
218217
C(|_| err(::serde_json::Error::syntax(
219218
::serde_json::error::ErrorCode::ExpectedColon,

0 commit comments

Comments
 (0)