Skip to content

Incorporate feedback from 2nd review of SE-0292 #1319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 27, 2021

Conversation

mattt
Copy link
Contributor

@mattt mattt commented Apr 5, 2021

This PR makes the following changes based on feedback from the 2nd review of SE-0292:

  • Package names are compared using NFKC + CaseFolding
  • The default, unscoped registry is denoted by [default]
  • Mention .netrc file as mitigation for hardcoded credentials

@mattt mattt requested a review from tomerd April 5, 2021 11:39
@benrimmington

This comment has been minimized.

@tomerd tomerd self-assigned this Apr 13, 2021
"registries": {
"default": {
"registries": {
"[default]": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does square brackets have precedence in other such system? in Swift we usually use underbar for making "special" things

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose [default], because it was visually distinctive. However, the only requirement is for it to not be a valid scope name, so _default could work just as well. At that point, it's a subjective choice.

As far as precedence, Swift uses square brackets in closures to annotate capture semantics ([weak self]), whereas underscore-prefixed variables are an unenforced convention. Another option to consider might be _. I don't feel too strongly either which way.

Copy link
Contributor

@tomerd tomerd May 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer we go with underbar _ or dot ., tho the latter may be too subtle. @neonichu @abertelrud @yim-lee @weissi @Lukasa opinions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on _

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think _ is usually associated with private or "not to be used" things in the Apple developer community, so seems like we should avoid that.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @neonichu , _ means "don't use, the ACL on this decl is more permissive than it should ideally be" in most of Swift.

Happy with [default], <default>, (default), .default, ... Don't think I have a strong preference. We could also go with "" to mean default but that's probably a bit opaque.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with any of @weissi's options there, I don't object to the _ but I understand the desire not to use it.

@tomerd
Copy link
Contributor

tomerd commented May 3, 2021

thanks for putting together @mattt, some comments inline and a couple of additional points that were brought up in the review and I think need to be addressed:

@mattt
Copy link
Contributor Author

mattt commented May 3, 2021

thanks for putting together @mattt, some comments inline and a couple of additional points that were brought up in the review and I think need to be addressed:

  • The OpenAPI spec that goes with this proposal is extremely useful for the reviewers and includes some interesting details that are not found elsewhere (afaict), for example getting information about additional versions of Package.swift is part of /Package.swift API. Could it be added to the proposal directly?

It'd be great to link to that from the proposal. The only blocker there was having a durable link target. Once the server specification is merged in swiftlang/swift-package-manager#3364, we can certainly add a link to it from this document.

From that conversation between me and @mmarston (emphasis added):

For now, Swift Package Manager will support only Zip source archives for the package registry. If or when support is added for binary XCFramework distributions, the specification could be amended to add an endpoint for GET /{scope}/{name}/{version}.xcframework and amend GET /{scope}/{name}/{version} to do the kind of content negotiation I described. That would facilitate server-driven content negotiation.

For the kind of agent-driven content negotiation you describe, the response for GET /{scope}/{name}/{version} could include Link header fields to list each of the formats supported for that release.

Until we support other formats in addition to Zip source archives, I don't think there's anything we need to do for this proposal. The current design is flexible to allow for such changes to be made over time, as needed.

As far as the "enterprise / behind a firewall mirroring use case" is concerned, this is discussed in the proposal itself under "Registry configuration subcommands", and in the specification under § 4.4.2. Download locations.

@tomerd

This comment has been minimized.

@tomerd
Copy link
Contributor

tomerd commented May 3, 2021

As far as the "enterprise / behind a firewall mirroring use case" is concerned, this is discussed in the proposal itself under "Registry configuration subcommands", and in the specification under § 4.4.2. Download locations.

iiu @mmarston feedback correctly, he was looking for a way to query a package content via the API to build mirror capabilities for enterprise / behind a firewall type registries. on the surface such API requirement makes sense for a robust ecosystem that includes such registries. @mmarston did I understand that correctly? if so, can you maybe suggest how this kind of API looks / fits into the current proposed set of APIs?

@mattt

This comment has been minimized.

@mmarston
Copy link

mmarston commented May 6, 2021

Until we support other formats in addition to Zip source archives, I don't think there's anything we need to do for this proposal. The current design is flexible to allow for such changes to be made over time, as needed.

The point of defining a way for the API to advertise additional content is so that a mirroring implementation can be written now that would automatically handle such additions in the future.

@mmarston did I understand that correctly? if so, can you maybe suggest how this kind of API looks / fits into the current proposed set of APIs?

Yes, you understand correctly. One suggestion for how this would fit with the current proposed APIs would be to add it to the GET /{scope}/{name} releases response. Ideally the response would include all the available "files" that can be downloaded and a hash (I'd suggest returning the hash in the integrity metadata format defined in the Subresource Integrity W3C recommendation).

This would allow a client to use /{scope}/{name} to not only detect new versions of a package but also detect the available files and detect if a version has been replaced with different content (in a well-behaving public registry, replacing the contents of an existing version is a big no-no, but there are situations where this happens, particularly in private registries).

One suggestion for what this could look like:

{
    "releases": {
        "1.1.1": {
            "url": "https://packages.example.com/mona/LinkedList/1.1.1",
            "contents": {
                "1.1.1.json": "sha256-tbudgBSg+bHWHiHnlteNzN8TUvI80ygS9IULh4rklEw=",
                "1.1.1.zip": "sha256-BN1NhfXL9LfTS/9ESilvie/C0wwz05b7bCV1fkuH2bs=",
                "1.1.1/Package.swift": "sha256-xzmRiiJ2TIeshJ5cWrSmgcptGcJ0jkNOGBffQ2tnH4E=",
                "1.1.1/Package.swift?swift-version=4.2": "sha256-n3Z7CweNJasBt+iHw6vgPfYJQ9K39Kb7IY6ZaF2zswo="
            }
        },
        "1.1.0": {
            "url": "https://packages.example.com/mona/LinkedList/1.1.0",
            "contents": {
                "1.1.0.json": "sha256-4bNGc/CTdLs7bHCl5MpMe9kVGgNkotyfwRxYqFEuk5w=",
                "1.1.0.xcframework": "sha256-Xk+24GBTha7lgwNa4NtzLkhXFcjSaIjSo1caJikftY4=",
                "1.1.0/Package.swift": "sha256-6yTF0IuSJZdvhUw0ooRRR4W6/oXcwSZE8CmcVB2g/o0="
            }
        }
    }
}

Here I've used the field name "contents" but I'd welcome alternative suggestions. I've also shown each entry as a map of file name (a path relative to the package name) to integrity hash. To allow for more extensibility (e.g. if someday we diced to add a size field) the value for each map entry could by an object instead of a string. For example:

 "contents": {
     "1.1.1.json": {"integrity":"sha256-tbudgBSg+bHWHiHnlteNzN8TUvI80ygS9IULh4rklEw="},
     "1.1.1.zip": {"integrity":"sha256-BN1NhfXL9LfTS/9ESilvie/C0wwz05b7bCV1fkuH2bs="),
     ...
 }

@mmarston
Copy link

mmarston commented May 6, 2021

For the kind of agent-driven content negotiation you describe, the response for GET /{scope}/{name}/{version} could include Link header fields to list each of the formats supported for that release.

I'd really prefer to minimize reliance on Link headers and content-negotiation. That is actually one piece of feedback I have about the current RegistryDraft.md. While relying on Link and Accept headers is standard HTTP it makes the implementation more complicated. A quick way to measure the simplicity of the registry API is to ask how much of it could be handled by a simple web server from a file system folder or from an S3 bucket. Obviously that wouldn't be a full-featured registry server (e.g. how would packages be published). But in my mind it would be extremely desirable if that were sufficient enough for someone to quickly set up a meager registry.

I currently see 3 areas in the registry API that fail this measure of simplicity:

  1. The use of Link headers. In most uses of the Link header I saw in the API it didn't seem like there would be any degradation in the swift package manager client's ability to resolve and fetch dependencies. In other words, if someone put their registry contents on an S3 bucket the client would work just fine even though the Link headers would be missing. But the part of the API that would not work without functioning Link headers is the ability for the server to notify the client of swift-version specific Package.swift files (e.g. Package.swift?swift-version=4). In general it seems to me that every use of Link headers in the API proposal could be changed to be returned as a field value in a JSON response.
  2. The use of absolute URLs. (Is the server required to return absolute URLs? I didn't see it stated, but all examples were absolute.) In particular, the package release JSON contains an absolute URL for each version. The use of absolute URLs makes the data less portable (i.e. the content would need to be modified to move it to a different S3 bucket or serve it from a simple web server with a different DNS name). It also affects the ability to put an http proxy (such as nginx or varnish) or CDN in front of the content (again with a different DNS name or base URL for the registry).
  3. The use of query string parameters. Most of the API doesn't depend on query string parameters, except the identity lookup by URL (/identifiers{?url}) and the swift-version specific Package.swift files. I suspect a simple web server or S3 bucket registry could do without identity lookup. But an alternative would be to use a path segment instead of query parameter (i.e. /identity/{url}). In either case the value of url would need to be URL encoded. Similarly, maybe it is OK that such a simple registry would not support swift-version specific Package.swift files. But they would be supported if URL for those were to match the file name in source code (e.g. use a URL of /{scope}/{package}/{version}/Package@swift-{swift-version}.swift where @ could be URL encoded).

@mmarston
Copy link

mmarston commented May 7, 2021

Something else I found odd about the API specification is that the service is supposed to respond with a different content type than the client requested. As a specific example, for listing package releases the client sends an Accept: application/vnd.swift.registry.v1+json header while the server responds with a Content-Type: application/json. Is it common for a server to send a content type that isn't actually a match for any of the mime types in the Accept header?

For comparison, npm has a application/vnd.npm.install-v1+json content type and when the client asks for this content type then the server responds with that content type (not application/json).

curl -H "Accept: application/vnd.npm.install-v1+json" -I https://registry.npmjs.org/lodash
HTTP/2 200
date: Fri, 07 May 2021 00:20:36 GMT
content-type: application/vnd.npm.install-v1+json
...

@mattt
Copy link
Contributor Author

mattt commented May 7, 2021

@mmarston Thanks for sharing your feedback. Responses below:

#1319 (comment)

Until we support other formats in addition to Zip source archives, I don't think there's anything we need to do for this proposal. The current design is flexible to allow for such changes to be made over time, as needed.

The point of defining a way for the API to advertise additional content is so that a mirroring implementation can be written now that would automatically handle such additions in the future.

On the other hand, any decisions we make to anticipate hypothetical use-cases limits what we'll be able to do in the future. I'd much rather an additive change in the future that solves a problem exactly, once we know what problem we're trying to solve.

I think there's a good case for enumerating the contents of a package release, if only for subresource integrity guarantees. As with publishing / unpublishing, this feature is an additive change that I'd love to look into for a future iteration, but isn't needed for this first version.


#1319 (comment)

A quick way to measure the simplicity of the registry API is to ask how much of it could be handled by a simple web server from a file system folder or from an S3 bucket. Obviously that wouldn't be a full-featured registry server (e.g. how would packages be published). But in my mind it would be extremely desirable if that were sufficient enough for someone to quickly set up a meager registry.

This was something I specifically designed for in writing the proposal. You can create a functioning package registry from a static file server, including one served from an S3 bucket.

I currently see 3 areas in the registry API that fail this measure of simplicity:

  1. The use of Link headers.

Link headers aren't strictly necessary. Rather, they provide a way to progressively enhance the capabilities of the server and client. A bare-bones registry served from an S3 bucket could work with Swift Package Manager, or you could layer on EC2 or Lambda to make it more capable.

  1. The use of absolute URLs. (Is the server required to return absolute URLs? I didn't see it stated, but all examples were absolute.)

If the use of absolute URLs in the GET /{package} response are prohibitive, what do you think of the following change to the specification?

+ If a `url` is provided for a release, the client SHOULD use that value to locate the corresponding release.
+ Otherwise, the client SHOULD locate the release by appending the version number as a path component to the response URL. 
  1. The use of query string parameters. Most of the API doesn't depend on query string parameters, except the identity lookup by URL (/identifiers{?url}) and the swift-version specific Package.swift files.

Query parameters, like Link Headers, aren't necessary for a registry to function. Swift Package Manager can resolve all dependencies against a static file server that responds with 404 for those endpoints.

For GET /identifiers{?url}, the server should normalize / canonicalize the URL (for example, to handle the optional .git extension). To handle this correctly, a static file server would need to precompute all of the alternative forms a URL takes.


#1319 (comment)

Something else I found odd about the API specification is that the service is supposed to respond with a different content type than the client requested. As a specific example, for listing package releases the client sends an Accept: application/vnd.swift.registry.v1+json header while the server responds with a Content-Type: application/json. Is it common for a server to send a content type that isn't actually a match for any of the mime types in the Accept header?

For comparison, npm has a application/vnd.npm.install-v1+json content type and when the client asks for this content type then the server responds with that content type (not application/json).

curl -H "Accept: application/vnd.npm.install-v1+json" -I https://registry.npmjs.org/lodash
HTTP/2 200
date: Fri, 07 May 2021 00:20:36 GMT
content-type: application/vnd.npm.install-v1+json
...

For what it's worth, the proposed behavior is consistent with how the GitHub API works:

$ curl -I -H "Accept: application/vnd.github.full+json" https://api.github.com/users/mattt
HTTP/2 200 
server: GitHub.com
content-type: application/json; charset=utf-8

I've seen examples of APIs that return the original vnd.-prefixed content type and the canonical equivalent, and don't think there's a strong consensus either way. But subjectively, having worked with both types, I found the other way to be less convenient, because it requires the client to re-map the response type back to a real content type.

@mmarston
Copy link

mmarston commented May 7, 2021

On the other hand, any decisions we make to anticipate hypothetical use-cases limits what we'll be able to do in the future.

I don't see mirroring support as a hypothetical use-case, not do I see my suggest as limiting what we can do in the future.

If the use of absolute URLs in the GET /{package} response are prohibitive, what do you think of the following change to the specification?

I would not go so far as to say that absolute URLs are prohibitive, but I do like your suggested change.

For GET /identifiers{?url}, the server should normalize / canonicalize the URL (for example, to handle the optional .git extension).

Is the server required to normalize / canonicalize the URL or is that something the client would do?

For what it's worth, the proposed behavior is consistent with how the GitHub API works

I'm OK with that. The Accept/Content-Type difference was something I noticed when reviewing the API so I wanted to call it out, but I'm OK with your response.

@mattt
Copy link
Contributor Author

mattt commented May 18, 2021

On the other hand, any decisions we make to anticipate hypothetical use-cases limits what we'll be able to do in the future.

I don't see mirroring support as a hypothetical use-case, not do I see my suggest as limiting what we can do in the future.

To be clear, I think mirroring — that is, the ability to serve package resources from alternative locations — is an important feature, which is why I made sure to mention it in the proposal.

What I'm saying is that the addition of subresource integrity checksums isn't necessary for mirroring, and shouldn't be a blocker for this initial proposal. SRI has security and ergonomic considerations that I think would really benefit from a separate discussion. For example:

  • Which files do we enumerate and how are they represented?
    • Do we include the leading version number (e.g. 1.1.1) in their key if that's redundant?
    • Does it make sense to include {version}.json if it's not part of the package itself?
    • Do we key version-specific manifests according to their filename ([email protected]) or do we use its URI equivalent (Package.swift?swift-version=5.4)
  • What cryptographic algorithms do we support?
    • Do we adhere to the W3C recommendations for SHA-384 as a baseline? Or is SHA-256 better for consistency across the specification?
  • Do we implement CORS in any way?
  • Should the client send Cache-Control: no-transform when requesting proxied / mirrored resources?
  • Does Swift Package Manager include options to disable SRI checks?

The current proposal doesn't require this functionality, and we can add it later without inconvenience. Is there any reason we couldn't discuss this as a separate follow-up?

For GET /identifiers{?url}, the server should normalize / canonicalize the URL (for example, to handle the optional .git extension).

Is the server required to normalize / canonicalize the URL or is that something the client would do?

The client sends the URL as it's declared by the user in the package manifest. The server is responsible for determining whether that URL corresponds to any existing package in the registry.

tomerd pushed a commit to swiftlang/swift-package-manager that referenced this pull request May 24, 2021
Swift Package Manager commands like
`swift package resolve` and `swift package update`
fail with an error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lukasa @mmarston ptal ^^

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This UX, the ability to pass --login and --password, is what I was looking for, so this looks good to me. I have a few questions about the behavior.

  1. Does this support using different credentials for different registries served from the same host name (at different paths)? I'm inferring that the answer is yes since the login field is added to registries.json, which means that the .netrc could have more than one entry for the same machine with different login and the Swift Package Manager would select the one with the matching login.
  2. Is the user allowed to omit --password when passing --login? (For example, to specify which login to use if the user already has credentials in their user-level .netrc).
  3. What credentials will the Swift Package Manager use if the registry redirects to a different host? For example, the a private registry may require credentials and then it may redirect the client to a pre-signed S3 URL. From a security perspective the client shouldn't include the credentials when following a redirect to a different host. Will the client attempt to look up the host of the redirect location in a .netrc file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Does this support using different credentials for different registries served from the same host name (at different paths)?

Yes. A different login / password combination can be specified to add a separate entry in the .netrc file.

  1. Is the user allowed to omit --password when passing --login

Yes.

  1. What credentials will the Swift Package Manager use if the registry redirects to a different host?

I'll defer to @tomerd about this specific implementation detail, but my expectation as a user and registry operator would be for any provided credentials (like in a pre-signed S3 URL) to be respected. If that request fails with a 401, I think there's a good case for Swift Package Manager to retry with any available .netrc credentials.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this all makes sense to me

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach seems reasonable, provided that SwiftPM's default .gitignore file ignores the project .netrc file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lukasa I just submitted a PR with that change here: swiftlang/swift-package-manager#3511

@tomerd
Copy link
Contributor

tomerd commented May 26, 2021

@mattt is this good to merge?

@mattt
Copy link
Contributor Author

mattt commented May 27, 2021

@tomerd Yep! Feel free to merge this in at your convenience.

@tomerd tomerd merged commit 02292fe into swiftlang:main May 27, 2021
@mattt mattt deleted the se-0292-r2-updates branch May 27, 2021 17:24
@mattt mattt mentioned this pull request Jul 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants