Skip to content

#700 & #701: HTTPClient.shared a globally shared singleton #705

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 3, 2024

Conversation

weissi
Copy link
Contributor

@weissi weissi commented Aug 12, 2023

Offer HTTPClient.shared, a global singleton HTTPClient that doesn't need shutdown (because it lives as long as your application). This globally shared HTTPClient does not accept any special configuration and uses HTTPClient.Configuration.singletonConfiguration (also introduced in this PR) which tries to mimic the platform's default browser as closely as possible.

Copy link
Collaborator

@FranzBusch FranzBusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Some small doc nits and I think we should highlight the new .shared singleton a bit in docs.

@weissi weissi force-pushed the jw-http-singleton branch 2 times, most recently from 2848911 to ecf4bca Compare August 12, 2023 18:53
@weissi weissi requested a review from FranzBusch August 12, 2023 18:53
/// - Linux (non-Android): Google Chrome
///
/// - note: Don't rely on the values in here never changing, they will be adapted to better match the platform's default/prevalent browser.
public static var browserLike: HTTPClient.Configuration {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced about this naming. Browser's have very strict rules about when to reuse a connection concerning cookies and auth headers. AHC does not implement any of those rules. It is questionable if AHC should implement those rules at all, since the use-case is very different.

Copy link
Contributor Author

@weissi weissi Aug 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this Configuration aims to match this as closely as possible. It doesn't say that it's a perfect match and if we can't/won't implement something then that's okay.

The reason I think "browser like" is good is that it tells the users that we won't make arbitrary changes that diverge further from what browsers do. And also that it's a property that evolves (if anything we converge with what browsers do). For example, we can't change the meaning of Configuration() because the user has certain expectations of it and may only change the properties they wish to change. With .browserLike it's pretty clear that over time we will converge with what browsers do. That should make it clear that if you need a specific setting then don't use browserLike and configure your own HTTPClient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this Configuration aims to match this as closely as possible. It doesn't say that it's a perfect match and if we can't/won't implement something then that's okay.

I can see your argument here, but I don't think that it is the right one, given that we throw our arms in the air when it comes to connection trust and reuse. All the security guarantees that users might associate with browser are completely non existent in AHC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do implement them by not sending cookies at all, that's even more secure.

Copy link
Contributor Author

@weissi weissi Aug 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in: browsers try (by implementing elaborate policies) to not send cookies to the wrong place. AHC also doesn't send cookies to the wrong place because I don't think it sends cookies to anything unless the user manually attaches right cookie headers manually, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fabianfett are you okay with keeping the browserLike name here?

@ktoso ktoso self-requested a review August 14, 2023 14:42
@weissi weissi requested a review from adam-fowler August 14, 2023 14:59
Copy link
Member

@adam-fowler adam-fowler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah generally looks good. I think docs could still be clearer. There is still a httpClient.execute in the first code example. Is the idea that people should use the shared instance by default? Maybe you need to have a section indicating the difference between using the shared instance and instantiating your own instance.

An an aside this is great timing for me. Soto is in alpha for a major version so I can ditch all the code that was managing my http client and just assume it is being managed outside of Soto.

@ktoso
Copy link
Contributor

ktoso commented Aug 15, 2023

I'm somewhat worried about this tbh and am wordering if rephrasing these APIs a bit could help?

What if we documented "do a let http = HTTPClient.makeShared() when you need one" and at least then if necessary moving to a configured one is less painful than the pain from moving of a HTTPClient.shared sprinkled allover the codebase...?

The docs could literarily say that the simplest way is:

let http = HTTPClient.makeShared()

and you can just exit the process if you want to -- you may lose in flight requests or whatever. And then we explain, well, if you do care -- please hook into service lifecycle, (and here's how ...). If folks don't care they can literarily plop a global let http into their codebase and call it a day.

The "make" could function the same as this ".shared" but I wonder how we should be recommending this to be used, such that moving to "oh no, actually I do need to configure it" isn't a world of pain... Otherwise we'll end up allowing "configure the global shared one" which is not something we'd want I feel...

Note also that a global will become main actor isolated in Swift 6 most likely, so there will be a hop like that. The SDK will have the same issue, so yeah we'd be on equal footing, but I am wondering if we can't do better and recommend things like:

actor INeedHTTP {
  let http: HTTPClient = .shared // .makeShared() ?

}

to avoid un-necessary hops to the global actor...?


Either way, at least tracing in applications "will work" but Fabian's concern that parallel testing with asserting on traces will be impacted and not usable if shared instances are used.


So... I guess this is fine, but I'm not very happy about it but I see the value provided of the simplicity here -- I'd like to consider if we can at least recommend "to the global variable yourself" and change how we phrase it in order to make "move away from static global func everywhere" less painful when people will need to do so.

@MartinLau7
Copy link

Hi, Has there been any progress on this PR?

@Joannis
Copy link
Member

Joannis commented Jan 24, 2024

I'm eager for this change. It resolves the main issues I see people (including myself at times) run into with libraries that depend on AHC, and I feel that 99% of people will want the default configuration for AHC anyways. This change makes it much easier to set up libraries based on AHC for end-users, in addition to making it more accessible to send HTTP requests.

One issue I tend to run into during (Lambda) application setup is that I need to fetch secrets before booting the remainder of services. If anything errors during this phase, and AHC is not correctly shut down under failing circumstances, the deinit will trigger a crash before the error is caught and displayed to the user. This leads to very confusing scenarios to debug, and is very hard to understand for the less-than-expert Swift developer.

@MahdiBM
Copy link
Contributor

MahdiBM commented Jan 26, 2024

I'm also looking forward to be able to use this, but I do still want the configurability.

Perhaps we can use some mechanism like swift-log's LoggingSystem bootstrap to let the shared HTTPClient be configurable?

Some libraries like the "aws-lambda" can provide APIs to users to be able to do a "pre-setup" so users can setup stuff like the LoggingSystem and HTTPClient as soon as they should.
Otherwise if the user is in charge of the main function of the entrypoint they should be able to setup the HTTPClient soon enough.

@gregcotten
Copy link
Contributor

After reading a recommendation to switch over from URLSession to this package, I too would love to see this land!

@akbashev
Copy link

akbashev commented Mar 27, 2024

Just today was working with AHC and wondering—what is the best way to work with the client? It's not very clear from description.
My initial assumption was shutdown is when an application stops working. But apparently from description one should use shutdown to prevent memory leaks? It's also somewhat not very clear what Shuts down the client and event loop gracefully. does really mean...

The thing is, it became very manual to check all creations/shutdowns of clients, even for our small service, wondering how it's for bigger apps... So this PR is very welcoming.

Though I have some questions:

  1. Won't there be any memory leak issue then? 🤔 Maybe missing something, not very familiar with the code tbh.
  2. What is advantage of using let httpClient = HTTPClient.shared rather than defining let httpClient = HTTPClient(eventLoopGroupProvider: .singleton) in your app?
  3. Reading through comments, think maybe having a function makeShared() as @ktoso mentioned is actually a better approach? Otherwise it's a global instance and can be misused or lead to unexpected errors (e.g. when you want shared per some system but end with shared between them).

Anyway, looking forward for PR to be merged! 🙂

@Lukasa
Copy link
Collaborator

Lukasa commented Mar 28, 2024

If you are writing an app, then using a singleton like this is almost certainly the best approach. The main reason to use non-singleton clients is to isolate state. This is useful if you have separate privacy domains (e.g. different credentials, TLS identities, cookies, etc.) or if you want to be sure you're starting from a clean slate when you're writing tests.

Otherwise, you should have long-lived clients and except in rare cases you should never shut them down.

@weissi weissi force-pushed the jw-http-singleton branch from 4d09635 to ebbd85f Compare April 3, 2024 12:11
@weissi weissi changed the title #700 & #701: HTTPClient.shared a globally shared singleton & .browserLike configuration #700 & #701: HTTPClient.shared a globally shared singleton Apr 3, 2024
@weissi weissi force-pushed the jw-http-singleton branch from ebbd85f to b52a50a Compare April 3, 2024 12:12
Copy link
Member

@fabianfett fabianfett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it!

@weissi weissi enabled auto-merge (squash) April 3, 2024 12:16
@weissi weissi force-pushed the jw-http-singleton branch from b52a50a to f2aef5f Compare April 3, 2024 12:26
@weissi
Copy link
Contributor Author

weissi commented Apr 3, 2024

build timeout:

[519/522] Testing AsyncHTTPClientTests.StressGetHttpsTests/testStressGetHttps
[520/522] Testing AsyncHTTPClientTests.TestIdleTimeoutNoReuse/testIdleTimeoutNoReuse
[521/522] Testing AsyncHTTPClientTests.NoBytesSentOverBodyLimitTests/testNoBytesSentOverBodyLimit
Build timed out (after 10 minutes). Marking the build as failed.
$ ssh-agent -k

@weissi
Copy link
Contributor Author

weissi commented Apr 3, 2024

@swift-server-bot test failed

@weissi
Copy link
Contributor Author

weissi commented Apr 3, 2024

@swift-server-bot test this please

@weissi weissi merged commit e0977cf into swift-server:main Apr 3, 2024
@weissi weissi deleted the jw-http-singleton branch April 3, 2024 13:01
@adam-fowler
Copy link
Member

Would love to see a release with this in. I'm just about to release the beta for Soto and this allows me to get rid of the requirement to shutdown the Soto client, as I don't have to manage an HTTPClient

@weissi
Copy link
Contributor Author

weissi commented Apr 4, 2024

I'm just about to release the beta for Soto and this allows me to get rid of the requirement to shutdown the Soto client, as I don't have to manage an HTTPClient

I would caution against Soto relying solely on HTTPClient.shared. It's almost inevitable that you'll need to set up your own HTTP client with special configuration (maybe number of connections to host, connection pool size, ...).

If you design Soto to not have a shutdown, then Soto essentially becomes a singleton-only library which I don't think is a good and scalable design.

What I would recommend is to design Soto to have a correct lifecycle and make shutdown calls always required. But additionally you may offer your own SotoS3Client.shared or similar for the users that are happy with losing control over resources and losing configurability.

Personally I think it's really important that on the server side all important libraries have correct lifecycles and here are some of the reasons why:

  • Resource limit control (maybe you want to allow 100k connections at the same time, maybe only 1)
  • Resource control (by running shutdown you can be sure that all resources have been terminated, with singletons that's impossible)
  • Resource isolation (Often servers have more than one component in one address space. But frequently these components have a differing level of importance. For example you may have your service & and admin HTTP endpoint. There may come a time when you discover that your admin endpoint is eating up resources that your service needs. With correctly managed lifecycles and no singletons you can now configure your admin endpoint to get its own EventLoopGroup (just 1 thread), its own instances of HTTPClient (configured with smaller limits), ...)
  • other configuration like Network.framework vs. BSD sockets that may differ in different parts of a program

@adam-fowler
Copy link
Member

I would caution against Soto relying solely on HTTPClient.shared. It's almost inevitable that you'll need to set up your own HTTP client with special configuration (maybe number of connections to host, connection pool size, ...).

Soto would not rely on the shared instance. You will still be able to provide your own HTTPClient instance. But I was mistaken I thought the only resources Soto relied on that needed a shutdown were controlled by HTTPClient that is not the case.

But I would like to be in the position where Soto is not controlling the lifecycle of of a HTTPClient. Currently the options for Soto are provide your own HTTPClient or Soto will create one for you. I would prefer it to be provide your own HTTPClient or Soto will use the shared instance.

@weissi
Copy link
Contributor Author

weissi commented Apr 4, 2024

I would caution against Soto relying solely on HTTPClient.shared. It's almost inevitable that you'll need to set up your own HTTP client with special configuration (maybe number of connections to host, connection pool size, ...).

Soto would not rely on the shared instance. You will still be able to provide your own HTTPClient instance. But I was mistaken I thought the only resources Soto relied on that needed a shutdown were controlled by HTTPClient that is not the case.

But I would like to be in the position where Soto is not controlling the lifecycle of of a HTTPClient. Currently the options for Soto are provide your own HTTPClient or Soto will create one for you. I would prefer it to be provide your own HTTPClient or Soto will use the shared instance.

Yes, all are options. I'd just make sure that the API contracts are spelled with an always/never. Like in HTTPClient (I hope we removed exceptions):

  • If you call HTTPClient.init() you always have to call client.shutdown()
  • If you use HTTPClient.shared you never have/should to call shutdown()

In the past we tried this misguided EventLoopGroupProvider thing which was just not a good idea. The problem is that you end up with objects that you sometimes need to shut down, depending on what value you passed into the init, that just doesn't seem right.

I'd suggest a similar design in Soto: Either lifecycle or singleton, but not a combination thereof.


Aside: I would imagine that over time Soto will have to manage other resources that aren't just a HTTPClient too. If you already have a shutdown mechanism: easy: add it there. If not: screwed :|. And libraries of the size/shape of Soto will always need other resources in my experience. The only way out is to lean into singletons all the way but then we lose resource control which is fine for certain smaller applications but certainly not always true (+ it's a pain to test).

@FranzBusch FranzBusch added the 🆕 semver/minor Adds new public API. label Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🆕 semver/minor Adds new public API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add HTTPClient.Configuration.browserLike HTTPClient.shared (like URLSession.shared)?