Skip to content

Use postgres-native-tls to connect to the database with TLS #9192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 31, 2024

Conversation

Turbo87
Copy link
Member

@Turbo87 Turbo87 commented Jul 30, 2024

diesel-async by default does not support TLS directly. Instead it offers the custom_setup option which allows its users to choose their TLS stack between native-tls, openssl and rustls.

This change implements a custom_setup hook based on postgres-native-tls, whose dependencies are mostly already in our lockfile anyway. The code was mostly adapted from the official example code in diesel-async.

Before diesel-async we relied on the regular diesel connection implementation, which uses pg-sys under the hood, which uses libpq, which uses OpenSSL for TLS. In other words: this slightly changes how TLS is handled for the database connections. Specifically, on OSX the server certificate is rejected as "not trusted" due to a long validity period. On production this should not be relevant, but during local development against the staging database this can cause issues, thus I've included a code comment with a temporary workaround for these cases.

@Turbo87 Turbo87 added C-internal 🔧 Category: Nonessential work that would make the codebase more consistent or clear A-backend ⚙️ labels Jul 30, 2024
@Turbo87
Copy link
Member Author

Turbo87 commented Jul 30, 2024

I've deployed and tested this on staging and everything appears to be working with this PR

@Turbo87 Turbo87 requested a review from a team July 30, 2024 08:40
Turbo87 added 3 commits July 30, 2024 12:38
`diesel-async` by default does not support TLS directly. Instead it offers the `custom_setup` option which allows its users to choose their TLS stack between `native-tls`, `openssl` and `rustls`.

This change implements a `custom_setup` hook based on `postgres-native-tls`, whose dependencies are mostly already in our lockfile anyway. The code was mostly adapted from the official example code in `diesel-async`.

Before `diesel-async` we relied on the regular `diesel` connection implementation, which uses `pg-sys` under the hood, which uses `libpq`, which uses OpenSSL for TLS. In other words: this slightly changes how TLS is handled for the database connections. Specifically, on OSX the server certificate is rejected as "not trusted" due to a long validity period. On production this should not be relevant, but during local development against the staging database this can cause issues, thus I've included a code comment with a temporary workaround for these cases.
… because otherwise the native TLS stack on our server complains about the self-signed certificate in the cert chain.
@Turbo87
Copy link
Member Author

Turbo87 commented Jul 30, 2024

obviously CI is now stalling... this fix is driving me mad 🫣

Copy link

codecov bot commented Jul 30, 2024

Codecov Report

Attention: Patch coverage is 95.34884% with 2 lines in your changes missing coverage. Please review.

Project coverage is 89.17%. Comparing base (e0bb004) to head (692f85e).

Files Patch % Lines
src/bin/background-worker.rs 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9192      +/-   ##
==========================================
- Coverage   89.19%   89.17%   -0.02%     
==========================================
  Files         282      282              
  Lines       28583    28622      +39     
==========================================
+ Hits        25494    25525      +31     
- Misses       3089     3097       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

... otherwise on CI our tests start failing due to the TLS stack complaining about a self-signed certificate in the chain.
Copy link
Contributor

@LawnGnome LawnGnome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the reasoning behind native-tls instead of, say, rustls is to minimise how much change occurs relative to the previous behaviour?

I don't love the workarounds here — essentially disabling TLS validity checks to handle an iffy macOS behaviour and CI that really doesn't have a need for TLS at all isn't ideal — but I'm happy enough to merge this for now and iterate later.

// certificate being self-signed. On CI we are connecting to a locally
// running database, so we also don't need to enforce the validity of
// the certificate either.
.danger_accept_invalid_certs(!enforce_tls)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I promise I stared at this line for a while to make sure it was correct.

@Turbo87
Copy link
Member Author

Turbo87 commented Jul 31, 2024

I assume the reasoning behind native-tls instead of, say, rustls is to minimise how much change occurs relative to the previous behaviour?

I tried native-tls, rustls and openssl and they all didn't work out-of-the-box without any issues. native-tls was already in our lockfile, so it felt like a good candidate to use here too. rustls would cause us to pull in quite a few more transitive dependencies and I wasn't sure if that's really worth it.

I don't love the workarounds here — essentially disabling TLS validity checks to handle an iffy macOS behaviour and CI that really doesn't have a need for TLS at all isn't ideal — but I'm happy enough to merge this for now and iterate later.

yeah, I don't love them either...

though to be fair, as @weiznich pointed out on Zulip, the default sslmode in libpq is prefer and we set it to required on production environments. if you look at https://www.postgresql.org/docs/current/libpq-ssl.html you will realize that even required does not verify the validity of the server cert. in other words: the implementation here is now apparently safer than before because we are now verifying the TLS certificate, even if sslmode is only set to required, due to no longer using libpq for the connection.

@Turbo87 Turbo87 merged commit 17f44a4 into rust-lang:main Jul 31, 2024
10 checks passed
@Turbo87 Turbo87 deleted the fix-tls branch July 31, 2024 06:41
@bors bors mentioned this pull request Jul 31, 2024
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-backend ⚙️ C-internal 🔧 Category: Nonessential work that would make the codebase more consistent or clear
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants