-
Notifications
You must be signed in to change notification settings - Fork 412
offers: avoid panic when truncating payer_note in UTF-8 code point #3747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 I see @wpaulino was un-assigned. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3747 +/- ##
==========================================
- Coverage 89.13% 89.08% -0.06%
==========================================
Files 157 157
Lines 123851 123884 +33
Branches 123851 123884 +33
==========================================
- Hits 110398 110365 -33
- Misses 10779 10832 +53
- Partials 2674 2687 +13 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, thanks.
Added a fixup! commit that:
I fuzzed this impl independently, but I'm not super familiar with LDK's fuzz suite so maybe someone wants to translate this proptest. It checks that (1) no fn string_truncate_safe(mut s: String, new_len: usize) -> String {
#[inline]
const fn u8_is_utf8_char_boundary(b: u8) -> bool {
// This is bit magic equivalent to: b < 128 || b >= 192
(b as i8) >= -0x40
}
#[inline]
fn str_floor_char_boundary(s: &str, index: usize) -> usize {
if index >= s.len() {
s.len()
} else {
let lower_bound = index.saturating_sub(3);
let new_index = s.as_bytes()[lower_bound..=index]
.iter()
.rposition(|b| u8_is_utf8_char_boundary(*b))
.unwrap_or(0);
lower_bound + new_index
}
}
let truncated_len = str_floor_char_boundary(s.as_str(), new_len);
s.truncate(truncated_len);
s
}
fn string_truncate_safe_2(mut s: String, new_len: usize) -> String {
#[inline]
fn str_floor_char_boundary(s: &str, index: usize) -> usize {
if index >= s.len() {
s.len()
} else {
let lower_bound = index.saturating_sub(3);
(lower_bound..=index)
.rev()
.find(|idx| s.is_char_boundary(*idx))
.unwrap_or(lower_bound)
}
}
let truncated_len = str_floor_char_boundary(s.as_str(), new_len);
s.truncate(truncated_len);
s
}
#[test]
fn fuzz_string_truncate_safe() {
let config = proptest::test_runner::Config::with_cases(1000);
proptest!(config, |(s: String)| {
for new_len in 0..=s.len() {
let s2 = string_truncate_safe(s.clone(), new_len);
prop_assert!(s2.len() <= s.len());
prop_assert!(s2.len() <= new_len);
let s3 = string_truncate_safe_2(s.clone(), new_len);
prop_assert_eq!(s2, s3);
}
})
}
#[test]
fn fuzz_ascii_string_truncate_safe() {
let config = proptest::test_runner::Config::with_cases(1000);
proptest!(config, |(s in arbitrary::any_ascii_string())| {
for new_len in 0..=s.len() {
let s2 = string_truncate_safe(s.clone(), new_len);
prop_assert!(s2.len() <= s.len());
// ASCII strings truncate exactly
prop_assert!(s2.len() == new_len);
let s3 = string_truncate_safe_2(s.clone(), new_len);
prop_assert_eq!(s2, s3);
}
})
} |
This CI test failure in
|
Hmm, hadn't seen that one fail before, its possible github just did their usual "sorry, our CI lost internet access" nonsense. I kicked it. |
fd123d3
to
627ee3e
Compare
aight fixed, rebased n ready to go! also looks like that dns test magically passed this time so probably just a typical github flake as usual |
Looks like a random fuzzer failure?
full logs: https://gist.github.com/phlip9/25dd83daa7101a381926bc5e36601662#file-full_stack_fail-out |
🔔 1st Reminder Hey @wpaulino! This PR has been waiting for your review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, opened #3750 with the minor code simplification suggested and also fuzzer updates that reach this issue.
The above fuzzer issue CI found is known (an issue because the fuzzer is able to create two transactions that are different but have the same txid) but sadly I haven't gotten around to upstreaming a fix (kinda surprised the CI fuzzer found its way into it, tbh). Will leave it cause its unrelated.
// UTF-8 code points are 1-4 bytes long, so we can limit our search | ||
// to this range: [new_len - 3, new_len] | ||
let lower_bound_index = new_len.saturating_sub(3); | ||
(lower_bound_index..=new_len) | ||
.rev() | ||
.find(|idx| s.is_char_boundary(*idx)) | ||
.unwrap_or(lower_bound_index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha, but also
// UTF-8 code points are 1-4 bytes long, so we can limit our search | |
// to this range: [new_len - 3, new_len] | |
let lower_bound_index = new_len.saturating_sub(3); | |
(lower_bound_index..=new_len) | |
.rev() | |
.find(|idx| s.is_char_boundary(*idx)) | |
.unwrap_or(lower_bound_index) | |
(0..=new_len).rev().find(|idx| s.is_char_boundary(*idx)).unwrap_or(0) |
Closing in favor of #3750 (for the additional context in the commit message and fuzz tests). |
v0.1.3 - Apr 30, 2025 - "Routing Unicode in 2025" Bug Fixes ========= * `Event::InvoiceReceived` is now only generated once for each `Bolt12Invoice` received matching a pending outbound payment. Previously it would be provided each time we received an invoice, which may happen many times if the sender sends redundant messages to improve success rates (lightningdevkit#3658). * LDK's router now more fully saturates paths which are subject to HTLC maximum restrictions after the first hop. In some rare cases this can result in finding paths when it would previously spuriously decide it cannot find enough diverse paths (lightningdevkit#3707, lightningdevkit#3755). Security ======== 0.1.3 fixes a denial-of-service vulnerability which cause a crash of an LDK-based node if an attacker has access to a valid `Bolt12Offer` which the LDK-based node created. * A malicious payer which requests a BOLT 12 Invoice from an LDK-based node (via the `Bolt12InvoiceRequest` message) can cause the panic of the LDK-based node due to the way `String::truncate` handles UTF-8 codepoints. The codepath can only be reached once the received `Botlt12InvoiceRequest` has been authenticated to be based on a valid `Bolt12Offer` which the same LDK-based node issued (lightningdevkit#3747, lightningdevkit#3750).
When building a
VerifiedInvoiceRequest
, we truncate an untrustedpayer_note: String
. This can panic if the new length (PAYER_NOTE_LIMIT = 512 B
) splits the string inside a multi-byte UTF-8 code point.For example, if the payer requests an invoice with
payer_note
set to"❤️".repeat(86)
(516 B), we'll panic when we try to truncate to 512 B as the nearest code point boundaries are at 510 B and 513 B.You can try the test
copies_verified_invoice_request_fields
without the fix to see it panic withassertion failed: self.is_char_boundary(new_len)
.This diff fixes the issue by truncating to the target byte-length, but rounding down to the nearest UTF-8 code point boundary if we're trying to split inside a multi-byte code point. Rust std actually has
std::str::floor_char_boundary
that can do this, but it's not currently stable so I've vendored it for now.Technically, this approach can still split a grapheme cluster in the string. For example, a string truncated inside "🧑🔬" can end up with just "🧑" at the end. Personally I think this is OK. Otherwise we'd have to pull in
unicode-segmentation
and all its associated unicode table bloat just to properly parse grapheme clusters in this one specific edge case.