Skip to content

[SampleProfileLoader] Fix integer overflow in generateMDProfMetadata #90217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions llvm/lib/Transforms/IPO/SampleProfile.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1713,13 +1713,15 @@ void SampleProfileLoader::generateMDProfMetadata(Function &F) {
// if needed. Sample counts in profiles are 64-bit unsigned values,
// but internally branch weights are expressed as 32-bit values.
if (Weight > std::numeric_limits<uint32_t>::max()) {
LLVM_DEBUG(dbgs() << " (saturated due to uint32_t overflow)");
LLVM_DEBUG(dbgs() << " (saturated due to uint32_t overflow)\n");
Weight = std::numeric_limits<uint32_t>::max();
Comment on lines 1715 to 1717
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this particular fix, but I'm also wondering if we would benefit from proper scaling/normalization, instead of just saturate uint32_t in the case of overflow. We can do some sanity check to see how much we actually hit such saturation.

Copy link
Member Author

@omern1 omern1 May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @WenleiHe, we probably do need some form of scaling here. In one of the codebases we (Sony) have we're seeing potential losses due to weights being saturated which will be helped scaling the values.

I'll try and address this in a future patch.

}
if (!SampleProfileUseProfi) {
// Weight is added by one to avoid propagation errors introduced by
// 0 weights.
Weights.push_back(static_cast<uint32_t>(Weight + 1));
Weights.push_back(static_cast<uint32_t>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious if this path is still beneficial for some use cases? Can we trust profi results and drop this hack altogether? @omern1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean delete the non-profi inference path completely? some people might still want a fall back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm curious to learn about such use cases; are there some known examples where fallback is preferred?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @spupyrev, I can't speak for all users of the non-profi inference path but from our perspective that would require analysis to understand the impact of profi on our codebases.

I think this discussion can be had separately from this patch.

If you / @WenleiHe / @MatzeB are happy with the current state of this patch can I please get an LGTM?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, I didn't plan to delay the patch, will stamp it.
(Though if you care about perf and the experiments with profi aren't too time-consuming, please try it out and share the results with us. Ultimately it would be great to keep only one codepath here, but the change should be done carefully)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll look into this soon and share the results.

Weight == std::numeric_limits<uint32_t>::max() ? Weight
: Weight + 1));
} else {
// Profi creates proper weights that do not require "+1" adjustments but
// we evenly split the weight among branches with the same destination.
Expand Down
2 changes: 2 additions & 0 deletions llvm/test/Transforms/SampleProfile/Inputs/overflow.proftext
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
_Z3testi:29600000000:29600000000
5: 29600000000
77 changes: 77 additions & 0 deletions llvm/test/Transforms/SampleProfile/overflow.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py UTC_ARGS: --version 4

; Checks that we are able to handle overflowing counters correctly.

; RUN: opt < %s -passes='sample-profile,print<branch-prob>' -sample-profile-file=%S/Inputs/overflow.proftext -disable-output 2>&1 | FileCheck %s

; Original Source:
; int sqrt(int);
; int test(int i) {
; if (i == 5) {
; return 42;
; }
; else {
; return sqrt(i);
; }
;}

define dso_local noundef i32 @_Z3testi(i32 noundef %i) local_unnamed_addr #0 !dbg !10 {
; CHECK-LABEL: '_Z3testi'
; CHECK-NEXT: ---- Branch Probabilities ----
; CHECK-NEXT: edge %entry -> %return probability is 0x00000000 / 0x80000000 = 0.00%
; CHECK-NEXT: edge %entry -> %if.else probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
; CHECK-NEXT: edge %if.else -> %return probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
;
entry:
tail call void @llvm.dbg.value(metadata i32 %i, metadata !16, metadata !DIExpression()), !dbg !17
%cmp = icmp eq i32 %i, 5, !dbg !18
br i1 %cmp, label %return, label %if.else, !dbg !20

if.else: ; preds = %entry
%call = tail call noundef i32 @_Z4sqrti(i32 noundef %i), !dbg !21
br label %return, !dbg !23

return: ; preds = %entry, %if.else
%retval.0 = phi i32 [ %call, %if.else ], [ 42, %entry ], !dbg !24
ret i32 %retval.0, !dbg !25
}

declare !dbg !26 noundef i32 @_Z4sqrti(i32 noundef)

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare void @llvm.dbg.value(metadata, metadata, metadata)

attributes #0 = { "use-sample-profile" }

!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!2, !3, !4, !5, !6, !7, !8}
!llvm.ident = !{!9}

!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
!1 = !DIFile(filename: "test.cpp", directory: "/", checksumkind: CSK_MD5, checksum: "cb38d90153a7ebdd6ecf3058eb0524c7")
!2 = !{i32 7, !"Dwarf Version", i32 5}
!3 = !{i32 2, !"Debug Info Version", i32 3}
!4 = !{i32 1, !"wchar_size", i32 4}
!5 = !{i32 8, !"PIC Level", i32 2}
!6 = !{i32 7, !"PIE Level", i32 2}
!7 = !{i32 7, !"uwtable", i32 2}
!8 = !{i32 7, !"debug-info-assignment-tracking", i1 true}
!9 = !{!"clang"}
!10 = distinct !DISubprogram(name: "test", linkageName: "_Z3loli", scope: !11, file: !11, line: 3, type: !12, scopeLine: 3, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !15)
!11 = !DIFile(filename: "./test.cpp", directory: "/", checksumkind: CSK_MD5, checksum: "cb38d90153a7ebdd6ecf3058eb0524c7")
!12 = !DISubroutineType(types: !13)
!13 = !{!14, !14}
!14 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!15 = !{!16}
!16 = !DILocalVariable(name: "i", arg: 1, scope: !10, file: !11, line: 3, type: !14)
!17 = !DILocation(line: 0, scope: !10)
!18 = !DILocation(line: 4, column: 11, scope: !19)
!19 = distinct !DILexicalBlock(scope: !10, file: !11, line: 4, column: 9)
!20 = !DILocation(line: 4, column: 9, scope: !10)
!21 = !DILocation(line: 8, column: 16, scope: !22)
!22 = distinct !DILexicalBlock(scope: !19, file: !11, line: 7, column: 10)
!23 = !DILocation(line: 8, column: 9, scope: !22)
!24 = !DILocation(line: 0, scope: !19)
!25 = !DILocation(line: 10, column: 1, scope: !10)
!26 = !DISubprogram(name: "sqrt", linkageName: "_Z4sqrti", scope: !11, file: !11, line: 1, type: !12, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized)