Skip to content

[ARM] Avoid reference into modified vector #93965

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 3, 2024

Conversation

nikic
Copy link
Contributor

@nikic nikic commented May 31, 2024

FirstCand is a reference to RepeatedSequenceLocs[0]. However, that vector is being modified a lot throughout the function, including one place that reassigns the whole vector. I'm not sure whether this can really happen in practice, but it doesn't seem unlikely that this could lead to a use-after-free.

Avoid this by directly using RepeatedSequenceLocs[0] at the start of the function (as a lot of other places already do) and only creating FirstCand at the end where no more modifications take place.

FirstCand is a reference to RepeatedSequenceLocs[0]. However, that
vector is being modified a lot throughout the function, including
one place that reassigns the whole vector. I'm not sure whether
this can really happen in practice, but it doesn't seem unlikely
that this could lead to a use-after-free.

Avoid this by directly using RepeatedSequenceLocs[0] at the
start of the function (as a lot of other places already do) and
only creating FirstCand at the end where no more modifications
take place.
@nikic nikic requested review from yroux and atrosinenko May 31, 2024 14:02
@llvmbot
Copy link
Member

llvmbot commented May 31, 2024

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-backend-arm

Author: Nikita Popov (nikic)

Changes

FirstCand is a reference to RepeatedSequenceLocs[0]. However, that vector is being modified a lot throughout the function, including one place that reassigns the whole vector. I'm not sure whether this can really happen in practice, but it doesn't seem unlikely that this could lead to a use-after-free.

Avoid this by directly using RepeatedSequenceLocs[0] at the start of the function (as a lot of other places already do) and only creating FirstCand at the end where no more modifications take place.


Full diff: https://github.com/llvm/llvm-project/pull/93965.diff

1 Files Affected:

  • (modified) llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp (+2-3)
diff --git a/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp b/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
index 8f873bee484ac..627148b73c4f5 100644
--- a/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
+++ b/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
@@ -5873,10 +5873,8 @@ static bool isLRAvailable(const TargetRegisterInfo &TRI,
 std::optional<outliner::OutlinedFunction>
 ARMBaseInstrInfo::getOutliningCandidateInfo(
     std::vector<outliner::Candidate> &RepeatedSequenceLocs) const {
-  outliner::Candidate &FirstCand = RepeatedSequenceLocs[0];
-
   unsigned SequenceSize = 0;
-  for (auto &MI : FirstCand)
+  for (auto &MI : RepeatedSequenceLocs[0])
     SequenceSize += getInstSizeInBytes(MI);
 
   // Properties about candidate MBBs that hold for all of them.
@@ -6071,6 +6069,7 @@ ARMBaseInstrInfo::getOutliningCandidateInfo(
   if (FlagsSetInAll & MachineOutlinerMBBFlags::HasCalls) {
     // check if the range contains a call.  These require a save + restore of
     // the link register.
+    outliner::Candidate &FirstCand = RepeatedSequenceLocs[0];
     if (std::any_of(FirstCand.begin(), std::prev(FirstCand.end()),
                     [](const MachineInstr &MI) { return MI.isCall(); }))
       NumBytesToCreateFrame += Costs.SaveRestoreLROnStack;

Copy link

github-actions bot commented May 31, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@yroux yroux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@nikic nikic merged commit 1c9f4d4 into llvm:main Jun 3, 2024
7 checks passed
@nikic nikic deleted the arm-outlining-candidate branch June 3, 2024 15:10
@rgwott
Copy link
Contributor

rgwott commented Jun 6, 2024

Hi. This PR broke some tests downstream for us, using the ARM backend.
From my debugging, it seems that the new &FirstCand in ARMBaseInstrInfo.cpp:6072 may refer to an empty container, thus breaking the subsequent access in line 6073. In the broken test case, this happens because CandidatesWithoutStackFixups in line 6061 can be empty.
I do not have a reproducer right now but will try to get one.
We did not observe problems with AArch64 yet, but we also did not investigate.
Thanks in advance for looking into this.

@nikic
Copy link
Contributor Author

nikic commented Jun 10, 2024

@rgwott I'd love to have a reproducer for this -- it actually sounds like exactly the case I was concerned about here: RepeatedSequenceLocs gets reassigned to CandidatesWithoutStackFixups, while FirstCand still points into the old vector memory. If you run your test case under valgrind (or address sanitizer), do you maybe get a use-after-free report? I'd expect that to happen, if I'm not misunderstanding the code.

In any case, I'd be happy to revert this change in the meantime if it's causing issues for you.

@rgwott
Copy link
Contributor

rgwott commented Jun 10, 2024

@nikic here is a reproducer:

void a();
void b(int c, int d, int, void *, struct e *f) { a(2, c, d, f); }
void g(int c, int d, int, void *, struct e *f) { a(4, c, d, f); }

clang --target=arm -mcpu=cortex-m55 -Oz repro.c

I did not see a use-after-free report in valgrind, but we do see this STL assertion:
/usr/include/c++/11/bits/stl_vector.h:1045: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = llvm::outliner::Candidate; _Alloc = std::allocator<llvm::outliner::Candidate>; std::vector<_Tp, _Alloc>::reference = llvm::outliner::Candidate&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__n < this->size()' failed.

No rush to revert this change, as I reverted it downstream to keep on working and I can reapply it once a fix comes from upstream.
Thanks!

nikic added a commit to nikic/llvm-project that referenced this pull request Jun 13, 2024
The following code assumes that RepeatedSequenceLocs is non-empty.
Bail out if there are less than 2 candidates left, as no outlining
is possible in that case. The same check is already present in
all the other places where elements from RepeatedSequenceLocs may
be dropped.

This fixes the issue reported at:
llvm#93965 (comment)
@nikic
Copy link
Contributor Author

nikic commented Jun 13, 2024

@rgwott Thanks for the reproducer! I believe that #95410 should fix this.

nikic added a commit that referenced this pull request Jun 14, 2024
)

The following code assumes that RepeatedSequenceLocs is non-empty. Bail
out if there are less than 2 candidates left, as no outlining is
possible in that case. The same check is already present in all the
other places where elements from RepeatedSequenceLocs may be dropped.

This fixes the issue reported at:
#93965 (comment)
EthanLuisMcDonough pushed a commit to EthanLuisMcDonough/llvm-project that referenced this pull request Aug 13, 2024
…m#95410)

The following code assumes that RepeatedSequenceLocs is non-empty. Bail
out if there are less than 2 candidates left, as no outlining is
possible in that case. The same check is already present in all the
other places where elements from RepeatedSequenceLocs may be dropped.

This fixes the issue reported at:
llvm#93965 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants