Skip to content

[MachinePipeliner] Add an abstract layer to manipulate Data Dependenc… #109918

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
159 changes: 129 additions & 30 deletions llvm/include/llvm/CodeGen/MachinePipeliner.h
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
#ifndef LLVM_CODEGEN_MACHINEPIPELINER_H
#define LLVM_CODEGEN_MACHINEPIPELINER_H

#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SetVector.h"
#include "llvm/CodeGen/DFAPacketizer.h"
#include "llvm/CodeGen/MachineDominators.h"
Expand Down Expand Up @@ -114,10 +115,123 @@ class MachinePipeliner : public MachineFunctionPass {
bool useWindowScheduler(bool Changed);
};

/// Represents a dependence between two instruction.
class SwingSchedulerDDGEdge {
SUnit *Dst = nullptr;
SDep Pred;
unsigned Distance = 0;

public:
/// Creates an edge corresponding to an edge represented by \p PredOrSucc and
/// \p Dep in the original DAG. This pair has no information about the
/// direction of the edge, so we need to pass an additional argument \p
/// IsSucc.
SwingSchedulerDDGEdge(SUnit *PredOrSucc, const SDep &Dep, bool IsSucc)
: Dst(PredOrSucc), Pred(Dep), Distance(0u) {
SUnit *Src = Dep.getSUnit();

if (IsSucc) {
std::swap(Src, Dst);
Pred.setSUnit(Src);
}

// An anti-dependence to PHI means loop-carried dependence.
if (Pred.getKind() == SDep::Anti && Src->getInstr()->isPHI()) {
Distance = 1;
std::swap(Src, Dst);
auto Reg = Pred.getReg();
Pred = SDep(Src, SDep::Kind::Data, Reg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a bug here, this "Pred" changing caused the querying of isAntiDep() below to be incorrect.
Can you fix it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate the problem?
Here the dependency type is changed from Anti to Data only when the source of the edge is PHI. IIUIC, the anti dependency from a PHI node is used to represent a back-edge. In this patch, the code that handles the back-edges has been completely changed, and the anti-dependencies for physical registers are left in place. So I don't think this part doesn't is a problem. As far as I have tested on my local with llvm-test-suite, there are no differences between generated binaries built with and without this patch. Do you have any cases where it causes problems?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the issue I think.
I had a bug (compiler crash on assertion) on an offline target (not in LLVM community).
I investigated and got to a conclusion that this is the issue.

You changed the phi anti-dep handling.
So when originally having,

some_instruction -> phi; anti dep

You have in your new DAG:
some_instruction -> phi, data dep

In the original consideration, some_instruction is a successor of the phi with anit-dep.
and it was missed out in succ_l.
See my fix, does it make sense to you?
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your investigation. I see, so there are actual cases where the compiler crashes. Hmm, but I'm not sure what the root cause is.

  • A dependence type is changed from anti to data only if the source instruction is PHI. So, in your example, the dependence type is not intended to be replaced unless the some_instr is PHI.
  • When the dependence is replaced with data, the source and the destination is swapped at the same time. In your case, the DDG behaves as if there is an edge from the phi to the some_instr. Therefore, this edge should be handled in the previous loop for (const auto &OE : DDG->getOutEdges(SU)) { ... }.

So I'm not sure why your fix changes the behavior. Could you give me more details of your case (e.g., what exactly are phi and some_instr)?

Copy link
Contributor

@mmarjieh mmarjieh Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's track what happens when sending this edge to your DDGEdge constructor:
Input:
some_instruction -> phi; anti dep

When evaluating the phi's successor to initialize the edge, DDGEdge will be called with the following arguments:
image

PredOrSucc: phi
IsSucc = true,
Dep: some_instruction; anti dep

First you swap Src and Dst:
image

Meaning now we have:
Src = phi
Dst = some_instruction
Pred's SUnit will be: phi

Then this code will be executed:
image
Meaning the edge lost its "anti" kind and now we model it as data.

My problem is that now we don't consider that
some_instruction is a successor of the phi in some parts of the pipeliner's code:
computeNodeOrder
succ_L

I believe that, in addition to checking the anti-deps on the input edges, we need to check edges that have a non-zero distance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, pred_L. Probably other cases too. I think the check for isAntiDep() can be replaced by getDistance(). That is, with the check for getDistance(), isAntiDep() isn't needed anymore.
Where does the assertion occur? With some more details it might be possible to create a test case. I assume some invalid node order is created? Would it be possible to show the pipeliner debug output for the use case (with and without the check for getDistance())?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The anti dependence from some_instr to phi becomes the data dependence from phi to some_instr, as I intended. And now, the vector returned by something like DDG->getOutEdges(phi) should contain that edge. That is, in your case, the some_instr should be appended to Succs in the for loop at line 1921.

https://github.com/kasuga-fj/llvm-project/blob/f6f6c72322bfef93e21e3b043b4c4fc4f8a13fba/llvm/lib/CodeGen/MachinePipeliner.cpp#L1921

I suspect that the DDG is not constructed as I intended, if your change (adding the distance check) fixes the problem. I think there is another issue where we should really address.

And, pred_L. Probably other cases too.

Yes, I agree with this.

I think the check for isAntiDep() can be replaced by getDistance(). That is, with the check for getDistance(), isAntiDep() isn't needed anymore.

I don't think so. Before this patch was merged, the pred_L was defined as follows:

static bool pred_L(SetVector<SUnit *> &NodeOrder,
                   SmallSetVector<SUnit *, 8> &Preds,
                   const NodeSet *S = nullptr) {
  Preds.clear();
  for (const SUnit *SU : NodeOrder) {
    for (const SDep &Pred : SU->Preds) {
      if (S && S->count(Pred.getSUnit()) == 0)
        continue;
      if (ignoreDependence(Pred, true))
        continue;
      if (NodeOrder.count(Pred.getSUnit()) == 0)
        Preds.insert(Pred.getSUnit());
    }
    // Back-edges are predecessors with an anti-dependence.
    for (const SDep &Succ : SU->Succs) {
      if (Succ.getKind() != SDep::Anti)
        continue;
      if (S && S->count(Succ.getSUnit()) == 0)
        continue;
      if (NodeOrder.count(Succ.getSUnit()) == 0)
        Preds.insert(Succ.getSUnit());
    }
  }
  return !Preds.empty();
}

The inline comment says that the second loop handles back-edges, but actually the loop body only checks if the edge is anti dependence. That is, in pred_L, anti forward-edges (I think many of them are WAR dependencies for some physical register) are handled as predecessors. The same goes for succ_L. I suspect this is a mishandling of back-edges, but to keep the original behavior, the anti dependence check is still needed.

Anyway, I also want to see some more details in this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I'll send you a debug log but before I need to understand your change a bit more.
You said and I quote:
"The anti dependence from some_instr to phi becomes the data dependence from phi to some_instr, as I intended."
image

By looking at this code:
Src will be some_instr
Dst will be the phi.

So I believe you meant to tell me that data dep is from some_instr to phi?

Copy link
Contributor Author

@kasuga-fj kasuga-fj Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The anti dependence from some_instr to phi becomes the data dependence from phi to some_instr, as I intended."

By looking at this code:
Src will be some_instr
Dst will be the phi.

So I believe you meant to tell me that data dep is from some_instr to phi?

Yes, I meant that, but now I think something is incorrect. In your previous comment, you said:

Input:
some_instruction -> phi; anti dep

Also

PredOrSucc: phi
IsSucc = true,
Dep: some_instruction; anti dep

This seems to be wrong. IIUIC, isSucc should be false in this case. This argument doesn't mean the "actual semantics" of the dependency. It just indicates whether the Dep is contained in PredOrSucc->Preds or PredOrSucc->Succs. Inserting some validation code like the following at the top of the ctor might be helpful (I don't check if this code works fine).

const auto &Edges = (IsSucc ? PredOrSucc->Succs : PredOrSucc->Preds);
bool Valid = false;
for (const SDep &D : Edges)
  if (D == Dep)
    Valid = true;
assert(Valid && "isSucc may be incorrect");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmarjieh Is the problem still occurring?

}
}

/// Returns the SUnit from which the edge comes (source node).
SUnit *getSrc() const { return Pred.getSUnit(); }

/// Returns the SUnit to which the edge points (destination node).
SUnit *getDst() const { return Dst; }

/// Returns the latency value for the edge.
unsigned getLatency() const { return Pred.getLatency(); }

/// Sets the latency for the edge.
void setLatency(unsigned Latency) { Pred.setLatency(Latency); }

/// Returns the distance value for the edge.
unsigned getDistance() const { return Distance; }

/// Sets the distance value for the edge.
void setDistance(unsigned D) { Distance = D; }

/// Returns the register associated with the edge.
Register getReg() const { return Pred.getReg(); }

/// Returns true if the edge represents anti dependence.
bool isAntiDep() const { return Pred.getKind() == SDep::Kind::Anti; }

/// Returns true if the edge represents output dependence.
bool isOutputDep() const { return Pred.getKind() == SDep::Kind::Output; }

/// Returns true if the edge represents a dependence that is not data, anti or
/// output dependence.
bool isOrderDep() const { return Pred.getKind() == SDep::Kind::Order; }

/// Returns true if the edge represents unknown scheduling barrier.
bool isBarrier() const { return Pred.isBarrier(); }

/// Returns true if the edge represents an artificial dependence.
bool isArtificial() const { return Pred.isArtificial(); }

/// Tests if this is a Data dependence that is associated with a register.
bool isAssignedRegDep() const { return Pred.isAssignedRegDep(); }

/// Returns true for DDG nodes that we ignore when computing the cost
/// functions. We ignore the back-edge recurrence in order to avoid unbounded
/// recursion in the calculation of the ASAP, ALAP, etc functions.
bool ignoreDependence(bool IgnoreAnti) const;
};

/// Represents dependencies between instructions. This class is a wrapper of
/// `SUnits` and its dependencies to manipulate back-edges in a natural way.
/// Currently it only supports back-edges via PHI, which are expressed as
/// anti-dependencies in the original DAG.
/// FIXME: Support any other loop-carried dependencies
class SwingSchedulerDDG {
using EdgesType = SmallVector<SwingSchedulerDDGEdge, 4>;

struct SwingSchedulerDDGEdges {
EdgesType Preds;
EdgesType Succs;
};

void initEdges(SUnit *SU);

SUnit *EntrySU;
SUnit *ExitSU;

std::vector<SwingSchedulerDDGEdges> EdgesVec;
SwingSchedulerDDGEdges EntrySUEdges;
SwingSchedulerDDGEdges ExitSUEdges;

void addEdge(const SUnit *SU, const SwingSchedulerDDGEdge &Edge);

SwingSchedulerDDGEdges &getEdges(const SUnit *SU);
const SwingSchedulerDDGEdges &getEdges(const SUnit *SU) const;

public:
SwingSchedulerDDG(std::vector<SUnit> &SUnits, SUnit *EntrySU, SUnit *ExitSU);

const EdgesType &getInEdges(const SUnit *SU) const;

const EdgesType &getOutEdges(const SUnit *SU) const;
};

/// This class builds the dependence graph for the instructions in a loop,
/// and attempts to schedule the instructions using the SMS algorithm.
class SwingSchedulerDAG : public ScheduleDAGInstrs {
MachinePipeliner &Pass;

std::unique_ptr<SwingSchedulerDDG> DDG;

/// The minimum initiation interval between iterations for this schedule.
unsigned MII = 0;
/// The maximum initiation interval between iterations for this schedule.
Expand All @@ -130,7 +244,7 @@ class SwingSchedulerDAG : public ScheduleDAGInstrs {
unsigned II_setByPragma = 0;
TargetInstrInfo::PipelinerLoopInfo *LoopPipelinerInfo = nullptr;

/// A toplogical ordering of the SUnits, which is needed for changing
/// A topological ordering of the SUnits, which is needed for changing
/// dependences and iterating over the SUnits.
ScheduleDAGTopologicalSort Topo;

Expand Down Expand Up @@ -252,27 +366,7 @@ class SwingSchedulerDAG : public ScheduleDAGInstrs {
return ScheduleInfo[Node->NodeNum].ZeroLatencyHeight;
}

/// Return true if the dependence is a back-edge in the data dependence graph.
/// Since the DAG doesn't contain cycles, we represent a cycle in the graph
/// using an anti dependence from a Phi to an instruction.
bool isBackedge(SUnit *Source, const SDep &Dep) {
if (Dep.getKind() != SDep::Anti)
return false;
return Source->getInstr()->isPHI() || Dep.getSUnit()->getInstr()->isPHI();
}

bool isLoopCarriedDep(SUnit *Source, const SDep &Dep,
bool isSucc = true) const;

/// The distance function, which indicates that operation V of iteration I
/// depends on operations U of iteration I-distance.
unsigned getDistance(SUnit *U, SUnit *V, const SDep &Dep) {
// Instructions that feed a Phi have a distance of 1. Computing larger
// values for arrays requires data dependence information.
if (V->getInstr()->isPHI() && Dep.getKind() == SDep::Anti)
return 1;
return 0;
}
bool isLoopCarriedDep(const SwingSchedulerDDGEdge &Edge) const;

void applyInstrChange(MachineInstr *MI, SMSchedule &Schedule);

Expand All @@ -294,6 +388,8 @@ class SwingSchedulerDAG : public ScheduleDAGInstrs {

static bool classof(const ScheduleDAGInstrs *DAG) { return true; }

const SwingSchedulerDDG *getDDG() const { return DDG.get(); }

private:
void addLoopCarriedDependences(AAResults *AA);
void updatePhiDependences();
Expand Down Expand Up @@ -357,15 +453,16 @@ class NodeSet {
//
// Hold a map from each SUnit in the circle to the maximum distance from the
// source node by only considering the nodes.
const SwingSchedulerDDG *DDG = DAG->getDDG();
DenseMap<SUnit *, unsigned> SUnitToDistance;
for (auto *Node : Nodes)
SUnitToDistance[Node] = 0;

for (unsigned I = 1, E = Nodes.size(); I <= E; ++I) {
SUnit *U = Nodes[I - 1];
SUnit *V = Nodes[I % Nodes.size()];
for (const SDep &Succ : U->Succs) {
SUnit *SuccSUnit = Succ.getSUnit();
for (const SwingSchedulerDDGEdge &Succ : DDG->getOutEdges(U)) {
SUnit *SuccSUnit = Succ.getDst();
if (V != SuccSUnit)
continue;
if (SUnitToDistance[U] + Succ.getLatency() > SUnitToDistance[V]) {
Expand All @@ -377,13 +474,13 @@ class NodeSet {
SUnit *FirstNode = Nodes[0];
SUnit *LastNode = Nodes[Nodes.size() - 1];

for (auto &PI : LastNode->Preds) {
for (auto &PI : DDG->getInEdges(LastNode)) {
// If we have an order dep that is potentially loop carried then a
// back-edge exists between the last node and the first node that isn't
// modeled in the DAG. Handle it manually by adding 1 to the distance of
// the last node.
if (PI.getSUnit() != FirstNode || PI.getKind() != SDep::Order ||
!DAG->isLoopCarriedDep(LastNode, PI, false))
if (PI.getSrc() != FirstNode || !PI.isOrderDep() ||
!DAG->isLoopCarriedDep(PI))
continue;
SUnitToDistance[FirstNode] =
std::max(SUnitToDistance[FirstNode], SUnitToDistance[LastNode] + 1);
Expand Down Expand Up @@ -627,11 +724,13 @@ class SMSchedule {

/// Return the cycle of the earliest scheduled instruction in the dependence
/// chain.
int earliestCycleInChain(const SDep &Dep);
int earliestCycleInChain(const SwingSchedulerDDGEdge &Dep,
const SwingSchedulerDDG *DDG);

/// Return the cycle of the latest scheduled instruction in the dependence
/// chain.
int latestCycleInChain(const SDep &Dep);
int latestCycleInChain(const SwingSchedulerDDGEdge &Dep,
const SwingSchedulerDDG *DDG);

void computeStart(SUnit *SU, int *MaxEarlyStart, int *MinLateStart, int II,
SwingSchedulerDAG *DAG);
Expand Down Expand Up @@ -694,7 +793,7 @@ class SMSchedule {
MachineOperand &MO) const;

bool onlyHasLoopCarriedOutputOrOrderPreds(SUnit *SU,
SwingSchedulerDAG *DAG) const;
const SwingSchedulerDDG *DDG) const;
void print(raw_ostream &os) const;
void dump() const;
};
Expand Down
Loading
Loading