@@ -199,9 +199,62 @@ struct MCExtraProcessorInfo {
199
199
// / provides a detailed reservation table describing each cycle of instruction
200
200
// / execution. Subtargets may define any or all of the above categories of data
201
201
// / depending on the type of CPU and selected scheduler.
202
+ // /
203
+ // / The machine independent properties defined here are used by the scheduler as
204
+ // / an abstract machine model. A real micro-architecture has a number of
205
+ // / buffers, queues, and stages. Declaring that a given machine-independent
206
+ // / abstract property corresponds to a specific physical property across all
207
+ // / subtargets can't be done. Nonetheless, the abstract model is
208
+ // / useful. Futhermore, subtargets typically extend this model with processor
209
+ // / specific resources to model any hardware features that can be exploited by
210
+ // / sceduling heuristics and aren't sufficiently represented in the abstract.
211
+ // /
212
+ // / The abstract pipeline is built around the notion of an "issue point". This
213
+ // / is merely a reference point for counting machine cycles. The physical
214
+ // / machine will have pipeline stages that delay execution. The scheduler does
215
+ // / not model those delays because they are irrelevant as long as they are
216
+ // / consistent. Inaccuracies arise when instructions have different execution
217
+ // / delays relative to each other, in addition to their intrinsic latency. Those
218
+ // / special cases can be handled by TableGen constructs such as, ReadAdvance,
219
+ // / which reduces latency when reading data, and ResourceCycles, which consumes
220
+ // / a processor resource when writing data for a number of abstract
221
+ // / cycles.
222
+ // /
223
+ // / TODO: One tool currently missing is the ability to add a delay to
224
+ // / ResourceCycles. That would be easy to add and would likely cover all cases
225
+ // / currently handled by the legacy itinerary tables.
226
+ // /
227
+ // / A note on out-of-order execution and, more generally, instruction
228
+ // / buffers. Part of the CPU pipeline is always in-order. The issue point, which
229
+ // / is the point of reference for counting cycles, only makes sense as an
230
+ // / in-order part of the pipeline. Other parts of the pipeline are sometimes
231
+ // / falling behind and sometimes catching up. It's only interesting to model
232
+ // / those other, decoupled parts of the pipeline if they may be predictably
233
+ // / resource constrained in a way that the scheduler can exploit.
234
+ // /
235
+ // / The LLVM machine model distinguishes between in-order constraints and
236
+ // / out-of-order constraints so that the target's scheduling strategy can apply
237
+ // / appropriate heuristics. For a well-balanced CPU pipeline, out-of-order
238
+ // / resources would not typically be treated as a hard scheduling
239
+ // / constraint. For example, in the GenericScheduler, a delay caused by limited
240
+ // / out-of-order resources is not directly reflected in the number of cycles
241
+ // / that the scheduler sees between issuing an instruction and its dependent
242
+ // / instructions. In other words, out-of-order resources don't directly increase
243
+ // / the latency between pairs of instructions. However, they can still be used
244
+ // / to detect potential bottlenecks across a sequence of instructions and bias
245
+ // / the scheduling heuristics appropriately.
202
246
struct MCSchedModel {
203
247
// IssueWidth is the maximum number of instructions that may be scheduled in
204
- // the same per-cycle group.
248
+ // the same per-cycle group. This is meant to be a hard in-order constraint
249
+ // (a.k.a. "hazard"). In the GenericScheduler strategy, no more than
250
+ // IssueWidth micro-ops can ever be scheduled in a particular cycle.
251
+ //
252
+ // In practice, IssueWidth is useful to model any bottleneck between the
253
+ // decoder (after micro-op expansion) and the out-of-order reservation
254
+ // stations or the decoder bandwidth itself. If the total number of
255
+ // reservation stations is also a bottleneck, or if any other pipeline stage
256
+ // has a bandwidth limitation, then that can be naturally modeled by adding an
257
+ // out-of-order processor resource.
205
258
unsigned IssueWidth;
206
259
static const unsigned DefaultIssueWidth = 1 ;
207
260
0 commit comments