Skip to content

Commit 9ec501d

Browse files
committed
[OpenMP] Refactor OMPScheduleType enum.
The OMPScheduleType enum stores the constants from libomp's internal sched_type in kmp.h and are used by several kmp API functions. The enum values have an internal structure, namely each scheduling algorithm (e.g.) exists in four variants: unordered, orderend, normerge unordered, and nomerge ordered. This patch (basically a followup to D114940) splits the "ordered" and "nomerge" bits into separate flags, as was already done for the "monotonic" and "nonmonotonic", so we can apply bit flags operations on them. It also now contains all possible combinations according to kmp's sched_type. Deriving of the OMPScheduleType enum from clause parameters has been moved form MLIR's OpenMPToLLVMIRTranslation.cpp to OpenMPIRBuilder to make available for clang as well. Since the primary purpose of the flag is the binary interface to libomp, it has been made more private to LLVMFrontend. The primary interface for generating worksharing-loop using OpenMPIRBuilder code becomes `applyWorkshareLoop` which derives the OMPScheduleType automatically and calls the appropriate emitter function. While this is mostly a NFC refactor, it still applies the following functional changes: * The logic from OpenMPToLLVMIRTranslation to derive the OMPScheduleType also applies to clang. Most notably, it now applies the nonmonotonic flag for non-static schedules by default. * In OpenMPToLLVMIRTranslation, the nonmonotonic default flag was previously not applied if the simd modifier was used. I assume this was a bug, since the effect was due to `loop.schedule_modifier()` returning `mlir::omp::ScheduleModifier::none` instead of `llvm::Optional::None`. * In OpenMPToLLVMIRTranslation, the nonmonotonic default flag was set even if ordered was specified, in breach to what the comment before citing the OpenMP specification says. I assume this was an oversight. The ordered flag with parameter was not considered in this patch. Changes will need to be made (e.g. adding/modifying function parameters) when support for it is added. The lengthy names of the enum values can be discussed, for the moment this is avoiding reusing previously existing enum value names such as `StaticChunked` to avoid confusion. Reviewed By: peixin Differential Revision: https://reviews.llvm.org/D123403
1 parent 58ceae9 commit 9ec501d

File tree

12 files changed

+445
-198
lines changed

12 files changed

+445
-198
lines changed

clang/lib/CodeGen/CGStmtOpenMP.cpp

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3760,9 +3760,11 @@ void CodeGenFunction::EmitOMPForDirective(const OMPForDirective &S) {
37603760
CGM.getOpenMPRuntime().getOMPBuilder();
37613761
llvm::OpenMPIRBuilder::InsertPointTy AllocaIP(
37623762
AllocaInsertPt->getParent(), AllocaInsertPt->getIterator());
3763-
OMPBuilder.applyWorkshareLoop(Builder.getCurrentDebugLocation(), CLI,
3764-
AllocaIP, NeedsBarrier, SchedKind,
3765-
ChunkSize);
3763+
OMPBuilder.applyWorkshareLoop(
3764+
Builder.getCurrentDebugLocation(), CLI, AllocaIP, NeedsBarrier,
3765+
SchedKind, ChunkSize, /*HasSimdModifier=*/false,
3766+
/*HasMonotonicModifier=*/false, /*HasNonmonotonicModifier=*/false,
3767+
/*HasOrderedClause=*/false);
37663768
return;
37673769
}
37683770

clang/test/OpenMP/irbuilder_for_unsigned_auto.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838
// CHECK-NEXT: store i32 %[[DOTCOUNT]], i32* %[[P_UPPERBOUND]], align 4
3939
// CHECK-NEXT: store i32 1, i32* %[[P_STRIDE]], align 4
4040
// CHECK-NEXT: %[[OMP_GLOBAL_THREAD_NUM:.+]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @1)
41-
// CHECK-NEXT: call void @__kmpc_dispatch_init_4u(%struct.ident_t* @1, i32 %[[OMP_GLOBAL_THREAD_NUM]], i32 38, i32 1, i32 %[[DOTCOUNT]], i32 1, i32 1)
41+
// CHECK-NEXT: call void @__kmpc_dispatch_init_4u(%struct.ident_t* @1, i32 %[[OMP_GLOBAL_THREAD_NUM]], i32 1073741862, i32 1, i32 %[[DOTCOUNT]], i32 1, i32 1)
4242
// CHECK-NEXT: br label %[[OMP_LOOP_PREHEADER_OUTER_COND:.+]]
4343
// CHECK-EMPTY:
4444
// CHECK-NEXT: [[OMP_LOOP_HEADER:.*]]:

clang/test/OpenMP/irbuilder_for_unsigned_dynamic.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838
// CHECK-NEXT: store i32 %[[DOTCOUNT]], i32* %[[P_UPPERBOUND]], align 4
3939
// CHECK-NEXT: store i32 1, i32* %[[P_STRIDE]], align 4
4040
// CHECK-NEXT: %[[OMP_GLOBAL_THREAD_NUM:.+]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @1)
41-
// CHECK-NEXT: call void @__kmpc_dispatch_init_4u(%struct.ident_t* @1, i32 %[[OMP_GLOBAL_THREAD_NUM]], i32 35, i32 1, i32 %[[DOTCOUNT]], i32 1, i32 1)
41+
// CHECK-NEXT: call void @__kmpc_dispatch_init_4u(%struct.ident_t* @1, i32 %[[OMP_GLOBAL_THREAD_NUM]], i32 1073741859, i32 1, i32 %[[DOTCOUNT]], i32 1, i32 1)
4242
// CHECK-NEXT: br label %[[OMP_LOOP_PREHEADER_OUTER_COND:.+]]
4343
// CHECK-EMPTY:
4444
// CHECK-NEXT: [[OMP_LOOP_HEADER:.*]]:

clang/test/OpenMP/irbuilder_for_unsigned_dynamic_chunked.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838
// CHECK-NEXT: store i32 %[[DOTCOUNT]], i32* %[[P_UPPERBOUND]], align 4
3939
// CHECK-NEXT: store i32 1, i32* %[[P_STRIDE]], align 4
4040
// CHECK-NEXT: %[[OMP_GLOBAL_THREAD_NUM:.+]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @1)
41-
// CHECK-NEXT: call void @__kmpc_dispatch_init_4u(%struct.ident_t* @1, i32 %[[OMP_GLOBAL_THREAD_NUM]], i32 35, i32 1, i32 %[[DOTCOUNT]], i32 1, i32 5)
41+
// CHECK-NEXT: call void @__kmpc_dispatch_init_4u(%struct.ident_t* @1, i32 %[[OMP_GLOBAL_THREAD_NUM]], i32 1073741859, i32 1, i32 %[[DOTCOUNT]], i32 1, i32 5)
4242
// CHECK-NEXT: br label %[[OMP_LOOP_PREHEADER_OUTER_COND:.+]]
4343
// CHECK-EMPTY:
4444
// CHECK-NEXT: [[OMP_LOOP_HEADER:.*]]:

clang/test/OpenMP/irbuilder_for_unsigned_runtime.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838
// CHECK-NEXT: store i32 %[[DOTCOUNT]], i32* %[[P_UPPERBOUND]], align 4
3939
// CHECK-NEXT: store i32 1, i32* %[[P_STRIDE]], align 4
4040
// CHECK-NEXT: %[[OMP_GLOBAL_THREAD_NUM:.+]] = call i32 @__kmpc_global_thread_num(%struct.ident_t* @1)
41-
// CHECK-NEXT: call void @__kmpc_dispatch_init_4u(%struct.ident_t* @1, i32 %[[OMP_GLOBAL_THREAD_NUM]], i32 37, i32 1, i32 %[[DOTCOUNT]], i32 1, i32 1)
41+
// CHECK-NEXT: call void @__kmpc_dispatch_init_4u(%struct.ident_t* @1, i32 %[[OMP_GLOBAL_THREAD_NUM]], i32 1073741861, i32 1, i32 %[[DOTCOUNT]], i32 1, i32 1)
4242
// CHECK-NEXT: br label %[[OMP_LOOP_PREHEADER_OUTER_COND:.+]]
4343
// CHECK-EMPTY:
4444
// CHECK-NEXT: [[OMP_LOOP_HEADER:.*]]:

llvm/include/llvm/Frontend/OpenMP/OMPConstants.h

Lines changed: 107 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -74,34 +74,114 @@ enum class IdentFlag {
7474

7575
/// \note This needs to be kept in sync with kmp.h enum sched_type.
7676
/// Todo: Update kmp.h to include this file, and remove the enums in kmp.h
77-
/// To complete this, more enum values will need to be moved here.
7877
enum class OMPScheduleType {
79-
StaticChunked = 33,
80-
Static = 34, // static unspecialized
81-
DynamicChunked = 35,
82-
GuidedChunked = 36, // guided unspecialized
83-
Runtime = 37,
84-
Auto = 38, // auto
85-
86-
StaticBalancedChunked = 45, // static with chunk adjustment (e.g., simd)
87-
GuidedSimd = 46, // guided with chunk adjustment
88-
RuntimeSimd = 47, // runtime with chunk adjustment
89-
90-
OrderedStaticChunked = 65,
91-
OrderedStatic = 66, // ordered static unspecialized
92-
OrderedDynamicChunked = 67,
93-
OrderedGuidedChunked = 68,
94-
OrderedRuntime = 69,
95-
OrderedAuto = 70, // ordered auto
96-
97-
DistributeChunked = 91, // distribute static chunked
98-
Distribute = 92, // distribute static unspecialized
99-
100-
ModifierMonotonic =
101-
(1 << 29), // Set if the monotonic schedule modifier was present
102-
ModifierNonmonotonic =
103-
(1 << 30), // Set if the nonmonotonic schedule modifier was present
104-
ModifierMask = ModifierMonotonic | ModifierNonmonotonic,
78+
// For typed comparisons, not a valid schedule
79+
None = 0,
80+
81+
// Schedule algorithms
82+
BaseStaticChunked = 1,
83+
BaseStatic = 2,
84+
BaseDynamicChunked = 3,
85+
BaseGuidedChunked = 4,
86+
BaseRuntime = 5,
87+
BaseAuto = 6,
88+
BaseTrapezoidal = 7,
89+
BaseGreedy = 8,
90+
BaseBalanced = 9,
91+
BaseGuidedIterativeChunked = 10,
92+
BaseGuidedAnalyticalChunked = 11,
93+
BaseSteal = 12,
94+
95+
// with chunk adjustment (e.g., simd)
96+
BaseStaticBalancedChunked = 13,
97+
BaseGuidedSimd = 14,
98+
BaseRuntimeSimd = 15,
99+
100+
// static schedules algorithims for distribute
101+
BaseDistributeChunked = 27,
102+
BaseDistribute = 28,
103+
104+
// Modifier flags to be combined with schedule algorithms
105+
ModifierUnordered = (1 << 5),
106+
ModifierOrdered = (1 << 6),
107+
ModifierNomerge = (1 << 7),
108+
ModifierMonotonic = (1 << 29),
109+
ModifierNonmonotonic = (1 << 30),
110+
111+
// Masks combining multiple flags
112+
OrderingMask = ModifierUnordered | ModifierOrdered | ModifierNomerge,
113+
MonotonicityMask = ModifierMonotonic | ModifierNonmonotonic,
114+
ModifierMask = OrderingMask | MonotonicityMask,
115+
116+
// valid schedule type values, without monotonicity flags
117+
UnorderedStaticChunked = BaseStaticChunked | ModifierUnordered, // 33
118+
UnorderedStatic = BaseStatic | ModifierUnordered, // 34
119+
UnorderedDynamicChunked = BaseDynamicChunked | ModifierUnordered, // 35
120+
UnorderedGuidedChunked = BaseGuidedChunked | ModifierUnordered, // 36
121+
UnorderedRuntime = BaseRuntime | ModifierUnordered, // 37
122+
UnorderedAuto = BaseAuto | ModifierUnordered, // 38
123+
UnorderedTrapezoidal = BaseTrapezoidal | ModifierUnordered, // 39
124+
UnorderedGreedy = BaseGreedy | ModifierUnordered, // 40
125+
UnorderedBalanced = BaseBalanced | ModifierUnordered, // 41
126+
UnorderedGuidedIterativeChunked =
127+
BaseGuidedIterativeChunked | ModifierUnordered, // 42
128+
UnorderedGuidedAnalyticalChunked =
129+
BaseGuidedAnalyticalChunked | ModifierUnordered, // 43
130+
UnorderedSteal = BaseSteal | ModifierUnordered, // 44
131+
132+
UnorderedStaticBalancedChunked =
133+
BaseStaticBalancedChunked | ModifierUnordered, // 45
134+
UnorderedGuidedSimd = BaseGuidedSimd | ModifierUnordered, // 46
135+
UnorderedRuntimeSimd = BaseRuntimeSimd | ModifierUnordered, // 47
136+
137+
OrderedStaticChunked = BaseStaticChunked | ModifierOrdered, // 65
138+
OrderedStatic = BaseStatic | ModifierOrdered, // 66
139+
OrderedDynamicChunked = BaseDynamicChunked | ModifierOrdered, // 67
140+
OrderedGuidedChunked = BaseGuidedChunked | ModifierOrdered, // 68
141+
OrderedRuntime = BaseRuntime | ModifierOrdered, // 69
142+
OrderedAuto = BaseAuto | ModifierOrdered, // 70
143+
OrderdTrapezoidal = BaseTrapezoidal | ModifierOrdered, // 71
144+
145+
OrderedDistributeChunked = BaseDistributeChunked | ModifierOrdered, // 91
146+
OrderedDistribute = BaseDistribute | ModifierOrdered, // 92
147+
148+
NomergeUnorderedStaticChunked =
149+
BaseStaticChunked | ModifierUnordered | ModifierNomerge, // 161
150+
NomergeUnorderedStatic =
151+
BaseStatic | ModifierUnordered | ModifierNomerge, // 162
152+
NomergeUnorderedDynamicChunked =
153+
BaseDynamicChunked | ModifierUnordered | ModifierNomerge, // 163
154+
NomergeUnorderedGuidedChunked =
155+
BaseGuidedChunked | ModifierUnordered | ModifierNomerge, // 164
156+
NomergeUnorderedRuntime =
157+
BaseRuntime | ModifierUnordered | ModifierNomerge, // 165
158+
NomergeUnorderedAuto = BaseAuto | ModifierUnordered | ModifierNomerge, // 166
159+
NomergeUnorderedTrapezoidal =
160+
BaseTrapezoidal | ModifierUnordered | ModifierNomerge, // 167
161+
NomergeUnorderedGreedy =
162+
BaseGreedy | ModifierUnordered | ModifierNomerge, // 168
163+
NomergeUnorderedBalanced =
164+
BaseBalanced | ModifierUnordered | ModifierNomerge, // 169
165+
NomergeUnorderedGuidedIterativeChunked =
166+
BaseGuidedIterativeChunked | ModifierUnordered | ModifierNomerge, // 170
167+
NomergeUnorderedGuidedAnalyticalChunked =
168+
BaseGuidedAnalyticalChunked | ModifierUnordered | ModifierNomerge, // 171
169+
NomergeUnorderedSteal =
170+
BaseSteal | ModifierUnordered | ModifierNomerge, // 172
171+
172+
NomergeOrderedStaticChunked =
173+
BaseStaticChunked | ModifierOrdered | ModifierNomerge, // 193
174+
NomergeOrderedStatic = BaseStatic | ModifierOrdered | ModifierNomerge, // 194
175+
NomergeOrderedDynamicChunked =
176+
BaseDynamicChunked | ModifierOrdered | ModifierNomerge, // 195
177+
NomergeOrderedGuidedChunked =
178+
BaseGuidedChunked | ModifierOrdered | ModifierNomerge, // 196
179+
NomergeOrderedRuntime =
180+
BaseRuntime | ModifierOrdered | ModifierNomerge, // 197
181+
NomergeOrderedAuto = BaseAuto | ModifierOrdered | ModifierNomerge, // 198
182+
NomergeOrderedTrapezoidal =
183+
BaseTrapezoidal | ModifierOrdered | ModifierNomerge, // 199
184+
105185
LLVM_MARK_AS_BITMASK_ENUM(/* LargestValue */ ModifierMask)
106186
};
107187

llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -344,6 +344,7 @@ class OpenMPIRBuilder {
344344
ArrayRef<CanonicalLoopInfo *> Loops,
345345
InsertPointTy ComputeIP);
346346

347+
private:
347348
/// Modifies the canonical loop to be a statically-scheduled workshare loop.
348349
///
349350
/// This takes a \p LoopInfo representing a canonical loop, such as the one
@@ -403,17 +404,15 @@ class OpenMPIRBuilder {
403404
/// the loop.
404405
/// \param Chunk The size of loop chunk considered as a unit when
405406
/// scheduling. If \p nullptr, defaults to 1.
406-
/// \param Ordered Indicates whether the ordered clause is specified without
407-
/// parameter.
408407
///
409408
/// \returns Point where to insert code after the workshare construct.
410409
InsertPointTy applyDynamicWorkshareLoop(DebugLoc DL, CanonicalLoopInfo *CLI,
411410
InsertPointTy AllocaIP,
412411
omp::OMPScheduleType SchedType,
413412
bool NeedsBarrier,
414-
Value *Chunk = nullptr,
415-
bool Ordered = false);
413+
Value *Chunk = nullptr);
416414

415+
public:
417416
/// Modifies the canonical loop to be a workshare loop.
418417
///
419418
/// This takes a \p LoopInfo representing a canonical loop, such as the one
@@ -436,13 +435,23 @@ class OpenMPIRBuilder {
436435
/// the loop.
437436
/// \param SchedKind Scheduling algorithm to use.
438437
/// \param ChunkSize The chunk size for the inner loop.
438+
/// \param HasSimdModifier Whether the simd modifier is present in the
439+
/// schedule clause.
440+
/// \param HasMonotonicModifier Whether the monotonic modifier is present in
441+
/// the schedule clause.
442+
/// \param HasNonmonotonicModifier Whether the nonmonotonic modifier is
443+
/// present in the schedule clause.
444+
/// \param HasOrderedClause Whether the (parameterless) ordered clause is
445+
/// present.
439446
///
440447
/// \returns Point where to insert code after the workshare construct.
441448
InsertPointTy applyWorkshareLoop(
442449
DebugLoc DL, CanonicalLoopInfo *CLI, InsertPointTy AllocaIP,
443450
bool NeedsBarrier,
444451
llvm::omp::ScheduleKind SchedKind = llvm::omp::OMP_SCHEDULE_Default,
445-
Value *ChunkSize = nullptr);
452+
Value *ChunkSize = nullptr, bool HasSimdModifier = false,
453+
bool HasMonotonicModifier = false, bool HasNonmonotonicModifier = false,
454+
bool HasOrderedClause = false);
446455

447456
/// Tile a loop nest.
448457
///

0 commit comments

Comments
 (0)