-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[AArch64][SME] Allow inlining when streaming-mode attributes dont match up. #68415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64][SME] Allow inlining when streaming-mode attributes dont match up. #68415
Conversation
…ch up. The use-case here is to support things like: int foo(int x, int y) __arm_streaming { return std::max<int>(x, y); } where the call to non-streaming `std::max<int>(x, y)` can be safely inlined into the streaming function. This is a first step and will need further work to allow more cases (e.g. more finegrained analysis of the function calls to ensure they don't result in any incompatible instructions for the requested mode).
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-aarch64 ChangesThe use-case here is to support things like: int foo(int x, int y) __arm_streaming { return std::max<int>(x, y); } where the call to non-streaming This is a first step and will need further work to allow more cases (e.g. more finegrained analysis of the function calls to ensure they don't result in any incompatible instructions for the requested mode). Full diff: https://github.com/llvm/llvm-project/pull/68415.diff 3 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index cded28054f59259..d053350c08bf9ab 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -190,16 +190,49 @@ static cl::opt<bool> EnableFixedwidthAutovecInStreamingMode(
static cl::opt<bool> EnableScalableAutovecInStreamingMode(
"enable-scalable-autovec-in-streaming-mode", cl::init(false), cl::Hidden);
+static bool isSMEABIRoutineCall(const CallInst &CI) {
+ const auto *F = CI.getCalledFunction();
+ return F && StringSwitch<bool>(F->getName())
+ .Case("__arm_sme_state", true)
+ .Case("__arm_tpidr2_save", true)
+ .Case("__arm_tpidr2_restore", true)
+ .Case("__arm_za_disable", true)
+ .Default(false);
+}
+
+/// Returns true if the function has explicit operations that can only be lowered
+/// using incompatible instructions for the selected mode.
+/// This also returns true if the function F may use or modify ZA state.
+static bool hasPossibleIncompatibleOps(const Function *F) {
+ for (const BasicBlock &BB : *F) {
+ for (const Instruction &I : BB) {
+ // Be conservative for now and assume that any call to inline asm or to
+ // intrinsics could could result in non-streaming ops (e.g. calls to
+ // @llvm.aarch64.* or @llvm.gather/scatter intrinsics). We can assume that
+ // all native LLVM instructions can be lowered to compatible instructions.
+ if (isa<CallInst>(I) && !I.isDebugOrPseudoInst() &&
+ (cast<CallInst>(I).isInlineAsm() || isa<IntrinsicInst>(I) ||
+ isSMEABIRoutineCall(cast<CallInst>(I))))
+ return true;
+ }
+ }
+ return false;
+}
+
bool AArch64TTIImpl::areInlineCompatible(const Function *Caller,
const Function *Callee) const {
SMEAttrs CallerAttrs(*Caller);
SMEAttrs CalleeAttrs(*Callee);
- if (CallerAttrs.requiresSMChange(CalleeAttrs,
- /*BodyOverridesInterface=*/true) ||
- CallerAttrs.requiresLazySave(CalleeAttrs) ||
- CalleeAttrs.hasNewZABody())
+ if (CalleeAttrs.hasNewZABody())
return false;
+ if (CallerAttrs.requiresLazySave(CalleeAttrs) ||
+ CallerAttrs.requiresSMChange(CalleeAttrs,
+ /*BodyOverridesInterface=*/true)) {
+ if (hasPossibleIncompatibleOps(Callee))
+ return false;
+ }
+
const TargetMachine &TM = getTLI()->getTargetMachine();
const FeatureBitset &CallerBits =
diff --git a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
index 3df5400875ae288..f2f5768dbe9c6e9 100644
--- a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
+++ b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
@@ -102,11 +102,11 @@ entry:
; [ ] N -> SC
; [ ] N -> N + B
; [ ] N -> SC + B
-define void @normal_caller_streaming_callee_dont_inline() {
-; CHECK-LABEL: define void @normal_caller_streaming_callee_dont_inline
+define void @normal_caller_streaming_callee_inline() {
+; CHECK-LABEL: define void @normal_caller_streaming_callee_inline
; CHECK-SAME: () #[[ATTR1]] {
; CHECK-NEXT: entry:
-; CHECK-NEXT: call void @streaming_callee()
+; CHECK-NEXT: call void @inlined_body()
; CHECK-NEXT: ret void
;
entry:
@@ -136,11 +136,11 @@ entry:
; [ ] N -> SC
; [x] N -> N + B
; [ ] N -> SC + B
-define void @normal_caller_locally_streaming_callee_dont_inline() {
-; CHECK-LABEL: define void @normal_caller_locally_streaming_callee_dont_inline
+define void @normal_caller_locally_streaming_callee_inline() {
+; CHECK-LABEL: define void @normal_caller_locally_streaming_callee_inline
; CHECK-SAME: () #[[ATTR1]] {
; CHECK-NEXT: entry:
-; CHECK-NEXT: call void @locally_streaming_callee()
+; CHECK-NEXT: call void @inlined_body()
; CHECK-NEXT: ret void
;
entry:
@@ -153,11 +153,11 @@ entry:
; [ ] N -> SC
; [ ] N -> N + B
; [x] N -> SC + B
-define void @normal_caller_streaming_compatible_locally_streaming_callee_dont_inline() {
-; CHECK-LABEL: define void @normal_caller_streaming_compatible_locally_streaming_callee_dont_inline
+define void @normal_caller_streaming_compatible_locally_streaming_callee_inline() {
+; CHECK-LABEL: define void @normal_caller_streaming_compatible_locally_streaming_callee_inline
; CHECK-SAME: () #[[ATTR1]] {
; CHECK-NEXT: entry:
-; CHECK-NEXT: call void @streaming_compatible_locally_streaming_callee()
+; CHECK-NEXT: call void @inlined_body()
; CHECK-NEXT: ret void
;
entry:
@@ -170,11 +170,11 @@ entry:
; [ ] S -> SC
; [ ] S -> N + B
; [ ] S -> SC + B
-define void @streaming_caller_normal_callee_dont_inline() "aarch64_pstate_sm_enabled" {
-; CHECK-LABEL: define void @streaming_caller_normal_callee_dont_inline
+define void @streaming_caller_normal_callee_inline() "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define void @streaming_caller_normal_callee_inline
; CHECK-SAME: () #[[ATTR2]] {
; CHECK-NEXT: entry:
-; CHECK-NEXT: call void @normal_callee()
+; CHECK-NEXT: call void @inlined_body()
; CHECK-NEXT: ret void
;
entry:
@@ -255,11 +255,11 @@ entry:
; [ ] N + B -> SC
; [ ] N + B -> N + B
; [ ] N + B -> SC + B
-define void @locally_streaming_caller_normal_callee_dont_inline() "aarch64_pstate_sm_body" {
-; CHECK-LABEL: define void @locally_streaming_caller_normal_callee_dont_inline
+define void @locally_streaming_caller_normal_callee_inline() "aarch64_pstate_sm_body" {
+; CHECK-LABEL: define void @locally_streaming_caller_normal_callee_inline
; CHECK-SAME: () #[[ATTR3]] {
; CHECK-NEXT: entry:
-; CHECK-NEXT: call void @normal_callee()
+; CHECK-NEXT: call void @inlined_body()
; CHECK-NEXT: ret void
;
entry:
@@ -340,11 +340,11 @@ entry:
; [ ] SC -> SC
; [ ] SC -> N + B
; [ ] SC -> SC + B
-define void @streaming_compatible_caller_normal_callee_dont_inline() "aarch64_pstate_sm_compatible" {
-; CHECK-LABEL: define void @streaming_compatible_caller_normal_callee_dont_inline
+define void @streaming_compatible_caller_normal_callee_inline() "aarch64_pstate_sm_compatible" {
+; CHECK-LABEL: define void @streaming_compatible_caller_normal_callee_inline
; CHECK-SAME: () #[[ATTR0]] {
; CHECK-NEXT: entry:
-; CHECK-NEXT: call void @normal_callee()
+; CHECK-NEXT: call void @inlined_body()
; CHECK-NEXT: ret void
;
entry:
@@ -357,11 +357,11 @@ entry:
; [ ] SC -> SC
; [ ] SC -> N + B
; [ ] SC -> SC + B
-define void @streaming_compatible_caller_streaming_callee_dont_inline() "aarch64_pstate_sm_compatible" {
-; CHECK-LABEL: define void @streaming_compatible_caller_streaming_callee_dont_inline
+define void @streaming_compatible_caller_streaming_callee_inline() "aarch64_pstate_sm_compatible" {
+; CHECK-LABEL: define void @streaming_compatible_caller_streaming_callee_inline
; CHECK-SAME: () #[[ATTR0]] {
; CHECK-NEXT: entry:
-; CHECK-NEXT: call void @streaming_callee()
+; CHECK-NEXT: call void @inlined_body()
; CHECK-NEXT: ret void
;
entry:
@@ -391,11 +391,11 @@ entry:
; [ ] SC -> SC
; [x] SC -> N + B
; [ ] SC -> SC + B
-define void @streaming_compatible_caller_locally_streaming_callee_dont_inline() "aarch64_pstate_sm_compatible" {
-; CHECK-LABEL: define void @streaming_compatible_caller_locally_streaming_callee_dont_inline
+define void @streaming_compatible_caller_locally_streaming_callee_inline() "aarch64_pstate_sm_compatible" {
+; CHECK-LABEL: define void @streaming_compatible_caller_locally_streaming_callee_inline
; CHECK-SAME: () #[[ATTR0]] {
; CHECK-NEXT: entry:
-; CHECK-NEXT: call void @locally_streaming_callee()
+; CHECK-NEXT: call void @inlined_body()
; CHECK-NEXT: ret void
;
entry:
@@ -408,11 +408,11 @@ entry:
; [ ] SC -> SC
; [ ] SC -> N + B
; [x] SC -> SC + B
-define void @streaming_compatible_caller_streaming_compatible_locally_streaming_callee_dont_inline() "aarch64_pstate_sm_compatible" {
-; CHECK-LABEL: define void @streaming_compatible_caller_streaming_compatible_locally_streaming_callee_dont_inline
+define void @streaming_compatible_caller_streaming_compatible_locally_streaming_callee_inline() "aarch64_pstate_sm_compatible" {
+; CHECK-LABEL: define void @streaming_compatible_caller_streaming_compatible_locally_streaming_callee_inline
; CHECK-SAME: () #[[ATTR0]] {
; CHECK-NEXT: entry:
-; CHECK-NEXT: call void @streaming_compatible_locally_streaming_callee()
+; CHECK-NEXT: call void @inlined_body()
; CHECK-NEXT: ret void
;
entry:
@@ -424,11 +424,11 @@ entry:
; [ ] SC + B -> SC
; [ ] SC + B -> N + B
; [ ] SC + B -> SC + B
-define void @streaming_compatible_locally_streaming_caller_normal_callee_dont_inline() "aarch64_pstate_sm_compatible" "aarch64_pstate_sm_body" {
-; CHECK-LABEL: define void @streaming_compatible_locally_streaming_caller_normal_callee_dont_inline
+define void @streaming_compatible_locally_streaming_caller_normal_callee_inline() "aarch64_pstate_sm_compatible" "aarch64_pstate_sm_body" {
+; CHECK-LABEL: define void @streaming_compatible_locally_streaming_caller_normal_callee_inline
; CHECK-SAME: () #[[ATTR4]] {
; CHECK-NEXT: entry:
-; CHECK-NEXT: call void @normal_callee()
+; CHECK-NEXT: call void @inlined_body()
; CHECK-NEXT: ret void
;
entry:
@@ -503,3 +503,81 @@ entry:
call void @streaming_compatible_locally_streaming_callee()
ret void
}
+
+define void @normal_callee_with_inlineasm() {
+; CHECK-LABEL: define void @normal_callee_with_inlineasm
+; CHECK-SAME: () #[[ATTR1]] {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: call void asm sideeffect "
+; CHECK-NEXT: ret void
+;
+entry:
+ call void asm sideeffect "; inlineasm", ""()
+ ret void
+}
+
+define void @streaming_caller_normal_callee_with_inlineasm_dont_inline() "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define void @streaming_caller_normal_callee_with_inlineasm_dont_inline
+; CHECK-SAME: () #[[ATTR2]] {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: call void @normal_callee_with_inlineasm()
+; CHECK-NEXT: ret void
+;
+entry:
+ call void @normal_callee_with_inlineasm()
+ ret void
+}
+
+define i64 @normal_callee_with_intrinsic_call() {
+; CHECK-LABEL: define i64 @normal_callee_with_intrinsic_call
+; CHECK-SAME: () #[[ATTR1]] {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[RES:%.*]] = call i64 @llvm.aarch64.sve.cntb(i32 4)
+; CHECK-NEXT: ret i64 [[RES]]
+;
+entry:
+ %res = call i64 @llvm.aarch64.sve.cntb(i32 4)
+ ret i64 %res
+}
+
+define i64 @streaming_caller_normal_callee_with_intrinsic_call_dont_inline() "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define i64 @streaming_caller_normal_callee_with_intrinsic_call_dont_inline
+; CHECK-SAME: () #[[ATTR2]] {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[RES:%.*]] = call i64 @normal_callee_with_intrinsic_call()
+; CHECK-NEXT: ret i64 [[RES]]
+;
+entry:
+ %res = call i64 @normal_callee_with_intrinsic_call()
+ ret i64 %res
+}
+
+declare i64 @llvm.aarch64.sve.cntb(i32)
+
+define i64 @normal_callee_call_sme_state() {
+; CHECK-LABEL: define i64 @normal_callee_call_sme_state
+; CHECK-SAME: () #[[ATTR1]] {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[RES:%.*]] = call { i64, i64 } @__arm_sme_state()
+; CHECK-NEXT: [[RES_0:%.*]] = extractvalue { i64, i64 } [[RES]], 0
+; CHECK-NEXT: ret i64 [[RES_0]]
+;
+entry:
+ %res = call {i64, i64} @__arm_sme_state()
+ %res.0 = extractvalue {i64, i64} %res, 0
+ ret i64 %res.0
+}
+
+declare {i64, i64} @__arm_sme_state()
+
+define i64 @streaming_caller_normal_callee_call_sme_state_dont_inline() "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define i64 @streaming_caller_normal_callee_call_sme_state_dont_inline
+; CHECK-SAME: () #[[ATTR2]] {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[RES:%.*]] = call i64 @normal_callee_call_sme_state()
+; CHECK-NEXT: ret i64 [[RES]]
+;
+entry:
+ %res = call i64 @normal_callee_call_sme_state()
+ ret i64 %res
+}
diff --git a/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll b/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll
index a833e7a911ac03f..7b104977cff5a7b 100644
--- a/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll
+++ b/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll
@@ -3,10 +3,12 @@
declare void @inlined_body()
+;
; Define some functions that will be called by the functions below.
; These just call a '...body()' function. If we see the call to one of
; these functions being replaced by '...body()', then we know it has been
; inlined.
+;
define void @nonza_callee() {
; CHECK-LABEL: define void @nonza_callee
@@ -42,6 +44,7 @@ define void @new_za_callee() "aarch64_pstate_za_new" {
ret void
}
+;
; Now test that inlining only happens when no lazy-save is needed.
; Test for a number of combinations, where:
; N Not using ZA.
@@ -85,7 +88,7 @@ define void @new_za_caller_nonza_callee_dont_inline() "aarch64_pstate_za_new" {
; CHECK-LABEL: define void @new_za_caller_nonza_callee_dont_inline
; CHECK-SAME: () #[[ATTR2]] {
; CHECK-NEXT: entry:
-; CHECK-NEXT: call void @nonza_callee()
+; CHECK-NEXT: call void @inlined_body()
; CHECK-NEXT: ret void
;
entry:
@@ -130,7 +133,7 @@ define void @shared_za_caller_nonza_callee_dont_inline() "aarch64_pstate_za_shar
; CHECK-LABEL: define void @shared_za_caller_nonza_callee_dont_inline
; CHECK-SAME: () #[[ATTR1]] {
; CHECK-NEXT: entry:
-; CHECK-NEXT: call void @nonza_callee()
+; CHECK-NEXT: call void @inlined_body()
; CHECK-NEXT: ret void
;
entry:
@@ -167,3 +170,67 @@ entry:
call void @shared_za_callee()
ret void
}
+
+define void @private_za_callee_call_za_disable() {
+; CHECK-LABEL: define void @private_za_callee_call_za_disable
+; CHECK-SAME: () #[[ATTR0]] {
+; CHECK-NEXT: call void @__arm_za_disable()
+; CHECK-NEXT: ret void
+;
+ call void @__arm_za_disable()
+ ret void
+}
+
+define void @shared_za_caller_private_za_callee_call_za_disable() "aarch64_pstate_za_shared" {
+; CHECK-LABEL: define void @shared_za_caller_private_za_callee_call_za_disable
+; CHECK-SAME: () #[[ATTR1]] {
+; CHECK-NEXT: call void @private_za_callee_call_za_disable()
+; CHECK-NEXT: ret void
+;
+ call void @private_za_callee_call_za_disable()
+ ret void
+}
+
+define void @private_za_callee_call_tpidr2_save() {
+; CHECK-LABEL: define void @private_za_callee_call_tpidr2_save
+; CHECK-SAME: () #[[ATTR0]] {
+; CHECK-NEXT: call void @__arm_tpidr2_save()
+; CHECK-NEXT: ret void
+;
+ call void @__arm_tpidr2_save()
+ ret void
+}
+
+define void @shared_za_caller_private_za_callee_call_tpidr2_save_dont_inline() "aarch64_pstate_za_shared" {
+; CHECK-LABEL: define void @shared_za_caller_private_za_callee_call_tpidr2_save_dont_inline
+; CHECK-SAME: () #[[ATTR1]] {
+; CHECK-NEXT: call void @private_za_callee_call_tpidr2_save()
+; CHECK-NEXT: ret void
+;
+ call void @private_za_callee_call_tpidr2_save()
+ ret void
+}
+
+define void @private_za_callee_call_tpidr2_restore(ptr %ptr) {
+; CHECK-LABEL: define void @private_za_callee_call_tpidr2_restore
+; CHECK-SAME: (ptr [[PTR:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT: call void @__arm_tpidr2_restore(ptr [[PTR]])
+; CHECK-NEXT: ret void
+;
+ call void @__arm_tpidr2_restore(ptr %ptr)
+ ret void
+}
+
+define void @shared_za_caller_private_za_callee_call_tpidr2_restore_dont_inline(ptr %ptr) "aarch64_pstate_za_shared" {
+; CHECK-LABEL: define void @shared_za_caller_private_za_callee_call_tpidr2_restore_dont_inline
+; CHECK-SAME: (ptr [[PTR:%.*]]) #[[ATTR1]] {
+; CHECK-NEXT: call void @private_za_callee_call_tpidr2_restore(ptr [[PTR]])
+; CHECK-NEXT: ret void
+;
+ call void @private_za_callee_call_tpidr2_restore(ptr %ptr)
+ ret void
+}
+
+declare void @__arm_za_disable()
+declare void @__arm_tpidr2_save()
+declare void @__arm_tpidr2_restore(ptr)
|
You can test this locally with the following command:git-clang-format --diff 48ee6bf563924e2b4b620ed4c53b2d9f476f392c 2ec46c7d50dde0c0dddd39c3936c57310bb09d4e -- llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp View the diff from clang-format here.diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index d053350c08bf..1be023473d4e 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -200,9 +200,9 @@ static bool isSMEABIRoutineCall(const CallInst &CI) {
.Default(false);
}
-/// Returns true if the function has explicit operations that can only be lowered
-/// using incompatible instructions for the selected mode.
-/// This also returns true if the function F may use or modify ZA state.
+/// Returns true if the function has explicit operations that can only be
+/// lowered using incompatible instructions for the selected mode. This also
+/// returns true if the function F may use or modify ZA state.
static bool hasPossibleIncompatibleOps(const Function *F) {
for (const BasicBlock &BB : *F) {
for (const Instruction &I : BB) {
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// all native LLVM instructions can be lowered to compatible instructions. | ||
if (isa<CallInst>(I) && !I.isDebugOrPseudoInst() && | ||
(cast<CallInst>(I).isInlineAsm() || isa<IntrinsicInst>(I) || | ||
isSMEABIRoutineCall(cast<CallInst>(I)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a great place for a remark that explains why we can't inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that would be useful. We could update the TargetTransformInfo::areInlineCompatible
interface to return an optional message, so that InlineCost can pass that into the InlineResult::failure("conflicting attributes[: some specific reason here]")
. I'll look into that.
gentle ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Forever onwards and upwards!
…s. (#68416) This is a stacked PR following on from #68415 This patch has two purposes: (1) It tries to make inlining more likely when it can avoid a streaming-mode change. (2) It avoids inlining when inlining causes more streaming-mode changes. An example of (1) is: ``` void streaming_compatible_bar(void); void foo(void) __arm_streaming { /* other code */ streaming_compatible_bar(); /* other code */ } void f(void) { foo(); // expensive streaming mode change } -> void f(void) { /* other code */ streaming_compatible_bar(); /* other code */ } ``` where it wouldn't have inlined the function when foo would be a non-streaming function. An example of (2) is: ``` void streaming_bar(void) __arm_streaming; void foo(void) __arm_streaming { streaming_bar(); streaming_bar(); } void f(void) { foo(); // expensive streaming mode change } -> (do not inline into) void f(void) { streaming_bar(); // these are now two expensive streaming mode changes streaming_bar(); }```
The use-case here is to support things like:
int foo(int x, int y) __arm_streaming { return std::max(x, y); }
where the call to non-streaming
std::max<int>(x, y)
can be safely inlined into the streaming function.This is a first step and will need further work to allow more cases (e.g. more finegrained analysis of the function calls to ensure they don't result in any incompatible instructions for the requested mode).