Skip to content

Commit 4ebc0c5

Browse files
committed
[HLSL][Docs] Add documentation for HLSL functions
This adds a new document that covers the HLSL approach to function calls and parameter semantics. At time of writing this document is a proposal for the implementation.
1 parent c3fa4b7 commit 4ebc0c5

File tree

2 files changed

+317
-0
lines changed

2 files changed

+317
-0
lines changed

clang/docs/HLSL/FunctionCalls.rst

Lines changed: 316 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,316 @@
1+
===================
2+
HLSL Function Calls
3+
===================
4+
5+
.. contents::
6+
:local:
7+
8+
Introduction
9+
============
10+
11+
This document descries the design and implementation of HLSL's function call
12+
semantics in Clang. This includes details related to argument conversion and
13+
parameter lifetimes.
14+
15+
This document does not seek to serve as official documentation for HLSL's
16+
call semantics, but does provide an overview to assist a reader. The
17+
authoritative documentation for HLSL's language semantics is the `draft language
18+
specification <https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf>`_.
19+
20+
Argument Semantics
21+
==================
22+
23+
In HLSL, all function arguments are passed by value in and out of functions.
24+
HLSL has 3 keywords which denote the parameter semantics (``in``, ``out`` and
25+
``inout``). In a function declaration a parameter may be annotated any of the
26+
following ways:
27+
28+
#. <no parameter annotation> - denotes input
29+
#. ``in`` - denotes input
30+
#. ``out`` - denotes output
31+
#. ``in out`` - denotes input and output
32+
#. ``out in`` - denotes input and output
33+
#. ``inout`` - denotes input and output
34+
35+
Parameters that are exclusively input behave like C/C++ parameters that are
36+
passed by value.
37+
38+
For parameters that are output (or input and output), a temporary value is
39+
created in the caller. The temporary value is then passed by-address. For
40+
output-only parameters, the temporary is uninitialized when passed (it is
41+
undefined behavior to not explicitly initialize an ``out`` parameter inside a
42+
function). For input and output parameters, the temporary is initialized from
43+
the lvalue argument expression through implicit or explicit casting from the
44+
lvalue argument type to the parameter type.
45+
46+
On return of the function, the values of any parameter temporaries are written
47+
back to the argument expression through an inverted conversion sequence (if an
48+
``out`` parameter was not initialized in the function, the uninitialized value
49+
may be written back).
50+
51+
Parameters of constant-sized array type, are also passed with value semantics.
52+
This requires input parameters of arrays to construct temporaries and the
53+
temporaries go through array-to-pointer decay when initializing parameters.
54+
55+
Implementations are allowed to avoid unnecessary temporaries, and HLSL's strict
56+
no-alias rules can enable some trivial optimizations.
57+
58+
Array Temporaries
59+
-----------------
60+
61+
Given the following example:
62+
63+
.. code-block:: c++
64+
65+
void fn(float a[4]) {
66+
a[0] = a[1] + a[2] + a[3];
67+
}
68+
69+
float4 main() : SV_Target {
70+
float arr[4] = {1, 1, 1, 1};
71+
fn(arr);
72+
return float4(a[0], a[1], a[2], a[3]);
73+
}
74+
75+
In C or C++, the array parameter decays to a pointer, so after the call to
76+
``fn``, the value of ``a[0]`` is ``3``. In HLSL, the array is passed by value,
77+
so modifications inside ``fn`` do not propagate out.
78+
79+
.. note::
80+
81+
DXC supports unsized arrays passed directly as decayed pointers, which is an
82+
unfortunate behavior divergence.
83+
84+
Out Parameter Temporaries
85+
-------------------------
86+
87+
.. code-block:: c++
88+
89+
void Init(inout int X, inout int Y) {
90+
Y = 2;
91+
X = 1;
92+
}
93+
94+
void main() {
95+
int V;
96+
Init(V, V); // MSVC ABI V == 2, Itanium V == 1
97+
}
98+
99+
In the above example the ``Init`` function's behavior depends on the C++ ABI
100+
implementation. In the MSVC C++ ABI (used for the HLSL DXIL target), call
101+
arguments are emitted right-to-left and destroyed left-to-right. This means that
102+
the parameter initialization and destruction occurs in the order: {``Y``,
103+
``X``, ``~X``, ``~Y``}. This causes the write-back of the value of ``Y`` to occur
104+
last, so the resulting value of ``V`` is ``2``. In the Itanium C++ ABI, the
105+
parameter ordering is reversed, so the initialization and destruction occurs in
106+
the order: {``X``, ``Y``, ``~Y``, ``X``}. This causes the write-back of the
107+
value ``X`` to occur last, resulting in the value of ``V`` being set to ``1``.
108+
109+
.. code-block:: c++
110+
111+
void Trunc(inout int3 V) { }
112+
113+
114+
void main() {
115+
float3 F = {1.5, 2.6, 3.3};
116+
Trunc(F); // F == {1.0, 2.0, 3.0}
117+
}
118+
119+
In the above example, the argument expression ``F`` undergoes element-wise
120+
conversion from a float vector to an integer vector to create a temporary
121+
``int3``. On expiration the temporary undergoes elementwise conversion back to
122+
the floating point vector type ``float3``. This results in an implicit
123+
truncation of the vector even if the value is unused in the function.
124+
125+
126+
.. code-block:: c++
127+
128+
void UB(out int X) {}
129+
130+
void main() {
131+
int X = 7;
132+
UB(X); // X is undefined!
133+
}
134+
135+
In this example an initialized value is passed to an ``out`` parameter.
136+
Parameters marked ``out`` are not initialized by the argument expression or
137+
implicitly by the function. They must be explicitly initialized. In this case
138+
the argument is not initialized in the function so the temporary is still
139+
uninitialized when it is copied back to the argument expression. This is
140+
undefined behavior in HLSL, and may be illegal in generated programs.
141+
142+
Clang Implementation
143+
====================
144+
145+
.. note::
146+
147+
The implementation described here is a proposal. It has not yet been fully
148+
implemented, so the current state of Clang's sources may not reflect this
149+
design. A prototype implementation was built on DXC which is Clang-3.7 based.
150+
The prototype can be found
151+
`here <https://github.com/microsoft/DirectXShaderCompiler/pull/5249>`_. A lot
152+
of the changes in the prototype implementation are restoring Clang-3.7 code
153+
that was previously modified to its original state.
154+
155+
The implementation in clang depends on two new AST nodes and minor extensions to
156+
Clang's existing support for Objective-C write-back arguments. The goal of this
157+
design is to capture the semantic details of HLSL function calls in the AST, and
158+
minimize the amount of magic that needs to occur during IR generation.
159+
160+
The two new AST nodes are ``HLSLArrayTemporaryExpr`` and ``HLSLOutParamExpr``,
161+
which respectively represent the temporaries used for passing arrays by value
162+
and the temporaries created for function outputs.
163+
164+
Array Temporaries
165+
-----------------
166+
167+
The ``HLSLArrayTemporaryExpr`` represents temporary values for input
168+
constant-sized array arguments. This applies for all constant-sized array
169+
arguments regardless of whether or not the parameter is constant-sized or
170+
unsized.
171+
172+
.. code-block:: c++
173+
174+
void SizedArray(float a[4]);
175+
void UnsizedArray(float a[]);
176+
177+
void main() {
178+
float arr[4] = {1, 1, 1, 1};
179+
SizedArray(arr);
180+
UnsizedArray(arr);
181+
}
182+
183+
In the example above, the following AST is generated for the call to
184+
``SizedArray``:
185+
186+
.. code-block:: text
187+
188+
CallExpr 'void'
189+
|-ImplicitCastExpr 'void (*)(float [4])' <FunctionToPointerDecay>
190+
| `-DeclRefExpr 'void (float [4])' lvalue Function 'SizedArray' 'void (float [4])'
191+
`-HLSLArrayTemporaryExpr 'float [4]'
192+
`-DeclRefExpr 'float [4]' lvalue Var 'arr' 'float [4]'
193+
194+
In the example above, the following AST is generated for the call to
195+
``UnsizedArray``:
196+
197+
.. code-block:: text
198+
199+
CallExpr 'void'
200+
|-ImplicitCastExpr 'void (*)(float [])' <FunctionToPointerDecay>
201+
| `-DeclRefExpr 'void (float [])' lvalue Function 'UnsizedArray' 'void (float [])'
202+
`-HLSLArrayTemporaryExpr 'float [4]'
203+
`-DeclRefExpr 'float [4]' lvalue Var 'arr' 'float [4]'
204+
205+
In both of these cases the argument expression is of known array size so we can
206+
initialize an appropriately sized temporary.
207+
208+
It is illegal in HLSL to convert an unsized array to a sized array:
209+
210+
.. code-block:: c++
211+
212+
void SizedArray(float a[4]);
213+
void UnsizedArray(float a[]) {
214+
SizedArray(a); // Cannot convert float[] to float[4]
215+
}
216+
217+
When converting a sized array to an unsized array, an array temporary can also
218+
be inserted. Given the following code:
219+
220+
.. code-block:: c++
221+
222+
void UnsizedArray(float a[]);
223+
void SizedArray(float a[4]) {
224+
UnsizedArray(a);
225+
}
226+
227+
An expected AST should be something like:
228+
229+
.. code-block:: text
230+
231+
CallExpr 'void'
232+
|-ImplicitCastExpr 'void (*)(float [])' <FunctionToPointerDecay>
233+
| `-DeclRefExpr 'void (float [])' lvalue Function 'UnsizedArray' 'void (float [])'
234+
`-HLSLArrayTemporaryExpr 'float [4]'
235+
`-DeclRefExpr 'float [4]' lvalue Var 'arr' 'float [4]'
236+
237+
Out Parameter Temporaries
238+
-------------------------
239+
240+
Output parameters are defined in HLSL as *casting expiring values* (cx-values),
241+
which is a term made up for HLSL. A cx-value is a temporary value which may be
242+
the result of a cast, and stores its value back to an lvalue when the value
243+
expires.
244+
245+
To represent this concept in Clang we introduce a new ``HLSLOutParamExpr``. An
246+
``HLSLOutParamExpr`` has two forms, one with a single sub-expression and one
247+
with two sub-expressions.
248+
249+
The single sub-expression form is used when the argument expression and the
250+
function parameter are the same type, so no cast is required. As in this
251+
example:
252+
253+
.. code-block:: c++
254+
255+
void Init(inout int X) {
256+
X = 1;
257+
}
258+
259+
void main() {
260+
int V;
261+
Init(V);
262+
}
263+
264+
The expected AST formulation for this code would be something like:
265+
266+
.. code-block:: text
267+
268+
CallExpr 'void'
269+
|-ImplicitCastExpr 'void (*)(int &)' <FunctionToPointerDecay>
270+
| `-DeclRefExpr 'void (int &)' lvalue Function 'Init' 'void (int &)'
271+
|-HLSLOutParamExpr 'int' lvalue inout
272+
`-DeclRefExpr 'int' lvalue Var 'V' 'int'
273+
274+
The ``HLSLOutParamExpr`` captures that the value is ``inout`` vs ``out`` to
275+
denote whether or not the temporary is initialized from the sub-expression. If
276+
no casting is required the sub-expression denotes the lvalue expression that the
277+
cx-value will be copied to when the value expires.
278+
279+
The two sub-expression form of the AST node is required when the argument type
280+
is not the same as the parameter type. Given this example:
281+
282+
.. code-block:: c++
283+
284+
void Trunc(inout int3 V) { }
285+
286+
287+
void main() {
288+
float3 F = {1.5, 2.6, 3.3};
289+
Trunc(F);
290+
}
291+
292+
For this case the ``HLSLOutParamExpr`` will have sub-expressions to record both
293+
casting expression sequences for the initialization and write back:
294+
295+
.. code-block:: text
296+
297+
-CallExpr 'void'
298+
|-ImplicitCastExpr 'void (*)(int3 &)' <FunctionToPointerDecay>
299+
| `-DeclRefExpr 'void (int3 &)' lvalue Function 'inc_i32' 'void (int3 &)'
300+
`-HLSLOutParamExpr 'int3' lvalue inout
301+
|-ImplicitCastExpr 'float3' <IntegralToFloating>
302+
| `-ImplicitCastExpr 'int3' <LValueToRValue>
303+
| `-OpaqueValueExpr 'int3' lvalue
304+
`-ImplicitCastExpr 'int3' <FloatingToIntegral>
305+
`-ImplicitCastExpr 'float3' <LValueToRValue>
306+
`-DeclRefExpr 'float3' lvalue 'F' 'float3'
307+
308+
In this formation the write-back casts are captured as the first sub-expression
309+
and they cast from an ``OpaqueValueExpr``. In IR generation we can use the
310+
``OpaqueValueExpr`` as a placeholder for the ``HLSLOutParamExpr``'s temporary
311+
value on function return.
312+
313+
In code generation this can be implemented with some targeted extensions to the
314+
Objective-C write-back support. Specifically extending CGCall.cpp's
315+
``EmitWriteback`` function to support casting expressions and emission of
316+
aggregate lvalues.

clang/docs/HLSL/HLSLDocs.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,4 @@ HLSL Design and Implementation
1414
HLSLIRReference
1515
ResourceTypes
1616
EntryFunctions
17+
FunctionCalls

0 commit comments

Comments
 (0)