Skip to content

Commit c475bf7

Browse files
committed
Add new unsigned proposal
1 parent 3b67982 commit c475bf7

File tree

1 file changed

+228
-0
lines changed

1 file changed

+228
-0
lines changed

proposals/unsigned/unsigned.txt

Lines changed: 228 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,228 @@
1+
To: J3 J3/24-XXX
2+
From:
3+
Subject: Adding an UNSIGNED type to Fortran
4+
Date: 2024-October-25
5+
6+
References: 24-116, 24-102, 07-007
7+
WG5 N2230 DIN Suggestions for F202Y.pdf
8+
WG5 N2142 Fortran 2020 Feature Survey Results 201710.pdf
9+
10+
# 1. Introduction
11+
12+
We propose adding a small set of features for an unsigned data to
13+
Fortran 202y. Unsigned integers are a basic data type used in many
14+
programming languages, such as C. They are useful for a range of
15+
applications, including, but not limited to
16+
17+
- interfacing to C
18+
- interfacing to the operating system
19+
- random number generators
20+
- image processing
21+
- signal processing
22+
- hashing
23+
- cryptography (including multi-precision arithmetic)
24+
- data compression
25+
- binary file I/O
26+
27+
Unsigned integers were the fourth most requested item to add to Fortran
28+
202x in 2017. It is the sixth item on the DIN national body list for
29+
inclusion in Fortran 202y.
30+
31+
The use cases can be roughly divided into three classes:
32+
33+
- representing unsigned integer
34+
- bit operations
35+
- modular arithmetic (2^n for a datatype with n bits)
36+
37+
The two fundamental designs are:
38+
39+
- adding a dedicated type for each use case with the appropriate
40+
behavior of aritmetic operators and intrinsic functions on overflow;
41+
the different types can possibly just be different kinds for
42+
`unsigned(kind=...)`:
43+
* `unsigned`: arithmetic operation overflow do not wraparound, or are
44+
possibly not even defined
45+
* `bits`: bit operations are defined, arithmetic operations do not
46+
wraparound or are not defined
47+
* `modular`: arithmetic operations wraparound using modular 2^n
48+
arithmetic
49+
- one `unsigned` type that is used for all three use cases; intrinsic
50+
functions are used to implement bit operations, various overflow modes
51+
(wraparound, checked, saturated), etc. One must choose some default
52+
behavior on arithmetic overflow, discussed below.
53+
54+
There is currently no community nor committee agreement which of the two
55+
fundamnetal designs to do, nor what the default overflow behavior should
56+
be for arithmetic operations if we go with the second design.
57+
58+
Consequently, we are proposing to implement the second design with
59+
undefined behavior for arithmetic overflow, consistent with the existing
60+
signed integers in Fortran, which allows processors to optionally check
61+
for overflow. This proposal leaves the door open to later implement
62+
either the first design, or the second design with defined overflow
63+
behavior (to wraparound). It is also a subset of features that most
64+
people seem to agree that we need.
65+
66+
The proposal adds a solution to all three use cases (data
67+
representation, bit operations, modular arithmetic) that processors can
68+
implement and users can start using. If later we decide to either add
69+
dedicated types/kinds `bits` and `modular`, or define default arithmetic
70+
operators' overflow to wraparound, no existing code will break.
71+
72+
## 1.1. Prior art
73+
74+
At least one Fortran compiler, Sun Fortran, supported unsigned integers.
75+
Documentation can be found at [Oracle]
76+
(https://docs.oracle.com/cd/E19205-01/819-5263/aevnb/index.html).
77+
This proposal borrows heavily from that prior art, without sticking
78+
to it in all details.
79+
80+
## 1.2 Inputs to this proposal
81+
82+
In addition to the references listed above, the discussion at the
83+
Fortran proposals site
84+
https://github.com/j3-fortran/fortran_proposals/issues/2
85+
influenced this proposal.
86+
87+
88+
# 2. Goal
89+
90+
Define a new type, UNSIGNED, with a small set of intrinsic operations
91+
and intrinsic functions that would satisfy most of the use cases listed
92+
above.
93+
94+
## 2.1 Value range limitation
95+
96+
An UNSIGNED with n bits has a value range between 0 and 2^n-1.
97+
(Note that Fortran model integers have values between -2^(n-1)+1 and
98+
2^(n-1)-1).
99+
100+
## 2.2 Arithmetic overflow is undefined
101+
102+
Just like the current (signed) integers, arithmetic overflow is
103+
undefined. This allows processors to optionally check for overflow.
104+
105+
The following intrinsic binary arithmetic operators are extended
106+
to support UNSIGNED values:
107+
+
108+
-
109+
*
110+
/
111+
112+
The unary - operator shall not be applied to UNSIGNED values.
113+
114+
The exponentiation operator ** shall not be applied to UNSIGNED values.
115+
116+
117+
## 2.3 Prohibit mixed-mode arithmetic with INTEGER and REAL
118+
119+
The intrinsic Fortran binary arithmetic operators shall have both
120+
operands be UNSIGNED if any of the operands is UNSIGNED.
121+
122+
The intrinsic Fortran binary relational operators (defined in R1014 rel-op)
123+
shall have both operands be UNSIGNED if either of the operands is UNSIGNED.
124+
125+
To perform mixed-mode arithmetic with INTEGER or REAL values,
126+
the UNSIGNED operand must be converted to an INTEGER or REAL
127+
value explicitly via the INT or REAL intrinsic functions.
128+
129+
130+
# 3. Avoiding traps and pitfalls
131+
132+
There are numerous well-known traps and pitfalls when using unsigned
133+
integers. We attempt to avoid these as follows:
134+
- comparison of signed vs. unsigned values: require conversion via
135+
an intrinsic function or other means.
136+
- overflow from assignment of large UNSIGNED values to similar-sized
137+
INTEGER entities: Either accept truncation or specify the KIND with a
138+
larger range to the INT intrinsic function.
139+
- confusion about modulo arithmetic, especially with respect to
140+
subtraction (e.g., 3u - 5u < 3u .EQV. .false.) is avoided
141+
because `3u - 5u` is undefined and compilers can optionally give a
142+
compile-time or runtime error.
143+
144+
145+
# 4. Proposal
146+
147+
- A type name tentatively called UNSIGNED, with the same KIND
148+
mechanism as for INTEGER, plus a SELECTED_UNSIGNED_KIND function,
149+
is added to implement unsigned integers.
150+
151+
- Unsigned integer literal constants are marked with a U suffix,
152+
with an optional KIND specifier attached via the usual underscore.
153+
154+
- Add a conversion function UINT, with an optional KIND.
155+
156+
- Prohibit binary operations between INTEGER and UNSIGNED or
157+
REAL and UNSIGNED without explicit conversion.
158+
159+
- Permit unsigned integer values in a SELECT CASE.
160+
161+
- Prohibit unsigned integers as index variables in a DO statement
162+
or as array indices.
163+
164+
- Allow unsigned integers to be read or written in list-directed,
165+
namelist or unformatted I/O, and by using the usual edit
166+
descriptors such as I, B, O and Z.
167+
168+
- Allow UNSIGNED arguments to some intrinsics:
169+
- BGE(UNSIGNED, UNSIGNED) and friends
170+
- BIT_SIZE(UNSIGNED)
171+
- BTEST(UNSIGNED, INTEGER)
172+
- DIGITS(UNSIGNED)
173+
- DSHIFTL(UNSIGNED, UNSIGNED, INTEGER)
174+
- DSHIFTR(UNSIGNED, UNSIGNED, INTEGER)
175+
- HUGE(UNSIGNED)
176+
- IAND(UNSIGNED, UNSIGNED), IEOR, IOR, NOT
177+
- IBCLR(UNSIGNED, INTEGER), IBITS, IBSET
178+
- ISHFT(UNSIGNED, INTEGER, INTEGER) and ISHFTC
179+
- LEADZ(UNSIGNED) and TRAILZ
180+
- MERGE_BITS(UNSIGNED, UNSIGNED, UNSIGNED
181+
- MIN(UNSIGNED, ...) and MAX
182+
- MOD(UNSIGNED, UNSIGNED) and MODULO
183+
- MVBITS(UNSIGNED, INTEGER, INTEGER, UNSIGNED, INTEGER)
184+
- POPCNT(UNSIGNED) and POPPAR
185+
- RANGE(UNSIGNED)
186+
- SHIFTA(UNSIGNED, INTEGER), SHIFTL, SHIFTR
187+
- TRANSFER(UNSIGNED, UNSIGNED, INTEGER)
188+
189+
- Allow UNSIGNED arguments to some array intrinsics:
190+
- IALL(UNSIGNED array, INTEGER, [, mask]) and friends
191+
- IPARITY(UNSIGNED array, INTEGER [, mask])
192+
- CSHIFT(UNSIGNED array, INTEGER, INTEGER)
193+
- DOT_PRODUCT(UNSIGNED array, UNSIGNED array)
194+
- EOSHIFT(UNSIGNED array, INTEGER, INTEGER)
195+
- FINDLOC(UNSIGNED array, UNSIGNED, ...)
196+
- MATMUL(UNSIGNED array, UNSIGNED array)
197+
- MAXLOC(UNSIGNED array, ...), and MINLOC
198+
- MAXVAL(UNSIGNED array, ...), MINVAL
199+
200+
- Extend ISO_C_BINDING with KIND numbers, for example,
201+
C_UINT, C_UINT8_T.
202+
203+
- Extend ISO_C_BINDING with other things we forgot to do.
204+
205+
- Extend ISO_Fortran_binding.h appropriately.
206+
207+
- Extend ISO_FORTRAN_ENV with KIND PARAMETERs, for example,
208+
UINT8, UINT16, UINT32.
209+
210+
- Conversion of an UNSIGNED value to an INTEGER outside the range of
211+
the integer is processor-dependent.
212+
213+
- Conversion of an INTEGER value to an UNSIGNED outside the range of
214+
the integer is processor-dependent.
215+
216+
- Conversion of an UNSIGNED value to an INTEGER with a wider range
217+
is exact.
218+
219+
# 5. Relation to other proposals
220+
221+
This proposal is almost identical to J3/24-116 with the main difference
222+
that overflow in arithmetic operators +, -, *, / is undefined instead of
223+
wrapping around by default.
224+
225+
This proposal complements the BITS proposal, J3/07-007r2.pdf, as
226+
proposed in J3/22-195.txt. BITS restricts its operations to logical
227+
operations and comparisons on bit lengths. This proposal adds arithmetic
228+
operations. This proposal limits the bit lengths to common powers of two.

0 commit comments

Comments
 (0)