|
| 1 | +To: J3 J3/24-XXX |
| 2 | +From: |
| 3 | +Subject: Adding an UNSIGNED type to Fortran |
| 4 | +Date: 2024-October-25 |
| 5 | + |
| 6 | +References: 24-116, 24-102, 07-007 |
| 7 | + WG5 N2230 DIN Suggestions for F202Y.pdf |
| 8 | + WG5 N2142 Fortran 2020 Feature Survey Results 201710.pdf |
| 9 | + |
| 10 | +# 1. Introduction |
| 11 | + |
| 12 | +We propose adding a small set of features for an unsigned data to |
| 13 | +Fortran 202y. Unsigned integers are a basic data type used in many |
| 14 | +programming languages, such as C. They are useful for a range of |
| 15 | +applications, including, but not limited to |
| 16 | + |
| 17 | +- interfacing to C |
| 18 | +- interfacing to the operating system |
| 19 | +- random number generators |
| 20 | +- image processing |
| 21 | +- signal processing |
| 22 | +- hashing |
| 23 | +- cryptography (including multi-precision arithmetic) |
| 24 | +- data compression |
| 25 | +- binary file I/O |
| 26 | + |
| 27 | +Unsigned integers were the fourth most requested item to add to Fortran |
| 28 | +202x in 2017. It is the sixth item on the DIN national body list for |
| 29 | +inclusion in Fortran 202y. |
| 30 | + |
| 31 | +The use cases can be roughly divided into three classes: |
| 32 | + |
| 33 | +- representing unsigned integer |
| 34 | +- bit operations |
| 35 | +- modular arithmetic (2^n for a datatype with n bits) |
| 36 | + |
| 37 | +The two fundamental designs are: |
| 38 | + |
| 39 | +- adding a dedicated type for each use case with the appropriate |
| 40 | + behavior of aritmetic operators and intrinsic functions on overflow; |
| 41 | + the different types can possibly just be different kinds for |
| 42 | + `unsigned(kind=...)`: |
| 43 | + * `unsigned`: arithmetic operation overflow do not wraparound, or are |
| 44 | + possibly not even defined |
| 45 | + * `bits`: bit operations are defined, arithmetic operations do not |
| 46 | + wraparound or are not defined |
| 47 | + * `modular`: arithmetic operations wraparound using modular 2^n |
| 48 | + arithmetic |
| 49 | +- one `unsigned` type that is used for all three use cases; intrinsic |
| 50 | + functions are used to implement bit operations, various overflow modes |
| 51 | + (wraparound, checked, saturated), etc. One must choose some default |
| 52 | + behavior on arithmetic overflow, discussed below. |
| 53 | + |
| 54 | +There is currently no community nor committee agreement which of the two |
| 55 | +fundamnetal designs to do, nor what the default overflow behavior should |
| 56 | +be for arithmetic operations if we go with the second design. |
| 57 | + |
| 58 | +Consequently, we are proposing to implement the second design with |
| 59 | +undefined behavior for arithmetic overflow, consistent with the existing |
| 60 | +signed integers in Fortran, which allows processors to optionally check |
| 61 | +for overflow. This proposal leaves the door open to later implement |
| 62 | +either the first design, or the second design with defined overflow |
| 63 | +behavior (to wraparound). It is also a subset of features that most |
| 64 | +people seem to agree that we need. |
| 65 | + |
| 66 | +The proposal adds a solution to all three use cases (data |
| 67 | +representation, bit operations, modular arithmetic) that processors can |
| 68 | +implement and users can start using. If later we decide to either add |
| 69 | +dedicated types/kinds `bits` and `modular`, or define default arithmetic |
| 70 | +operators' overflow to wraparound, no existing code will break. |
| 71 | + |
| 72 | +## 1.1. Prior art |
| 73 | + |
| 74 | +At least one Fortran compiler, Sun Fortran, supported unsigned integers. |
| 75 | +Documentation can be found at [Oracle] |
| 76 | +(https://docs.oracle.com/cd/E19205-01/819-5263/aevnb/index.html). |
| 77 | +This proposal borrows heavily from that prior art, without sticking |
| 78 | +to it in all details. |
| 79 | + |
| 80 | +## 1.2 Inputs to this proposal |
| 81 | + |
| 82 | +In addition to the references listed above, the discussion at the |
| 83 | +Fortran proposals site |
| 84 | +https://github.com/j3-fortran/fortran_proposals/issues/2 |
| 85 | +influenced this proposal. |
| 86 | + |
| 87 | + |
| 88 | +# 2. Goal |
| 89 | + |
| 90 | +Define a new type, UNSIGNED, with a small set of intrinsic operations |
| 91 | +and intrinsic functions that would satisfy most of the use cases listed |
| 92 | +above. |
| 93 | + |
| 94 | +## 2.1 Value range limitation |
| 95 | + |
| 96 | +An UNSIGNED with n bits has a value range between 0 and 2^n-1. |
| 97 | +(Note that Fortran model integers have values between -2^(n-1)+1 and |
| 98 | +2^(n-1)-1). |
| 99 | + |
| 100 | +## 2.2 Arithmetic overflow is undefined |
| 101 | + |
| 102 | +Just like the current (signed) integers, arithmetic overflow is |
| 103 | +undefined. This allows processors to optionally check for overflow. |
| 104 | + |
| 105 | +The following intrinsic binary arithmetic operators are extended |
| 106 | +to support UNSIGNED values: |
| 107 | + + |
| 108 | + - |
| 109 | + * |
| 110 | + / |
| 111 | + |
| 112 | +The unary - operator shall not be applied to UNSIGNED values. |
| 113 | + |
| 114 | +The exponentiation operator ** shall not be applied to UNSIGNED values. |
| 115 | + |
| 116 | + |
| 117 | +## 2.3 Prohibit mixed-mode arithmetic with INTEGER and REAL |
| 118 | + |
| 119 | +The intrinsic Fortran binary arithmetic operators shall have both |
| 120 | +operands be UNSIGNED if any of the operands is UNSIGNED. |
| 121 | + |
| 122 | +The intrinsic Fortran binary relational operators (defined in R1014 rel-op) |
| 123 | +shall have both operands be UNSIGNED if either of the operands is UNSIGNED. |
| 124 | + |
| 125 | +To perform mixed-mode arithmetic with INTEGER or REAL values, |
| 126 | +the UNSIGNED operand must be converted to an INTEGER or REAL |
| 127 | +value explicitly via the INT or REAL intrinsic functions. |
| 128 | + |
| 129 | + |
| 130 | +# 3. Avoiding traps and pitfalls |
| 131 | + |
| 132 | +There are numerous well-known traps and pitfalls when using unsigned |
| 133 | +integers. We attempt to avoid these as follows: |
| 134 | +- comparison of signed vs. unsigned values: require conversion via |
| 135 | + an intrinsic function or other means. |
| 136 | +- overflow from assignment of large UNSIGNED values to similar-sized |
| 137 | + INTEGER entities: Either accept truncation or specify the KIND with a |
| 138 | + larger range to the INT intrinsic function. |
| 139 | +- confusion about modulo arithmetic, especially with respect to |
| 140 | + subtraction (e.g., 3u - 5u < 3u .EQV. .false.) is avoided |
| 141 | + because `3u - 5u` is undefined and compilers can optionally give a |
| 142 | + compile-time or runtime error. |
| 143 | + |
| 144 | + |
| 145 | +# 4. Proposal |
| 146 | + |
| 147 | +- A type name tentatively called UNSIGNED, with the same KIND |
| 148 | + mechanism as for INTEGER, plus a SELECTED_UNSIGNED_KIND function, |
| 149 | + is added to implement unsigned integers. |
| 150 | + |
| 151 | +- Unsigned integer literal constants are marked with a U suffix, |
| 152 | + with an optional KIND specifier attached via the usual underscore. |
| 153 | + |
| 154 | +- Add a conversion function UINT, with an optional KIND. |
| 155 | + |
| 156 | +- Prohibit binary operations between INTEGER and UNSIGNED or |
| 157 | + REAL and UNSIGNED without explicit conversion. |
| 158 | + |
| 159 | +- Permit unsigned integer values in a SELECT CASE. |
| 160 | + |
| 161 | +- Prohibit unsigned integers as index variables in a DO statement |
| 162 | + or as array indices. |
| 163 | + |
| 164 | +- Allow unsigned integers to be read or written in list-directed, |
| 165 | + namelist or unformatted I/O, and by using the usual edit |
| 166 | + descriptors such as I, B, O and Z. |
| 167 | + |
| 168 | +- Allow UNSIGNED arguments to some intrinsics: |
| 169 | + - BGE(UNSIGNED, UNSIGNED) and friends |
| 170 | + - BIT_SIZE(UNSIGNED) |
| 171 | + - BTEST(UNSIGNED, INTEGER) |
| 172 | + - DIGITS(UNSIGNED) |
| 173 | + - DSHIFTL(UNSIGNED, UNSIGNED, INTEGER) |
| 174 | + - DSHIFTR(UNSIGNED, UNSIGNED, INTEGER) |
| 175 | + - HUGE(UNSIGNED) |
| 176 | + - IAND(UNSIGNED, UNSIGNED), IEOR, IOR, NOT |
| 177 | + - IBCLR(UNSIGNED, INTEGER), IBITS, IBSET |
| 178 | + - ISHFT(UNSIGNED, INTEGER, INTEGER) and ISHFTC |
| 179 | + - LEADZ(UNSIGNED) and TRAILZ |
| 180 | + - MERGE_BITS(UNSIGNED, UNSIGNED, UNSIGNED |
| 181 | + - MIN(UNSIGNED, ...) and MAX |
| 182 | + - MOD(UNSIGNED, UNSIGNED) and MODULO |
| 183 | + - MVBITS(UNSIGNED, INTEGER, INTEGER, UNSIGNED, INTEGER) |
| 184 | + - POPCNT(UNSIGNED) and POPPAR |
| 185 | + - RANGE(UNSIGNED) |
| 186 | + - SHIFTA(UNSIGNED, INTEGER), SHIFTL, SHIFTR |
| 187 | + - TRANSFER(UNSIGNED, UNSIGNED, INTEGER) |
| 188 | + |
| 189 | +- Allow UNSIGNED arguments to some array intrinsics: |
| 190 | + - IALL(UNSIGNED array, INTEGER, [, mask]) and friends |
| 191 | + - IPARITY(UNSIGNED array, INTEGER [, mask]) |
| 192 | + - CSHIFT(UNSIGNED array, INTEGER, INTEGER) |
| 193 | + - DOT_PRODUCT(UNSIGNED array, UNSIGNED array) |
| 194 | + - EOSHIFT(UNSIGNED array, INTEGER, INTEGER) |
| 195 | + - FINDLOC(UNSIGNED array, UNSIGNED, ...) |
| 196 | + - MATMUL(UNSIGNED array, UNSIGNED array) |
| 197 | + - MAXLOC(UNSIGNED array, ...), and MINLOC |
| 198 | + - MAXVAL(UNSIGNED array, ...), MINVAL |
| 199 | + |
| 200 | +- Extend ISO_C_BINDING with KIND numbers, for example, |
| 201 | + C_UINT, C_UINT8_T. |
| 202 | + |
| 203 | +- Extend ISO_C_BINDING with other things we forgot to do. |
| 204 | + |
| 205 | +- Extend ISO_Fortran_binding.h appropriately. |
| 206 | + |
| 207 | +- Extend ISO_FORTRAN_ENV with KIND PARAMETERs, for example, |
| 208 | + UINT8, UINT16, UINT32. |
| 209 | + |
| 210 | +- Conversion of an UNSIGNED value to an INTEGER outside the range of |
| 211 | + the integer is processor-dependent. |
| 212 | + |
| 213 | +- Conversion of an INTEGER value to an UNSIGNED outside the range of |
| 214 | + the integer is processor-dependent. |
| 215 | + |
| 216 | +- Conversion of an UNSIGNED value to an INTEGER with a wider range |
| 217 | + is exact. |
| 218 | + |
| 219 | +# 5. Relation to other proposals |
| 220 | + |
| 221 | +This proposal is almost identical to J3/24-116 with the main difference |
| 222 | +that overflow in arithmetic operators +, -, *, / is undefined instead of |
| 223 | +wrapping around by default. |
| 224 | + |
| 225 | +This proposal complements the BITS proposal, J3/07-007r2.pdf, as |
| 226 | +proposed in J3/22-195.txt. BITS restricts its operations to logical |
| 227 | +operations and comparisons on bit lengths. This proposal adds arithmetic |
| 228 | +operations. This proposal limits the bit lengths to common powers of two. |
0 commit comments