Skip to content

Commit b8506b5

Browse files
authored
Merge pull request #48 from JeffBezanson/teh/convert
Use widening to reliably throw InexactErrors
2 parents 65b5c31 + 32b8b93 commit b8506b5

File tree

8 files changed

+124
-47
lines changed

8 files changed

+124
-47
lines changed

README.md

Lines changed: 40 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,44 +1,62 @@
11
# FixedPointNumbers
22

3-
This library exports fixed-point number types.
4-
A [fixed-point number][wikipedia] represents a fractional, or non-integral, number.
5-
In contrast with the more widely known floating-point numbers, fixed-point
6-
numbers have a fixed number of digits (bits) after the decimal (radix) point.
7-
They are effectively integers scaled by a constant factor.
3+
This library exports fixed-point number types. A
4+
[fixed-point number][wikipedia] represents a fractional, or
5+
non-integral, number. In contrast with the more widely known
6+
floating-point numbers, with fixed-point numbers the decimal point
7+
doesn't "float": fixed-point numbers are effectively integers that are
8+
interpreted as being scaled by a constant factor. Consequently, they
9+
have a fixed number of digits (bits) after the decimal (radix) point.
810

911
Fixed-point numbers can be used to perform arithmetic. Another practical
1012
application is to implicitly rescale integers without modifying the
1113
underlying representation.
1214

1315
This library exports two categories of fixed-point types. Fixed-point types are
1416
used like any other number: they can be added, multiplied, raised to a power,
15-
etc. In many cases these operations result in conversion to floating-point types.
17+
etc. In some cases these operations result in conversion to floating-point types.
18+
19+
# Type hierarchy and interpretation
1620

17-
# Type hierarchy
1821
This library defines an abstract type `FixedPoint{T <: Integer, f}` as a
19-
subtype of `Real`. The parameter `T` is the underlying representation and `f`
22+
subtype of `Real`. The parameter `T` is the underlying machine representation and `f`
2023
is the number of fraction bits.
2124

22-
For signed integers, there is a fixed-point type `Fixed{T, f}` and for unsigned
23-
integers, there is the `UFixed{T, f}` type.
24-
25-
These types, built with `f` fraction bits, map the closed interval [0.0,1.0] to
26-
the span of numbers with `f` bits. For example, the `UFixed8` type (aliased to
27-
UFixed{UInt8,8}) is represented internally by a `UInt8`, and makes `0x00`
28-
equivalent to `0.0` and `0xff` to `1.0`. The type aliases `UFixed10`, `UFixed12`,
29-
`UFixed14`, and `UFixed16` are all based on `UInt16` and reach the value `1.0`
30-
at 10, 12, 14, and 16 bits, respectively (`0x03ff`, `0x0fff`, `0x3fff`, and
31-
`0xffff`).
32-
33-
To construct such a number, use `convert(UFixed12, 1.3)`, `ufixed12(1.3)` (a
34-
convenience function), `UFixed{UInt16,12}(1.3)`, or the literal syntax
25+
For `T<:Signed` (a signed integer), there is a fixed-point type
26+
`Fixed{T, f}`; for `T<:Unsigned` (an unsigned integer), there is the
27+
`UFixed{T, f}` type. However, there are slight differences in behavior
28+
that go beyond signed/unsigned distinctions.
29+
30+
The `Fixed{T,f}` types use 1 bit for sign, and `f` bits to represent
31+
the fraction. For example, `Fixed{Int8,7}` uses 7 bits (all bits
32+
except the sign bit) for the fractional part. The value of the number
33+
is interpreted as if the integer representation has been divided by
34+
`2^f`. Consequently, `Fixed{Int8,7}` numbers `x` satisfy
35+
36+
```
37+
-1.0 = -128/128 ≤ x ≤ 127/128 ≈ 0.992.
38+
```
39+
40+
because the range of `Int8` is from -128 to 127.
41+
42+
In contrast, the `UFixed{T,f}`, with `f` fraction bits, map the closed
43+
interval [0.0,1.0] to the span of numbers with `f` bits. For example,
44+
the `UFixed8` type (aliased to `UFixed{UInt8,8}`) is represented
45+
internally by a `UInt8`, and makes `0x00` equivalent to `0.0` and
46+
`0xff` to `1.0`. Consequently, `UFixed` numbers are scaled by `2^f-1`
47+
rather than `2^f`. The type aliases `UFixed10`, `UFixed12`,
48+
`UFixed14`, and `UFixed16` are all based on `UInt16` and reach the
49+
value `1.0` at 10, 12, 14, and 16 bits, respectively (`0x03ff`,
50+
`0x0fff`, `0x3fff`, and `0xffff`).
51+
52+
To construct such a number, use `convert(UFixed12, 1.3)`, `UFixed12(1.3)`, `UFixed{UInt16,12}(1.3)`, or the literal syntax
3553
`0x14ccuf12`. The latter syntax means to construct a `UFixed12` (it ends in
3654
`uf12`) from the `UInt16` value `0x14cc`.
3755

3856
More generally, an arbitrary number of bits from any of the standard unsigned
3957
integer widths can be used for the fractional part. For example:
4058
`UFixed{UInt32,16}`, `UFixed{UInt64,3}`, `UFixed{UInt128,7}`.
4159

42-
There currently is no literal syntax for signed `Fixed` numbers.
60+
There currently is no literal syntax for signed `Fixed` numbers.
4361

4462
[wikipedia]: http://en.wikipedia.org/wiki/Fixed-point_arithmetic

appveyor.yml

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
environment:
2+
matrix:
3+
- JULIAVERSION: "julialang/bin/winnt/x86/0.4/julia-0.4-latest-win32.exe"
4+
- JULIAVERSION: "julialang/bin/winnt/x64/0.4/julia-0.4-latest-win64.exe"
5+
- JULIAVERSION: "julialang/bin/winnt/x86/0.5/julia-0.5-latest-win32.exe"
6+
- JULIAVERSION: "julialang/bin/winnt/x64/0.5/julia-0.5-latest-win64.exe"
7+
- JULIAVERSION: "julianightlies/bin/winnt/x86/julia-latest-win32.exe"
8+
- JULIAVERSION: "julianightlies/bin/winnt/x64/julia-latest-win64.exe"
9+
10+
branches:
11+
only:
12+
- master
13+
- /release-.*/
14+
15+
notifications:
16+
- provider: Email
17+
on_build_success: false
18+
on_build_failure: false
19+
on_build_status_changed: false
20+
21+
install:
22+
# Download most recent Julia Windows binary
23+
- ps: (new-object net.webclient).DownloadFile(
24+
$("http://s3.amazonaws.com/"+$env:JULIAVERSION),
25+
"C:\projects\julia-binary.exe")
26+
# Run installer silently, output to C:\projects\julia
27+
- C:\projects\julia-binary.exe /S /D=C:\projects\julia
28+
29+
build_script:
30+
# Need to convert from shallow to complete for Pkg.clone to work
31+
- IF EXIST .git\shallow (git fetch --unshallow)
32+
- C:\projects\julia\bin\julia -e "versioninfo();
33+
Pkg.clone(pwd(), \"FixedPointNumbers\"); Pkg.build(\"FixedPointNumbers\")"
34+
35+
test_script:
36+
- C:\projects\julia\bin\julia -e "Pkg.test(\"FixedPointNumbers\")"

src/FixedPointNumbers.jl

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -28,12 +28,6 @@ export
2828
UFixed12,
2929
UFixed14,
3030
UFixed16,
31-
# constructors
32-
ufixed8,
33-
ufixed10,
34-
ufixed12,
35-
ufixed14,
36-
ufixed16,
3731
# literal constructor constants
3832
uf8,
3933
uf10,
@@ -58,6 +52,16 @@ typemin{T<: FixedPoint}(::Type{T}) = T(typemin(rawtype(T)), 0)
5852
realmin{T<: FixedPoint}(::Type{T}) = typemin(T)
5953
realmax{T<: FixedPoint}(::Type{T}) = typemax(T)
6054

55+
widen1(::Type{Int8}) = Int16
56+
widen1(::Type{UInt8}) = UInt16
57+
widen1(::Type{Int16}) = Int32
58+
widen1(::Type{UInt16}) = UInt32
59+
widen1(::Type{Int32}) = Int64
60+
widen1(::Type{UInt32}) = UInt64
61+
widen1(::Type{Int64}) = Int128
62+
widen1(::Type{UInt64}) = UInt128
63+
widen1(x::Integer) = x % widen1(typeof(x))
64+
6165
include("fixed.jl")
6266
include("ufixed.jl")
6367
include("deprecations.jl")

src/deprecations.jl

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,15 @@ import Base.@deprecate_binding
1111

1212
@deprecate_binding Fixed32 Fixed16
1313
@deprecate Fixed(x::Real) convert(Fixed{Int32, 16}, x)
14+
15+
@deprecate ufixed8(x) UFixed8(x)
16+
@deprecate ufixed10(x) UFixed10(x)
17+
@deprecate ufixed12(x) UFixed12(x)
18+
@deprecate ufixed14(x) UFixed14(x)
19+
@deprecate ufixed16(x) UFixed16(x)
20+
21+
Compat.@dep_vectorize_1arg Real ufixed8
22+
Compat.@dep_vectorize_1arg Real ufixed10
23+
Compat.@dep_vectorize_1arg Real ufixed12
24+
Compat.@dep_vectorize_1arg Real ufixed14
25+
Compat.@dep_vectorize_1arg Real ufixed16

src/fixed.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,8 @@ abs{T,f}(x::Fixed{T,f}) = Fixed{T,f}(abs(x.i),0)
3030

3131

3232
# # conversions and promotions
33-
convert{T,f}(::Type{Fixed{T,f}}, x::Integer) = Fixed{T,f}(convert(T,x)<<f,0)
34-
convert{T,f}(::Type{Fixed{T,f}}, x::AbstractFloat) = Fixed{T,f}(trunc(T,x)<<f + round(T, rem(x,1)*(1<<f)),0)
33+
convert{T,f}(::Type{Fixed{T,f}}, x::Integer) = Fixed{T,f}(round(T, convert(widen1(T),x)<<f),0)
34+
convert{T,f}(::Type{Fixed{T,f}}, x::AbstractFloat) = Fixed{T,f}(round(T, trunc(widen1(T),x)<<f + rem(x,1)*(1<<f)),0)
3535
convert{T,f}(::Type{Fixed{T,f}}, x::Rational) = Fixed{T,f}(x.num)/Fixed{T,f}(x.den)
3636

3737
convert{T,f}(::Type{BigFloat}, x::Fixed{T,f}) =

src/ufixed.jl

Lines changed: 3 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -43,19 +43,9 @@ rawone(v) = reinterpret(one(v))
4343
convert{T<:UFixed}(::Type{T}, x::T) = x
4444
convert{T1<:UFixed}(::Type{T1}, x::UFixed) = reinterpret(T1, round(rawtype(T1), (rawone(T1)/rawone(x))*reinterpret(x)))
4545
convert(::Type{UFixed16}, x::UFixed8) = reinterpret(UFixed16, convert(UInt16, 0x0101*reinterpret(x)))
46-
convert{T<:UFixed}(::Type{T}, x::Real) = T(round(rawtype(T), rawone(T)*x),0)
47-
48-
ufixed8(x) = convert(UFixed8, x)
49-
ufixed10(x) = convert(UFixed10, x)
50-
ufixed12(x) = convert(UFixed12, x)
51-
ufixed14(x) = convert(UFixed14, x)
52-
ufixed16(x) = convert(UFixed16, x)
53-
54-
Compat.@dep_vectorize_1arg Real ufixed8
55-
Compat.@dep_vectorize_1arg Real ufixed10
56-
Compat.@dep_vectorize_1arg Real ufixed12
57-
Compat.@dep_vectorize_1arg Real ufixed14
58-
Compat.@dep_vectorize_1arg Real ufixed16
46+
convert{U<:UFixed}(::Type{U}, x::Real) = _convert(U, rawtype(U), x)
47+
_convert{U<:UFixed,T}(::Type{U}, ::Type{T}, x) = U(round(T, widen1(rawone(U))*x), 0)
48+
_convert{U<:UFixed }(::Type{U}, ::Type{UInt128}, x) = U(round(UInt128, rawone(U)*x), 0)
5949

6050

6151
convert(::Type{BigFloat}, x::UFixed) = reinterpret(x)*(1/BigFloat(rawone(x)))

test/fixed.jl

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,14 @@ function test_fixed{T}(::Type{T}, f)
5050
end
5151
end
5252

53+
@test_approx_eq_eps convert(Fixed{Int8,7}, 0.8) 0.797 0.001
54+
@test_approx_eq_eps convert(Fixed{Int8,7}, 0.9) 0.898 0.001
55+
@test_throws InexactError convert(Fixed{Int8, 7}, 0.999)
56+
@test_throws InexactError convert(Fixed{Int8, 7}, 1.0)
57+
@test_throws InexactError convert(Fixed{Int8, 7}, 1)
58+
@test_throws InexactError convert(Fixed{Int8, 7}, 2)
59+
@test_throws InexactError convert(Fixed{Int8, 7}, 128)
60+
5361
for (TI, f) in [(Int8, 8), (Int16, 8), (Int16, 10), (Int32, 16)]
5462
T = Fixed{TI,f}
5563
println(" Testing $T")

test/ufixed.jl

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,10 @@ using Compat
1414
@test reinterpret(UFixed14, 0x1fa2) == 0x1fa2uf14
1515
@test reinterpret(UFixed16, 0x1fa2) == 0x1fa2uf16
1616

17-
@test ufixed8(1.0) == 0xffuf8
18-
@test ufixed8(0.5) == 0x80uf8
19-
@test ufixed14(1.0) == 0x3fffuf14
20-
v = @compat ufixed12.([2])
17+
@test UFixed8(1.0) == 0xffuf8
18+
@test UFixed8(0.5) == 0x80uf8
19+
@test UFixed14(1.0) == 0x3fffuf14
20+
v = @compat UFixed12.([2])
2121
@test v == UFixed12[0x1ffeuf12]
2222
@test isa(v, Vector{UFixed12})
2323

@@ -44,6 +44,15 @@ end
4444
@test typemax(UFixed{UInt64,3}) == typemax(UInt64) // (2^3-1)
4545
@test typemax(UFixed{UInt128,7}) == typemax(UInt128) // (2^7-1)
4646

47+
@test_throws InexactError UFixed8(2)
48+
@test_throws InexactError UFixed8(255)
49+
@test_throws InexactError UFixed8(0xff)
50+
@test_throws InexactError UFixed16(2)
51+
@test_throws InexactError UFixed16(0xff)
52+
@test_throws InexactError UFixed16(0xffff)
53+
@test_throws InexactError convert(UFixed8, typemax(UFixed10))
54+
@test_throws InexactError convert(UFixed16, typemax(UFixed10))
55+
4756
x = UFixed8(0.5)
4857
@test isfinite(x) == true
4958
@test isnan(x) == false

0 commit comments

Comments
 (0)