|
1 | 1 | # Fast Fast Hadamard Transform
|
2 | 2 |
|
3 |
| -FFHT (Fast Fast Hadamard Transform) is a library that provides a heavily |
4 |
| -optimized C99 implementation of the Fast Hadamard Transform. FFHT also provides |
5 |
| -a thin Python wrapper that allows to perform the Fast Hadamard Transform on |
6 |
| -one-dimensional [NumPy](http://www.numpy.org/) arrays. |
7 |
| - |
8 |
| -The Hadamard Transform is a linear orthogonal map defined on real vectors whose |
9 |
| -length is a _power of two_. For the precise definition, see the |
10 |
| -[Uncyclopedia entry](https://en.wikipedia.org/wiki/Hadamard_transform). The |
11 |
| -Hadamard Transform has been recently used a lot in various machine learning |
12 |
| -and numerical algorithms. |
13 |
| - |
14 |
| -FFHT uses [AVX](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) |
15 |
| -to speed up the computation. |
16 |
| - |
17 |
| -The header file `fht.h` exports two functions: `int fht_float(float *buf, int |
18 |
| -log_n)` and `int fht_double(double *buf, int log_n)`. The |
19 |
| -only difference between them is the type of vector entries. So, in what follows, |
20 |
| -we describe how the version for floats `fht_float` works. |
21 |
| - |
22 |
| -The function `fht_float` takes two parameters: |
23 |
| - |
24 |
| -* `buf` is a pointer to the data on which one needs to perform the Fast |
25 |
| -Hadamard Transform. |
26 |
| -* `log_n` is the binary logarithm of the length of `buffer`. |
27 |
| -That is, the length is equal to `2^log_n`. |
28 |
| - |
29 |
| -The return value is -1 if the input is invalid and is zero otherwise. |
30 |
| - |
31 |
| -A header-only version of the library is provided in `fht_header_only.h`. |
32 |
| - |
33 |
| -In addition to the Fast Hadamard Transform, we provide two auxiliary programs: |
34 |
| -`test_float` and `test_double`, which are implemented in C99. The exhaustively |
35 |
| -test and benchmark the library. |
36 |
| - |
37 |
| -FFHT has been tested on 64-bit versions of Linux, OS X and Windows (the latter |
38 |
| -is via Cygwin). |
39 |
| - |
40 |
| -To install the Python package, run `python setup.py install`. The script |
41 |
| -`example.py` shows how to use FFHT from Python. |
42 |
| - |
43 |
| -## Benchmarks |
44 |
| - |
45 |
| -Below are the times for the Fast Hadamard Transform for vectors of |
46 |
| -various lengths. The benchmarks were run on a machine with Intel |
47 |
| -Core i7-6700K and 2133 MHz DDR4 RAM. We compare FFHT, |
48 |
| -[FFTW 3.3.6](http://fftw.org/), and |
49 |
| -[fht](https://github.com/nbarbey/fht) by |
50 |
| -[Nicolas Barbey](https://github.com/nbarbey). |
51 |
| - |
52 |
| -Let us stress that FFTW is a great versatile tool, and the authors of FFTW did |
53 |
| -not try to optimize the performace of the Fast Hadamard Transform. On the other |
54 |
| -hand, FFHT does one thing (the Fast Hadamard Transform), but does it extremely |
55 |
| -well. |
56 |
| - |
57 |
| -Vector size | FFHT (float) | FFHT (double) | FFTW 3.3.6 (float) | FFTW 3.3.6 (double) | fht (float) | fht (double) |
58 |
| -:---: | :---: | :---: | :---: | :---: | :---: | :---: |
59 |
| -2<sup>10</sup> | 0.31 us | 0.49 us | 4.48 us | 7.72 us | 17.4 us | 19.3 us |
60 |
| -2<sup>20</sup> | 0.68 ms | 1.39 ms | 8.81 ms | 17.07 ms | 29.8 ms | 35.0 ms |
61 |
| -2<sup>27</sup> | 0.22 s | 0.50 s | 2.08 s | 3.57 s | 6.89 s | 7.49 s |
62 |
| - |
63 |
| -## Troubleshooting |
64 |
| - |
65 |
| -For some versions of OS X the native `clang` compiler (that mimicks `gcc`) may |
66 |
| -not recognize the availability of AVX. A solution for this problem is to use a |
67 |
| -genuine `gcc` (say from [Homebrew](http://brew.sh/)) or to use `-march=corei7-avx` |
68 |
| -instead of `-march=native` for compiler flags. |
69 |
| - |
70 |
| -A symptom of the above happening is the undefined macros `__AVX__`. |
71 |
| - |
72 |
| -## Related Work |
73 |
| - |
74 |
| -FFHT has been created as a part of |
75 |
| -[FALCONN](https://github.com/falconn-lib/falconn): a library for similarity |
76 |
| -search over high-dimensional data. FALCONN's underlying algorithms are described |
77 |
| -and analyzed in the following research paper: |
78 |
| - |
79 |
| -> Alexandr Andoni, Piotr Indyk, Thijs Laarhoven, Ilya Razenshteyn and Ludwig |
80 |
| -> Schmidt, "Practical and Optimal LSH for Angular Distance", NIPS 2015, full |
81 |
| -> version available at [arXiv:1509.02897](http://arxiv.org/abs/1509.02897) |
82 |
| -
|
83 |
| -This is the right paper to cite, if you use FFHT for your research projects. |
84 |
| - |
85 |
| -## Acknowledgments |
86 |
| - |
87 |
| -We thank Ruslan Savchenko for useful discussions. |
88 |
| - |
89 |
| -Thanks to: |
90 |
| - |
91 |
| -* Clement Canonne |
92 |
| -* Michal Forisek |
93 |
| -* Rati Gelashvili |
94 |
| -* Daniel Grier |
95 |
| -* Dhiraj Holden |
96 |
| -* Justin Holmgren |
97 |
| -* Aleksandar Ivanovic |
98 |
| -* Vladislav Isenbaev |
99 |
| -* Jacob Kogler |
100 |
| -* Ilya Kornakov |
101 |
| -* Anton Lapshin |
102 |
| -* Rio LaVigne |
103 |
| -* Oleg Martynov |
104 |
| -* Linar Mikeev |
105 |
| -* Cameron Musco |
106 |
| -* Sam Park |
107 |
| -* Sunoo Park |
108 |
| -* Amelia Perry |
109 |
| -* Andrew Sabisch |
110 |
| -* Abhishek Sarkar |
111 |
| -* Ruslan Savchenko |
112 |
| -* Vadim Semenov |
113 |
| -* Arman Yessenamanov |
114 |
| - |
115 |
| -for helping us with testing FFHT. |
| 3 | +This directory contains a fork of https://github.com/FALCONN-LIB/FFHT |
| 4 | +(License: https://github.com/FALCONN-LIB/FFHT/blob/master/LICENSE.md) |
| 5 | +focused on ARM64 NEON code generation. |
0 commit comments