Skip to content

Commit 4046efa

Browse files
committed
[ExecuTorch] Clean commit of FFHT dependency
This is https://github.com/FALCONN-LIB/FFHT . Differential Revision: [D60194972](https://our.internmc.facebook.com/intern/diff/D60194972/) [ghstack-poisoned]
1 parent 7dd3c34 commit 4046efa

File tree

107 files changed

+59214
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

107 files changed

+59214
-0
lines changed
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
The MIT License (MIT)
2+
3+
Copyright (c) 2015 Alexandr Andoni, Piotr Indyk, Thijs Laarhoven,
4+
Ilya Razenshteyn, Ludwig Schmidt
5+
6+
Permission is hereby granted, free of charge, to any person obtaining a copy
7+
of this software and associated documentation files (the "Software"), to deal
8+
in the Software without restriction, including without limitation the rights
9+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
10+
copies of the Software, and to permit persons to whom the Software is
11+
furnished to do so, subject to the following conditions:
12+
13+
The above copyright notice and this permission notice shall be included in
14+
all copies or substantial portions of the Software.
15+
16+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
22+
THE SOFTWARE.
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
CC = gcc
2+
CFLAGS = -O3 -march=native -std=c99 -pedantic -Wall -Wextra -Wshadow -Wpointer-arith -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes
3+
4+
all: test_float test_double fast_copy.o fht.o
5+
6+
OBJ := fast_copy.o fht.o
7+
8+
%.o: %.c
9+
$(CC) $< -o $@ -c $(CFLAGS)
10+
11+
test_%: test_%.c $(OBJ)
12+
$(CC) $< $(OBJ) -o $@ $(CFLAGS)
13+
14+
test_double_header_only: test_double_header_only.c
15+
$(CC) $< -o $@ $(CFLAGS)
16+
17+
test_float_header_only: test_double_header_only.c
18+
$(CC) $< -o $@ $(CFLAGS)
19+
20+
clean:
21+
rm -f test_float test_double test_float_header_only test_double_header_only $(OBJ)
Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
# Fast Fast Hadamard Transform
2+
3+
FFHT (Fast Fast Hadamard Transform) is a library that provides a heavily
4+
optimized C99 implementation of the Fast Hadamard Transform. FFHT also provides
5+
a thin Python wrapper that allows to perform the Fast Hadamard Transform on
6+
one-dimensional [NumPy](http://www.numpy.org/) arrays.
7+
8+
The Hadamard Transform is a linear orthogonal map defined on real vectors whose
9+
length is a _power of two_. For the precise definition, see the
10+
[Uncyclopedia entry](https://en.wikipedia.org/wiki/Hadamard_transform). The
11+
Hadamard Transform has been recently used a lot in various machine learning
12+
and numerical algorithms.
13+
14+
FFHT uses [AVX](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions)
15+
to speed up the computation.
16+
17+
The header file `fht.h` exports two functions: `int fht_float(float *buf, int
18+
log_n)` and `int fht_double(double *buf, int log_n)`. The
19+
only difference between them is the type of vector entries. So, in what follows,
20+
we describe how the version for floats `fht_float` works.
21+
22+
The function `fht_float` takes two parameters:
23+
24+
* `buf` is a pointer to the data on which one needs to perform the Fast
25+
Hadamard Transform.
26+
* `log_n` is the binary logarithm of the length of `buffer`.
27+
That is, the length is equal to `2^log_n`.
28+
29+
The return value is -1 if the input is invalid and is zero otherwise.
30+
31+
A header-only version of the library is provided in `fht_header_only.h`.
32+
33+
In addition to the Fast Hadamard Transform, we provide two auxiliary programs:
34+
`test_float` and `test_double`, which are implemented in C99. The exhaustively
35+
test and benchmark the library.
36+
37+
FFHT has been tested on 64-bit versions of Linux, OS X and Windows (the latter
38+
is via Cygwin).
39+
40+
To install the Python package, run `python setup.py install`. The script
41+
`example.py` shows how to use FFHT from Python.
42+
43+
## Benchmarks
44+
45+
Below are the times for the Fast Hadamard Transform for vectors of
46+
various lengths. The benchmarks were run on a machine with Intel
47+
Core&nbsp;i7-6700K and 2133 MHz DDR4 RAM. We compare FFHT,
48+
[FFTW 3.3.6](http://fftw.org/), and
49+
[fht](https://github.com/nbarbey/fht) by
50+
[Nicolas Barbey](https://github.com/nbarbey).
51+
52+
Let us stress that FFTW is a great versatile tool, and the authors of FFTW did
53+
not try to optimize the performace of the Fast Hadamard Transform. On the other
54+
hand, FFHT does one thing (the Fast Hadamard Transform), but does it extremely
55+
well.
56+
57+
Vector size | FFHT (float) | FFHT (double) | FFTW 3.3.6 (float) | FFTW 3.3.6 (double) | fht (float) | fht (double)
58+
:---: | :---: | :---: | :---: | :---: | :---: | :---:
59+
2<sup>10</sup> | 0.31 us | 0.49 us | 4.48 us | 7.72 us | 17.4 us | 19.3 us
60+
2<sup>20</sup> | 0.68 ms | 1.39 ms | 8.81 ms | 17.07 ms | 29.8 ms | 35.0 ms
61+
2<sup>27</sup> | 0.22 s | 0.50 s | 2.08 s | 3.57 s | 6.89 s | 7.49 s
62+
63+
## Troubleshooting
64+
65+
For some versions of OS X the native `clang` compiler (that mimicks `gcc`) may
66+
not recognize the availability of AVX. A solution for this problem is to use a
67+
genuine `gcc` (say from [Homebrew](http://brew.sh/)) or to use `-march=corei7-avx`
68+
instead of `-march=native` for compiler flags.
69+
70+
A symptom of the above happening is the undefined macros `__AVX__`.
71+
72+
## Related Work
73+
74+
FFHT has been created as a part of
75+
[FALCONN](https://github.com/falconn-lib/falconn): a library for similarity
76+
search over high-dimensional data. FALCONN's underlying algorithms are described
77+
and analyzed in the following research paper:
78+
79+
> Alexandr Andoni, Piotr Indyk, Thijs Laarhoven, Ilya Razenshteyn and Ludwig
80+
> Schmidt, "Practical and Optimal LSH for Angular Distance", NIPS 2015, full
81+
> version available at [arXiv:1509.02897](http://arxiv.org/abs/1509.02897)
82+
83+
This is the right paper to cite, if you use FFHT for your research projects.
84+
85+
## Acknowledgments
86+
87+
We thank Ruslan Savchenko for useful discussions.
88+
89+
Thanks to:
90+
91+
* Clement Canonne
92+
* Michal Forisek
93+
* Rati Gelashvili
94+
* Daniel Grier
95+
* Dhiraj Holden
96+
* Justin Holmgren
97+
* Aleksandar Ivanovic
98+
* Vladislav Isenbaev
99+
* Jacob Kogler
100+
* Ilya Kornakov
101+
* Anton Lapshin
102+
* Rio LaVigne
103+
* Oleg Martynov
104+
* Linar Mikeev
105+
* Cameron Musco
106+
* Sam Park
107+
* Sunoo Park
108+
* Amelia Perry
109+
* Andrew Sabisch
110+
* Abhishek Sarkar
111+
* Ruslan Savchenko
112+
* Vadim Semenov
113+
* Arman Yessenamanov
114+
115+
for helping us with testing FFHT.
Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
#include <Python.h>
2+
#include <numpy/arrayobject.h>
3+
#include "fht.h"
4+
5+
#define UNUSED(x) (void)(x)
6+
7+
static char module_docstring[] =
8+
"A C extension that computes the Fast Hadamard Transform";
9+
static char fht_docstring[] =
10+
"Compute the Fast Hadamard Transform (FHT) for a given "
11+
"one-dimensional NumPy array.\n\n"
12+
"The Hadamard Transform is a linear orthogonal map defined on real vectors "
13+
"whose length is a _power of two_. For the precise definition, see the "
14+
"[Uncyclopedia entry](https://en.wikipedia.org/wiki/Hadamard_transform). The "
15+
"Hadamard Transform has been recently used a lot in various machine "
16+
"learning "
17+
"and numerical algorithms.\n\n"
18+
"The implementation uses "
19+
"[AVX](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) "
20+
"to speed up the computation. If AVX is not supported on your machine, "
21+
"a simpler implementation without (explicit) vectorization is used.\n\n"
22+
"The function takes two parameters:\n\n"
23+
"* `buffer` is a NumPy array which is being transformed. It must be a "
24+
"one-dimensional array with `dtype` equal to `float32` or `float64` (the "
25+
"former is recommended unless you need high accuracy) and of size being a "
26+
"power "
27+
"of two. If your CPU supports AVX, then `buffer` must be aligned to 32 "
28+
"bytes. "
29+
"To allocate such an aligned buffer, use the function `created_aligned` "
30+
"from this "
31+
"module.\n"
32+
"* `chunk` is a positive integer that controls when the implementation "
33+
"switches "
34+
"from recursive to iterative algorithm. The overall algorithm is "
35+
"recursive, but as "
36+
"soon as the vector becomes no longer than `chunk`, the iterative "
37+
"algorithm is "
38+
"invoked. For technical reasons, `chunk` must be at least 8. A good choice "
39+
"is to "
40+
"set `chunk` to 1024. But to fine-tune the performance one should use a "
41+
"program "
42+
"`best_chunk` supplied with the library.\n";
43+
44+
static PyObject *ffht_fht(PyObject *self, PyObject *args);
45+
46+
static PyMethodDef module_methods[] = {
47+
{"fht", ffht_fht, METH_VARARGS, fht_docstring}, {NULL, NULL, 0, NULL}};
48+
49+
PyMODINIT_FUNC initffht(void);
50+
51+
PyMODINIT_FUNC initffht(void) {
52+
PyObject *m = Py_InitModule3("ffht", module_methods, module_docstring);
53+
if (!m) return;
54+
55+
import_array();
56+
}
57+
58+
static PyObject *ffht_fht(PyObject *self, PyObject *args) {
59+
UNUSED(self);
60+
61+
PyObject *buffer_obj;
62+
63+
if (!PyArg_ParseTuple(args, "O", &buffer_obj)) {
64+
return NULL;
65+
}
66+
67+
PyArray_Descr *dtype;
68+
int ndim;
69+
npy_intp dims[NPY_MAXDIMS];
70+
PyArrayObject *arr = NULL;
71+
72+
if (PyArray_GetArrayParamsFromObject(buffer_obj, NULL, 1, &dtype, &ndim, dims,
73+
&arr, NULL) < 0) {
74+
return NULL;
75+
}
76+
77+
if (arr == NULL) {
78+
PyErr_SetString(PyExc_TypeError, "not a numpy array");
79+
return NULL;
80+
}
81+
82+
dtype = PyArray_DESCR(arr);
83+
84+
if (dtype->type_num != NPY_FLOAT && dtype->type_num != NPY_DOUBLE) {
85+
PyErr_SetString(PyExc_TypeError, "array must consist of floats or doubles");
86+
Py_DECREF(arr);
87+
return NULL;
88+
}
89+
90+
if (PyArray_NDIM(arr) != 1) {
91+
PyErr_SetString(PyExc_TypeError, "array must be one-dimensional");
92+
Py_DECREF(arr);
93+
return NULL;
94+
}
95+
96+
int n = PyArray_DIM(arr, 0);
97+
98+
if (n == 0 || (n & (n - 1))) {
99+
PyErr_SetString(PyExc_ValueError, "array's length must be a power of two");
100+
Py_DECREF(arr);
101+
return NULL;
102+
}
103+
104+
int log_n = 0;
105+
while ((1 << log_n) < n) {
106+
++log_n;
107+
}
108+
109+
void *raw_buffer = PyArray_DATA(arr);
110+
int res;
111+
if (dtype->type_num == NPY_FLOAT) {
112+
float *buffer = (float *)raw_buffer;
113+
res = fht_float(buffer, log_n);
114+
} else {
115+
double *buffer = (double *)raw_buffer;
116+
res = fht_double(buffer, log_n);
117+
}
118+
119+
if (res) {
120+
PyErr_SetString(PyExc_RuntimeError, "FHT did not work properly");
121+
Py_DECREF(arr);
122+
return NULL;
123+
}
124+
125+
Py_DECREF(arr);
126+
127+
return Py_BuildValue("");
128+
}

0 commit comments

Comments
 (0)