-
Notifications
You must be signed in to change notification settings - Fork 711
PEP-517 and PEP-518 support (add pyproject.toml) #274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
68b6257
PEP-517 support
groodt e94c5dc
Simplify include_dirs
groodt 467c98f
Remove deprecated `setup.py test`
groodt 2248ab4
pybind11 isn't needed at runtime, only build time
groodt 8fe02c0
Support for packaging sdist
groodt 73134a7
https git clone in README
groodt File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,8 @@ | ||
hnswlib.egg-info/ | ||
build/ | ||
dist/ | ||
tmp/ | ||
python_bindings/tests/__pycache__/ | ||
*.pyd | ||
hnswlib.cpython*.so | ||
hnswlib.egg-info/ | ||
build/ | ||
dist/ | ||
tmp/ | ||
python_bindings/tests/__pycache__/ | ||
*.pyd | ||
hnswlib.cpython*.so | ||
var/ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
[build-system] | ||
requires = [ | ||
"setuptools>=42", | ||
"wheel", | ||
"numpy>=1.10.0", | ||
"pybind11>=2.0", | ||
] | ||
|
||
build-backend = "setuptools.build_meta" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,131 +1,127 @@ | ||
import os | ||
import unittest | ||
|
||
import numpy as np | ||
|
||
class RandomSelfTestCase(unittest.TestCase): | ||
def testRandomSelf(self): | ||
for idx in range(16): | ||
groodt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
print("\n**** Index save-load test ****\n") | ||
import hnswlib | ||
import numpy as np | ||
|
||
np.random.seed(idx) | ||
dim = 16 | ||
num_elements = 10000 | ||
|
||
# Generating sample data | ||
data = np.float32(np.random.random((num_elements, dim))) | ||
|
||
# Declaring index | ||
p = hnswlib.Index(space='l2', dim=dim) # possible options are l2, cosine or ip | ||
|
||
# Initing index | ||
# max_elements - the maximum number of elements, should be known beforehand | ||
# (probably will be made optional in the future) | ||
# | ||
# ef_construction - controls index search speed/build speed tradeoff | ||
# M - is tightly connected with internal dimensionality of the data | ||
# stronlgy affects the memory consumption | ||
|
||
p.init_index(max_elements = num_elements, ef_construction = 100, M = 16) | ||
|
||
# Controlling the recall by setting ef: | ||
# higher ef leads to better accuracy, but slower search | ||
p.set_ef(100) | ||
|
||
p.set_num_threads(4) # by default using all available cores | ||
|
||
# We split the data in two batches: | ||
data1 = data[:num_elements // 2] | ||
data2 = data[num_elements // 2:] | ||
|
||
print("Adding first batch of %d elements" % (len(data1))) | ||
p.add_items(data1) | ||
|
||
# Query the elements for themselves and measure recall: | ||
labels, distances = p.knn_query(data1, k=1) | ||
|
||
items=p.get_items(labels) | ||
|
||
# Check the recall: | ||
self.assertAlmostEqual(np.mean(labels.reshape(-1) == np.arange(len(data1))),1.0,3) | ||
|
||
# Check that the returned element data is correct: | ||
diff_with_gt_labels=np.mean(np.abs(data1-items)) | ||
self.assertAlmostEqual(diff_with_gt_labels, 0, delta = 1e-4) | ||
|
||
# Serializing and deleting the index. | ||
# We need the part to check that serialization is working properly. | ||
|
||
index_path = 'first_half.bin' | ||
print("Saving index to '%s'" % index_path) | ||
p.save_index(index_path) | ||
print("Saved. Deleting...") | ||
del p | ||
print("Deleted") | ||
|
||
print("\n**** Mark delete test ****\n") | ||
# Reiniting, loading the index | ||
print("Reiniting") | ||
p = hnswlib.Index(space='l2', dim=dim) | ||
|
||
print("\nLoading index from '%s'\n" % index_path) | ||
p.load_index(index_path) | ||
p.set_ef(100) | ||
|
||
print("Adding the second batch of %d elements" % (len(data2))) | ||
p.add_items(data2) | ||
|
||
# Query the elements for themselves and measure recall: | ||
labels, distances = p.knn_query(data, k=1) | ||
items=p.get_items(labels) | ||
|
||
# Check the recall: | ||
self.assertAlmostEqual(np.mean(labels.reshape(-1) == np.arange(len(data))),1.0,3) | ||
|
||
# Check that the returned element data is correct: | ||
diff_with_gt_labels=np.mean(np.abs(data-items)) | ||
self.assertAlmostEqual(diff_with_gt_labels, 0, delta = 1e-4) # deleting index. | ||
|
||
# Checking that all labels are returned correctly: | ||
sorted_labels=sorted(p.get_ids_list()) | ||
self.assertEqual(np.sum(~np.asarray(sorted_labels)==np.asarray(range(num_elements))),0) | ||
|
||
# Delete data1 | ||
labels1, _ = p.knn_query(data1, k=1) | ||
|
||
for l in labels1: | ||
p.mark_deleted(l[0]) | ||
labels2, _ = p.knn_query(data2, k=1) | ||
items=p.get_items(labels2) | ||
diff_with_gt_labels=np.mean(np.abs(data2-items)) | ||
self.assertAlmostEqual(diff_with_gt_labels, 0, delta = 1e-3) # console | ||
|
||
|
||
labels1_after, _ = p.knn_query(data1, k=1) | ||
for la in labels1_after: | ||
for lb in labels1: | ||
if la[0] == lb[0]: | ||
self.assertTrue(False) | ||
print("All the data in data1 are removed") | ||
import hnswlib | ||
|
||
# checking saving/loading index with elements marked as deleted | ||
del_index_path = "with_deleted.bin" | ||
p.save_index(del_index_path) | ||
p = hnswlib.Index(space='l2', dim=dim) | ||
p.load_index(del_index_path) | ||
p.set_ef(100) | ||
|
||
labels1_after, _ = p.knn_query(data1, k=1) | ||
for la in labels1_after: | ||
for lb in labels1: | ||
if la[0] == lb[0]: | ||
self.assertTrue(False) | ||
|
||
os.remove(index_path) | ||
os.remove(del_index_path) | ||
class RandomSelfTestCase(unittest.TestCase): | ||
def testRandomSelf(self): | ||
for idx in range(16): | ||
print("\n**** Index save-load test ****\n") | ||
|
||
np.random.seed(idx) | ||
dim = 16 | ||
num_elements = 10000 | ||
|
||
# Generating sample data | ||
data = np.float32(np.random.random((num_elements, dim))) | ||
|
||
if __name__ == "__main__": | ||
unittest.main() | ||
# Declaring index | ||
p = hnswlib.Index(space='l2', dim=dim) # possible options are l2, cosine or ip | ||
|
||
# Initing index | ||
# max_elements - the maximum number of elements, should be known beforehand | ||
# (probably will be made optional in the future) | ||
# | ||
# ef_construction - controls index search speed/build speed tradeoff | ||
# M - is tightly connected with internal dimensionality of the data | ||
# stronlgy affects the memory consumption | ||
|
||
p.init_index(max_elements=num_elements, ef_construction=100, M=16) | ||
|
||
# Controlling the recall by setting ef: | ||
# higher ef leads to better accuracy, but slower search | ||
p.set_ef(100) | ||
|
||
p.set_num_threads(4) # by default using all available cores | ||
|
||
# We split the data in two batches: | ||
data1 = data[:num_elements // 2] | ||
data2 = data[num_elements // 2:] | ||
|
||
print("Adding first batch of %d elements" % (len(data1))) | ||
p.add_items(data1) | ||
|
||
# Query the elements for themselves and measure recall: | ||
labels, distances = p.knn_query(data1, k=1) | ||
|
||
items=p.get_items(labels) | ||
|
||
# Check the recall: | ||
self.assertAlmostEqual(np.mean(labels.reshape(-1) == np.arange(len(data1))), 1.0, 3) | ||
|
||
# Check that the returned element data is correct: | ||
diff_with_gt_labels=np.mean(np.abs(data1-items)) | ||
self.assertAlmostEqual(diff_with_gt_labels, 0, delta=1e-4) | ||
|
||
# Serializing and deleting the index. | ||
# We need the part to check that serialization is working properly. | ||
|
||
index_path = 'first_half.bin' | ||
print("Saving index to '%s'" % index_path) | ||
p.save_index(index_path) | ||
print("Saved. Deleting...") | ||
del p | ||
print("Deleted") | ||
|
||
print("\n**** Mark delete test ****\n") | ||
# Reiniting, loading the index | ||
print("Reiniting") | ||
p = hnswlib.Index(space='l2', dim=dim) | ||
|
||
print("\nLoading index from '%s'\n" % index_path) | ||
p.load_index(index_path) | ||
p.set_ef(100) | ||
|
||
print("Adding the second batch of %d elements" % (len(data2))) | ||
p.add_items(data2) | ||
|
||
# Query the elements for themselves and measure recall: | ||
labels, distances = p.knn_query(data, k=1) | ||
items=p.get_items(labels) | ||
|
||
# Check the recall: | ||
self.assertAlmostEqual(np.mean(labels.reshape(-1) == np.arange(len(data))), 1.0, 3) | ||
|
||
# Check that the returned element data is correct: | ||
diff_with_gt_labels=np.mean(np.abs(data-items)) | ||
self.assertAlmostEqual(diff_with_gt_labels, 0, delta=1e-4) # deleting index. | ||
|
||
# Checking that all labels are returned correctly: | ||
sorted_labels=sorted(p.get_ids_list()) | ||
self.assertEqual(np.sum(~np.asarray(sorted_labels) == np.asarray(range(num_elements))), 0) | ||
|
||
# Delete data1 | ||
labels1, _ = p.knn_query(data1, k=1) | ||
|
||
for l in labels1: | ||
p.mark_deleted(l[0]) | ||
labels2, _ = p.knn_query(data2, k=1) | ||
items=p.get_items(labels2) | ||
diff_with_gt_labels = np.mean(np.abs(data2-items)) | ||
self.assertAlmostEqual(diff_with_gt_labels, 0, delta=1e-3) # console | ||
|
||
labels1_after, _ = p.knn_query(data1, k=1) | ||
for la in labels1_after: | ||
for lb in labels1: | ||
if la[0] == lb[0]: | ||
self.assertTrue(False) | ||
print("All the data in data1 are removed") | ||
|
||
# checking saving/loading index with elements marked as deleted | ||
del_index_path = "with_deleted.bin" | ||
p.save_index(del_index_path) | ||
p = hnswlib.Index(space='l2', dim=dim) | ||
p.load_index(del_index_path) | ||
p.set_ef(100) | ||
|
||
labels1_after, _ = p.knn_query(data1, k=1) | ||
for la in labels1_after: | ||
for lb in labels1: | ||
if la[0] == lb[0]: | ||
self.assertTrue(False) | ||
|
||
os.remove(index_path) | ||
os.remove(del_index_path) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.