-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Adding unit tests for dspy.retrievers.Embeddings #8129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding unit tests for dspy.retrievers.Embeddings #8129
Conversation
Added three unit tests for the embeddings. 1) test_embeddings_basic_search: Verifies that the retriever returns the correct top k relevant passages and their indices for a single query. 2) test_embeddings_forward_batch: Ensures the retriever handles batch queries correctly, returning the top k relevant passages and indices for each query. 3) test_normalization: Confirms that the embeddings are correctly normalized to have a norm close to 1 after processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Left some comments
tests/retrievers/test_embeddings.py
Outdated
|
||
from dspy.retrievers.embeddings import Embeddings | ||
|
||
@pytest.fixture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we don't need fixture for the list since it's only used in this test
tests/retrievers/test_embeddings.py
Outdated
assert isinstance(passage, str) | ||
assert passage in dummy_corpus | ||
|
||
def test_embeddings_forward_batch(dummy_corpus, dummy_embedder): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove the test and the one below, since they are testing the private methods.
1) Removed tests that were calling private functions. 2) Added a new test to check robustness on high concurrency. 3) Updated dummy embedder by keeping similar data close and different data far away.
Hey @chenmoneygithub , can you take a look now? TIA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I will do some minor cleanup and merge. Thanks for the contribution again!
Added three unit tests for the embeddings.
test_embeddings_basic_search: Verifies that the retriever returns the correct top k relevant passages and their indices for a single query.
test_embeddings_forward_batch: Ensures the retriever handles batch queries correctly, returning the top k relevant passages and indices for each query.
test_normalization: Confirms that the embeddings are correctly normalized to have a norm close to 1 after processing.