Skip to content

feat: vector index refactor #528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Feb 17, 2025
Merged

feat: vector index refactor #528

merged 11 commits into from
Feb 17, 2025

Conversation

jayhack
Copy link
Contributor

@jayhack jayhack commented Feb 17, 2025

Overview

Creates FileIndex and CodeIndex, which support the following API:

# Parse a codebase
codebase = Codebase.from_repo('fastapi/fastapi', language='python')

# Create index
index = FileIndex(codebase)
index.create() # computes per-file embeddings

# Save index to .pkl
index.save('index.pkl')

# Load index into memory
index.load('index.pkl')

# Update index after changes
codebase.files[0].edit('# 🌈 Replacing File Content 🌈')
codebase.commit()
index.update() # re-computes 1 embedding

# Search with natural language
results = index.similarity_search(
    "How does FastAPI handle dependency injection?",
    k=5  # number of results
)

# Print results
for file, score in results:
    print(f"\nScore: {score:.3f} | File: {file.filepath}")
    print(f"Preview: {file.content[:200]}...")

Next

We can extend this to support:

  • CloudPickle (e.g. on Modal)
  • Other vector supporters (Pinecone etc.)

@jayhack jayhack requested review from codegen-team and a team as code owners February 17, 2025 22:38
Copy link

codecov bot commented Feb 17, 2025

Codecov Report

Attention: Patch coverage is 28.45188% with 171 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/codegen/extensions/index/file_index.py 21.64% 105 Missing ⚠️
src/codegen/extensions/index/code_index.py 33.67% 65 Missing ⚠️
src/codegen/extensions/tools/semantic_search.py 50.00% 1 Missing ⚠️
Additional details and impacted files

@jayhack jayhack enabled auto-merge (squash) February 17, 2025 22:54
@jayhack jayhack merged commit 652423a into develop Feb 17, 2025
25 of 26 checks passed
@jayhack jayhack deleted the jay/vector-index-refactor branch February 17, 2025 23:00
Copy link
Contributor

🎉 This PR is included in version 0.21.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

rushilpatel0 pushed a commit that referenced this pull request Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants