Skip to content

feat: swe bench harness #590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 21, 2025
Merged

feat: swe bench harness #590

merged 4 commits into from
Feb 21, 2025

Conversation

jayhack
Copy link
Contributor

@jayhack jayhack commented Feb 20, 2025

Motivation

Adds a SWE Bench Harness to the codegen agent.

Content

  • Loads SWE Bench dataset
  • For each entry in the database a modal instance is created where an agent can run
  • Output of each agent is stored and tested on modal using swebench
  • documentation in readme

Contributions from:

Please check the following before marking your PR as ready for review

  • I have updated the documentation or added new documentation as needed

@jayhack jayhack requested review from codegen-team and a team as code owners February 20, 2025 21:12
Copy link
Contributor Author

@jayhack jayhack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Left a few comments

Copy link

codecov bot commented Feb 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

✅ All tests successful. No failed tests found.

Additional details and impacted files

@jemeza-codegen jemeza-codegen enabled auto-merge (squash) February 21, 2025 00:22
@jayhack jayhack assigned jayhack and jemeza-codegen and unassigned jayhack Feb 21, 2025
Copy link
Contributor

@jemeza-codegen jemeza-codegen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

@jemeza-codegen jemeza-codegen merged commit 410ee85 into develop Feb 21, 2025
25 of 26 checks passed
@jemeza-codegen jemeza-codegen deleted the jmeza-swe-bench-harness branch February 21, 2025 00:33
Copy link
Contributor

🎉 This PR is included in version 0.29.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants