commit-0
diff --git a/‎docs/agent.md
Lines changed: 44 additions & 0 deletions b/‎docs/agent.md
Lines changed: 44 additions & 0 deletions
diff --git a/‎docs/api.md
Lines changed: 135 additions & 0 deletions b/‎docs/api.md
Lines changed: 135 additions & 0 deletions
diff --git a/‎docs/arch.png
281 KB b/‎docs/arch.png
281 KB
diff --git a/‎docs/baseline.md
Lines changed: 7 additions & 0 deletions b/‎docs/baseline.md
Lines changed: 7 additions & 0 deletions
diff --git a/‎docs/commit0.gif
20.8 MB b/‎docs/commit0.gif
20.8 MB
diff --git a/‎docs/index.md
Lines changed: 18 additions & 6 deletions b/‎docs/index.md
Lines changed: 18 additions & 6 deletions
diff --git a/‎docs/setupdist.md
Lines changed: 37 additions & 0 deletions b/‎docs/setupdist.md
Lines changed: 37 additions & 0 deletions
diff --git a/‎docs/setuplocal.md
Lines changed: 1 addition & 1 deletion b/‎docs/setuplocal.md
Lines changed: 1 addition & 1 deletion
@@ -0,0 +1,44 @@
+Commit0 provides a command-line `agent` for configuring and
+running AI agents to assist with code development and testing.
+In this example we use [Aider](https://aider.chat/) as the
+baseline code completion agent
+
+```bash
+pip install aider
+```
+
+First we assume there is an underlying `commit0`
+project that is configured. To create a new project,
+run the commit0 `setup` command.
+
+```bash
+commit0 setup lite
+```
+
+Next we need to configure the backend for the agent.
+Currently we only support the aider backend. Config
+can also be used to pass in arguments.
+
+```bash
+export ANTHROPIC_API_KEY="..."
+agent config aider
+```
+
+Finally we run the underlying agent. This will create a display
+that shows the current progress of the agent.
+
+```bash
+agent run
+```
+
+
+### Extending
+Refer to `class Agents` in `agent/agents.py`. You can design your own agent by inheriting `Agents` class and implement the `run` method.
+
+## Notes
+
+
+* Aider automatically retries certain API errors. For details, see [here](https://github.com/paul-gauthier/aider/blob/75e1d519da9b328b0eca8a73ee27278f1289eadb/aider/sendchat.py#L17).
+* When increasing --max-parallel-repos, be mindful of aider's [60-second retry timeout](https://github.com/paul-gauthier/aider/blob/75e1d519da9b328b0eca8a73ee27278f1289eadb/aider/sendchat.py#L39). Set this value according to your API tier to avoid RateLimitErrors stopping processes.
+* Currently, agent will skip file with more than 1500 lines. See `agent/agent_utils.py#L199` for details.
+* Running a full `all` commit0 split costs approximately $100 with Claude Sonnet 3.5.
@@ -0,0 +1,135 @@
+## Commit0
+
+Commit0 provides several commands to facilitate the process of cloning, building, testing, and evaluating repositories. Here's an overview of the available commands:
+
+### Setup
+
+Use `commit0 setup [OPTIONS] REPO_SPLIT` to clone a repository split.
+Available options include:
+
+| Argument | Type | Description | Default |
+|----------|------|-------------|---------|
+| `repo_split` | str | Split of repositories to clone | |
+| `--dataset-name` | str | Name of the Huggingface dataset | `wentingzhao/commit0_combined` |
+| `--dataset-split` | str | Split of the Huggingface dataset | `test` |
+| `--base-dir` | str | Base directory to clone repos to | `repos/` |
+| `--commit0-dot-file-path` | str | Storing path for stateful commit0 configs | `.commit0.yaml` |
+
+### Build
+
+Use `commit0 build [OPTIONS]` to build the Commit0 split chosen in the Setup stage.
+Available options include:
+
+| Argument | Type | Description | Default |
+|----------|------|-------------|---------|
+| `--num-workers` | int | Number of workers | `8` |
+| `--commit0-dot-file-path` | str | Path to the commit0 dot file | `.commit0.yaml` |
+| `--verbose` | int | Verbosity level (1 or 2) | `1` |
+
+### Get Tests
+
+Use `commit0 get-tests REPO_NAME` to get tests for a Commit0 repository.
+
+| Argument | Type | Description | Default |
+|----------|------|-------------|---------|
+| `repo_name` | str | Name of the repository to get tests for | |
+
+### Test
+
+Use `commit0 test [OPTIONS] REPO_OR_REPO_PATH [TEST_IDS]` to run tests on a Commit0 repository.
+Available options include:
+
+| Argument | Type | Description | Default |
+|----------|------|-------------|---------|
+| `repo_or_repo_path` | str | Directory of the repository to test | |
+| `test_ids` | str | Test IDs to run | |
+| `--branch` | str | Branch to test | |
+| `--backend` | str | Backend to use for testing | `modal` |
+| `--timeout` | int | Timeout for tests in seconds | `1800` |
+| `--num-cpus` | int | Number of CPUs to use | `1` |
+| `--reference` | bool | Test the reference commit | `False` |
+| `--coverage` | bool | Get coverage information | `False` |
+| `--rebuild` | bool | Rebuild an image | `False` |
+| `--commit0-dot-file-path` | str | Path to the commit0 dot file | `.commit0.yaml` |
+| `--verbose` | int | Verbosity level (1 or 2) | `1` |
+| `--stdin` | bool | Read test names from stdin | `False` |
+
+### Evaluate
+
+Use `commit0 evaluate [OPTIONS]` to evaluate the Commit0 split chosen in the Setup stage.
+Available options include:
+
+| Argument | Type | Description | Default |
+|----------|------|-------------|---------|
+| `--branch` | str | Branch to evaluate | |
+| `--backend` | str | Backend to use for evaluation | `modal` |
+| `--timeout` | int | Timeout for evaluation in seconds | `1800` |
+| `--num-cpus` | int | Number of CPUs to use | `1` |
+| `--num-workers` | int | Number of workers to use | `8` |
+| `--reference` | bool | Evaluate the reference commit | `False` |
+| `--coverage` | bool | Get coverage information | `False` |
+| `--commit0-dot-file-path` | str | Path to the commit0 dot file | `.commit0.yaml` |
+| `--rebuild` | bool | Rebuild images | `False` |
+
+### Lint
+
+Use `commit0 lint [OPTIONS] REPO_OR_REPO_DIR` to lint files in a repository.
+Available options include:
+
+| Argument | Type | Description | Default |
+|----------|------|-------------|---------|
+| `repo_or_repo_dir` | str | Directory of the repository to test | |
+| `--files` | List[Path] | Files to lint (optional) | |
+| `--commit0-dot-file-path` | str | Path to the commit0 dot file | `.commit0.yaml` |
+| `--verbose` | int | Verbosity level (1 or 2) | `1` |
+
+### Save
+
+Use `commit0 save [OPTIONS] OWNER BRANCH` to save the Commit0 split to GitHub.
+Available options include:
+
+| Argument | Type | Description | Default |
+|----------|------|-------------|---------|
+| `owner` | str | Owner of the repository | |
+| `branch` | str | Branch to save | |
+| `--github-token` | str | GitHub token for authentication | |
+| `--commit0-dot-file-path` | str | Path to the commit0 dot file | `.commit0.yaml` |
+
+## Agent
+
+### Config
+
+Use `agent config [OPTIONS] AGENT_NAME` to set up the configuration for an agent.
+Available options include:
+
+| Argument | Type | Description | Default |
+|----------|------|-------------|---------|
+| `agent_name` | str | Agent to use, we only support [aider](https://aider.chat/) for now. | `aider` |
+| `--model-name` | str | LLM model to use, check [here](https://aider.chat/docs/llms.html) for all supported models. | `claude-3-5-sonnet-20240620` |
+| `--use-user-prompt` | bool | Use a custom prompt instead of the default prompt. | `False` |
+| `--user-prompt` | str | The prompt sent to agent. | See code for details. |
+| `--run-tests` | bool | Run tests after code modifications for feedback. You need to set up `docker` or `modal` before running tests, refer to commit0 docs. | `False` |
+| `--max-iteration` | int | Maximum number of agent iterations. | `3` |
+| `--use-repo-info` | bool | Include the repository information. | `False` |
+| `--max-repo-info-length` | int | Maximum length of the repository information to use. | `10000` |
+| `--use-unit-tests-info` | bool | Include the unit tests information. | `False` |
+| `--max-unit-tests-info-length` | int | Maximum length of the unit tests information to use. | `10000` |
+| `--use-spec-info` | bool | Include the spec information. | `False` |
+| `--max-spec-info-length` | int | Maximum length of the spec information to use. | `10000` |
+| `--use-lint-info` | bool | Include the lint information. | `False` |
+| `--max-lint-info-length` | int | Maximum length of the lint information to use. | `10000` |
+| `--pre-commit-config-path` | str | Path to the pre-commit config file. This is needed for running `lint`. | `.pre-commit-config.yaml` |
+| `--agent-config-file` | str | Path to write the agent config. | `.agent.yaml` |
+
+### Running
+
+Use `agent run [OPTIONS] BRANCH` to execute an agent on a specific branch.
+Available options include:
+
+| Argument | Type | Description | Default |
+|----------|------|-------------|---------|
+| `branch` | str | Branch to run the agent on, you can specific the name of the branch | |
+| `--backend` | str | Test backend to run the agent on, ignore this option if you are not adding `run_tests` option to agent. | `modal` |
+| `--log-dir` | str | Log directory to store the logs. | `logs/aider` |
+| `--max-parallel-repos` | int | Maximum number of repositories for agent to run in parallel. Running in sequential if set to 1. | `1` |
+| `--display-repo-progress-num` | int | Number of repo progress displayed when running. | `5` |
@@ -0,0 +1,7 @@
+# Baseline
+
+Commit0 contains a baseline system based on
+the [Aider](https://aider.chat/) code generation
+system.
+
+...
@@ -3,19 +3,23 @@
 
 #
 
-Commit-0 is a real-world AI coding challenge.
-Can your agent generate a working library from commit 0?
+## Overview
+
+Commit-0 is a from scratch AI coding challenge.
+Can you create a library from commit 0?
 
 The benchmark consists of 57 core Python libraries.
-Libraries are selected based on:
+The challenge is to rebuild these libraries and
+pass their unit tests.  All libraries have:
 
-* Significant unit-test coverage
+* Significant test coverage
 * Detailed specification and documentation
 * Lint and type checking
 
-The [commit0 tool](setup) allows you to:
+Commit-0 is an interactive environment that makes it easy
+to design and test new agents. You can:
 
-* Efficiently run interactive tests in isolated environemnts
+* Efficiently run tests in isolated environemnts
 * Distribute testing and development across cloud systems
 * Track and log all changes made throughout.
 
@@ -25,6 +29,14 @@ To install run:
 pip install commit0
 ```
 
+## Architecture
+
+![](arch.png)
+
+
+![](commit0.gif)
+
+## Libraries
 
 |  | Name |  Repo | Commit0 | Tests |  |
 |--|--------|-------|----|----|------|
 
@@ -44,3 +44,40 @@ you can commit to the branch and call with the --branch command.
 ```bash
 commit0 test simpy tests/test_event.py::test_succeed --branch my_branch
 ```
+
+## Local Mode
+
+To run in local mode you first be sure that you have [docker tools](https://docs.docker.com/desktop/install/mac-install/)
+installed. On Debian systems:
+
+```bash
+apt install docker
+```
+
+To get started, run the `setup` command with the dataset
+split that you are interested in working with.
+We'll start with the `lite` split.
+
+
+```bash
+commit0 setup lite
+```
+
+This will install a clone the code for subset of libraries to your `repos/` directory.
+
+Next run the `build` command which will configure Docker containers for
+each of the libraries with isolated virtual environments. The command uses the
+[uv](https://github.com/astral-sh/uv) library for efficient builds.
+
+```bash
+commit0 build
+```
+
+The main operation you can do with these enviroments is to run tests.
+Here we run [a test](https://github.com/commit-0/simpy/blob/master/tests/test_event.py#L11) in the `simpy` library.
+
+```bash
+commit0 test simpy tests/test_event.py::test_succeed
+```
+
+See [distributed setup](/setupdist) for more commands.
@@ -33,4 +33,4 @@ Here we run [a test](https://github.com/commit-0/simpy/blob/master/tests/test_ev
 commit0 test simpy tests/test_event.py::test_succeed
 ```
 
-See [distributed setup](setupdist) for more commands.
+See [distributed setup](/setupdist) for more commands.