Skip to content

Commit bcaef7e

Browse files
authored
Merge pull request #59 from commit-0/docs3
Docs3
2 parents b9d089a + ca0f48a commit bcaef7e

File tree

10 files changed

+304
-12
lines changed

10 files changed

+304
-12
lines changed

docs/agent.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
Commit0 provides a command-line `agent` for configuring and
2+
running AI agents to assist with code development and testing.
3+
In this example we use [Aider](https://aider.chat/) as the
4+
baseline code completion agent
5+
6+
```bash
7+
pip install aider
8+
```
9+
10+
First we assume there is an underlying `commit0`
11+
project that is configured. To create a new project,
12+
run the commit0 `setup` command.
13+
14+
```bash
15+
commit0 setup lite
16+
```
17+
18+
Next we need to configure the backend for the agent.
19+
Currently we only support the aider backend. Config
20+
can also be used to pass in arguments.
21+
22+
```bash
23+
export ANTHROPIC_API_KEY="..."
24+
agent config aider
25+
```
26+
27+
Finally we run the underlying agent. This will create a display
28+
that shows the current progress of the agent.
29+
30+
```bash
31+
agent run
32+
```
33+
34+
35+
### Extending
36+
Refer to `class Agents` in `agent/agents.py`. You can design your own agent by inheriting `Agents` class and implement the `run` method.
37+
38+
## Notes
39+
40+
41+
* Aider automatically retries certain API errors. For details, see [here](https://github.com/paul-gauthier/aider/blob/75e1d519da9b328b0eca8a73ee27278f1289eadb/aider/sendchat.py#L17).
42+
* When increasing --max-parallel-repos, be mindful of aider's [60-second retry timeout](https://github.com/paul-gauthier/aider/blob/75e1d519da9b328b0eca8a73ee27278f1289eadb/aider/sendchat.py#L39). Set this value according to your API tier to avoid RateLimitErrors stopping processes.
43+
* Currently, agent will skip file with more than 1500 lines. See `agent/agent_utils.py#L199` for details.
44+
* Running a full `all` commit0 split costs approximately $100 with Claude Sonnet 3.5.

docs/api.md

Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
## Commit0
2+
3+
Commit0 provides several commands to facilitate the process of cloning, building, testing, and evaluating repositories. Here's an overview of the available commands:
4+
5+
### Setup
6+
7+
Use `commit0 setup [OPTIONS] REPO_SPLIT` to clone a repository split.
8+
Available options include:
9+
10+
| Argument | Type | Description | Default |
11+
|----------|------|-------------|---------|
12+
| `repo_split` | str | Split of repositories to clone | |
13+
| `--dataset-name` | str | Name of the Huggingface dataset | `wentingzhao/commit0_combined` |
14+
| `--dataset-split` | str | Split of the Huggingface dataset | `test` |
15+
| `--base-dir` | str | Base directory to clone repos to | `repos/` |
16+
| `--commit0-dot-file-path` | str | Storing path for stateful commit0 configs | `.commit0.yaml` |
17+
18+
### Build
19+
20+
Use `commit0 build [OPTIONS]` to build the Commit0 split chosen in the Setup stage.
21+
Available options include:
22+
23+
| Argument | Type | Description | Default |
24+
|----------|------|-------------|---------|
25+
| `--num-workers` | int | Number of workers | `8` |
26+
| `--commit0-dot-file-path` | str | Path to the commit0 dot file | `.commit0.yaml` |
27+
| `--verbose` | int | Verbosity level (1 or 2) | `1` |
28+
29+
### Get Tests
30+
31+
Use `commit0 get-tests REPO_NAME` to get tests for a Commit0 repository.
32+
33+
| Argument | Type | Description | Default |
34+
|----------|------|-------------|---------|
35+
| `repo_name` | str | Name of the repository to get tests for | |
36+
37+
### Test
38+
39+
Use `commit0 test [OPTIONS] REPO_OR_REPO_PATH [TEST_IDS]` to run tests on a Commit0 repository.
40+
Available options include:
41+
42+
| Argument | Type | Description | Default |
43+
|----------|------|-------------|---------|
44+
| `repo_or_repo_path` | str | Directory of the repository to test | |
45+
| `test_ids` | str | Test IDs to run | |
46+
| `--branch` | str | Branch to test | |
47+
| `--backend` | str | Backend to use for testing | `modal` |
48+
| `--timeout` | int | Timeout for tests in seconds | `1800` |
49+
| `--num-cpus` | int | Number of CPUs to use | `1` |
50+
| `--reference` | bool | Test the reference commit | `False` |
51+
| `--coverage` | bool | Get coverage information | `False` |
52+
| `--rebuild` | bool | Rebuild an image | `False` |
53+
| `--commit0-dot-file-path` | str | Path to the commit0 dot file | `.commit0.yaml` |
54+
| `--verbose` | int | Verbosity level (1 or 2) | `1` |
55+
| `--stdin` | bool | Read test names from stdin | `False` |
56+
57+
### Evaluate
58+
59+
Use `commit0 evaluate [OPTIONS]` to evaluate the Commit0 split chosen in the Setup stage.
60+
Available options include:
61+
62+
| Argument | Type | Description | Default |
63+
|----------|------|-------------|---------|
64+
| `--branch` | str | Branch to evaluate | |
65+
| `--backend` | str | Backend to use for evaluation | `modal` |
66+
| `--timeout` | int | Timeout for evaluation in seconds | `1800` |
67+
| `--num-cpus` | int | Number of CPUs to use | `1` |
68+
| `--num-workers` | int | Number of workers to use | `8` |
69+
| `--reference` | bool | Evaluate the reference commit | `False` |
70+
| `--coverage` | bool | Get coverage information | `False` |
71+
| `--commit0-dot-file-path` | str | Path to the commit0 dot file | `.commit0.yaml` |
72+
| `--rebuild` | bool | Rebuild images | `False` |
73+
74+
### Lint
75+
76+
Use `commit0 lint [OPTIONS] REPO_OR_REPO_DIR` to lint files in a repository.
77+
Available options include:
78+
79+
| Argument | Type | Description | Default |
80+
|----------|------|-------------|---------|
81+
| `repo_or_repo_dir` | str | Directory of the repository to test | |
82+
| `--files` | List[Path] | Files to lint (optional) | |
83+
| `--commit0-dot-file-path` | str | Path to the commit0 dot file | `.commit0.yaml` |
84+
| `--verbose` | int | Verbosity level (1 or 2) | `1` |
85+
86+
### Save
87+
88+
Use `commit0 save [OPTIONS] OWNER BRANCH` to save the Commit0 split to GitHub.
89+
Available options include:
90+
91+
| Argument | Type | Description | Default |
92+
|----------|------|-------------|---------|
93+
| `owner` | str | Owner of the repository | |
94+
| `branch` | str | Branch to save | |
95+
| `--github-token` | str | GitHub token for authentication | |
96+
| `--commit0-dot-file-path` | str | Path to the commit0 dot file | `.commit0.yaml` |
97+
98+
## Agent
99+
100+
### Config
101+
102+
Use `agent config [OPTIONS] AGENT_NAME` to set up the configuration for an agent.
103+
Available options include:
104+
105+
| Argument | Type | Description | Default |
106+
|----------|------|-------------|---------|
107+
| `agent_name` | str | Agent to use, we only support [aider](https://aider.chat/) for now. | `aider` |
108+
| `--model-name` | str | LLM model to use, check [here](https://aider.chat/docs/llms.html) for all supported models. | `claude-3-5-sonnet-20240620` |
109+
| `--use-user-prompt` | bool | Use a custom prompt instead of the default prompt. | `False` |
110+
| `--user-prompt` | str | The prompt sent to agent. | See code for details. |
111+
| `--run-tests` | bool | Run tests after code modifications for feedback. You need to set up `docker` or `modal` before running tests, refer to commit0 docs. | `False` |
112+
| `--max-iteration` | int | Maximum number of agent iterations. | `3` |
113+
| `--use-repo-info` | bool | Include the repository information. | `False` |
114+
| `--max-repo-info-length` | int | Maximum length of the repository information to use. | `10000` |
115+
| `--use-unit-tests-info` | bool | Include the unit tests information. | `False` |
116+
| `--max-unit-tests-info-length` | int | Maximum length of the unit tests information to use. | `10000` |
117+
| `--use-spec-info` | bool | Include the spec information. | `False` |
118+
| `--max-spec-info-length` | int | Maximum length of the spec information to use. | `10000` |
119+
| `--use-lint-info` | bool | Include the lint information. | `False` |
120+
| `--max-lint-info-length` | int | Maximum length of the lint information to use. | `10000` |
121+
| `--pre-commit-config-path` | str | Path to the pre-commit config file. This is needed for running `lint`. | `.pre-commit-config.yaml` |
122+
| `--agent-config-file` | str | Path to write the agent config. | `.agent.yaml` |
123+
124+
### Running
125+
126+
Use `agent run [OPTIONS] BRANCH` to execute an agent on a specific branch.
127+
Available options include:
128+
129+
| Argument | Type | Description | Default |
130+
|----------|------|-------------|---------|
131+
| `branch` | str | Branch to run the agent on, you can specific the name of the branch | |
132+
| `--backend` | str | Test backend to run the agent on, ignore this option if you are not adding `run_tests` option to agent. | `modal` |
133+
| `--log-dir` | str | Log directory to store the logs. | `logs/aider` |
134+
| `--max-parallel-repos` | int | Maximum number of repositories for agent to run in parallel. Running in sequential if set to 1. | `1` |
135+
| `--display-repo-progress-num` | int | Number of repo progress displayed when running. | `5` |

docs/arch.png

281 KB
Loading

docs/baseline.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Baseline
2+
3+
Commit0 contains a baseline system based on
4+
the [Aider](https://aider.chat/) code generation
5+
system.
6+
7+
...

docs/commit0.gif

20.8 MB
Loading

docs/index.md

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,19 +3,23 @@
33

44
#
55

6-
Commit-0 is a real-world AI coding challenge.
7-
Can your agent generate a working library from commit 0?
6+
## Overview
7+
8+
Commit-0 is a from scratch AI coding challenge.
9+
Can you create a library from commit 0?
810

911
The benchmark consists of 57 core Python libraries.
10-
Libraries are selected based on:
12+
The challenge is to rebuild these libraries and
13+
pass their unit tests. All libraries have:
1114

12-
* Significant unit-test coverage
15+
* Significant test coverage
1316
* Detailed specification and documentation
1417
* Lint and type checking
1518

16-
The [commit0 tool](setup) allows you to:
19+
Commit-0 is an interactive environment that makes it easy
20+
to design and test new agents. You can:
1721

18-
* Efficiently run interactive tests in isolated environemnts
22+
* Efficiently run tests in isolated environemnts
1923
* Distribute testing and development across cloud systems
2024
* Track and log all changes made throughout.
2125

@@ -25,6 +29,14 @@ To install run:
2529
pip install commit0
2630
```
2731

32+
## Architecture
33+
34+
![](arch.png)
35+
36+
37+
![](commit0.gif)
38+
39+
## Libraries
2840

2941
| | Name | Repo | Commit0 | Tests | |
3042
|--|--------|-------|----|----|------|

docs/setupdist.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,3 +44,40 @@ you can commit to the branch and call with the --branch command.
4444
```bash
4545
commit0 test simpy tests/test_event.py::test_succeed --branch my_branch
4646
```
47+
48+
## Local Mode
49+
50+
To run in local mode you first be sure that you have [docker tools](https://docs.docker.com/desktop/install/mac-install/)
51+
installed. On Debian systems:
52+
53+
```bash
54+
apt install docker
55+
```
56+
57+
To get started, run the `setup` command with the dataset
58+
split that you are interested in working with.
59+
We'll start with the `lite` split.
60+
61+
62+
```bash
63+
commit0 setup lite
64+
```
65+
66+
This will install a clone the code for subset of libraries to your `repos/` directory.
67+
68+
Next run the `build` command which will configure Docker containers for
69+
each of the libraries with isolated virtual environments. The command uses the
70+
[uv](https://github.com/astral-sh/uv) library for efficient builds.
71+
72+
```bash
73+
commit0 build
74+
```
75+
76+
The main operation you can do with these enviroments is to run tests.
77+
Here we run [a test](https://github.com/commit-0/simpy/blob/master/tests/test_event.py#L11) in the `simpy` library.
78+
79+
```bash
80+
commit0 test simpy tests/test_event.py::test_succeed
81+
```
82+
83+
See [distributed setup](/setupdist) for more commands.

docs/setuplocal.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,4 +33,4 @@ Here we run [a test](https://github.com/commit-0/simpy/blob/master/tests/test_ev
3333
commit0 test simpy tests/test_event.py::test_succeed
3434
```
3535

36-
See [distributed setup](setupdist) for more commands.
36+
See [distributed setup](/setupdist) for more commands.

0 commit comments

Comments
 (0)