|
1 | 1 | # Commit0
|
| 2 | + |
| 3 | +Commit0 is a from scratch AI coding challenge. Can you create a library from commit 0? |
| 4 | + |
| 5 | +The benchmark consists of 57 core Python libraries. The challenge is to rebuild these libraries and pass their unit tests. All libraries have: |
| 6 | + |
| 7 | +* Significant test coverage |
| 8 | +* Detailed specification and documentation |
| 9 | +* Lint and type checking |
| 10 | + |
| 11 | +Commit0 is an interactive environment that makes it easy to design and test new agents. You can: |
| 12 | + |
| 13 | +* Efficiently run tests in isolated environments |
| 14 | +* Distribute testing and development across cloud systems |
| 15 | +* Track and log all changes made throughout. |
| 16 | + |
| 17 | +To install Commit0, run: |
| 18 | + |
| 19 | +``` |
| 20 | +pip install commit0 |
| 21 | +``` |
| 22 | + |
| 23 | +Commit0 provides several commands to facilitate the process of cloning, building, testing, and evaluating repositories. Here's an overview of the available commands: |
| 24 | + |
| 25 | +### Setup |
| 26 | + |
| 27 | +Use `commit0 setup [OPTIONS] REPO_SPLIT` to clone a repository split. |
| 28 | +Available options include: |
| 29 | + |
| 30 | +| Argument | Type | Description | Default | |
| 31 | +|----------|------|-------------|---------| |
| 32 | +| `repo_split` | str | Split of repositories to clone | | |
| 33 | +| `--dataset-name` | str | Name of the Huggingface dataset | `wentingzhao/commit0_combined` | |
| 34 | +| `--dataset-split` | str | Split of the Huggingface dataset | `test` | |
| 35 | +| `--base-dir` | str | Base directory to clone repos to | `repos/` | |
| 36 | +| `--commit0-dot-file-path` | str | Storing path for stateful commit0 configs | `.commit0.yaml` | |
| 37 | + |
| 38 | +### Build |
| 39 | + |
| 40 | +Use `commit0 build [OPTIONS]` to build the Commit0 split chosen in the Setup stage. |
| 41 | +Available options include: |
| 42 | + |
| 43 | +| Argument | Type | Description | Default | |
| 44 | +|----------|------|-------------|---------| |
| 45 | +| `--num-workers` | int | Number of workers | `8` | |
| 46 | +| `--commit0-dot-file-path` | str | Path to the commit0 dot file | `.commit0.yaml` | |
| 47 | +| `--verbose` | int | Verbosity level (1 or 2) | `1` | |
| 48 | + |
| 49 | +### Get Tests |
| 50 | + |
| 51 | +Use `commit0 get-tests REPO_NAME` to get tests for a Commit0 repository. |
| 52 | + |
| 53 | +| Argument | Type | Description | Default | |
| 54 | +|----------|------|-------------|---------| |
| 55 | +| `repo_name` | str | Name of the repository to get tests for | | |
| 56 | + |
| 57 | +### Test |
| 58 | + |
| 59 | +Use `commit0 test [OPTIONS] REPO_OR_REPO_PATH [TEST_IDS]` to run tests on a Commit0 repository. |
| 60 | +Available options include: |
| 61 | + |
| 62 | +| Argument | Type | Description | Default | |
| 63 | +|----------|------|-------------|---------| |
| 64 | +| `repo_or_repo_path` | str | Directory of the repository to test | | |
| 65 | +| `test_ids` | str | Test IDs to run | | |
| 66 | +| `--branch` | str | Branch to test | | |
| 67 | +| `--backend` | str | Backend to use for testing | `modal` | |
| 68 | +| `--timeout` | int | Timeout for tests in seconds | `1800` | |
| 69 | +| `--num-cpus` | int | Number of CPUs to use | `1` | |
| 70 | +| `--reference` | bool | Test the reference commit | `False` | |
| 71 | +| `--coverage` | bool | Get coverage information | `False` | |
| 72 | +| `--rebuild` | bool | Rebuild an image | `False` | |
| 73 | +| `--commit0-dot-file-path` | str | Path to the commit0 dot file | `.commit0.yaml` | |
| 74 | +| `--verbose` | int | Verbosity level (1 or 2) | `1` | |
| 75 | +| `--stdin` | bool | Read test names from stdin | `False` | |
| 76 | + |
| 77 | +### Evaluate |
| 78 | + |
| 79 | +Use `commit0 evaluate [OPTIONS]` to evaluate the Commit0 split chosen in the Setup stage. |
| 80 | +Available options include: |
| 81 | + |
| 82 | +| Argument | Type | Description | Default | |
| 83 | +|----------|------|-------------|---------| |
| 84 | +| `--branch` | str | Branch to evaluate | | |
| 85 | +| `--backend` | str | Backend to use for evaluation | `modal` | |
| 86 | +| `--timeout` | int | Timeout for evaluation in seconds | `1800` | |
| 87 | +| `--num-cpus` | int | Number of CPUs to use | `1` | |
| 88 | +| `--num-workers` | int | Number of workers to use | `8` | |
| 89 | +| `--reference` | bool | Evaluate the reference commit | `False` | |
| 90 | +| `--coverage` | bool | Get coverage information | `False` | |
| 91 | +| `--commit0-dot-file-path` | str | Path to the commit0 dot file | `.commit0.yaml` | |
| 92 | +| `--rebuild` | bool | Rebuild images | `False` | |
| 93 | + |
| 94 | +### Lint |
| 95 | + |
| 96 | +Use `commit0 lint [OPTIONS] REPO_OR_REPO_DIR` to lint files in a repository. |
| 97 | +Available options include: |
| 98 | + |
| 99 | +| Argument | Type | Description | Default | |
| 100 | +|----------|------|-------------|---------| |
| 101 | +| `repo_or_repo_dir` | str | Directory of the repository to test | | |
| 102 | +| `--files` | List[Path] | Files to lint (optional) | | |
| 103 | +| `--commit0-dot-file-path` | str | Path to the commit0 dot file | `.commit0.yaml` | |
| 104 | +| `--verbose` | int | Verbosity level (1 or 2) | `1` | |
| 105 | + |
| 106 | +### Save |
| 107 | + |
| 108 | +Use `commit0 save [OPTIONS] OWNER BRANCH` to save the Commit0 split to GitHub. |
| 109 | +Available options include: |
| 110 | + |
| 111 | +| Argument | Type | Description | Default | |
| 112 | +|----------|------|-------------|---------| |
| 113 | +| `owner` | str | Owner of the repository | | |
| 114 | +| `branch` | str | Branch to save | | |
| 115 | +| `--github-token` | str | GitHub token for authentication | | |
| 116 | +| `--commit0-dot-file-path` | str | Path to the commit0 dot file | `.commit0.yaml` | |
| 117 | + |
| 118 | +## Agent |
| 119 | + |
| 120 | +### Config |
| 121 | + |
| 122 | +Use `agent config [OPTIONS] AGENT_NAME` to set up the configuration for an agent. |
| 123 | +Available options include: |
| 124 | + |
| 125 | +| Argument | Type | Description | Default | |
| 126 | +|----------|------|-------------|---------| |
| 127 | +| `agent_name` | str | Agent to use, we only support [aider](https://aider.chat/) for now. | `aider` | |
| 128 | +| `--model-name` | str | LLM model to use, check [here](https://aider.chat/docs/llms.html) for all supported models. | `claude-3-5-sonnet-20240620` | |
| 129 | +| `--use-user-prompt` | bool | Use a custom prompt instead of the default prompt. | `False` | |
| 130 | +| `--user-prompt` | str | The prompt sent to agent. | See code for details. | |
| 131 | +| `--run-tests` | bool | Run tests after code modifications for feedback. You need to set up `docker` or `modal` before running tests, refer to commit0 docs. | `False` | |
| 132 | +| `--max-iteration` | int | Maximum number of agent iterations. | `3` | |
| 133 | +| `--use-repo-info` | bool | Include the repository information. | `False` | |
| 134 | +| `--max-repo-info-length` | int | Maximum length of the repository information to use. | `10000` | |
| 135 | +| `--use-unit-tests-info` | bool | Include the unit tests information. | `False` | |
| 136 | +| `--max-unit-tests-info-length` | int | Maximum length of the unit tests information to use. | `10000` | |
| 137 | +| `--use-spec-info` | bool | Include the spec information. | `False` | |
| 138 | +| `--max-spec-info-length` | int | Maximum length of the spec information to use. | `10000` | |
| 139 | +| `--use-lint-info` | bool | Include the lint information. | `False` | |
| 140 | +| `--max-lint-info-length` | int | Maximum length of the lint information to use. | `10000` | |
| 141 | +| `--pre-commit-config-path` | str | Path to the pre-commit config file. This is needed for running `lint`. | `.pre-commit-config.yaml` | |
| 142 | +| `--agent-config-file` | str | Path to write the agent config. | `.agent.yaml` | |
| 143 | + |
| 144 | +### Running |
| 145 | + |
| 146 | +Use `agent run [OPTIONS] BRANCH` to execute an agent on a specific branch. |
| 147 | +Available options include: |
| 148 | + |
| 149 | +| Argument | Type | Description | Default | |
| 150 | +|----------|------|-------------|---------| |
| 151 | +| `branch` | str | Branch to run the agent on, you can specific the name of the branch | | |
| 152 | +| `--backend` | str | Test backend to run the agent on, ignore this option if you are not adding `run_tests` option to agent. | `modal` | |
| 153 | +| `--log-dir` | str | Log directory to store the logs. | `logs/aider` | |
| 154 | +| `--max-parallel-repos` | int | Maximum number of repositories for agent to run in parallel. Running in sequential if set to 1. | `1` | |
| 155 | +| `--display-repo-progress-num` | int | Number of repo progress displayed when running. | `5` | |
0 commit comments