Skip to content

Commit 5a2d5e8

Browse files
author
tkucar
committed
Merge remote-tracking branch 'origin/develop' into tom-cg-10450-enable-unauthenticated-codemod-test-runs
2 parents c109e83 + 0a79e04 commit 5a2d5e8

File tree

185 files changed

+2406
-1419
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

185 files changed

+2406
-1419
lines changed

.circleci/config.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ commands:
6161
if [ "<<parameters.extra_repos>>" = "true" ]; then
6262
EXTRA_REPOS_ARG="--extra-repos"
6363
fi
64-
uv run --frozen gs codemod clone-repos ${EXTRA_REPOS_ARG} --token ${CODEGEN_BOT_GHE_TOKEN} --clean-cache
64+
uv run --frozen python -m tests.shared.codemod.commands clone-repos ${EXTRA_REPOS_ARG} --token ${CODEGEN_BOT_GHE_TOKEN} --clean-cache
6565
- save_cache:
6666
paths:
6767
- $GITHUB_WORKSPACE
@@ -70,15 +70,15 @@ commands:
7070
steps:
7171
- run:
7272
command: |
73-
uv run --frozen gs codemod fetch-verified-codemods --cli-api-key ${PROD_CLI_API_KEY}
73+
uv run --frozen python -m tests.shared.codemod.commands fetch-verified-codemods --cli-api-key ${PROD_CLI_API_KEY}
7474
cache-verified-codemod-repos:
7575
steps:
7676
- restore_cache:
7777
keys:
7878
- v1-verified-codemod-repos-{{ checksum "tests/integration/verified_codemods/codemod_data/repo_commits.json" }}-{{.Environment.CIRCLE_NODE_INDEX}}-{{.Environment.CIRCLE_NODE_TOTAL}}
7979
- run:
8080
command: |
81-
uv run --frozen gs codemod clone-repos --verified-codemod-repos --token ${CODEGEN_BOT_GHE_TOKEN}
81+
uv run --frozen python -m tests.shared.codemod.commands clone-repos --verified-codemod-repos --token ${CODEGEN_BOT_GHE_TOKEN}
8282
- save_cache:
8383
paths:
8484
- $GITHUB_WORKSPACE

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,4 +64,4 @@ graph-sitter-types/out/**
6464
graph-sitter-types/typings/**
6565
coverage.json
6666
tests/integration/verified_codemods/codemod_data/repo_commits.json
67-
67+
.codegen/*

.pre-commit-config.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ repos:
1919
hooks:
2020
- id: biome-check
2121
additional_dependencies: ["@biomejs/[email protected]"]
22-
exclude: (src/codemods/eval)|(tests/unit/codegen/sdk/skills)|(tests/unit/codegen/sdk/output)|(tests/integration/verified_codemods)|(docs/samples)
22+
exclude: (src/codemods/eval)|(tests/unit/skills/snapshots)|(tests/unit/codegen/sdk/output)|(tests/integration/verified_codemods)|(docs/samples)
2323

2424
- repo: https://github.com/MarcoGorelli/cython-lint
2525
rev: v0.16.6
@@ -56,15 +56,15 @@ repos:
5656
- id: sync-pre-commit-deps
5757

5858
- repo: https://github.com/codespell-project/codespell
59-
rev: v2.2.4
59+
rev: v2.4.0
6060
hooks:
6161
- id: codespell
6262
additional_dependencies:
6363
- tomli
6464
files: "docs/.*/.*.mdx"
6565

6666
- repo: https://github.com/fpgmaas/deptry.git
67-
rev: "0.22.0"
67+
rev: "0.23.0"
6868
hooks:
6969
- id: deptry
7070
pass_filenames: false

CONTRIBUTING.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
# Contributing to Graph Sitter
1+
# Contributing to Codegen
22

3-
Thank you for your interest in contributing to Graph Sitter! This document outlines the process and guidelines for contributing.
3+
Thank you for your interest in contributing to Codegen! This document outlines the process and guidelines for contributing.
44

55
## Contributor License Agreement
66

7-
By contributing to Graph Sitter, you agree that:
7+
By contributing to Codegen, you agree that:
88

99
1. Your contributions will be licensed under the project's license.
1010
2. You have the right to license your contribution under the project's license.

Dockerfile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ RUN --mount=type=cache,target=/root/.cache/uv \
3434
uv run pre-commit install-hooks
3535
FROM base-image AS extra-repos
3636
ARG CODEGEN_BOT_GHE_TOKEN=""
37-
RUN uv run gs codemod clone-repos --clean-cache --extra-repos --token ${CODEGEN_BOT_GHE_TOKEN}
37+
RUN uv run python -m tests.shared.codemod.commands clone-repos --clean-cache --extra-repos --token ${CODEGEN_BOT_GHE_TOKEN}
3838
FROM base-image AS oss-repos
3939
ARG CODEGEN_BOT_GHE_TOKEN=""
40-
RUN uv run gs codemod clone-repos --clean-cache --token ${CODEGEN_BOT_GHE_TOKEN}
40+
RUN uv run python -m tests.shared.codemod.commands clone-repos --clean-cache --token ${CODEGEN_BOT_GHE_TOKEN}

README.md

Lines changed: 30 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
11
# Codegen
22

33
[![Documentation](https://img.shields.io/badge/docs-docs.codegen.com-blue)](https://docs.codegen.com)
4+
[![Slack Community](https://img.shields.io/badge/slack-community-4A154B?logo=slack)](https://community.codegen.com)
5+
[![Twitter Follow](https://img.shields.io/twitter/follow/codegen)](https://twitter.com/codegen)
46

57
[Codegen](https://docs.codegen.com) is a python library for manipulating codebases.
68

7-
Write code that transforms code. Codegen combines the parsing power of [Tree-sitter](https://tree-sitter.github.io/tree-sitter/) with the graph algorithms of [rustworkx](https://github.com/Qiskit/rustworkx) to enable scriptable, multi-language code manipulation at scale.
89

910
```python
1011
from codegen import Codebase
@@ -21,16 +22,43 @@ for function in codebase.functions:
2122
function.move_to_file('deprecated.py')
2223
```
2324

24-
## Installation
25+
Write code that transforms code. Codegen combines the parsing power of [Tree-sitter](https://tree-sitter.github.io/tree-sitter/) with the graph algorithms of [rustworkx](https://github.com/Qiskit/rustworkx) to enable scriptable, multi-language code manipulation at scale.
26+
27+
## Installation and Usage
2528
**This library requires Python 3.12 – 3.13.**
2629
```
30+
# Install inside existing project
2731
uv pip install codegen
32+
33+
# Install global CLI
34+
uv tool install codegen
35+
36+
# Create a codemod for a given repo
37+
cd path/to/repo
38+
codegen init
39+
codegen create test-function
40+
41+
# Run said codemod
42+
codegen run test-function
43+
44+
# Create an isolated venv with codegen => open jupyter
45+
codegen notebook
46+
```
47+
48+
## Usage
49+
50+
See [https://docs.codegen.com/introduction/getting-started] for a full tutorial.
51+
52+
```
53+
from codegen import Codebase
54+
2855
```
2956

3057
## Resources
3158

3259
- [Docs](https://docs.codegen.com)
3360
- [Get Started](https://docs.codegen.com/introduction/getting-started)
61+
- [Contributing](CONTRIBUTING.md)
3462

3563

3664
## Why Codegen?

docs/blog/act-via-code.mdx

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
---
2+
title: "Act via Code"
3+
icon: "code"
4+
iconType: "solid"
5+
description: "The path to fully-automated software engineering"
6+
---
7+
8+
<Frame caption="Voyager (2023) solved agentic tasks with code execution">
9+
<img src="/images/mine-amethyst.png" />
10+
</Frame>
11+
12+
13+
Two and a half years since the launch of the GPT-3 API, code assistants have emerged as potentially the premier use case of LLMs. The rapid adoption of AI-powered IDEs and prototype builders isn't surprising — code is structured, deterministic, and rich with patterns, making it an ideal domain for machine learning. Developers actively working with tools like Cursor (myself included) have an exhiliarating yet uncertain sense that the field of software engineering is approaching an inflection point.
14+
15+
Yet there's a striking gap between understanding and action for today's code assistants. When provided proper context, frontier LLMs can analyze massive enterprise codebases and propose practical paths towards sophisticated, large-scale improvements. But implementing changes that impact more than a small set of files with modern AI assistants is fundamentally infeasible. The good news is that for focused, file-level changes, we've found real success: AI-powered IDEs ([Windsurf](https://codeium.com/windsurf), [Cursor](https://www.cursor.com/)) are transforming how developers write and review code, while chat-based assistants are revolutionizing how we bootstrap and prototype new applications (via tools like [v0](https://v0.dev/), [lovable.dev](https://lovable.dev/), and [bolt.new](https://bolt.new/)).
16+
17+
However, there's a whole class of critical engineering tasks that remain out of reach - tasks that are fundamentally programmatic and deal with codebase structure at scale. A significant amount of effort on modern engineering teams is directed towards eliminating tech debt, managing large-scale migrations, analyzing dependency graphs, enforcing type coverage across the codebase, and similar tasks that require a global view of a codebase. Today's AI assistants can fully understand these challenges and even propose solutions, but they lack the mechanisms to actually implement them. The intelligence is there, but it's trapped in your IDE's text completion window.
18+
19+
20+
The bottleneck isn't intelligence — it's tooling. The solution requires letting AI systems programmatically interact with codebases and software systems through code execution environments. Code execution environments represent the most expressive tool we could offer an agent—enabling composition, abstraction, and systematic manipulation of complex systems. By combining code execution environments with custom APIs that correspond to powerful large-scale operations, we can unlock a new set of tasks in which agents can be significant contributors. When paired with ever-improving foundation models, this will lead to a step function improvement for code assistants, enabling their application in an entirely new set of valuable tasks.
21+
22+
## Beating Minecraft with Code Execution
23+
24+
In mid-2023, a research project called [Voyager](https://voyager.minedojo.org) made waves: it effectively solved Minecraft, performing several multiples better than the prior SOTA. This was a massive breakthrough as previous reinforcement learning systems had struggled for years with even basic Minecraft tasks.
25+
26+
While the AI community was focused on scaling intelligence, Voyager demonstrated something more fundamental: the right tools can unlock entirely new tiers of capability. The same GPT-4 model that struggled with Minecraft using standard agentic frameworks (like [ReAct](https://klu.ai/glossary/react-agent-model)) achieved remarkable results when allowed to write and execute code. This wasn't about raw intelligence—it was about giving the agent a more expressive way to act.
27+
28+
<Frame>
29+
<img src="/images/voyager-performance.png" />
30+
</Frame>
31+
32+
The breakthrough came from a simple yet powerful insight: let the AI write code. Instead of limiting the agent to primitive "tools," Voyager allowed GPT-4 to write and execute [JS programs](https://github.com/MineDojo/Voyager/tree/main/skill_library/trial2/skill/code) that controlled Minecraft actions through a clean API.
33+
34+
```javascript
35+
// Example "action program" from Voyager, 2023
36+
// written by gpt-4
37+
async function chopSpruceLogs(bot) {
38+
const spruceLogCount = bot.inventory.count(mcData.itemsByName.spruce_log.id);
39+
const logsToMine = 3 - spruceLogCount;
40+
if (logsToMine > 0) {
41+
bot.chat("Chopping down spruce logs...");
42+
await mineBlock(bot, "spruce_log", logsToMine);
43+
bot.chat("Chopped down 3 spruce logs.");
44+
} else {
45+
bot.chat("Already have 3 spruce logs in inventory.");
46+
}
47+
}
48+
```
49+
50+
This approach transformed the agent's capabilities. Rather than being constrained to atomic actions like `equipItem(...)` (this would be typical of "traditional" agent algorithms, such as ReAct), it could create higher-level operations like [craftShieldWithFurnace()](https://github.com/MineDojo/Voyager/blob/main/skill_library/trial2/skill/code/craftShieldWithFurnace.js) through composing the atomic APIs. Furthermore, Wang et al. implemented a memory mechanism, in which these successful "action programs" could later be recalled, copied, and built upon, effectively enabling the agent to accumulate experience.
51+
52+
<Frame>
53+
<img src="/images/voyager-retrieval.png" />
54+
</Frame>
55+
56+
As the Voyager authors noted:
57+
58+
<Tip>*"We opt to use code as the action space instead of low-level motor commands because programs can naturally represent temporally extended and compositional actions, which are essential for many long-horizon tasks in Minecraft."*</Tip>
59+
60+
## Code is an Ideal Action Space
61+
62+
What these authors demonstrated is a fundamental insight that extends far beyond gaming. Letting AI act through code rather than atomic commands will lead to a step change in the capabilities of AI systems. Nowhere is this more apparent than in software engineering, where agents already understand complex transformations but lack the tools to execute them effectively.
63+
64+
Today's productionized code assistants operate though an interface where they can directly read/write to text files and perform other bespoke activities, like searching through file embeddings or running terminal commands.
65+
66+
In the act via code paradigm, all of these actions are expressed through writing and executing code, like the below:
67+
68+
```python
69+
# Implement `grep` via for loops and if statements
70+
for function in codebase.functions:
71+
if 'Page' in function.name:
72+
73+
# Implement systematic actions, like moving things around, through an API
74+
function.move_to_file('/pages/' + function.name + '.tsx')
75+
```
76+
77+
Provided a sufficiently comprehensive set of APIs, this paradigm has many clear advantages:
78+
79+
- **API-Driven Extensibility**: Any operation that can be expressed through an API becomes accessible to the agent. This means the scope of tasks an agent can handle grows with our ability to create clean APIs for complex operations.
80+
81+
- **Programmatic Efficiency**: Many agent tasks involve systematic operations across large codebases. Expressing these as programs rather than individual commands dramatically reduces computational overhead and allows for batch operations.
82+
83+
- **Composability**: Agents can build their own tools by combining simpler operations. This aligns perfectly with LLMs' demonstrated ability to compose and interpolate between examples to create novel solutions.
84+
85+
- **Constrained Action Space**: Well-designed APIs act as guardrails, making invalid operations impossible to express. The type system becomes a powerful tool for preventing entire classes of errors before they happen.
86+
87+
- **Objective Feedback**: Code execution provides immediate, unambiguous feedback through stack traces and error messages—not just confidence scores. This concrete error signal is invaluable for learning.
88+
89+
- **Natural Collaboration**: Programs are a shared language between humans and agents. Code explicitly encodes reasoning in a reviewable format, making actions transparent, debuggable, and easily re-runnable.
90+
91+
## Code Manipulation Programs
92+
93+
For software engineering, we believe the path forward is clear: agents need a framework that matches how developers think about and manipulate code. While decades of static analysis work gives us a strong foundation, traditional code modification frameworks weren't designed with AI-human collaboration in mind - they expose low-level APIs that don't match how developers (or AI systems) think about code changes.
94+
95+
We're building a framework with high-level APIs that correspond to how engineers actually think about code modifications. The APIs are clean and intuitive, following clear [principles](/introduction/guiding-principles) that eliminate sharp edges and handle edge cases automatically. Most importantly, the framework encodes rich structural understanding of code. Consider this example:
96+
97+
```python
98+
# Access to high-level semantic operations
99+
for component in codebase.jsx_components:
100+
# Rich structural analysis built-in
101+
if len(component.usages) == 0:
102+
# Systematic operations across the codebase
103+
component.rename(component.name + 'Page')
104+
```
105+
106+
This isn't just string manipulation - the framework understands React component relationships, tracks usage patterns, and can perform complex refactors while maintaining correctness. By keeping the codebase representation in memory, we can provide lightning-fast operations for both analysis and systematic edits.
107+
108+
The documentation for such a framework isn't just API reference - it's education for advanced intelligence about how to successfully manipulate code at scale. We're building for a future where AI systems are significant contributors to codebases, and they need to understand not just the "how" but the "why" behind code manipulation patterns.
109+
110+
Crucially, we believe these APIs will extend beyond the codebase itself into the broader software engineering ecosystem. When agents can seamlessly interact with tools like Datadog, AWS, and other development platforms through the same clean interfaces, we'll take a major step toward [autonomous software engineering](/introduction/about#our-mission). The highest leverage move isn't just giving agents the ability to modify code - it's giving them programmatic access to the entire software development lifecycle.
111+
112+
## Codegen is now OSS
113+
114+
We're excited to release [Codegen](https://github.com/codegen-sh/codegen-sdk) as open source [Apache 2.0](https://github.com/codegen-sh/codegen-sdk?tab=Apache-2.0-1-ov-file) and build out this vision with the broader developer community. [Get started with Codegen](/introduction/getting-started) today or please join us in our [Slack community](https://community.codegen.com) if you have feedback or questions about a use case!
115+
116+
Jay Hack, Founder

0 commit comments

Comments
 (0)