Skip to content

Commit 4088aab

Browse files
jayhackcodegen-bot
and
codegen-bot
authored
docs: act-via-code (#105)
# Motivation <!-- Why is this change necessary? --> # Content <!-- Please include a summary of the change --> # Testing <!-- How was the change tested? --> # Please check the following before marking your PR as ready for review - [ ] I have added tests for my changes - [ ] I have updated the documentation or added new documentation as needed - [ ] I have read and agree to the [Contributor License Agreement](../CLA.md) --------- Co-authored-by: codegen-bot <[email protected]>
1 parent a4c468e commit 4088aab

24 files changed

+545
-104
lines changed

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,4 +64,4 @@ graph-sitter-types/out/**
6464
graph-sitter-types/typings/**
6565
coverage.json
6666
tests/integration/verified_codemods/codemod_data/repo_commits.json
67-
67+
.codegen/*

docs/blog/act-via-code.mdx

Lines changed: 17 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -5,32 +5,32 @@ iconType: "solid"
55
description: "The path to advanced code manipulation agents"
66
---
77

8-
<Frame caption="Voyager (Jim Fan)">
8+
<Frame caption="Voyager (2023) solved agentic tasks with code execution">
99
<img src="/images/nether-portal.png" />
1010
</Frame>
1111

1212

13-
# Act via Code
13+
Two and a half years since the launch of the GPT-3 API, code assistants have emerged as potentially the premier use case of LLMs. The rapid adoption of AI-powered IDEs and prototype builders isn't surprising — code is structured, deterministic, and rich with patterns, making it an ideal domain for machine learning. Experienced developers working with tools like Cursor (myself included) can tell that the field of software engineering is about to go through rapid change.
1414

15-
Two and a half years since the launch of the GPT-3 API, code assistants have emerged as the most powerful and practically useful applications of LLMs. The rapid adoption of AI-powered IDEs and prototype builders isn't surprising — code is structured, deterministic, and rich with patterns, making it an ideal domain for machine learning. As model capabilities continue to scale, we're seeing compounding improvements in code understanding and generation.
16-
17-
Yet there's a striking gap between what AI agents can understand and what they can actually do. While they can reason about complex architectural changes, debug intricate issues, and propose sophisticated refactors, they often can't execute these ideas. The ceiling isn't intelligence or context—it's the ability to manipulate code at scale. Large-scale modifications remain unreliable or impossible, not because agents don't understand what to do, but because they lack the right interfaces to do it.
15+
Yet there's a striking gap between understanding and action. Today's AI agents can analyze enterprise codebases and propose sophisticated improvements—eliminating tech debt, untangling dependencies, improving modularity. But ask them to actually implement these changes across millions of lines of code, and they hit a wall. Their ceiling isn't intelligence—it's the ability to safely and reliably execute large-scale modifications on real, enterprise codebases.
1816

1917
The bottleneck isn't intelligence — it's tooling. By giving AI models the ability to write and execute code that modifies code, we're about to unlock an entire class of tasks that agents already understand but can't yet perform. Code execution environments represent the most expressive tool we could offer an agent—enabling composition, abstraction, and systematic manipulation of complex systems. When paired with ever-improving language models, this will unlock another step function improvement in AI capabilities.
2018

2119
## Beating Minecraft with Code Execution
2220

23-
In mid-2023, a research project called [Voyager](https://voyager.minedojo.org) made waves: it effectively solved Minecraft, performing several multiples better than the prior SOTA on many important dimensions. This was a massive breakthrough previous reinforcement learning systems had struggled for years with even basic Minecraft tasks.
21+
In mid-2023, a research project called [Voyager](https://voyager.minedojo.org) made waves: it effectively solved Minecraft, performing several multiples better than the prior SOTA. This was a massive breakthrough as previous reinforcement learning systems had struggled for years with even basic Minecraft tasks.
2422

25-
While the AI community was focused on scaling intelligence, Voyager demonstrated something more fundamental: the right tools can unlock entirely new tiers of capability. The same GPT-4 model that struggled with Minecraft using traditional frameworks achieved remarkable results when allowed to write and execute code. This wasn't about raw intelligence—it was about giving the agent a more expressive way to act.
23+
While the AI community was focused on scaling intelligence, Voyager demonstrated something more fundamental: the right tools can unlock entirely new tiers of capability. The same GPT-4 model that struggled with Minecraft using standard agentic frameworks (like [ReAct](https://klu.ai/glossary/react-agent-model)) achieved remarkable results when allowed to write and execute code. This wasn't about raw intelligence—it was about giving the agent a more expressive way to act.
2624

2725
<Frame>
2826
<img src="/images/voyager-performance.png" />
2927
</Frame>
3028

31-
The breakthrough came from a simple yet powerful insight: let the AI write code. Instead of limiting the agent to primitive "tools," Voyager allowed GPT-4 to write and execute [JS programs](https://github.com/MineDojo/Voyager/tree/main/skill_library/trial2/skill/code) that controlled Minecraft actions through a clean API:
29+
The breakthrough came from a simple yet powerful insight: let the AI write code. Instead of limiting the agent to primitive "tools," Voyager allowed GPT-4 to write and execute [JS programs](https://github.com/MineDojo/Voyager/tree/main/skill_library/trial2/skill/code) that controlled Minecraft actions through a clean API.
3230

3331
```javascript
32+
// Example "action program" from Voyager, 2023
33+
// written by gpt-4
3434
async function chopSpruceLogs(bot) {
3535
const spruceLogCount = bot.inventory.count(mcData.itemsByName.spruce_log.id);
3636
const logsToMine = 3 - spruceLogCount;
@@ -44,7 +44,7 @@ async function chopSpruceLogs(bot) {
4444
}
4545
```
4646

47-
This approach transformed the agent's capabilities. Rather than being constrained to atomic actions like `equipItem(...)`, it could create higher-level operations like [`craftShieldWithFurnace()`](https://github.com/MineDojo/Voyager/blob/main/skill_library/trial2/skill/code/craftShieldWithFurnace.js) through composing JS APIs. The system also implemented a memory mechanism, storing successful programs for reuse in similar situations—effectively building its own library of proven solutions it could later refer to and adapt to similar circumstances.
47+
This approach transformed the agent's capabilities. Rather than being constrained to atomic actions like `equipItem(...)`, it could create higher-level operations like [`craftShieldWithFurnace()`](https://github.com/MineDojo/Voyager/blob/main/skill_library/trial2/skill/code/craftShieldWithFurnace.js) through composing JS APIs. Furthermore, Wang et al. implemented a memory mechanism, in which successful "action programs" could later be recalled, copied, and built upon, effectively enabling the agent to accumulate experience.
4848

4949
<Frame>
5050
<img src="/images/voyager-retrieval.png" />
@@ -56,23 +56,21 @@ As the Voyager authors noted:
5656

5757
## Code is an Ideal Action Space
5858

59-
The implications of code as an action space extend far beyond gaming. Code provides a uniquely powerful interface between AI and real-world systems. When an agent writes code, it gains several critical advantages over traditional atomic tools.
59+
The implications of code as an action space extend far beyond gaming. This architectural insight — letting AI act through code rather than atomic commands — will lead to a step change in the capabilities of AI systems. Nowhere is this more apparent than in software engineering, where agents already understand complex transformations but lack the tools to execute them effectively.
60+
61+
When an agent writes code, it gains several critical advantages over traditional atomic tools:
6062

61-
### Code is Composable
62-
Code is the ultimate composable medium. Agents can build their own tools by combining simpler operations, wrapping any function as a building block for more complex behaviors. This aligns well with what is perhaps LLMs' premier capability: understanding and interpolating between examples to create new solutions.
63+
- **Composability**: Agents can build their own tools by combining simpler operations. This aligns perfectly with LLMs' demonstrated ability to compose and interpolate between examples to create novel solutions.
6364

64-
### Code Constrains the Action Space
65-
APIs can enforce guardrails that keep agents on track. By designing interfaces that make invalid operations impossible to express, we can prevent entire classes of errors before they happen. The type system becomes a powerful tool for shaping agent behavior.
65+
- **Constrained Action Space**: Well-designed APIs act as guardrails, making invalid operations impossible to express. The type system becomes a powerful tool for preventing entire classes of errors before they happen.
6666

67-
### Code Provides Objective Feedback
68-
Code execution gives immediate, unambiguous feedback. When something goes wrong, you get stack traces and error messages—not just a confidence score. This concrete error signal is invaluable for agents learning to navigate complex systems.
67+
- **Objective Feedback**: Code execution provides immediate, unambiguous feedback through stack traces and error messages—not just confidence scores. This concrete error signal is invaluable for learning.
6968

70-
### Code is a Natural Medium for Collaboration
71-
Programs are a shared language between humans and agents. Code explicitly encodes reasoning in a reviewable format, making agent actions transparent and debuggable. There's no magic—just deterministic execution that can be understood, modified, and improved by both humans and AI.
69+
- **Natural Collaboration**: Programs are a shared language between humans and agents. Code explicitly encodes reasoning in a reviewable format, making actions transparent, debuggable, and easily re-runnable.
7270

7371
## For Software Engineering
7472

75-
This brings us to software engineering, where we see a massive gap between AI's theoretical capabilities and practical achievements. Many code modification tasks are fundamentally programmatic—dependency analysis, refactors, control flow analysis—yet we lack the tools to express them properly.
73+
Software engineering tasks are inherently programmatic and graph-based — dependency analysis, refactors, control flow analysis, etc. Yet today's AI agents interface with code primarily through string manipulation, missing the rich structure that developers and their tools rely on. By giving agents APIs that operate on the codebase's underlying graph structure rather than raw text, we can unlock a new tier of capabilities. Imagine agents that can rapidly traverse dependency trees, analyze control flow, and perform complex refactors while maintaining perfect awareness of the codebase's structure.
7674

7775
Consider how a developer thinks about refactoring: it's rarely about direct text manipulation. Instead, we think in terms of high-level operations: "move this function," "rename this variable everywhere," "split this module." These operations can be encoded into a powerful Python API:
7876

@@ -84,17 +82,3 @@ for component in codebase.jsx_components:
8482
# powerful edit APIs that handle edge cases
8583
component.rename(component.name + 'Page')
8684
```
87-
88-
This isn't just another code manipulation library—it's a scriptable language server that builds on proven foundations like LSP and codemods, but designed specifically for programmatic analysis and refactoring.
89-
90-
## What does this look like?
91-
92-
At Codegen, we've built exactly this system. Our approach centers on four key principles:
93-
94-
The foundation must be Python, enabling easy composition with existing tools and workflows. Operations must be in-memory for performance, handling large-scale changes efficiently. The system must be open source, allowing developers and AI researchers to extend and enhance it. And perhaps most importantly, it must be thoroughly documented—not just for humans, but for the next generation of AI agents that will build upon it.
95-
96-
## What does this enable?
97-
98-
We've already used this approach to merge hundreds of thousands of lines of code in enterprise codebases. Our tools have automated complex tasks like feature flag deletion, test suite reorganization, import cycle elimination, and dead code removal. But more importantly, we've proven that code-as-action-space isn't just theoretical—it's a practical approach to scaling software engineering.
99-
100-
This is just the beginning. With Codegen, we're providing the foundation for the next generation of code manipulation tools—built for both human developers and AI agents. We believe this approach will fundamentally change how we think about and implement large-scale code changes, making previously impossible tasks not just possible, but routine.

docs/blog/codemod-frameworks.mdx

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,11 @@ icon: "code-compare"
55
iconType: "solid"
66
---
77

8+
# Others to add
9+
- [Abracadabra](https://github.com/nicoespeon/abracadabra)
10+
- [Rope](https://rope.readthedocs.io/en/latest/overview.html#rope-overview)
11+
- [Grit](https://github.com/getgrit/gritql)
12+
813
Code transformation tools have evolved significantly over the years, each offering unique approaches to programmatic code manipulation. Let's explore the strengths and limitations of major frameworks in this space.
914

1015
## Python's AST Module

docs/blog/posts.mdx

Lines changed: 3 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -4,39 +4,16 @@ icon: "clock"
44
iconType: "solid"
55
---
66

7-
<Update
8-
label="2024-01-24"
9-
description="Static Analysis in the Age of AI Coding Assistants"
10-
title="Static Analysis in the Age of AI Coding Assistants"
11-
>
12-
## Static Analysis in the Age of AI Coding Assistants
13-
14-
Why traditional language servers aren't enough for the future of AI-powered code manipulation
15-
16-
</Update>
17-
18-
<Update
19-
label="2024-01-24"
20-
description="A Deep Dive into Codemod Frameworks"
21-
title="Codemod Frameworks"
22-
href="/blog/codemod-frameworks"
23-
>
24-
## Codemod Frameworks
25-
26-
Comparing popular tools for programmatic code transformation
27-
28-
</Update>
29-
307
<Update label="2024-01-24" description="Acting via Code">
318

329
## Act via Code
3310

34-
Programs are the natural convergence of LLMs and traditional computation.
11+
Why code as an action space will lead to a step function improvement in agent capabilities.
3512

3613
<Card
37-
img="/images/voyager.png"
14+
img="/images/nether-portal.png"
3815
title="Act via Code"
39-
href="https://codegen.com"
16+
href="/blog/act-via-code"
4017
/>
4118

4219
</Update>

docs/building-with-codegen/at-a-glance.mdx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -60,11 +60,11 @@ Learn how to use Codegen's core APIs to analyze and transform code.
6060
Understand function call patterns and manipulate call sites.
6161
</Card>
6262
<Card
63-
title="Imports & Exports"
63+
title="Imports"
6464
icon="file-import"
65-
href="/building-with-codegen/imports-and-exports"
65+
href="/building-with-codegen/imports"
6666
>
67-
Work with module imports, exports, and manage dependencies.
67+
Work with module imports and manage dependencies.
6868
</Card>
6969
<Card
7070
title="Traversing the Call Graph"
Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,180 @@
1+
---
2+
title: "The Export API"
3+
sidebarTitle: "Exports"
4+
icon: "file-export"
5+
iconType: "solid"
6+
---
7+
8+
The [Export](/api-reference/core/Export) API provides tools for managing exports and module boundaries in TypeScript codebases.
9+
10+
## Export Statements vs Exports
11+
12+
Similar to imports, Codegen provides two levels of abstraction for working with exports:
13+
14+
- [ExportStatement](/api-reference/core/ExportStatement) - Represents a complete export statement
15+
- [Export](/api-reference/core/Export) - Represents individual exported symbols
16+
17+
```typescript
18+
// One ExportStatement containing multiple Export objects
19+
export { foo, bar as default, type User };
20+
// Creates:
21+
// - Export for 'foo'
22+
// - Export for 'bar' as default
23+
// - Export for 'User' as a type
24+
25+
// Direct exports create one ExportStatement per export
26+
export const value = 42;
27+
export function process() {}
28+
```
29+
30+
You can access these through your file's collections:
31+
32+
```python
33+
# Access all export statements
34+
for stmt in file.export_statements:
35+
print(f"Statement: {stmt.source}")
36+
37+
# Access individual exports in the statement
38+
for exp in stmt.exports:
39+
print(f" Export: {exp.name}")
40+
```
41+
42+
<Note>
43+
ExportStatement inherits from [Statement](/building-with-codegen/statements-and-code-blocks), providing operations like `remove()` and `insert_before()`. This is particularly useful when you want to manipulate the entire export declaration.
44+
</Note>
45+
46+
## Export Types
47+
48+
Codegen supports several types of exports:
49+
50+
```typescript
51+
// Direct exports
52+
export const value = 42; // Value export
53+
export function myFunction() {} // Function export
54+
export class MyClass {} // Class export
55+
export type MyType = string; // Type export
56+
export interface MyInterface {} // Interface export
57+
export enum MyEnum {} // Enum export
58+
59+
// Re-exports
60+
export { foo, bar } from './other-file'; // Named re-exports
61+
export type { Type } from './other-file'; // Type re-exports
62+
export * from './other-file'; // Wildcard re-exports
63+
export * as utils from './other-file'; // Namespace re-exports
64+
65+
// Aliased exports
66+
export { foo as foop }; // Basic alias
67+
export { foo as default }; // Default export alias
68+
export { bar as baz } from './other-file'; // Re-export with alias
69+
```
70+
71+
## Working with Exports
72+
73+
The Export API provides methods to identify and filter exports:
74+
75+
```python
76+
# Check export types
77+
for exp in file.exports:
78+
if exp.is_type_export():
79+
print(f"Type export: {exp.name}")
80+
elif exp.is_default_export():
81+
print(f"Default export: {exp.name}")
82+
elif exp.is_wildcard_export():
83+
print(f"Wildcard export from: {exp.from_file.filepath}")
84+
85+
# Work with re-exports
86+
for exp in file.exports:
87+
if exp.is_reexport():
88+
if exp.is_external_export:
89+
print(f"External re-export: {exp.name} from {exp.from_file.filepath}")
90+
else:
91+
print(f"Internal re-export: {exp.name}")
92+
```
93+
94+
## Export Resolution
95+
96+
You can trace exports to their original symbols:
97+
98+
```python
99+
for exp in file.exports:
100+
if exp.is_reexport():
101+
# Get original and current symbols
102+
current = exp.exported_symbol
103+
original = exp.resolved_symbol
104+
105+
print(f"Re-exporting {original.name} from {exp.from_file.filepath}")
106+
print(f"Through: {' -> '.join(e.file.filepath for e in exp.export_chain)}")
107+
```
108+
109+
## Common Operations
110+
111+
Here are common operations for working with exports:
112+
113+
```python
114+
# Add new export
115+
file.add_export("MyComponent")
116+
117+
# Add export with alias
118+
file.add_export("MyComponent", alias="default")
119+
120+
# Convert to type export
121+
export = file.get_export("MyType")
122+
export.make_type_export()
123+
124+
# Remove export
125+
export.remove() # Removes export but keeps symbol
126+
127+
# Update export properties
128+
export.update(
129+
name="NewName",
130+
is_type=True,
131+
is_default=False
132+
)
133+
```
134+
135+
## Managing Re-exports
136+
137+
Common patterns for working with re-exports:
138+
139+
```python
140+
# Create public API
141+
index_file = codebase.get_file("index.ts")
142+
143+
# Re-export from internal files
144+
for internal_file in codebase.files:
145+
if internal_file.name != "index":
146+
for symbol in internal_file.symbols:
147+
if symbol.is_public:
148+
index_file.add_export(
149+
symbol,
150+
from_file=internal_file
151+
)
152+
153+
# Convert default to named exports
154+
for exp in file.exports:
155+
if exp.is_default_export():
156+
exp.make_named_export()
157+
158+
# Consolidate re-exports
159+
from collections import defaultdict
160+
161+
file_exports = defaultdict(list)
162+
for exp in file.exports:
163+
if exp.is_reexport():
164+
file_exports[exp.from_file].append(exp)
165+
166+
for from_file, exports in file_exports.items():
167+
if len(exports) > 1:
168+
# Create consolidated re-export
169+
names = [exp.name for exp in exports]
170+
file.add_export_from_source(
171+
f"export {{ {', '.join(names)} }} from '{from_file.filepath}'"
172+
)
173+
# Remove individual exports
174+
for exp in exports:
175+
exp.remove()
176+
```
177+
178+
<Note>
179+
When managing exports, consider the impact on your module's public API. Not all symbols that can be exported should be exported.
180+
</Note>

0 commit comments

Comments
 (0)