Skip to content

Commit ec2ef11

Browse files
jayhackrushilpatel0
authored andcommitted
docs: docs research agent (#514)
1 parent a75991b commit ec2ef11

File tree

2 files changed

+213
-0
lines changed

2 files changed

+213
-0
lines changed

docs/mint.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,7 @@
8484
"tutorials/at-a-glance",
8585
"tutorials/build-code-agent",
8686
"tutorials/slack-bot",
87+
"tutorials/deep-code-research",
8788
"tutorials/training-data",
8889
"tutorials/codebase-visualization",
8990
"tutorials/migrating-apis",

docs/tutorials/deep-code-research.mdx

Lines changed: 212 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,212 @@
1+
---
2+
title: "Deep Code Research with AI"
3+
sidebarTitle: "Code Research Agent"
4+
icon: "magnifying-glass"
5+
iconType: "solid"
6+
---
7+
8+
This guide demonstrates how to build an intelligent code research tool that can analyze and explain codebases using Codegen's and LangChain. The tool combines semantic code search, dependency analysis, and natural language understanding to help developers quickly understand new codebases.
9+
10+
<Info>View the full code on [GitHub](https://github.com/codegen-sh/codegen-sdk/tree/develop/codegen-examples/examples/deep_code_research)</Info>
11+
12+
<Tip>This example works with any public GitHub repository - just provide the repo name in the format owner/repo</Tip>
13+
14+
## Overview
15+
16+
The process involves three main components:
17+
18+
1. A CLI interface for interacting with the research agent
19+
2. A set of code analysis tools powered by Codegen
20+
3. An LLM-powered agent that combines the tools to answer questions
21+
22+
Let's walk through building each component.
23+
24+
## Step 1: Setting Up the Research Tools
25+
26+
First, let's import the necessary components and set up our research tools:
27+
28+
```python
29+
from codegen import Codebase
30+
from codegen.extensions.langchain.agent import create_agent_with_tools
31+
from codegen.extensions.langchain.tools import (
32+
ListDirectoryTool,
33+
RevealSymbolTool,
34+
SearchTool,
35+
SemanticSearchTool,
36+
ViewFileTool,
37+
)
38+
from langchain_core.messages import SystemMessage
39+
```
40+
41+
We'll create a function to initialize our codebase with a nice progress indicator:
42+
43+
```python
44+
def initialize_codebase(repo_name: str) -> Optional[Codebase]:
45+
"""Initialize a codebase with a spinner showing progress."""
46+
with console.status("") as status:
47+
try:
48+
status.update(f"[bold blue]Cloning {repo_name}...[/bold blue]")
49+
codebase = Codebase.from_repo(repo_name)
50+
status.update("[bold green]✓ Repository cloned successfully![/bold green]")
51+
return codebase
52+
except Exception as e:
53+
console.print(f"[bold red]Error initializing codebase:[/bold red] {e}")
54+
return None
55+
```
56+
57+
Then we'll set up our research tools:
58+
59+
```python
60+
# Create research tools
61+
tools = [
62+
ViewFileTool(codebase), # View file contents
63+
ListDirectoryTool(codebase), # Explore directory structure
64+
SearchTool(codebase), # Text-based search
65+
SemanticSearchTool(codebase), # Natural language search
66+
RevealSymbolTool(codebase), # Analyze symbol relationships
67+
]
68+
```
69+
70+
Each tool provides specific capabilities:
71+
- `ViewFileTool`: Read and understand file contents
72+
- `ListDirectoryTool`: Explore the codebase structure
73+
- `SearchTool`: Find specific code patterns
74+
- `SemanticSearchTool`: Search using natural language
75+
- `RevealSymbolTool`: Analyze dependencies and usages
76+
77+
## Step 2: Creating the Research Agent
78+
79+
Next, we'll create an agent that can use these tools intelligently. We'll give it a detailed prompt about its role:
80+
81+
```python
82+
RESEARCH_AGENT_PROMPT = """You are a code research expert. Your goal is to help users understand codebases by:
83+
1. Finding relevant code through semantic and text search
84+
2. Analyzing symbol relationships and dependencies
85+
3. Exploring directory structures
86+
4. Reading and explaining code
87+
88+
Always explain your findings in detail and provide context about how different parts of the code relate to each other.
89+
When analyzing code, consider:
90+
- The purpose and functionality of each component
91+
- How different parts interact
92+
- Key patterns and design decisions
93+
- Potential areas for improvement
94+
95+
Break down complex concepts into understandable pieces and use examples when helpful."""
96+
97+
# Initialize the agent
98+
agent = create_agent_with_tools(
99+
codebase=codebase,
100+
tools=tools,
101+
chat_history=[SystemMessage(content=RESEARCH_AGENT_PROMPT)],
102+
verbose=True
103+
)
104+
```
105+
106+
## Step 3: Building the CLI Interface
107+
108+
Finally, we'll create a user-friendly CLI interface using rich-click:
109+
110+
```python
111+
import rich_click as click
112+
from rich.console import Console
113+
from rich.markdown import Markdown
114+
115+
@click.group()
116+
def cli():
117+
"""🔍 Codegen Code Research CLI"""
118+
pass
119+
120+
@cli.command()
121+
@click.argument("repo_name", required=False)
122+
@click.option("--query", "-q", default=None, help="Initial research query.")
123+
def research(repo_name: Optional[str] = None, query: Optional[str] = None):
124+
"""Start a code research session."""
125+
# Initialize codebase
126+
codebase = initialize_codebase(repo_name)
127+
128+
# Create and run the agent
129+
agent = create_research_agent(codebase)
130+
131+
# Main research loop
132+
while True:
133+
if not query:
134+
query = Prompt.ask("[bold cyan]Research query[/bold cyan]")
135+
136+
result = agent.invoke({"input": query})
137+
console.print(Markdown(result["output"]))
138+
139+
query = None # Clear for next iteration
140+
```
141+
142+
## Using the Research Tool
143+
144+
You can use the tool in several ways:
145+
146+
1. Interactive mode (will prompt for repo):
147+
```bash
148+
python run.py research
149+
```
150+
151+
2. Specify a repository:
152+
```bash
153+
python run.py research "fastapi/fastapi"
154+
```
155+
156+
3. Start with an initial query:
157+
```bash
158+
python run.py research "fastapi/fastapi" -q "Explain the main components"
159+
```
160+
161+
Example research queries:
162+
- "Explain the main components and their relationships"
163+
- "Find all usages of the FastAPI class"
164+
- "Show me the dependency graph for the routing module"
165+
- "What design patterns are used in this codebase?"
166+
167+
<Tip>
168+
The agent maintains conversation history, so you can ask follow-up questions
169+
and build on previous findings.
170+
</Tip>
171+
172+
## Advanced Usage
173+
174+
### Custom Research Tools
175+
176+
You can extend the agent with custom tools for specific analysis needs:
177+
178+
```python
179+
from langchain.tools import BaseTool
180+
from pydantic import BaseModel, Field
181+
182+
class CustomAnalysisTool(BaseTool):
183+
"""Custom tool for specialized code analysis."""
184+
name = "custom_analysis"
185+
description = "Performs specialized code analysis"
186+
187+
def _run(self, query: str) -> str:
188+
# Custom analysis logic
189+
return results
190+
191+
# Add to tools list
192+
tools.append(CustomAnalysisTool())
193+
```
194+
195+
### Customizing the Agent
196+
197+
You can modify the agent's behavior by adjusting its prompt:
198+
199+
```python
200+
CUSTOM_PROMPT = """You are a specialized code reviewer focused on:
201+
1. Security best practices
202+
2. Performance optimization
203+
3. Code maintainability
204+
...
205+
"""
206+
207+
agent = create_agent_with_tools(
208+
codebase=codebase,
209+
tools=tools,
210+
chat_history=[SystemMessage(content=CUSTOM_PROMPT)],
211+
)
212+
```

0 commit comments

Comments
 (0)