Skip to content

Commit 4164602

Browse files
authored
Symbol attributions example cleaned up (#806)
README + run.py
1 parent 572ebee commit 4164602

File tree

2 files changed

+92
-0
lines changed

2 files changed

+92
-0
lines changed
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# Symbol Attributions
2+
3+
This example demonstrates how to analyze and track attribution information for symbols in a codebase, including identifying AI vs human contributions and tracking edit history.
4+
5+
## What it does
6+
7+
This script performs several key functions:
8+
9+
1. **Codebase Analysis**
10+
- Loads and parses all Python files in the repository
11+
- Builds a dependency graph of symbols (classes, functions, etc.)
12+
- Analyzes import relationships and dependencies
13+
14+
```python
15+
from codegen import Codebase
16+
17+
# Initialize codebase object from directory
18+
codebase = Codebase.from_repo("your-org/your-repo", language="python")
19+
```
20+
21+
2. **AI Impact Analysis**
22+
- Identifies commits made by AI bots vs human contributors
23+
- Calculates statistics on AI contributions:
24+
- Percentage of AI commits
25+
- Files with significant AI contribution
26+
- Number of AI-touched symbols
27+
- Identifies high-impact AI-written code
28+
29+
```python
30+
ai_authors = ["devin[bot]", "codegen[bot]", "github-actions[bot]"]
31+
add_attribution_to_symbols(codebase, ai_authors)
32+
```
33+
34+
3. **Symbol Attribution**
35+
- Tracks edit history for each symbol in the codebase
36+
- Records:
37+
- Last editor of each symbol
38+
- Complete editor history
39+
- Whether the symbol was AI-authored
40+
- Provides detailed attribution for most-used symbols
41+
42+
```python
43+
symbols_with_usages = []
44+
for symbol in codebase.symbols:
45+
if hasattr(symbol, "usages") and len(symbol.usages) > 0:
46+
symbols_with_usages.append((symbol, len(symbol.usages)))
47+
```
48+
49+
## Example Output
50+
51+
The script provides detailed analytics including:
52+
53+
- Repository statistics (files, symbols, contributors)
54+
- AI contribution summary (% of commits, impacted files)
55+
- Top contributors list
56+
- Detailed attribution for most-used symbols, showing:
57+
- Symbol name and type
58+
- File location
59+
- Usage count
60+
- Last editor
61+
- Editor history
62+
- AI authorship status
63+
64+
## Usage
65+
66+
Run the script in your repository:
67+
68+
```bash
69+
python run.py
70+
```
71+
72+
The script will automatically:
73+
74+
- Use the current directory if it's a git repository
75+
- Fall back to a sample repository if not in a git repo
76+
- Generate comprehensive attribution analysis
77+
- Save detailed results to `ai_impact_analysis.json`
78+
79+
## Requirements
80+
81+
- A Git repository
82+
- Python codebase
83+
- `codegen` installed
84+
85+
## Learn More
86+
87+
- [Codegen Symbols](https://docs.codegen.com/api-reference/core/Symbol#symbol)
88+
- [Codegen Documentation](https://docs.codegen.com)
89+
90+
## Contributing
91+
92+
Feel free to submit issues and enhancement requests!

0 commit comments

Comments
 (0)