PDF-Frame AI

⚠️ Work in Progress: This project is currently in active development and experimentation. The API and model outputs may change as we refine the system. We welcome feedback and contributions!

A powerful AI-powered PDF-frame template generator that uses fine-tuned language models to create optimized PDF-frame templates based on natural language instructions.

Motivation

pdf-frame is a powerful framework for building visually rich PDFs and canvas graphics using declarative syntax. However, crafting these templates —especially for intricate layouts—can be slow and complex. It demands fluency in both visualization concepts and the specifics of pdf-frame's syntax. This friction hinders rapid prototyping and makes onboarding new developers challenging.

The motivation behind this project is to streamline this process using AI, allowing users to describe layouts in natural language and receive valid, ready-to-use template code in return.

Features

Generate PDF-Frame templates from natural language instructions
Fine-tuned on StarCoder2-7B model
GPU-accelerated inference
REST API interface via FastAPI
Support for charts, animations, and D3-based computations

Prerequisites

Python 3.10 or higher
CUDA-compatible GPU with at least 16GB VRAM
NVIDIA drivers and CUDA toolkit installed
Hugging Face account and access token

Installation

Clone the repository:

git clone https://github.com/I2Djs/pdf-frame-ai.git
cd pdf-frame-ai

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Training (Fine-tuning)

Login to Hugging Face:

huggingface-cli login

You'll need to provide your Hugging Face access token. If you don't have one, you can create it at https://huggingface.co/settings/tokens

Start the training process:

python finetune/train.py

The training will:

Download the base model from Hugging Face bigcode/starcoder2-7b
Download the training dataset from Hugging Face https://huggingface.co/datasets/nswamy14/pdf-frame-dataset-1/resolve/main/pdf_frame_dataset_large.jsonl
Fine-tune the model on the dataset
Save the LoRA weights locally in the ./snaps/starcoder-pdf-frame-v2 directory

Inference (Generation)

Start the FastAPI server:

uvicorn inference.api:app --host 0.0.0.0 --port 8000

The server will:

Load the base model from Hugging Face
Load the local LoRA weights from ./snaps/starcoder-pdf-frame-v2
Start the inference server

Command Line Interface

from inference.generate import generator

# Initialize the generator
generator.load_model()

# Generate a template
result = generator.generate(
    "generate a pie chart with pdf-frame template, to show sales coverage of the following products data: 
[{product: 'shoes', percent: 0.25}, {product: 'belts', percent: 0.15}, {product: 'tie', percent: 0.35}, {product: 'slippers', percent: 0.25}]"
)
print(result)

API Usage

# Generate a template
curl -X POST "http://localhost:8000/generate" \
     -H "Content-Type: application/json" \
     -d '{"prompt": "Generate a pdf-frame template for bar chart"}'

API Documentation

Once the server is running, visit http://localhost:8000/docs for interactive API documentation.

Endpoints

POST /generate: Generate a PDF-frame template

Request body:
```
{
    "prompt": "string"
}
```
Response:
```
{
    "generated_text": "string"
}
```

GET /health: Check server health

Response:

{
    "status": "healthy",
    "model_loaded": true
}

Project Structure

pdf-frame-ai/
├── inference/
│   ├── generate.py    # Core generation logic
│   └── api.py         # FastAPI server
├── finetune/
│   └── train.py       # Model fine-tuning code
├── lora-models/       # Directory for saved LoRA weights
├── requirements.txt  # Python dependencies
└── README.md        # This file

Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

StarCoder2 for the base model
Hugging Face Transformers for the model framework
FastAPI for the API framework

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PDF-Frame AI

Motivation

Features

Prerequisites

Installation

Training (Fine-tuning)

Inference (Generation)

Command Line Interface

API Usage

API Documentation

Endpoints

Project Structure

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
dataset		dataset
finetune		finetune
inference		inference
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt

License

I2Djs/pdf-frame-ai

Folders and files

Latest commit

History

Repository files navigation

PDF-Frame AI

Motivation

Features

Prerequisites

Installation

Training (Fine-tuning)

Inference (Generation)

Command Line Interface

API Usage

API Documentation

Endpoints

Project Structure

Contributing

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages