Bragi: Together.ai Training Interface

A full-stack application for fine-tuning and training models with Together.ai, featuring a Next.js frontend and FastAPI backend.

Project Structure

bragi-app/
├── frontend/                # Next.js TypeScript frontend
│   ├── src/
│   │   ├── app/            # Next.js app router
│   │   ├── components/     # React components
│   │   ├── hooks/          # Custom React hooks
│   │   └── lib/            # Utility functions
│   ├── public/             # Static assets
│   └── package.json
├── backend/                # FastAPI Python backend
│   ├── api/                # API endpoints
│   │   ├── routes/         # Route modules
│   │   └── endpoints.py    # Main API endpoints
│   ├── src/                # Core logic
│   │   ├── formatter.py    # Training data formatter
│   │   ├── trainer.py      # Training management
│   │   ├── validator.py    # Data validation
│   │   ├── llm_manager.py  # LLM integration
│   │   ├── monitor.py      # Training monitoring
│   │   ├── websocket.py    # WebSocket utilities
│   │   ├── config.py       # Configuration
│   │   └── utils.py        # Utility functions
│   ├── data/               # Training data storage
│   ├── config/             # Configuration files
│   ├── logs/               # Log files
│   └── requirements.txt    # Python dependencies
├── docker/                 # Docker configuration
│   ├── backend.Dockerfile  # Backend Docker image
│   └── frontend.Dockerfile # Frontend Docker image
└── docker-compose.yml      # Docker Compose configuration

Prerequisites

Node.js 16+
Python 3.9+
Together.ai API key

Installation

Backend Setup

cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
# Copy example environment file and configure
cp .env.example .env
# Edit .env with your configuration

Frontend Setup

cd frontend
npm install
# or
yarn install
# Copy example environment file and configure
cp .env.example .env.local
# Edit .env.local with your configuration

Environment Setup

Backend (.env)

The backend uses a .env file for configuration:

# Together.ai API settings (optional, can be provided via UI)
TOGETHER_API_KEY=your_together_api_key_here
TOGETHER_MODEL=mistralai/Mixtral-8x7B-Instruct-v0.1

# Application settings
SECRET_KEY=your_secret_key_here  # Used for JWT tokens
ALLOWED_ORIGINS=http://localhost:3000,http://127.0.0.1:3000  # Comma-separated list of allowed origins for CORS

# Logging settings (optional)
LOG_LEVEL=INFO

Frontend (.env.local)

The frontend uses .env.local for configuration:

NEXT_PUBLIC_API_URL=http://localhost:8000

Running the Application

Start the Backend

cd backend
uvicorn main:app --reload

Backend runs on http://localhost:8000

Start the Frontend

cd frontend
npm run dev
# or
yarn dev

Frontend runs on http://localhost:3000

Deployment

Using Docker

The application includes Docker configuration for easy deployment:

Copy the example environment file:
```
cp .env.example .env
```
Edit the .env file with your Together.ai API key and other settings
Start the application using Docker Compose:
```
docker-compose up -d
```
Access the application at http://localhost:3000

The Docker setup includes:

Automatic reloading for both frontend and backend during development
Volume mounting for persistent data storage
Health checks for the backend service
Networking between services

Manual Deployment

For production deployment:

Build the frontend:
```
cd frontend
npm run build
```
Set up a production web server (like Nginx) to serve the frontend and proxy API requests to the backend

Run the backend with a production ASGI server:

cd backend
gunicorn -k uvicorn.workers.UvicornWorker main:app

Set appropriate environment variables for both services

Features

Training Configuration

Model selection (currently supporting Together.ai models including):
- mistralai/Mixtral-8x7B-Instruct-v0.1
Training parameters:
- Number of epochs
- Batch size
- Learning rate
- Checkpoints
- Learning rate scheduler
- Warmup ratio
- Max gradient norm
- Weight decay
File upload for training data
Real-time training status monitoring
Interactive model testing via WebSocket

Data Formatting

The system expects training data in JSONL format with the following structure:

{
  "text": "<|im_start|>system\nYour system message\n<|im_end|>\n<|im_start|>user\nUser message\n<|im_end|>\n<|im_start|>assistant\nAssistant response\n<|im_end|>"
}

Training Requirements

Minimum 100 training examples required
Valid JSON format for each line
Proper conversation structure with system, user, and assistant messages

API Endpoints

POST /api/validate-key

Validates Together.ai API key

POST /api/upload

Uploads training data file

POST /api/train/preview

Previews training configuration

POST /api/train/confirm

Starts training process with configuration

GET /api/status/{job_id}

Gets training status for a specific job

GET /api/available-models

Gets available models for training

WebSocket /api/ws/model-test

Interactive model testing endpoint

Security Considerations

API Key Management

No API keys are hardcoded in the application
Frontend temporarily stores API key in localStorage for session use only
Backend validates API key for each secured endpoint
API keys are passed via secure headers
API keys can be optionally set in the backend environment file for server-side use

Environment Variables

All configuration is managed through environment variables:

Sensitive information like API keys and secrets are kept in .env files
These .env files are included in .gitignore to prevent accidental commits
Example .env files are provided without real credentials

Running in Production

When deploying to production:

Always use HTTPS for the frontend and backend
Set a strong, unique SECRET_KEY in the backend .env file
Configure ALLOWED_ORIGINS properly to prevent CORS attacks
Consider using a reverse proxy like Nginx for additional security
Implement rate limiting for API endpoints
Set up proper access controls if exposed to the internet

Privacy Considerations

Training data is stored locally by default
Files uploaded for training remain on your server
Together.ai will store your training data on their servers when you initiate training
Review Together.ai's privacy policy for how they handle your data

Development Notes

Using Together.ai SDK for training operations
Frontend built with Next.js, TypeScript, Tailwind CSS
Backend using FastAPI + Python
Real-time interactions via WebSockets
Data validation and formatting
Error handling and user feedback

Limitations

Currently using Together.ai SDK which supports limited training parameters
Minimum 100 training examples required
Only supports JSONL format for training data
Specific conversation structure required

Future Improvements

Switch to direct API calls for more training parameters
Add support for more model types
Implement training cost estimation
Add data validation preview
Support for different conversation formats
Enhanced dataset quality assessment
User authentication and multi-user support
Improved data privacy controls

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
backend		backend
docker		docker
frontend		frontend
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

compiledwithproblems/Bragi-App

Folders and files

Latest commit

History

Repository files navigation