Skip to content

A full-stack application for fine-tuning and training models with Together.ai, featuring a Next.js frontend and FastAPI backend.

Notifications You must be signed in to change notification settings

compiledwithproblems/Bragi-App

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bragi: Together.ai Training Interface

A full-stack application for fine-tuning and training models with Together.ai, featuring a Next.js frontend and FastAPI backend.

Project Structure

bragi-app/
├── frontend/                # Next.js TypeScript frontend
│   ├── src/
│   │   ├── app/            # Next.js app router
│   │   ├── components/     # React components
│   │   ├── hooks/          # Custom React hooks
│   │   └── lib/            # Utility functions
│   ├── public/             # Static assets
│   └── package.json
├── backend/                # FastAPI Python backend
│   ├── api/                # API endpoints
│   │   ├── routes/         # Route modules
│   │   └── endpoints.py    # Main API endpoints
│   ├── src/                # Core logic
│   │   ├── formatter.py    # Training data formatter
│   │   ├── trainer.py      # Training management
│   │   ├── validator.py    # Data validation
│   │   ├── llm_manager.py  # LLM integration
│   │   ├── monitor.py      # Training monitoring
│   │   ├── websocket.py    # WebSocket utilities
│   │   ├── config.py       # Configuration
│   │   └── utils.py        # Utility functions
│   ├── data/               # Training data storage
│   ├── config/             # Configuration files
│   ├── logs/               # Log files
│   └── requirements.txt    # Python dependencies
├── docker/                 # Docker configuration
│   ├── backend.Dockerfile  # Backend Docker image
│   └── frontend.Dockerfile # Frontend Docker image
└── docker-compose.yml      # Docker Compose configuration

Prerequisites

  • Node.js 16+
  • Python 3.9+
  • Together.ai API key

Installation

Backend Setup

cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
# Copy example environment file and configure
cp .env.example .env
# Edit .env with your configuration

Frontend Setup

cd frontend
npm install
# or
yarn install
# Copy example environment file and configure
cp .env.example .env.local
# Edit .env.local with your configuration

Environment Setup

Backend (.env)

The backend uses a .env file for configuration:

# Together.ai API settings (optional, can be provided via UI)
TOGETHER_API_KEY=your_together_api_key_here
TOGETHER_MODEL=mistralai/Mixtral-8x7B-Instruct-v0.1

# Application settings
SECRET_KEY=your_secret_key_here  # Used for JWT tokens
ALLOWED_ORIGINS=http://localhost:3000,http://127.0.0.1:3000  # Comma-separated list of allowed origins for CORS

# Logging settings (optional)
LOG_LEVEL=INFO

Frontend (.env.local)

The frontend uses .env.local for configuration:

NEXT_PUBLIC_API_URL=http://localhost:8000

Running the Application

Start the Backend

cd backend
uvicorn main:app --reload

Backend runs on http://localhost:8000

Start the Frontend

cd frontend
npm run dev
# or
yarn dev

Frontend runs on http://localhost:3000

Deployment

Using Docker

The application includes Docker configuration for easy deployment:

  1. Copy the example environment file:

    cp .env.example .env
  2. Edit the .env file with your Together.ai API key and other settings

  3. Start the application using Docker Compose:

    docker-compose up -d
  4. Access the application at http://localhost:3000

The Docker setup includes:

  • Automatic reloading for both frontend and backend during development
  • Volume mounting for persistent data storage
  • Health checks for the backend service
  • Networking between services

Manual Deployment

For production deployment:

  1. Build the frontend:

    cd frontend
    npm run build
  2. Set up a production web server (like Nginx) to serve the frontend and proxy API requests to the backend

  3. Run the backend with a production ASGI server:

    cd backend
    gunicorn -k uvicorn.workers.UvicornWorker main:app
  4. Set appropriate environment variables for both services

Features

Training Configuration

  • Model selection (currently supporting Together.ai models including):
    • mistralai/Mixtral-8x7B-Instruct-v0.1
  • Training parameters:
    • Number of epochs
    • Batch size
    • Learning rate
    • Checkpoints
    • Learning rate scheduler
    • Warmup ratio
    • Max gradient norm
    • Weight decay
  • File upload for training data
  • Real-time training status monitoring
  • Interactive model testing via WebSocket

Data Formatting

The system expects training data in JSONL format with the following structure:

{
  "text": "<|im_start|>system\nYour system message\n<|im_end|>\n<|im_start|>user\nUser message\n<|im_end|>\n<|im_start|>assistant\nAssistant response\n<|im_end|>"
}

Training Requirements

  • Minimum 100 training examples required
  • Valid JSON format for each line
  • Proper conversation structure with system, user, and assistant messages

API Endpoints

POST /api/validate-key

Validates Together.ai API key

POST /api/upload

Uploads training data file

POST /api/train/preview

Previews training configuration

POST /api/train/confirm

Starts training process with configuration

GET /api/status/{job_id}

Gets training status for a specific job

GET /api/available-models

Gets available models for training

WebSocket /api/ws/model-test

Interactive model testing endpoint

Security Considerations

API Key Management

  • No API keys are hardcoded in the application
  • Frontend temporarily stores API key in localStorage for session use only
  • Backend validates API key for each secured endpoint
  • API keys are passed via secure headers
  • API keys can be optionally set in the backend environment file for server-side use

Environment Variables

All configuration is managed through environment variables:

  • Sensitive information like API keys and secrets are kept in .env files
  • These .env files are included in .gitignore to prevent accidental commits
  • Example .env files are provided without real credentials

Running in Production

When deploying to production:

  1. Always use HTTPS for the frontend and backend
  2. Set a strong, unique SECRET_KEY in the backend .env file
  3. Configure ALLOWED_ORIGINS properly to prevent CORS attacks
  4. Consider using a reverse proxy like Nginx for additional security
  5. Implement rate limiting for API endpoints
  6. Set up proper access controls if exposed to the internet

Privacy Considerations

  • Training data is stored locally by default
  • Files uploaded for training remain on your server
  • Together.ai will store your training data on their servers when you initiate training
  • Review Together.ai's privacy policy for how they handle your data

Development Notes

  • Using Together.ai SDK for training operations
  • Frontend built with Next.js, TypeScript, Tailwind CSS
  • Backend using FastAPI + Python
  • Real-time interactions via WebSockets
  • Data validation and formatting
  • Error handling and user feedback

Limitations

  • Currently using Together.ai SDK which supports limited training parameters
  • Minimum 100 training examples required
  • Only supports JSONL format for training data
  • Specific conversation structure required

Future Improvements

  • Switch to direct API calls for more training parameters
  • Add support for more model types
  • Implement training cost estimation
  • Add data validation preview
  • Support for different conversation formats
  • Enhanced dataset quality assessment
  • User authentication and multi-user support
  • Improved data privacy controls

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

MIT License

About

A full-stack application for fine-tuning and training models with Together.ai, featuring a Next.js frontend and FastAPI backend.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published