Streamlit App for Classifying Gemini or OpenAI responses using LastMile AI's alBERTa model

Fine-tuned alBERTa model on Gemini and OpenAI responses using the LastMile AI AutoEval platform.

How was it fine-tuned?

The training dataset is composed of 800 natural language question and answers. All questions were sampled from https://ai.google.com/research/NaturalQuestions.
Data cleaning of the natural questions and standardizing on the format
Ran all 800 natural language questions in OpenAI (model=gpt4o-mini) and Gemini (model=gemini-1.5-flash)
All responses were cleaned for symbols, markdown, and other stylistic indicators that may be too telling on whether it's Gemini or OpenAI
Output was set to 400 randomly selected responses from openai and the other 400 were gemini
OpenAI responses were mapped to label 0, Gemini responses were mapped to label 1 Final dataset in csv format (https://drive.google.com/file/d/1MYuEV_dFLlXw-pZu7Rrtbvwj26fvgC82/view?usp=sharing)
LastMile AI's AutoEval platform (https://lastmileai.dev/models) was used to fine-tune the alBERTa model on this dataset
Model scores were callibrated where the threshold was set to 0.552

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.devcontainer		.devcontainer
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback