Skip to content

Evals - Education and Airline scenarios #140

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Evals - Education and Airline scenarios #140

wants to merge 4 commits into from

Conversation

MahtabSarvmaili
Copy link

  1. MCP Education Agent example - A tutor agent for educational scenarios
  that guides students with homework
  2. MCP Evaluation Framework - A comprehensive tool for evaluating agent
  performance with metrics for progress rate, grounding accuracy, and task
  completion
  3. Example results and visualizations for airline assistance scenario

      1. MCP Education Agent example - A tutor agent for educational scenarios
      that guides students with homework
      2. MCP Evaluation Framework - A comprehensive tool for evaluating agent
      performance with metrics for progress rate, grounding accuracy, and task
      completion
      3. Example results and visualizations for airline assistance scenario
Implements a comprehensive evaluation framework for MCP agents that measures performance metrics including response time, success rate, and task completion across different examples.
Includes visualization utilities and automated testing tools for benchmarking agents.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants