Skip to content

Available databases on ClickHouse

Huy Do edited this page Nov 18, 2024 · 11 revisions

The default database

The default database that includes all GitHub events, for example workflow_run and workflow_job. It also includes several non-GitHub tables migrated there from Rockset.

  • failed_test_runs includes the information about failed tests. It's populated by upload_test_stats.py script.
  • job_annotation is used in HUD to manually annotate a failure into several categories like INFRA_FLAKE, or BROKEN_TRUNK.
  • merges contains the information about merges from mergebot. This is used to compute the important % force merges KPI.
  • rerun_disabled_tests is used by rerun disabled tests bot to confirm if a disabled test is still failing in trunk.
  • servicelab_torch_dynamo_perf_stats stores the internal service lab benchmark results. This should be on the benchmark database instead. Having it here is a mistake.
  • test_run_s3 keeps the test time for individual tests on, well, S3. This information is used later to build CI features that depends on test times, for example marking slow tests.
  • test_run_summary aggregates the information in test_run_s3 by test class and provide aggregated test time per class when computing CI test shards.

The benchmark database

The benchmark database for all benchmark and metric data.

Clone this wiki locally