Skip to content

Commit 8e4d3e7

Browse files
authored
Merge pull request #899 from rylev/schema-docs
Start documenting schema
2 parents feee9ec + b550d9a commit 8e4d3e7

File tree

1 file changed

+167
-0
lines changed

1 file changed

+167
-0
lines changed

database/schema.md

Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
# Schema
2+
3+
Below is an explanation of the current database schema. This schema is duplicated across the (currently) two database backends we support: sqlite and postgres.
4+
5+
6+
## Overview
7+
8+
In general, the database is used to track three groups of things:
9+
* Performance run statistics (e.g., instruction count) on a per benchmark, profile, and cache-state basis.
10+
* Self profile data gathered with `-Zself-profile`.
11+
* State when running GitHub bots and the performance runs (e.g., how long it took for a performance suite to run, errors encountered a long the way, etc.)
12+
13+
Below are some diagrams showing the basic layout of the database schema for these three uses:
14+
15+
### Performance run statistics
16+
17+
```
18+
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
19+
│ benchmarks │ │ collections │ │ artifacts │
20+
├───────────────┤ ├───────────────┤ ├───────────────┤
21+
┌►│ id * │ │ id * │◄┐│ id * │◄┐
22+
│ │ name │ │ commit │ ││ name │ │
23+
│ │ runs_on_stable│ │ │ ││ date │ │
24+
│ │ │ │ │ ││ type │ │
25+
│ └───────────────┘ └───────────────┘ │└───────────────┘ │
26+
│ │ │
27+
│ │ │
28+
│ ┌───────────────┐ ┌───────────────┐ | │
29+
│ │ pstat_series │ │ pstats │ │ │
30+
│ ├───────────────┤ ├───────────────┤ │ │
31+
│ │ id * │◄┐│ id * │ │ │
32+
└─┤ benchmark_id │ └┤ series_id │ │ │
33+
│ profile │ │ artifact_id ├─┼──────────────────┘
34+
│ cache │ │ collection_id ├─┘
35+
│ statistic │ │ value │
36+
└───────────────┘ └───────────────┘
37+
```
38+
39+
### Self profile data
40+
41+
**TODO**
42+
43+
### Miscellaneous State
44+
45+
**TODO**
46+
47+
## Tables
48+
49+
### benchmark
50+
51+
The different types of benchmarks that are run.
52+
53+
The table stores the name of the benchmark as well as whether it is capable of being run using the stable compiler. The benchmark name is used as a foreign key in many of the other tables.
54+
55+
```
56+
sqlite> select * from benchmark limit 1;
57+
name stabilized
58+
---------- ----------
59+
helloworld 0
60+
```
61+
62+
### artifact
63+
64+
A description of a rustc compiler artifact being benchmarked.
65+
66+
This description includes:
67+
* name: usually a commit sha or a tag like "1.51.0" but is free-form text so can be anything.
68+
* date: the date associated with this compiler artifact (usually only when the name is a commit)
69+
* type: currently one of "master" (i.e., we're testing a merge commit), "try" (someone is testing a PR), and "release" (usually a release candidate - though local compilers also get labeled like this).
70+
71+
```
72+
sqlite> select * from artifact limit 1;
73+
id name date type
74+
---------- ---------- ---------- ----------
75+
1 LOCAL_TEST release
76+
```
77+
78+
### collection
79+
80+
A "collection" of benchmarks tied only differing by the statistic collected.
81+
82+
This is a way to collect statistics together signifying that they belong to the same logical benchmark run.
83+
84+
Currently the collection also marks the git sha of the currently running collector binary.
85+
86+
```
87+
sqlite> select * from collection limit 1;
88+
id perf_commit
89+
---------- -----------------------------------------
90+
1 d9fd96f409a15429757030f225b082744a72516c
91+
```
92+
93+
### pstat_series
94+
95+
A unique collection of crate, profile, cache and statistic.
96+
97+
* crate: the benchmarked crate which might be a crate from crates.io or a crate made specifically to stress some part of the compiler.
98+
* profile: what type of compilation is happening - check build, optimized build (a.k.a. release build), debug build, or doc build.
99+
* cache: how much of the incremental cache is full. An empty incremental cache means that the compiler must do a full build.
100+
* statistic: the type of stat being collected
101+
102+
```
103+
sqlite> select * from pstat_series limit 1;
104+
id crate profile cache statistic
105+
---------- ---------- ---------- ---------- ------------
106+
1 helloworld check full task-clock:u
107+
```
108+
109+
### pstat
110+
111+
A statistic that is unique to a pstat_series, artifact and collection.
112+
113+
This stat is unique across a benchmarked crate, profile, cache state, statistic, rustc artifact, and benchmarks "collection".
114+
115+
```
116+
sqlite> select * from pstat limit 1;
117+
series aid cid value
118+
---------- ---------- ---------- ----------
119+
1 1 1 24.93
120+
```
121+
122+
123+
### self_profile_query_series
124+
125+
**TODO**
126+
127+
### self_profile_query
128+
129+
**TODO**
130+
131+
### pull_request_builds
132+
133+
**TODO**
134+
135+
### artifact_collection_duration
136+
137+
Records how long benchmarking takes in seconds.
138+
139+
```
140+
sqlite> select * from artifact_collection_duration limit 1;
141+
aid date_recorded duration
142+
---------- ------------- ----------
143+
1 1625829965 4
144+
```
145+
146+
### collector_progress
147+
148+
Keeps track of the collector's start and finish time as well as which step it's currently on.
149+
150+
```
151+
sqlite> select * from collector_progress limit 1;
152+
aid step start end
153+
---------- ---------- ---------- ----------
154+
1 helloworld 1625829961 1625829965
155+
```
156+
157+
### rustc_compilation
158+
159+
**TODO**
160+
161+
### error_series
162+
163+
**TODO**
164+
165+
### error
166+
167+
**TODO**

0 commit comments

Comments
 (0)