Add some basic docs #895

rylev · 2021-07-07T16:02:51Z

Starting to add some docs to make onboarding into the codebase a bit easier.

rylev · 2021-07-07T16:04:46Z

site/src/load.rs

 #[derive(Debug, Deserialize)]
 pub struct Config {
    pub keys: Keys,
 }

+/// Site context object that contains global data
 pub struct InputData {


We might want to consider renaming this. InputData is descriptive enough to make me think it's something specific but not descriptive enough to know what that might be.

Considering it's sort of a collection of cached items we just want to keep around, something purposefully vague like SiteCtx might be better.

It's also interesting that it's referred to as Db elsewhere in the code (presumably because it has the pool and cached index.

Happy to see this renamed -- I didn't ever come up with a nice name for it. I think SiteCtx is not a bad idea.

rylev · 2021-07-07T16:05:45Z

site/src/load.rs

 #[derive(Debug, Default, Deserialize)]
 pub struct Keys {
+    /// GitHub API token from the `GITHUB_API_TOKEN` env variable
    pub github: Option<String>,


These fields might have better names like github_api_token and github_webhook_secret.

rylev · 2021-07-07T16:08:03Z

database/src/lib.rs

@@ -434,11 +434,11 @@ pub struct LabelId(pub u8, pub u32);
 #[derive(Serialize, Deserialize, Copy, Clone, PartialEq, Eq, PartialOrd, Ord, Hash, Debug)]
 pub struct ArtifactIdNumber(pub u32);

+/// Cached results of various queries.
 #[derive(Debug, Clone, PartialEq, Eq, Default)]
 pub struct Index {
    commits: Indexed<Commit>,


I'd like to add comments to these fields, but I'm still not exactly sure what they are. For instance, commits seems to be from the artifact table with type master or try. I'm not exactly sure what that should be.

Can you say more about your confusion? These fields are intended to be essentially duplicates of the various tables (artifacts, error_series, pstat_series, etc.) so that we can avoid a network round-trip in the case of Postgres when fetching the ID used in the "main" table.

To some extent I think the presence of these is mostly an artifact of not wanting to write too much SQL -- it's much easier to have ad-hoc searching in Rust code than threading through SQL queries. In practice IIRC it was also a little faster, but whether that's still true or if I'm remembering right I'm less sure.

My comment is a bit old as I know understand more about the structure. Perhaps we should do a short write up on the database structure. We could then refer to that here.

rylev · 2021-07-07T16:39:32Z

collector/src/lib.rs

@@ -15,24 +15,33 @@ pub use self_profile::{QueryData, SelfProfile};
 #[derive(Debug, Copy, Clone, PartialEq, PartialOrd, Deserialize)]
 pub struct DeltaTime(#[serde(with = "round_float")] pub f64);

+/// The bound of a range changes in codebase


Bound doesn't seem to always be a bound in the true sense. If it's a Commit we're usually looking for exact matches. I'm not sure what Bound is really supposed to be.

It is supposed to be "exact" in the sense of identifying a particular commit in our storage, at least at the time of access.

Commits hashes are easy, obviously, though the string in Commit(String) can also be the artifact ID (e.g., 1.53.0 or whatever the user provides on the command line when recording data locally).

rylev · 2021-07-07T16:40:04Z

collector/src/lib.rs

    pub fn left_match(&self, commit: &Commit) -> bool {
-        let last_month = chrono::Utc::now().date().naive_utc() - chrono::Duration::days(30);
        match self {
            Bound::Commit(sha) => commit.sha == **sha,


Here, for example, we don't treat the Bound as a bound but rather an exact match.

Yeah, I think the thing that might be missing in the intuition is that for the date bounds, we don't want an exact match because there may not be commits on that day, but you still want to have the query for e.g. the last month or year of data to work, not error out saying "no commits on 2021-01-01" or something like that.

With commits there's not really such a problem; it doesn't make sense to have a partial match there as much.

collector/src/lib.rs

rylev · 2021-07-07T16:44:44Z

site/src/selector.rs

+///
+/// Searches in the indexed commits either from the right or left
+/// If not found in the indexed commits, searches through the artifacts cache
+pub fn data_for(data: &Index, is_left: bool, bound: Bound) -> Option<ArtifactId> {


This function was hard for me to grok at first. I mean it's simple to understand, but every time I came across it, I'd have to reread the entire body because the name didn't help me at all to remember what it does.

As always, happy to make any changes -- I've never had good experiences naming things in rustc-perf.

rylev · 2021-07-08T15:38:14Z

site/src/comparison.rs

@@ -257,18 +259,30 @@ pub async fn compare_given_commits(
    }))
 }

-/// Data associated with a specific date
+/// Data associated with a specific artifact
 #[derive(Debug, Clone, Serialize)]
 pub struct DateData {


This should probably be renamed ArtifactData as this is for a given artifact which may or may not have a date associated with it.

Seems reasonable. It was named prior to the notion of artifacts -- and "CommitData" was already used for something else I think at that time...

rylev · 2021-07-08T15:40:10Z

site/src/comparison.rs

 #[derive(Debug, Clone, Serialize)]
 pub struct DateData {
+    /// The artifact in question
+    pub commit: String,


This should probably be artifact: ArtifactId since commit is wrong (it might be an artifact "tag" and not a commit sha) and there's no need for it to be a string at this point.

rylev · 2021-07-08T15:41:46Z

site/src/comparison.rs

+    /// For the given `ArtifactId`, consume the first datapoint in each of the given `SeriesResponse`
+    ///
+    /// It is assumed that the provided ArtifactId is the same as artifact id returned as the next data
+    /// point from all of the series `SeriesResponse`s. If this is not true, this function will panic.
    async fn consume_one<'a, T>(


I finally understand why this is called consume_one but I don't really like the name....

Mark-Simulacrum · 2021-07-08T19:21:49Z

database/src/lib.rs

+/// The incremental cache state
+///
+/// These are usually reported to users in a "flipped" way. For example,
+/// `Cache::Empty` means we're doing a "full" build. We present this to users as "full".


FWIW it would be fine by me to rename the enum variants here -- they were named when we were trying different things but it's no big deal to rename if needed.

Mark-Simulacrum · 2021-07-09T00:22:12Z

site/src/comparison.rs

+    ///
+    /// * $profile refers to the flavor of compilation like debug, doc, opt(timized), etc.
+    /// * $cache refers to how much of the compilation must be done and how much is cached
+    /// (e.g., "incr-unchanged" == compiling with full incremental cache and no code having changed)


It's plausible we could have non-String data here too. IIRC it's like this because making sure all the JS code is up to date when changing the Rust serialization was a bit of a pain - and Serde can be a bit picky in terms of serializing enums and struct nicely when they're nested inside vecs/hashmaps etc

Mark-Simulacrum · 2021-07-09T00:27:16Z

Left comments on most of your comments - largely I think everything looks good in terms of existing docs, don't think I had any concerns.

rylev · 2021-07-09T08:14:23Z

I'm going to merge this since there's no real concerns. I'll submit a small PR renaming things. I definitely feel much better about the code now. As far as documentation goes, I think the only other major thing I'd like to do is to document the database structure. Other changes can just be made along the way.

Add some basic docs

368af5b

rylev requested a review from Mark-Simulacrum July 7, 2021 16:02

rylev commented Jul 7, 2021

View reviewed changes

More docs

2370f97

rylev commented Jul 7, 2021

View reviewed changes

collector/src/lib.rs Show resolved Hide resolved

rylev commented Jul 7, 2021

View reviewed changes

rylev added 2 commits July 7, 2021 18:54

More docs

7f2f18e

More docs

c8fcecc

rylev commented Jul 8, 2021

View reviewed changes

More docs

a038e11

Mark-Simulacrum reviewed Jul 8, 2021

View reviewed changes

Mark-Simulacrum reviewed Jul 9, 2021

View reviewed changes

rylev merged commit 2132441 into rust-lang:master Jul 9, 2021

rylev deleted the add-docs1 branch July 9, 2021 08:14

rylev mentioned this pull request Jul 9, 2021

Some Refactorings #898

Merged

Add some basic docs #895

Add some basic docs #895

Uh oh!

Conversation

rylev commented Jul 7, 2021

Uh oh!

rylev Jul 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Mark-Simulacrum commented Jul 9, 2021

Uh oh!

rylev commented Jul 9, 2021

Uh oh!

Uh oh!

rylev Jul 7, 2021 •

edited

Loading