Skip to content

Commit 2b701ef

Browse files
authored
Merge pull request #60 from jplatte/typos-wording
Fix typos, improve wording in latest blog post
2 parents 2182e7e + 84ca9c5 commit 2b701ef

File tree

4 files changed

+40
-40
lines changed

4 files changed

+40
-40
lines changed

blog/_posts/2020-07-20-three-architectures-for-responsive-ide.adoc

Lines changed: 37 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -18,24 +18,24 @@ Specifically, we'll look at the backbone infrastructure of an IDE which serves t
1818
1919
== Map Reduce
2020

21-
The first architecture is reminiscent of map-reduce paradigm.
21+
The first architecture is reminiscent of the map-reduce paradigm.
2222
The idea is to split analysis into relatively simple indexing phase, and a separate full analysis phase.
2323

2424
The core constraint of indexing is that it runs on a per-file basis.
25-
The indexer takes a text of a single file, parses it, and spits out some data about the file.
25+
The indexer takes the text of a single file, parses it, and spits out some data about the file.
2626
The indexer can't touch other files.
2727

2828
Full analysis can read other files, and it leverages information from the index to save work.
2929

3030
This all sounds way too abstract, so let's look at a specific example -- Java.
3131
In Java, each file starts with a package declaration.
3232
The indexer concatenates the name of the package with a class name to get a fully-qualified name (FQN).
33-
It also collects the set of method declared in the class, the list of superclasses and interfaces, etc.
33+
It also collects the set of methods declared in the class, the list of superclasses and interfaces, etc.
3434

3535
Per-file data is merged into an index which maps FQNs to classes.
3636
Note that constructing this mapping is an embarrassingly parallel task -- all files are parsed independently.
3737
Moreover, this map is cheap to update.
38-
When a file change arrives, this file's contribution from the index is removed, the text of file is changed and the indexer runs on the new text and adds new contributions.
38+
When a file change arrives, this file's contribution from the index is removed, the text of the file is changed and the indexer runs on the new text and adds the new contributions.
3939
The amount of work to do is proportional to the number of changed files, and is independent from the total number of files.
4040

4141
Let's see how FQN index can be used to quickly provide completion.
@@ -72,27 +72,27 @@ public class Main {
7272

7373
The user has just typed `Foo.f().`, and we need to figure out that the type of receiver expression is `Bar`, and suggest `g` as a completion.
7474

75-
First, as the `Main.java` file is modified, we run the indexer on this single file.
76-
Nothing has changed (the file still contains `Main` class with static `main` method), so we don't need to update FQN index.
75+
First, as the file `Main.java` is modified, we run the indexer on this single file.
76+
Nothing has changed (the file still contains the class `Main` with a static `main` method), so we don't need to update the FQN index.
7777

78-
Next, we need to resolve `Foo` name.
79-
We parse the file, notice an `import` and lookup `mypackage.Foo` in the FQN index.
78+
Next, we need to resolve the name `Foo`.
79+
We parse the file, notice an `import` and look up `mypackage.Foo` in the FQN index.
8080
In the index, we also find that `Foo` has a static method `f`, so we resolve the call as well.
8181
The index also stores the return type of `f`, but, and this is crucial, it stores it as a string `"Bar"`, and not as a direct reference to the class `Bar`.
8282

8383
The reason for that is `+import java.util.*+` in `Foo.java`.
8484
`Bar` can refer either to `java.util.Bar` or to `mypackage.Bar`.
85-
The indexer doesn't know which one, because it can look *only* at the text of the `Foo.java`.
85+
The indexer doesn't know which one, because it can look *only* at the text of `Foo.java`.
8686
In other words, while the index does store the return types of methods, it stores them in an unresolved form.
8787

88-
The next step is to resolve `Bar` identifier in the context of `Foo.java` file.
89-
This uses FQN index, and lands into the `mypackage.Bar` class.
90-
There the desired `g` method is found.
88+
The next step is to resolve the identifier `Bar` in the context of `Foo.java`.
89+
This uses the FQN index, and lands in the class `mypackage.Bar`.
90+
There the desired method `g` is found.
9191

92-
All together, only three files were touched during completion.
93-
FQN index allowed to completely ignore all the other files in the project.
92+
Altogether, only three files were touched during completion.
93+
The FQN index allowed us to completely ignore all the other files in the project.
9494

95-
One problem with the approach described thus far is that resolving types from the index requires non-trivial amount of work.
95+
One problem with the approach described thus far is that resolving types from the index requires a non-trivial amount of work.
9696
This work might be duplicated if, for example, `Foo.f` is called several times.
9797
The fix is to add a cache.
9898
Name resolution results are memoized, so that the cost is paid only once.
@@ -103,28 +103,28 @@ To sum up, the first approach works like this:
103103
. Each file is being indexed, independently and in parallel, producing a "stub" -- a set of visible top-level declarations, with unresolved types.
104104
. All stubs are merged into a single index data structure.
105105
. Name resolution and type inference work primarily off the stubs.
106-
. Name resolution is lazy (we only resolved type from the stub when we need it) and memoized (each type is resolved only once).
106+
. Name resolution is lazy (we only resolve a type from the stub when we need it) and memoized (each type is resolved only once).
107107
. The caches are completely invalidated on every change
108108
. The index is updated incrementally:
109-
* if the edit doesn't change file's stub, no change to the index is required.
109+
* if the edit doesn't change the file's stub, no change to the index is required.
110110
* otherwise, old keys are removed and new keys are added
111111

112-
Note an interesting interplay between "dumb" indexes which can be incrementally updated, and "smart" caches, which are re-computed from scratch.
112+
Note an interesting interplay between "dumb" indexes which can be updated incrementally, and "smart" caches, which are re-computed from scratch.
113113

114114
This approach combines simplicity and stellar performance.
115115
The bulk of work is the indexing phase, and you can parallelize and even distribute it across several machine.
116-
The two example of this architecture are https://www.jetbrains.com/idea/[IntelliJ] and https://sorbet.org/[Sorbet].
116+
Two examples of this architecture are https://www.jetbrains.com/idea/[IntelliJ] and https://sorbet.org/[Sorbet].
117117

118118
The main drawback of this approach is that it works only when it works -- not every language has a well-defined FQN concept.
119-
I think overall it's a good idea to design name resolution and module system (mostly boring parts of the language) such that they work well with map-reduce paradigm.
119+
I think overall it's a good idea to design name resolution and module systems (mostly boring parts of a language) such that they work well with the map-reduce paradigm.
120120

121-
* Require `package` declarations or infer them from a file-system layout
121+
* Require `package` declarations or infer them from the file-system layout
122122
* Forbid meta-programming facilities which add new top-level declarations, or restrict them in such way that they can be used by the indexer.
123123
For example, preprocessor-like compiler plugins that access a single file at a time might be fine.
124124
* Make sure that each source element corresponds to a single semantic element.
125125
For example, if the language supports conditional compilation, make sure that it works during name resolution (like Kotlin's https://kotlinlang.org/docs/reference/platform-specific-declarations.html[expect/actual]) and not during parsing (like conditional compilation in most other languages).
126126
Otherwise, you'd have to index the same file with different conditional compilation settings, and that is messy.
127-
* Make sure that FQN are enough for most of the name resolution.
127+
* Make sure that FQNs are enough for most of the name resolution.
128128

129129
The last point is worth elaborating. Let's look at the following Rust example:
130130

@@ -158,12 +158,12 @@ However, to make sure that `s.f` indeed refers to `f` from `T`, we also need to
158158
The second approach places even more restrictions on the language.
159159
It requires:
160160

161-
* "declaration before use" rule,
161+
* a "declaration before use" rule,
162162
* headers or equivalent interface files.
163163

164164
Two such languages are {cpp} and OCaml.
165165

166-
The idea of the approach is simple -- just use traditional compiler, by snapshotting its state immediately after imports for each compilation unit.
166+
The idea of the approach is simple -- just use a traditional compiler and snapshot its state immediately after imports for each compilation unit.
167167
An example:
168168

169169
[source,c++]
@@ -175,11 +175,11 @@ void main() {
175175
}
176176
----
177177

178-
Here, the compiler fully processes `iostream` (and any further headers it includes), snapshots its state and proceeds with parsing the program itself.
178+
Here, the compiler fully processes `iostream` (and any further headers included), snapshots its state and proceeds with parsing the program itself.
179179
When the user types more characters, the compiler restarts from the point just after the include.
180180
As the size of each compilation unit itself is usually reasonable, the analysis is fast.
181181

182-
If the user types something into the header file than the caches need to be invalidated.
182+
If the user types something into the header file, then the caches need to be invalidated.
183183
However, changes to headers are comparatively rare, most of the code lives in `.cpp` files.
184184

185185
In a sense, headers correspond to the stubs of the first approach, with two notable differences:
@@ -191,7 +191,7 @@ In a sense, headers correspond to the stubs of the first approach, with two nota
191191
The two examples of this approach are https://github.com/ocaml/merlin[Merlin] of OCaml and https://clangd.llvm.org/[clangd].
192192

193193
The huge benefit of this approach is that it allows re-use of an existing batch compiler.
194-
Two other approaches described in the article typically result in compiler re-writes.
194+
The two other approaches described in this article typically result in compiler re-writes.
195195
The drawback is that almost nobody likes headers and forward declarations.
196196

197197

@@ -231,8 +231,8 @@ bitflags! {
231231
----
232232

233233
`bitflags` is macro which comes from another crate and defines a top-level declaration.
234-
We can't put the results of macro expansion into the index, because it depends on macro definition in another file.
235-
We can put macro call itself into an index, but that is mostly useless, as the items, declared by the macro, would miss the index.
234+
We can't put the results of macro expansion into the index, because it depends on a macro definition in another file.
235+
We can put the macro call itself into an index, but that is mostly useless, as the items, declared by the macro, would miss the index.
236236

237237
Here's another one:
238238

@@ -245,14 +245,14 @@ mod bar;
245245
----
246246

247247
Modules `foo` and `bar` refer to the same file, `foo.rs`, which effectively means that items from `foo.rs` are duplicated.
248-
If `foo.rs` contains `struct S;` declaration, than `foo::S` and `bar::S` are different types.
248+
If `foo.rs` contains the declaration `struct S;`, then `foo::S` and `bar::S` are different types.
249249
You also can't fit that into an index, because those `mod` declarations are in a different file.
250250

251251
The second approach doesn't work either.
252252
In {cpp}, the compilation unit is a single file.
253253
In Rust, the compilation unit is a whole crate, which consists of many files and is typically much bigger.
254-
And Rust has procedural macros, which means that even surface analysis of code can take unbounded amount of time.
255-
And there are no header files, so IDE has to process the whole crate.
254+
And Rust has procedural macros, which means that even surface analysis of code can take an unbounded amount of time.
255+
And there are no header files, so the IDE has to process the whole crate.
256256
Additionally, intra-crate name resolution is much more complicated (declaration before use vs. fixed point iteration intertwined with macro expansion).
257257

258258
It seems that purely laziness based models do not work for Rust.
@@ -262,22 +262,22 @@ For this reason, in rust-analyzer we resort to a smart solution.
262262
We compensate for the deficit of laziness with incrementality.
263263
Specifically, we use a generic framework for incremental computation -- https://github.com/salsa-rs/salsa[salsa].
264264

265-
The idea behind salsa is rather simple -- all function calls inside the compiler are instrumented to record which other functions were called during execution.
265+
The idea behind salsa is rather simple -- all function calls inside the compiler are instrumented to record which other functions were called during their execution.
266266
The recorded traces are used to implement fine-grained incrementality.
267267
If after modification the results of all of the dependencies are the same, the old result is reused.
268268

269269

270270
There's also an additional, crucial, twist -- if a function is re-executed due to a change in dependency, the new result is compared with the old one.
271271
If despite a different input they are the same, the propagation of invalidation stops.
272272

273-
Using this engine, we were able to implement rather fancy update strategy.
274-
Unlike map reduce approach, our indices can store resolved types, which are invalidated only when top-level change occurs.
273+
Using this engine, we were able to implement a rather fancy update strategy.
274+
Unlike the map reduce approach, our indices can store resolved types, which are invalidated only when a top-level change occurs.
275275
Even after a top-level change, we are able to re-use results of most macro expansions.
276-
And typing inside top-level macro also doesn't invalidate caches unless the expansion of the macro introduces a different set of items.
276+
And typing inside of a top-level macro also doesn't invalidate caches unless the expansion of the macro introduces a different set of items.
277277

278278
The main benefit of this approach is generality and correctness.
279279
If you have an incremental computation engine at your disposal, it becomes relatively easy to experiment with the way you structure the computation.
280-
The code looks mostly like a boring imperative compiler, and you are immune from cache invalidation bugs (we had one, due to procedural macro being non-deterministic).
280+
The code looks mostly like a boring imperative compiler, and you are immune to cache invalidation bugs (we had one, due to procedural macros being non-deterministic).
281281

282282
The main drawback is extra complexity, slower performance (fine-grained tracking of dependencies takes time and memory) and a feeling that this is a somewhat uncharted territory yet :-)
283283

thisweek/_posts/2020-07-06-changelog-32.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ This week we'd like to thank one of our individual sponsors, https://github.com/
1212
@anp maintains https://moxie.rs[moxie] project: a lightweight platform-agnostic declarative UI runtime, because incremental feedback latency and quality are core to building interactive software.
1313

1414

15-
**Become a sponsor:** https://opencollective.com/rust-analyzer/[opecollective.com/rust-analyzer]
15+
**Become a sponsor:** https://opencollective.com/rust-analyzer/[opencollective.com/rust-analyzer]
1616

1717
== New Features
1818

thisweek/_posts/2020-07-13-changelog-33.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ Ferrous Systems offers advice, training, open source development, and proprietar
1212
This week, Ferrous Systems is running https://oxidizeconf.com/[Oxidize Global], an online conference for embedded systems in Rust!
1313
Thanks to Ferrous Systems for supporting open source projects like rust-analyzer!
1414

15-
**Become a sponsor:** https://opencollective.com/rust-analyzer/[opecollective.com/rust-analyzer]
15+
**Become a sponsor:** https://opencollective.com/rust-analyzer/[opencollective.com/rust-analyzer]
1616

1717
== New Features
1818

thisweek/_posts/2020-07-20-changelog-34.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ Release: release:2020-07-20[]
77

88
== Sponsors
99

10-
**Become a sponsor:** https://opencollective.com/rust-analyzer/[opecollective.com/rust-analyzer]
10+
**Become a sponsor:** https://opencollective.com/rust-analyzer/[opencollective.com/rust-analyzer]
1111

1212
== New Features
1313

0 commit comments

Comments
 (0)