Skip to content

Commit 4c652be

Browse files
authored
Merge pull request #58 from matklad/three-ides
Three Architectures
2 parents 1d29c3b + 1bb4351 commit 4c652be

File tree

1 file changed

+299
-0
lines changed

1 file changed

+299
-0
lines changed
Lines changed: 299 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,299 @@
1+
= Three Architectures for a Responsive IDE
2+
:sectanchors:
3+
:experimental:
4+
:page-layout: post
5+
6+
rust-analyzer is a new "IDE backend" for the https://www.rust-lang.org/[Rust] programming language.
7+
Support rust-analyzer on https://opencollective.com/rust-analyzer/[Open Collective].
8+
9+
In this post, we'll learn how to make a snappy IDE, in three different ways :-)
10+
It was inspired by this excellent article about using datalog for semantic analysis: https://petevilter.me/post/datalog-typechecking/
11+
The post describes only the highest-level architecture.
12+
There's **much** more to implementing a full-blown IDE.
13+
14+
Specifically, we'll look at the backbone infrastructure of an IDE which serves two goals:
15+
16+
* Quickly accepting new edits to source files.
17+
* Providing type information about currently opened files for highlighting, completion, etc.
18+
19+
== Map Reduce
20+
21+
The first architecture is reminiscent of map-reduce paradigm.
22+
The idea is to split analysis into relatively simple indexing phase, and a separate full analysis phase.
23+
24+
The core constraint of indexing is that it runs on a per-file basis.
25+
The indexer takes a text of a single file, parses it, and spits out some data about the file.
26+
The indexer can't touch other files.
27+
28+
Full analysis can read other files, and it leverages information from the index to save work.
29+
30+
This all sounds way too abstract, so let's look at a specific example -- Java.
31+
In Java, each file starts with a package declaration.
32+
The indexer concatenates the name of the package with a class name to get a fully-qualified name (FQN).
33+
It also collects the set of method declared in the class, the list of superclasses and interfaces, etc.
34+
35+
Per-file data is merged into an index which maps FQNs to classes.
36+
Note that constructing this mapping is an embarrassingly parallel task -- all files are parsed independently.
37+
Moreover, this map is cheap to update.
38+
When a file change arrives, this file's contribution from the index is removed, the text of file is changed and the indexer runs on the new text and adds new contributions.
39+
The amount of work to do is proportional to the number of changed files, and is independent from the total number of files.
40+
41+
Let's see how FQN index can be used to quickly provide completion.
42+
43+
[source,java]
44+
----
45+
// File ./mypackage/Foo.java
46+
package mypackage;
47+
48+
import java.util.*;
49+
50+
public class Foo {
51+
public static Bar f() {
52+
return new Bar();
53+
}
54+
}
55+
56+
// File ./mypackage/Bar.java
57+
package mypackage;
58+
59+
public class Bar {
60+
public void g() {}
61+
}
62+
63+
// File ./Main.java
64+
import mypackage.Foo;
65+
66+
public class Main {
67+
public static void main(String[] args) {
68+
Foo.f().
69+
}
70+
}
71+
----
72+
73+
The user has just typed `Foo.f().`, and we need to figure out that the type of receiver expression is `Bar`, and suggest `g` as a completion.
74+
75+
First, as the `Main.java` file is modified, we run the indexer on this single file.
76+
Nothing has changed (the file still contains `Main` class with static `main` method), so we don't need to update FQN index.
77+
78+
Next, we need to resolve `Foo` name.
79+
We parse the file, notice an `import` and lookup `mypackage.Foo` in the FQN index.
80+
In the index, we also find that `Foo` has a static method `f`, so we resolve the call as well.
81+
The index also stores the return type of `f`, but, and this is crucial, it stores it as a string `"Bar"`, and not as a direct reference to the class `Bar`.
82+
83+
The reason for that is `+import java.util.*+` in `Foo.java`.
84+
`Bar` can refer either to `java.util.Bar` or to `mypackage.Bar`.
85+
The indexer doesn't know which one, because it can look *only* at the text of the `Foo.java`.
86+
In other words, while the index does store the return types of methods, it stores them in an unresolved form.
87+
88+
The next step is to resolve `Bar` identifier in the context of `Foo.java` file.
89+
This uses FQN index, and lands into the `mypackage.Bar` class.
90+
There the desired `g` method is found.
91+
92+
All together, only three files were touched during completion.
93+
FQN index allowed to completely ignore all the other files in the project.
94+
95+
One problem with the approach described thus far is that resolving types from the index requires non-trivial amount of work.
96+
This work might be duplicated if, for example, `Foo.f` is called several times.
97+
The fix is to add a cache.
98+
Name resolution results are memoized, so that the cost is paid only once.
99+
The cache is blown away completely on any change -- with an index, reconstructing the cache is not that costly.
100+
101+
To sum up, the first approach works like this:
102+
103+
. Each file is being indexed, independently and in parallel, producing a "stub" -- a set of visible top-level declarations, with unresolved types.
104+
. All stubs are merged into a single index data structure.
105+
. Name resolution and type inference work primarily off the stubs.
106+
. Name resolution is lazy (we only resolved type from the stub when we need it) and memoized (each type is resolved only once).
107+
. The caches are completely invalidated on every change
108+
. The index is updated incrementally:
109+
* if the edit doesn't change file's stub, no change to the index is required.
110+
* otherwise, old keys are removed and new keys are added
111+
112+
Note an interesting interplay between "dumb" indexes which can be incrementally updated, and "smart" caches, which are re-computed from scratch.
113+
114+
This approach combines simplicity and stellar performance.
115+
The bulk of work is the indexing phase, and you can parallelize and even distribute it across several machine.
116+
The two example of this architecture are https://www.jetbrains.com/idea/[IntelliJ] and https://sorbet.org/[Sorbet].
117+
118+
The main drawback of this approach is that it works only when it works -- not every language has a well-defined FQN concept.
119+
I think overall it's a good idea to design name resolution and module system (mostly boring parts of the language) such that they work well with map-reduce paradigm.
120+
121+
* Require `package` declarations or infer them from a file-system layout
122+
* Forbid meta-programming facilities which add new top-level declarations, or restrict them in such way that they can be used by the indexer.
123+
For example, preprocessor-like compiler plugins that access a single file at a time might be fine.
124+
* Make sure that each source element corresponds to a single semantic element.
125+
For example, if the language supports conditional compilation, make sure that it works during name resolution (like Kotlin's https://kotlinlang.org/docs/reference/platform-specific-declarations.html[expect/actual]) and not during parsing (like conditional compilation in most other languages).
126+
Otherwise, you'd have to index the same file with different conditional compilation settings, and that is messy.
127+
* Make sure that FQN are enough for most of the name resolution.
128+
129+
The last point is worth elaborating. Let's look at the following Rust example:
130+
131+
[source,rust]
132+
----
133+
// File: ./foo.rs
134+
trait T {
135+
fn f(&self) {}
136+
}
137+
// File: ./bar.rs
138+
struct S;
139+
140+
// File: ./somewhere/else.rs
141+
impl T for S {}
142+
143+
// File: ./main.s
144+
use foo::T;
145+
use bar::S
146+
147+
fn main() {
148+
let s = S;
149+
s.f();
150+
}
151+
----
152+
153+
Here, we can easily find the `S` struct and the `T` trait (as they are imported directly).
154+
However, to make sure that `s.f` indeed refers to `f` from `T`, we also need to find the corresponding `impl`, and that can be roughly anywhere!
155+
156+
== Leveraging Headers
157+
158+
The second approach places even more restrictions on the language.
159+
It requires:
160+
161+
* "declaration before use" rule,
162+
* headers or equivalent interface files.
163+
164+
Two such languages are {cpp} and OCaml.
165+
166+
The idea of the approach is simple -- just use traditional compiler, by snapshotting its state immediately after imports for each compilation unit.
167+
An example:
168+
169+
[source,c++]
170+
----
171+
#include <iostream>
172+
173+
void main() {
174+
std::cout << "Hello, World!" << std::
175+
}
176+
----
177+
178+
Here, the compiler fully processes `iostream` (and any further headers it includes), snapshots its state and proceeds with parsing the program itself.
179+
When the user types more characters, the compiler restarts from the point just after the include.
180+
As the size of each compilation unit itself is usually reasonable, the analysis is fast.
181+
182+
If the user types something into the header file than the caches need to be invalidated.
183+
However, changes to headers are comparatively rare, most of the code lives in `.cpp` files.
184+
185+
In a sense, headers correspond to the stubs of the first approach, with two notable differences:
186+
187+
* It's the user who is tasked with producing a stub, not the tool.
188+
* Unlike stubs, headers can't be mutually recursive.
189+
Stubs store unresolved types, but includes can be snapshotted after complete analysis.
190+
191+
The two examples of this approach are https://github.com/ocaml/merlin[Merlin] of OCaml and https://clangd.llvm.org/[clangd].
192+
193+
The huge benefit of this approach is that it allows re-use of an existing batch compiler.
194+
Two other approaches described in the article typically result in compiler re-writes.
195+
The drawback is that almost nobody likes headers and forward declarations.
196+
197+
198+
== Intermission: Laziness vs Incrementality
199+
200+
Note how neither of the two approaches is incremental in any interesting way.
201+
It is mostly "if something has changed, let's clear the caches completely".
202+
There's a tiny bit of incrementality in the index update in the first approach, but it is almost trivial -- remove old keys, add new keys.
203+
204+
This is because it's not the incrementality that makes and IDE fast.
205+
Rather, it's laziness -- the ability to skip huge swaths of code altogether.
206+
207+
With map-reduce, the index tells us exactly which small set of files is used from the current file and is worth looking at.
208+
Headers shield us from most of the implementation code.
209+
210+
== Query-based Compiler
211+
212+
Welcome to my world...
213+
214+
Rust fits the described approaches like a square peg into a round hole.
215+
216+
Here's a small example:
217+
218+
[source,rust]
219+
----
220+
#[macro_use]
221+
extern crate bitflags;
222+
223+
bitflags! {
224+
struct Flags: u32 {
225+
const A = 0b00000001;
226+
const B = 0b00000010;
227+
const C = 0b00000100;
228+
const ABC = Self::A.bits | Self::B.bits | Self::C.bits;
229+
}
230+
}
231+
----
232+
233+
`bitflags` is macro which comes from another crate and defines a top-level declaration.
234+
We can't put the results of macro expansion into the index, because it depends on macro definition in another file.
235+
We can put macro call itself into an index, but that is mostly useless, as the items, declared by the macro, would miss the index.
236+
237+
Here's another one:
238+
239+
[source,rust]
240+
----
241+
mod foo;
242+
243+
#[path = "foo.rs"]
244+
mod bar;
245+
----
246+
247+
Modules `foo` and `bar` refer to the same file, `foo.rs`, which effectively means that items from `foo.rs` are duplicated.
248+
If `foo.rs` contains `struct S;` declaration, than `foo::S` and `bar::S` are different types.
249+
You also can't fit that into an index, because those `mod` declarations are in a different file.
250+
251+
The second approach doesn't work either.
252+
In {cpp}, the compilation unit is a single file.
253+
In Rust, the compilation unit is a whole crate, which consists of many files and is typically much bigger.
254+
And Rust has procedural macros, which means that even surface analysis of code can take unbounded amount of time.
255+
And there are no header files, so IDE has to process the whole crate.
256+
Additionally, intra-crate name resolution is much more complicated (declaration before use vs. fixed point iteration intertwined with macro expansion).
257+
258+
It seems that purely laziness based models do not work for Rust.
259+
The minimal feasible unit of laziness, a crate, is still too big.
260+
261+
For this reason, in rust-analyzer we resort to a smart solution.
262+
We compensate for the deficit of laziness with incrementality.
263+
Specifically, we use a generic framework for incremental computation -- https://github.com/salsa-rs/salsa[salsa].
264+
265+
The idea behind salsa is rather simple -- all function calls inside the compiler are instrumented to record which other functions were called during execution.
266+
The recorded traces are used to implement fine-grained incrementality.
267+
If after modification the results of all of the dependencies are the same, the old result is reused.
268+
269+
270+
There's also an additional, crucial, twist -- if a function is re-executed due to a change in dependency, the new result is compared with the old one.
271+
If despite a different input they are the same, the propagation of invalidation stops.
272+
273+
Using this engine, we were able to implement rather fancy update strategy.
274+
Unlike map reduce approach, our indices can store resolved types, which are invalidated only when top-level change occurs.
275+
Even after a top-level change, we are able to re-use results of most macro expansions.
276+
And typing inside top-level macro also doesn't invalidate caches unless the expansion of the macro introduces a different set of items.
277+
278+
The main benefit of this approach is generality and correctness.
279+
If you have an incremental computation engine at your disposal, it becomes relatively easy to experiment with the way you structure the computation.
280+
The code looks mostly like a boring imperative compiler, and you are immune from cache invalidation bugs (we had one, due to procedural macro being non-deterministic).
281+
282+
The main drawback is extra complexity, slower performance (fine-grained tracking of dependencies takes time and memory) and a feeling that this is a somewhat uncharted territory yet :-)
283+
284+
== Links
285+
286+
How IntelliJ works::
287+
https://jetbrains.org/intellij/sdk/docs/basics/indexing_and_psi_stubs.html
288+
289+
How Sorbet works::
290+
https://www.youtube.com/watch?v=Gdx6by6tcvw
291+
292+
How clangd works::
293+
https://clangd.llvm.org/design/
294+
295+
How Merlin works::
296+
https://arxiv.org/abs/1807.06702
297+
298+
How rust-analyzer works::
299+
https://github.com/rust-analyzer/rust-analyzer/tree/master/docs/dev

0 commit comments

Comments
 (0)