Rewrote several sections to improve the clarity of the story

erichgess · erichgess · commit b11494ff628c · 2021-04-12T10:04:51.000-04:00
diff --git a/src/vision/status_quo/barbara_simulates_hydrodynamics.md b/src/vision/status_quo/barbara_simulates_hydrodynamics.md
@@ -17,14 +17,11 @@ Her first attempt to was to emulate a common CFD design pattern: using message p
 This solution was fine, but Barbara was not satisified.  She had two problems with it: first, she didn't like that the design would greedily use all the resources on the machine and, second, when her team added a new feature (tracer particles) that increased the complexity of how patches interact with each other and the number of messages that were passsed between threads increased to the point that it became a performance bottleneck and parallel processing became no faster than serial processing. So, Barbara decided to find a better solution.
 
 ### Solution Path
-What Barbara wanted to do was find a way to more efficiently use threads: have a fixed number of threads that each mapped to a core on the CPU and assign patches to those threads as patches became ready to compute. The `async` feature seemed to provide exactly that behavior. And to move away from the message passing design, because the number of messages being passed was proportional to the number of trace particles being traced.
+What Barbara wanted use the CPU more efficiently: she would decouple the work that needed to be done (the patches) from the workers (threads) this would allow her to more finely control how many resources were used. So, she began looking for a tool in Rust that would meet this design pattern. When she read about `async` and how it allowed the user to define units of work, called tasks, and send those to an executor which would manage the execution of those tasks across a set of workers, she thought she'd found exactly what she needed. Further reading indicate that `tokio` was the runtime of choice for `async` in the community and so she began building a new CFD tool with `async` and `tokio`. And to move away from the message passing design, because the number of messages being passed was proportional to the number of trace particles being traced.
 
-As Barbara began working on her new design with `tokio`, her use of `async` went from a general (from the textbook) use of basic `async` features to a more specific implementation leveraging exactly the features that were most suited for her needs.
-
-At first, Barbara was under a false impression about what async executors do. She had assumed that a multi-threaded executor could automatically move the execution of an async block to a worker thread. Then she discovered that async tasks must be explicitly spawned into a thread pool if they are to be executed on a worker thread. This meant that the algorithm to be parallelized became strongly coupled to both the spawner and the executor. Code that used to cleanly express a physics algorithm now had interspersed references to the task spawner, not only making it harder to understand, but also making it impossible to try different execution strategies, since with Tokio the spawner and executor are the same object (the Tokio runtime). Barbara feels that a better design for data parallelism would enable better separation of concerns: a group of interdependent compute tasks, and a strategy to execute them in parallel.
-
-In order to remove the need for message passing, Barbara moved to a shared state design: she would keep a table tracking the solution state for every grid patch and a specific patch would only start its computation task when solutions were written for all the patches it was dependent on. So, each task needs to access the table with the solution results of all the other tasks. Learning how to properly use shared data with `async` was a new challenge. The initial design:
+As Barbara began working on her new design with `tokio`, her use of `async` went from a general (from the textbook) use of basic `async` features to a more specific implementation leveraging exactly the features that were most suited for her needs. At first, Barbara was under a false impression about what async executors do. She had assumed that a multi-threaded executor could automatically move the execution of an async block to a worker thread. When this turned out to wrong, she went to Stackoverflow and learned that async tasks must be explicitly spawned into a thread pool if they are to be executed on a worker thread. This meant that the algorithm to be parallelized became strongly coupled to both the spawner and the executor. Code that used to cleanly express a physics algorithm now had interspersed references to the task spawner, not only making it harder to understand, but also making it impossible to try different execution strategies, since with Tokio the spawner and executor are the same object (the Tokio runtime). Barbara felt that a better design for data parallelism would enable better separation of concerns: a group of interdependent compute tasks, and a strategy to execute them in parallel.
 
+Along with moving the execution of the computational tasks to `async`, Barbara also used this as an opportunity to remove the message passing that was used to coordinate the computation of each patch. She used the `async` API to define dependencies between patches so that a patch would only begin computing its solution when its neighboring patches had completed. This also required setting up shared state that would store the solutions for all the patches as they were computed, so that dependents could access them. Learning how to properly use shared data with `async` was a new challenge. The initial design:
 ```rust
     let mut stage_primitive_and_scalar = |index: BlockIndex, state: BlockState<C>, hydro: H, geometry: GridGeometry| {
         let stage = async move {
@@ -43,9 +40,9 @@ Ok::<_, HydroError>( ( p.to_shared(), s.to_shared() ) )
 ```
 And she could not figure out why she had to add the `::<_, HydroError>` to some of the `Result` values.
 
-Once her team began using the new `async` design for their simulations, they noticed an important issue that impacted productivity: compilation time had now increased to between 30 and 60 seconds. The nature of their work requires frequent changes to code and recompilation and 30-60 seconds is long enough to have a noticeable impact on their quality of life.  What her and her team want is for compilation to be 2 to 3 seconds. Barbara believes that the use of `async` is a major contributor to the long compilation times.
+Finally, once her team began using the new `async` design for their simulations, they noticed an important issue that impacted productivity: compilation time had now increased to between 30 and 60 seconds. The nature of their work requires frequent changes to code and recompilation and 30-60 seconds is long enough to have a noticeable impact on their quality of life.  What her and her team want is for compilation to be 2 to 3 seconds. Barbara believes that the use of `async` is a major contributor to the long compilation times.
 
-This new solution works, but Barbara is not satisfied with how complex her code wound up becoming with the move to `async` and the fact that compilation time is now 30-60 seconds.  The state sharing adding a large amount of cruft with `Arc` and `async` is not well suited for using a dependency graph to schedule tasks so implementing this solution created a key component of her program that was difficult to understand and pervasive. Ultimately, her conclusion was that `async` is not appropriate for parallelizing computational tasks. She will be trying a new design based upon Rayon in the next version of her application.
+This new solution works, but Barbara is not satisfied with how complex her code became after the move to `async` and that compilation time is now 30-60 seconds.  The state sharing adding a large amount of cruft with `Arc` and `async` is not well suited for using a dependency graph to schedule tasks so implementing this solution created a key component of her program that was difficult to understand and pervasive. Ultimately, her conclusion was that `async` is not appropriate for parallelizing computational tasks. She will be trying a new design based upon Rayon in the next version of her application.
 
 ## 🤔 Frequently Asked Questions