api: Use streaming UTF-8 + JSON decoding for the response

rajveermalviya · rajveermalviya · commit d19b5fdea083 · 2024-10-01T21:27:09.000+05:30
Before this change, the JSON response body was fully downloaded before decoding it into a Map using `jsonUtf8Decoder`. For larger response bodies, such as the `/register` endpoint on CZO (~14MB uncompressed) the CPU would often remain idle while waiting for the entire response to download before starting the decoding process. With this change, the response byte stream is now piped directly into the `jsonUtf8Decoder` stream transformer. This allows for decoding to begin as soon as the byte stream starts emitting chunks of data (max length of 65536 bytes (64KiB) observed in my local testing). Additionally, I ran some local benchmarks for which the code is available here: https://gist.github.com/rajveermalviya/7b4d92f84c68f0976ed07f6d797ac164 where the response bodies of `/register` and `/static/generated/emoji/emoji_api.xxx.json` were downloaded and served locally using Nginx configured using the following config (mimicking the Zulip server): https://gist.github.com/rajveermalviya/2188c9a8d1a3e21c2efea186d61026b2 The results were as follows: $ dart compile exe bin/dart_json_bench.dart && ./bin/dart_json_bench.exe /register.json StreamingJsonBenchmark(RunTime): 77548.42307692308 us. (~77ms) /register.json NonStreamingJsonBenchmark(RunTime): 116733.44444444444 us. (~116ms) /emoji_api.json StreamingJsonBenchmark(RunTime): 1109.8724348308374 us. (~1ms) /emoji_api.json NonStreamingJsonBenchmark(RunTime): 1138.2514220705348 us. (~1ms) (The durations represent the time taken to make a single request, calculated by averaging the total time over the number of iterations performed within a 2-second period.) In non-streaming mode for `/register.json` the UTF-8 + JSON decoder takes around ~55ms and rest ~61ms is utilized just to download the response. Meaning in the streaming mode, decoding only incurs extra ~16ms. This duration would be overshadowed by download time even more when running on a slower network (possibly even reaching to near zero) because bytes chunks would be downloaded much much slower than time it would take to decode the chunks. Though for smaller responses (e.g. `/emoji_api.json` ~60KiB) there isn't much difference observed, understandbly.
diff --git a/lib/api/core.dart b/lib/api/core.dart
@@ -147,8 +147,14 @@ class ApiConnection {
     final int httpStatus = response.statusCode;
     Map<String, dynamic>? json;
     try {
-      final bytes = await response.stream.toBytes();
-      json = jsonUtf8Decoder.convert(bytes) as Map<String, dynamic>?;
+      // Pass the response stream through the `jsonUtf8Decoder` transformer,
+      // allowing decoding to start as soon as the response stream emits data
+      // chunks.
+      final jsonStream = jsonUtf8Decoder.bind(response.stream);
+
+      // Actually start listening to the response byte stream and wait for
+      // decoding to finish.
+      json = await jsonStream.single as Map<String, dynamic>?;
     } catch (e) {
       // We'll throw something below, seeing `json` is null.
     }