@@ -314,7 +314,7 @@ features.
314
314
``MLModelRunner `` implementations
315
315
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
316
316
317
- We currently feature 3 implementations:
317
+ We currently feature 4 implementations:
318
318
319
319
- ``ModelUnderTrainingRunner ``. This requires the compiler be built with TFLite
320
320
support. It allows loading a TFLite model dynamically and is primarily
@@ -338,15 +338,97 @@ requiring no out of tree build-time dependencies.
338
338
presumably a python training algorithm. We do not envision using this in a
339
339
production environment.
340
340
341
+ - ``NoInferenceModelRunner ``. This serves as a store for feature values, and its
342
+ ``evaluate `` should never be called. It's used for training scenarios, when we
343
+ want to capture the behavior of the default (non-ML) heuristic.
344
+
341
345
Note that training leaves it to the training infrastructure to handle
342
346
distributed computing. The assumed architecture has python processes
343
347
communicating remotely between themselves, but managing local communication with
344
348
clang.
345
349
346
- ..
347
- TODO(mtrofin):
348
- - logging, and the use in interactive mode.
349
- - discuss an example (like the inliner)
350
+ Logging Facility
351
+ ----------------
352
+
353
+ When training models, we need to expose the features we will want to use during
354
+ inference, as well as outcomes, to guide reward-based learning techniques. This
355
+ can happen in 2 forms:
356
+
357
+ - when running the compiler on some input, as a capture of the features and
358
+ actions taken by some policy or a model currently being used.
359
+ For example, see ``DevelopmentModeInlineAdvisor `` or ``DevelopmentModeEvictAdvisor ``
360
+ in ``MLRegallocEvictAdvisor.cpp ``. In more detail, in the former case, if
361
+ ``-training-log `` is specified, the features and actions (inline/no inline)
362
+ from each inlining decision are saved to the specified file. Since
363
+ ``MLModelRunner `` implementations hold on to feature values (they don't get
364
+ cleared by ``evaluate ``), logging is easily supported by just looping over the
365
+ model runner's features and passing the tensor buffers to the logger. Note how
366
+ we use the ``NoInferenceModelRunner `` to capture the features observed when
367
+ using the default policy.
368
+
369
+ - as a serialization mechanism for the ``InteractiveModelRunner ``. Here, we need
370
+ to pass the observed features over IPC (a file descriptor, likely a named
371
+ pipe).
372
+
373
+ Both cases require serializing the same kind of data and we support both with
374
+ ``Analysis/Utils/TrainingLogger ``.
375
+
376
+ The goal of the logger design was avoiding any new dependency, and optimizing
377
+ for the tensor scenario - i.e. exchanging potentially large buffers of fixed
378
+ size, containing scalars. We explicitly assume the reader of the format has the
379
+ same endianness as the compiler host, and we further expect the reader and the
380
+ compiler run on the same host. This is because we expect the training scenarios
381
+ have a (typically python) process managing the compiler process, and we leave to
382
+ the training side to handle remoting.
383
+
384
+ The logger produces the following sequence:
385
+
386
+ - a header describing the structure of the log. This is a one-line textual JSON
387
+ dictionary with the following elements:
388
+
389
+ - ``features ``: a list of JSON-serialized ``TensorSpec `` values. The position
390
+ in the list matters, as it will be the order in which values will be
391
+ subsequently recorded. If we are just logging (i.e. not using the
392
+ ``InteractiveModelRunner ``), the last feature should be that of the action
393
+ (e.g. "inline/no inline", or "index of evicted live range")
394
+ - (optional) ``score ``: a ``TensorSpec `` describing a value we will include to
395
+ help formulate a reward. This could be a size estimate or a latency estimate.
396
+ - (optional) ``advice ``: a ``TensorSpec `` describing the action. This is used
397
+ for the ``InteractiveModelRunner ``, in which case it shouldn't be in the
398
+ ``features `` list.
399
+ - a sequence of ``contexts ``. Contexts are independent traces of the optimization
400
+ problem. For module passes, there is only one context, for function passes,
401
+ there is a context per function. The start of a context is marked with a
402
+ one-line JSON dictionary of the form ``{"context": <context name, a string>} ``
403
+
404
+ Each context has a sequence of:
405
+
406
+ - ``observations ``. An observation is:
407
+
408
+ - one-line JSON ``{"observation": <observation number. 0-indexed>} ``
409
+ - a binary dump of the tensor buffers, in the order in which they were
410
+ specified in the header.
411
+ - a new line character
412
+ - if ``score `` was specified in the header:
413
+
414
+ - a one-line JSON object ``{"outcome": <value>} ``, where the ``value ``
415
+ conforms to the ``TensorSpec `` in defined for the ``score `` in the header.
416
+ - the outcome value, as a binary dump
417
+ - a new line character.
418
+
419
+ The format uses a mix of textual JSON (for headers) and binary dumps (for tensors)
420
+ because the headers are not expected to dominate the payload - the tensor values
421
+ are. We wanted to avoid overburdening the log reader - likely python - from
422
+ additional dependencies; and the one-line JSON makes it rudimentarily possible
423
+ to inspect a log without additional tooling.
424
+
425
+ A python utility for reading logs, used for tests, is available at
426
+ ``Analysis/models/log_reader.py ``. A utility showcasing the ``InteractiveModelRunner ``,
427
+ which uses this reader as well, is at ``Analysis/models/interactive_host.py ``.
428
+ The latter is also used in tests.
429
+
430
+ There is no C++ implementation of a log reader. We do not have a scenario
431
+ motivating one.
350
432
351
433
IR2Vec Embeddings
352
434
=================
0 commit comments