-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[doc][mlgo] Document the logger (serialization) and expose the doc #141094
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@llvm/pr-subscribers-mlgo Author: Mircea Trofin (mtrofin) ChangesFull diff: https://github.com/llvm/llvm-project/pull/141094.diff 2 Files Affected:
diff --git a/llvm/docs/MLGO.rst b/llvm/docs/MLGO.rst
index c88d8d68a7ce3..dd219f4d979fa 100644
--- a/llvm/docs/MLGO.rst
+++ b/llvm/docs/MLGO.rst
@@ -314,7 +314,7 @@ features.
``MLModelRunner`` implementations
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-We currently feature 3 implementations:
+We currently feature 4 implementations:
- ``ModelUnderTrainingRunner``. This requires the compiler be built with TFLite
support. It allows loading a TFLite model dynamically and is primarily
@@ -338,12 +338,94 @@ requiring no out of tree build-time dependencies.
presumably a python training algorithm. We do not envision using this in a
production environment.
+- ``NoInferenceModelRunner``. This serves as a store for feature values, and its
+ ``evaluate`` should never be called. It's used for training scenarios, when we
+ want to capture the behavior of the default (non-ML) heuristic.
+
Note that training leaves it to the training infrastructure to handle
distributed computing. The assumed architecture has python processes
communicating remotely between themselves, but managing local communication with
clang.
-..
- TODO(mtrofin):
- - logging, and the use in interactive mode.
- - discuss an example (like the inliner)
+Logging Facility
+----------------
+
+When training models, we need to expose the features we will want to use during
+inference, as well as outcomes, to guide reward-based learning techniques. This
+can happen in 2 forms:
+
+- as an effect of running the compiler on some input, as a capture of the
+ features and actions taken by some policy or a model currently being used.
+ For example, see ``DevelopmentModeInlineAdvisor`` or ``DevelopmentModeEvictAdvisor``
+ in ``MLRegallocEvictAdvisor.cpp``. In more detail, in the former case, if
+ ``-training-log`` is specified, the features and actions (inline/no inline)
+ from each inlining decision are saved to the specified file. Since
+ ``MLModelRunner`` implementations hold on to feature values (they don't get
+ cleared by ``evaluate``), logging is easily supported by just looping over the
+ model runner's features and passing the tensor buffers to the logger. Note how
+ we use the ``NoInferenceModelRunner`` to capture the features observed when
+ using the default policy.
+
+- as a serialization mechanism for the ``InteractiveModelRunner``. Here, we need
+ to pass the observed features over IPC (a file descriptor, likely a named
+ pipe).
+
+Both cases require serializing the same kind of data and we support both with
+``Analysis/Utils/TrainingLogger``.
+
+The goal of the logger design was avoiding any new dependency, and optimizing
+for the tensor scenario - i.e. exchanging potentially large buffers of fixed
+size, containing scalars. We explicitly assume the reader of the format has the
+same endianness as the compiler host, and we further expect the reader and the
+compiler run on the same host. This is because we expect the training scenarios
+have a (typically python) process managing the compiler process, and we leave to
+the training side to handle remoting.
+
+The logger produces the following sequence:
+
+- a header describing the structure of the log. This is a one-line textual JSON
+ dictionary with the following elements:
+
+ - ``features``: a list of JSON-serialized ``TensorSpec`` values. The position
+ in the list matters, as it will be the order in which values will be
+ subsequently recorded. If we are just logging (i.e. not using the
+ ``InteractiveModelRunner``), the last feature should be that of the action
+ (e.g. "inline/no inline", or "index of evicted live range")
+ - (optional) ``score``: a ``TensorSpec`` describing a value we will include to
+ help formulate a reward. This could be a size estimate or a latency estimate.
+ - (optional) ``advice``: a ``TensorSpec`` describing the action. This is used
+ for the ``InteractiveModelRunner``, in which case it shouldn't be in the
+ ``features`` list.
+- a sequence of ``contexts``. Contexts are independent traces of the optimization
+ problem. For module passes, there is only one context, for function passes,
+ there is a context per function. The start of a context is marked with a
+ one-line JSON dictionary of the form ``{"context": <context name, a string>}``
+
+ Each context has a sequence of:
+
+ - ``observations``. An observation is:
+
+ - one-line JSON ``{"observation": <observation number. 0-indexed>}``
+ - a binary dump of the tensor buffers, in the order in which they were
+ specified in the header.
+ - a new line character
+ - if ``score`` was specified in the header:
+
+ - a one-line JSON object ``{"outcome": <value>}``, where the ``value``
+ conforms to the ``TensorSpec`` in defined for the ``score`` in the header.
+ - the outcome value, as a binary dump
+ - a new line character.
+
+The format uses a mix of textual JSON (for headers) and binary dumps (for tensors)
+because the headers are not expected to dominate the payload - the tensor values
+are. We wanted to avoid overburdening the log reader - likely python - from
+additional dependencies; and the one-line JSON makes it rudimentarily possible
+to inspect a log without additional tooling.
+
+A python utility for reading logs, used for tests, is available at
+``Analysis/models/log_reader.py``. A utility showcasing the ``InteractiveModelRunner``,
+which uses this reader as well, is at ``Analysis/models/``Analysis/models/interactive_host.py``.
+The latter is also used in tests.
+
+There is no C++ implementation of a log reader. We do not have a scenario
+motivating one.
diff --git a/llvm/docs/Reference.rst b/llvm/docs/Reference.rst
index 565d5c6876d66..7f13aec2ec86b 100644
--- a/llvm/docs/Reference.rst
+++ b/llvm/docs/Reference.rst
@@ -41,7 +41,6 @@ LLVM and API reference documentation.
PDB/index
PointerAuth
ScudoHardenedAllocator
- MLGO
MemoryModelRelaxationAnnotations
MemTagSanitizer
Security
@@ -239,3 +238,7 @@ Additional Topics
:doc:`ConvergenceAndUniformity`
A description of uniformity analysis in the presence of irreducible
control flow, and its implementation.
+
+:doc:`MLGO`
+ Facilities for ML-Guided Optimization, such as collecting IR corpora from a
+ build, interfacing with ML models, an exposing features for training.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the docs build is failing. That needs to be fixed.
boomanaiden154
approved these changes
May 22, 2025
sivan-shani
pushed a commit
to sivan-shani/llvm-project
that referenced
this pull request
Jun 3, 2025
ajaden-codes
pushed a commit
to Jaddyen/llvm-project
that referenced
this pull request
Jun 6, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.