Skip to content

[llvm]Add a simple Telemetry framework #102323

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 99 commits into from
Dec 20, 2024
Merged
Show file tree
Hide file tree
Changes from 42 commits
Commits
Show all changes
99 commits
Select commit Hold shift + click to select a range
dbb8b15
[llvm][lib]Propose a simple Telemetry framework.
oontvoo Aug 7, 2024
6e43f67
add header
oontvoo Aug 7, 2024
e25f5fc
fixed typo
oontvoo Aug 7, 2024
ad3906c
Merge branch 'llvm:main' into llvm_telemetry
oontvoo Aug 9, 2024
24d07d6
Merge branch 'llvm:main' into llvm_telemetry
oontvoo Aug 22, 2024
0057bcf
add tests and addressed review comments
oontvoo Aug 29, 2024
750e4ac
formatting changes
oontvoo Aug 29, 2024
1378ed4
more formatting
oontvoo Aug 29, 2024
02e750e
more formatting changes
oontvoo Aug 29, 2024
63e99fc
Added header comment to describe the package
oontvoo Aug 29, 2024
fa88512
reformat header
oontvoo Aug 29, 2024
0866f64
addressed review comments and added separate docs in llvm/docs/
oontvoo Sep 3, 2024
a8523f7
updated field names in tests
oontvoo Sep 4, 2024
47e8b06
reformated doc and added additional details
oontvoo Sep 4, 2024
690c6ab
fix formatting
oontvoo Sep 4, 2024
db668f4
details
oontvoo Sep 4, 2024
0290a14
formatted TelemetryTest.cpp
oontvoo Sep 4, 2024
14b5234
fix formatting in docs
oontvoo Sep 4, 2024
6ca87e5
convert comments to doxygen style
oontvoo Sep 4, 2024
4155add
fix doc toc
oontvoo Sep 4, 2024
11ead4a
group all the testing-params into a Context struct
oontvoo Sep 5, 2024
48228ee
fix conversion warnings
oontvoo Sep 5, 2024
ed8a3f1
Merge branch 'llvm:main' into llvm_telemetry
oontvoo Sep 5, 2024
990e1ca
finish up todos
oontvoo Sep 5, 2024
479cd13
reformat docs
oontvoo Sep 20, 2024
6d8aee2
Merge branch 'llvm:main' into llvm_telemetry
oontvoo Sep 24, 2024
3a17dbb
Update llvm/docs/Telemetry.rst
oontvoo Sep 24, 2024
a6a6c7b
Update llvm/docs/Telemetry.rst
oontvoo Sep 24, 2024
f30dec6
Update llvm/docs/Telemetry.rst
oontvoo Sep 24, 2024
fd3da20
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Sep 24, 2024
60616ef
Update llvm/unittests/Telemetry/TelemetryTest.cpp
oontvoo Sep 24, 2024
8c0ac5a
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Sep 24, 2024
c8be829
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Sep 24, 2024
1393c1f
Update llvm/docs/Telemetry.rst
oontvoo Sep 24, 2024
eb07577
Update llvm/docs/Telemetry.rst
oontvoo Sep 24, 2024
0376abc
fix comment
oontvoo Sep 24, 2024
e2f7c23
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Sep 24, 2024
17dfac7
Update llvm/docs/Telemetry.rst
oontvoo Sep 24, 2024
1ea0906
Update llvm/docs/Telemetry.rst
oontvoo Sep 24, 2024
39fd0e7
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Sep 24, 2024
67b7688
Addressed review comments: remove most of the member fields from Tele…
oontvoo Sep 24, 2024
efd25d8
use llvm::StringLiteral
oontvoo Sep 24, 2024
4a8276f
Addressed review comment:
oontvoo Sep 26, 2024
a16344b
pass test context as arg to ctor
oontvoo Sep 27, 2024
26ee5eb
construct expected json directly rather than parsing from strings
oontvoo Sep 30, 2024
b766e3c
remove unnecessary use of atomic seed
oontvoo Sep 30, 2024
c8cddab
addressed review comments
oontvoo Nov 21, 2024
5f8d65f
s/emitEntry/receiveEntry
oontvoo Nov 21, 2024
dffeacd
Addressed review comments:
oontvoo Nov 27, 2024
398ed3e
updated doc
oontvoo Nov 27, 2024
e462908
fix build warning
oontvoo Dec 4, 2024
df8a9af
more cleanup
oontvoo Dec 4, 2024
70f4742
spacing
oontvoo Dec 4, 2024
8eab77a
more docs
oontvoo Dec 4, 2024
f9e1a65
pass map arg as const ref
oontvoo Dec 9, 2024
2a1fbe5
Update llvm/docs/Telemetry.rst
oontvoo Dec 11, 2024
6e3a85d
Update llvm/docs/Telemetry.rst
oontvoo Dec 11, 2024
f9b1cce
addressed review comments
oontvoo Dec 11, 2024
0f38a74
Update llvm/docs/Telemetry.rst
oontvoo Dec 16, 2024
ad57099
Update llvm/docs/Telemetry.rst
oontvoo Dec 16, 2024
a2fcd4d
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Dec 16, 2024
628e7e9
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Dec 16, 2024
6ea4e92
Update llvm/docs/Telemetry.rst
oontvoo Dec 16, 2024
08cf32f
Update llvm/docs/Telemetry.rst
oontvoo Dec 16, 2024
3c52401
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Dec 16, 2024
07f06c0
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Dec 16, 2024
06e746e
addressed review comments:
oontvoo Dec 16, 2024
2476294
more styling fixe
oontvoo Dec 16, 2024
5ce406c
dest -> Dest
oontvoo Dec 16, 2024
fd57d06
fix caps
oontvoo Dec 16, 2024
a8a878b
Update llvm/docs/Telemetry.rst
oontvoo Dec 17, 2024
54eeaba
Update llvm/docs/Telemetry.rst
oontvoo Dec 17, 2024
32dfc6a
Update llvm/docs/Telemetry.rst
oontvoo Dec 17, 2024
0893c5b
Update llvm/docs/Telemetry.rst
oontvoo Dec 17, 2024
155f243
Update llvm/docs/Telemetry.rst
oontvoo Dec 17, 2024
b255e3a
Update llvm/docs/Telemetry.rst
oontvoo Dec 17, 2024
14d1c92
formatting
oontvoo Dec 17, 2024
bf2172d
update design chart
oontvoo Dec 17, 2024
c2dcb7d
replace write(std::map) with beginObject/endObject interface
oontvoo Dec 17, 2024
bbe9009
formatting
oontvoo Dec 17, 2024
e2036c6
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Dec 18, 2024
05947d5
Update llvm/docs/Telemetry.rst
oontvoo Dec 18, 2024
f1100bb
add helper write() that takes a map. also updated docs
oontvoo Dec 18, 2024
2f64b29
added additional overloads for int types
oontvoo Dec 18, 2024
a5df7a0
formatting
oontvoo Dec 18, 2024
585fee8
remove more unused headers
oontvoo Dec 18, 2024
3812cda
Update llvm/unittests/Telemetry/TelemetryTest.cpp
oontvoo Dec 19, 2024
0cdc240
Update llvm/unittests/Telemetry/TelemetryTest.cpp
oontvoo Dec 19, 2024
da8056f
Update llvm/unittests/Telemetry/TelemetryTest.cpp
oontvoo Dec 19, 2024
c116074
Update llvm/unittests/Telemetry/TelemetryTest.cpp
oontvoo Dec 19, 2024
a4e2799
Update llvm/unittests/Telemetry/TelemetryTest.cpp
oontvoo Dec 19, 2024
2763d46
Update llvm/unittests/Telemetry/TelemetryTest.cpp
oontvoo Dec 19, 2024
75c1b4b
addressed review comments
oontvoo Dec 19, 2024
9780ec7
Addressed review comments:
oontvoo Dec 19, 2024
4715743
Update llvm/lib/Telemetry/Telemetry.cpp
oontvoo Dec 20, 2024
d231bf2
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Dec 20, 2024
1941b1f
addressed review comments:
oontvoo Dec 20, 2024
bd3df5e
Add simple assert to make sure the map's key is string compat
oontvoo Dec 20, 2024
4351b94
Merge branch 'llvm:main' into llvm_telemetry
oontvoo Dec 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
214 changes: 214 additions & 0 deletions llvm/docs/Telemetry.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,214 @@
===========================
Telemetry framework in LLVM
===========================

.. contents::
:local:

.. toctree::
:hidden:

===========================
Telemetry framework in LLVM
===========================

Objective
=========

Provides a common framework in LLVM for collecting various usage and performance
metrics.
It is located at `llvm/Telemetry/Telemetry.h`

Characteristics
---------------
* Configurable and extensible by:

* Tools: any tool that wants to use Telemetry can extend and customize it.
* Vendors: Toolchain vendors can also provide custom implementation of the
library, which could either override or extend the given tool's upstream
implementation, to best fit their organization's usage and privacy models.
* End users of such tool can also configure Telemetry (as allowed by their
vendor).


Important notes
----------------

* There is no concrete implementation of a Telemetry library in upstream LLVM.
We only provide the abstract API here. Any tool that wants telemetry will
implement one.

The rationale for this is that, all the tools in LLVM are very different in
what they care about (what/where/when to instrument data). Hence, it might not
be practical to have a single implementation.
However, in the future, if we see enough common pattern, we can extract them
into a shared place. This is TBD - contributions are welcomed.

* No implementation of Telemetry in upstream LLVM shall store any of the
collected data due to privacy and security reasons:

* Different organizations have different privacy models:

* Which data is sensitive, which is not?
* Whether it is acceptable for instrumented data to be stored anywhere?
(to a local file, what not?)

* Data ownership and data collection consents are hard to accommodate from
LLVM developers' point of view:

* Eg., data collected by Telemetry is not neccessarily owned by the user
of an LLVM tool with Telemetry enabled, hence the user's consent to data
collection is not meaningful. On the other hand, LLVM developers have no
reasonable ways to request consent from the "real" owners.


High-level design
=================

Key components
--------------

The framework consists of four important classes:

* `llvm::telemetry::Telemeter`: The class responsible for collecting and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RST uses double backticks for inline literals. If you click on "Display the rich diff" GitHub will show you the rendered version of the RST file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

transmitting telemetry data. This is the main point of interaction between the
framework and any tool that wants to enable telemery.
* `llvm::telemetry::TelemetryInfo`: Data courier
* `llvm::telemetry::Destination`: Data sink to which the Telemetry framework
sends data.
Its implementation is transparent to the framework.
It is up to the vendor to decide which pieces of data to forward and where
to forward them to their final storage.
* `llvm::telemetry::Config`: Configurations on the `Telemeter`.

.. image:: llvm_telemetry_design.png

How to implement and interact with the API
------------------------------------------

To use Telemetry in your tool, you need to provide a concrete implementation of the `Telemeter` class and `Destination`.

1) Define a custom `Telemeter` and `Destination`

.. code-block:: c++

// This destination just prints the given entry to a stdout.
// In "real life", this would be where you forward the data to your
// custom data storage.
class MyStdoutDestination : public llvm::telemetry::Destination {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: you qualify Destination with its namespaces, but none of the other classes. You should probably omit it here.

public:
Error emitEntry(const TelemetryInfo* Entry) override {
return sendToBlackBox(Entry);

}

private:
Error sendToBlackBox(const TelemetryInfo* Entry) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following the benefit of the extra function here? Could the contents of this function just be inlined?

Also, since this is code in our documentation, it should probably follow LLVM coding standards (specifically here, the * character looks to be in the wrong place).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - i was just trying to demonstrate that the data can be forwarded to somewhere else ...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if the comment is inlined into emitEntry along with the code, that would simplify the example without really losing the fact that the data can be sent somewhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - simplified the example

// This could send the data anywhere.
// But we're simply sending it to stdout for the example.
llvm::outs() << entryToString(Entry) << "\n";
return llvm::success();
}

std::string entryToString(const TelemetryInfo* Entry) {
// make a string-representation of the given entry.
}
};

// This defines a custom TelemetryInfo that has an addition Msg field.
struct MyTelemetryInfo : public llvm::telemetry::TelemetryInfo {
std::string Msg;

json::Object serializeToJson() const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have override.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done -(the function will actually be removed )

json::Object Ret = TelemeteryInfo::serializeToJson();
Ret.emplace_back("MyMsg", Msg);
return std::move(Ret);
}

// TODO: implement getKind() and classof() to support dyn_cast operations.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intended as additional instructions to someone using this class as a base, or something to come back to before merging?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is supposed to be instructions to users to define these.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks. To me "TODO" seemed more like something to tackle before merging. I'm okay with you leaving it like this, if you want, but would personally write the function definition, but leave the body blank with a comment like this or something: /* ... */

};

class MyTelemeter : public llvm::telemery::Telemeter {
public:
static std::unique_ptr<MyTelemeter> createInstatnce(llvm::telemetry::Config* config) {
// If Telemetry is not enabled, then just return null;
if (!config->EnableTelemetry) return nullptr;

std::make_unique<MyTelemeter>();
}
MyTelemeter() = default;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually needed?


void logStartup(llvm::StringRef ToolName, TelemetryInfo* Entry) override {
if (MyTelemetryInfo* M = dyn_cast<MyTelemetryInfo>(Entry)) {
M->Msg = "Starting up tool with name: " + ToolName;
emitToAllDestinations(M);
} else {
emitToAllDestinations(Entry);
}
}

void logExit(llvm::StringRef ToolName, TelemetryInfo* Entry) override {
if (MyTelemetryInfo* M = dyn_cast<MyTelemetryInfo>(Entry)) {
M->Msg = "Exitting tool with name: " + ToolName;
emitToAllDestinations(M);
} else {
emitToAllDestinations(Entry);
}
}

void addDestination(Destination* dest) override {
destinations.push_back(dest);
}

// You can also define additional instrumentation points.)
void logAdditionalPoint(TelemetryInfo* Entry) {
// .... code here
}
private:
void emitToAllDestinations(const TelemetryInfo* Entry) {
// Note: could do this in parallel, if needed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest dropping this note: ultimately the end implementation will determine parallelism or not, and I think that's somewhat obvious, given that there isn't really any real upstream implementation involved here.

for (Destination* Dest : Destinations)
Dest->emitEntry(Entry);
}
std::vector<Destination> Destinations;
};

2) Use the library in your tool.

Logging the tool init-process:

.. code-block:: c++

// At tool's init code
auto StartTime = std::chrono::time_point<std::chrono::steady_clock>::now();
llvm::telemetry::Config MyConfig = makeConfig(); // build up the appropriate Config struct here.
auto Telemeter = MyTelemeter::createInstance(&MyConfig);
std::string CurrentSessionId = ...; // Make some unique ID corresponding to the current session here.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I think this should be the Telemeter class's responsibility. It shouldn't be something tool developers need to do. Otherwise, the session IDs may not all follow the same scheme (e.g. one tool might use a timestamp-based system, another a UUID etc), even if they've been written by the same vendor.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving this some more thought, perhaps there should be a default implementation upstream to generate the session ID that is extensible in some manner. I imagine that the session IDs for some people might want to be something different to UUIDs. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like virtual std::string telemetry::Manager::makeSessionId()?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, or even have a SessionID class that can be extended further. The simple returning a string option though is probably sufficient for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - Added this on the Config


// Any other tool's init code can go here
// ...

// Finally, take a snapshot of the time now so we know how long it took the
// init process to finish
auto EndTime = std::chrono::time_point<std::chrono::steady_clock>::now();
MyTelemetryInfo Entry;
Entry.SessionId = CurrentSessionId ; // Assign some unique ID here.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my previous comment, I think this should be done within the Telemeter class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - the Manager now assigns the ID to the entry before sending it to Destination

Entry.Stats = {StartTime, EndTime};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no Stats member of this class.

Telemeter->logStartup("MyTool", &Entry);

Similar code can be used for logging the tool's exit.

Additionall, at any other point in the tool's lifetime, it can also log telemetry:

.. code-block:: c++

// At some execution point:
auto StartTime = std::chrono::time_point<std::chrono::steady_clock>::now();

// ... other events happening here

auto EndTime = std::chrono::time_point<std::chrono::steady_clock>::now();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: indentation here and the lines below is inconsistent.

MyTelemetryInfo Entry;
Entry.SessionId = CurrentSessionId ; // Assign some unique ID here.
Entry.Stats = {StartTime, EndTime};
Telemeter->logAdditionalPoint(&Entry);
4 changes: 4 additions & 0 deletions llvm/docs/UserGuides.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ intermediate LLVM representation.
SupportLibrary
TableGen/index
TableGenFundamentals
Telemetry
Vectorizers
WritingAnLLVMPass
WritingAnLLVMNewPMPass
Expand Down Expand Up @@ -293,3 +294,6 @@ Additional Topics

:doc:`Sandbox IR <SandboxIR>`
This document describes the design and usage of Sandbox IR, a transactional layer over LLVM IR.

:doc:`Telemetry`
This document describes the Telemetry framework in LLVM.
Binary file added llvm/docs/llvm_telemetry_design.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
136 changes: 136 additions & 0 deletions llvm/include/llvm/Telemetry/Telemetry.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
//===- llvm/Telemetry/Telemetry.h - Telemetry -------------------*- C++ -*-===//
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: coding standards have very recently changed such that the C++ bit of this line is no longer needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does the new header look like? (I've only seen this format ....)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
///
/// \file
/// This file provides the basic framework for Telemetry
///
/// It comprises of three important structs/classes:
///
/// - Telemeter: The class responsible for collecting and forwarding
/// telemery data.
/// - TelemetryInfo: data courier
/// - TelemetryConfig: this stores configurations on Telemeter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would omit this and make Telemetry.rst the canonical source of truth for this class. I'm worried the two are going to get out of sync.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

///
/// Refer to its documentation at llvm/docs/Telemetry.rst for more details.
//===---------------------------------------------------------------------===//

#ifndef LLVM_TELEMETRY_TELEMETRY_H
#define LLVM_TELEMETRY_TELEMETRY_H

#include <chrono>
#include <ctime>
#include <memory>
#include <optional>
#include <string>

#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/Error.h"
#include "llvm/Support/JSON.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not looked at the other headers, but I believe at least JSON.h isn't needed now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


namespace llvm {
namespace telemetry {

/// Configuration for the Telemeter class.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment needs updating for the new class name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

/// This stores configurations from both users and vendors and is passed
/// to the Telemeter upon construction. (Any changes to the config after
/// the Telemeter's construction will not have effect on it).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// the Telemeter's construction will not have effect on it).
/// the Telemeter's construction will not have effect any on it).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

///
/// This struct can be extended as needed to add additional configuration
/// points specific to a vendor's implementation.
struct Config {
// If true, telemetry will be enabled.
bool EnableTelemetry;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my comment below about the mechanism for setting this being vendor-specific, perhaps this should be an abstract method isTelemetryEnabled or similar. In this class's current state, the tool has to specifically opt in or out of telemetry, whereas I think the decision needs to depend on how the vendor wants to control it. With the abstract method, the vendor will define a Config subclass that specifies how this can be determined, e.g. via reading an environment variable. My only concern here is how would a "can enable/disable it on the command-line" behaviour be implemented?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern here is how would a "can enable/disable it on the command-line" behaviour be implemented?

Short answer to your question: Yes, definitely.

The idea is to have a telemetry::makeConfig() function that can be overridden by tool or vendors (the default impl will, of course, return a mostly empty config with telemetry disabled).

In the makeConfig() function, the vendor can set whatever value they want - they could also extend the class to have additional flags, if needed.

Does that address your concern?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of, but not entirely.

Two thoughts:

  1. By making the EnableTelemetry member a public non-const variable, it's possible for tool code to change its value, regardless of the intended behaviour by the vendor. Obviously, this would be incorrect to do, but a developer adding telemetry might not realise that. Making it an abstract method would force the behaviour to be defined in the downstream vendor and make it unmodifiable.
  2. How would makeConfig access options specified on the command-line if they're not stored in global variables?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. We can make EnableTelemetry a const variable.
  2. I'm not sure I understand the issue here. In my mind, the functin's impl can read from the commandline variable if it wants.

Eg.:

// Upstream (ie., default) impl
std::unique_ptr<Config> llvm::telemetry::makeConfig() {
return std::Make_unique<Config>{/*EnableTelemetry*/false};
}

// In some tool specific sub-directory:
std::unique_ptr<Config> mytool::telemetry::makeConfig() {
  InputArgList args = ....;
  bool enableTelemetry =  args.getLastArgValue(OPT_EnableTelemetry);
  return std::make_unique<Config>{enableTelemetry);
}



// Implementation-defined names of additional destinations to send
// telemetry data (Could be stdout, stderr, or some local paths, etc).
//
// These strings will be interpreted by the vendor's code.
// So the users must pick the from their vendor's pre-defined
// set of Destinations.
std::vector<std::string> AdditionalDestinations;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this contain a list of telemetry::Destination?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this field intended to contain the description of the Destinations. In other words, the Config struct is intended to contain descriptions and configurations from the tool (and its user) - it is then passed to the Telemetry framework to construct the object(s).

Because Destination classes are defined in vendor plugins, it's not ideal to make the tool (upstrream code) or users build up the Destinations themselves.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm beginning to think that this field doesn't belong in the generic upstream implementation. I think we can all agree that a field for enabling telemetry is common, though I also think it's fair to say that the mechanism for setting it will be vendor-specific. However, the destinations feel entirely like they should be vendor side, given we're not providing any concrete destinations in the upstream implementation.

For each destination a vendor defines, they'll either have it enabled by default, or disabled by default, with a user-facing option to enable it. It may be the case that they have NO destinations that are user-configurable, in which case this field is entirely dead weight. Even if it is configurable by the user somehow, it would seem like the logical way of doing this will be implementation dependent and therefore should be part of the vendor's Config subclass. For example, they might prefer a couple of boolean fields "enableStdoutLogging", "sendDataToServer" or something like that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to @jh7370 comment. I don't think this belongs here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, can remove it.

};

/// For isa, dyn_cast, etc operations on TelemetryInfo.
typedef unsigned KindType;
/// This struct is used by TelemetryInfo to support isa<>, dyn_cast<>
/// operations.
/// It is defined as a struct (rather than an enum) because it is
/// expectend to be extended by subclasses which may have
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// expectend to be extended by subclasses which may have
/// expected to be extended by subclasses which may have

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

/// additional TelemetryInfo types defined to describe different events.
struct EntryKind {
static const KindType Base = 0;
};

/// TelemetryInfo is the data courier, used to move instrumented data
/// from the tool being monitored to the Telemery framework.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// from the tool being monitored to the Telemery framework.
/// from the tool being monitored to the Telemetry framework.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

///
/// This base class contains only the basic set of telemetry data.
/// Downstream implementations can add more fields as needed to describe
/// additional events.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this:

Suggested change
/// Downstream implementations can add more fields as needed to describe
/// additional events.
/// Downstream implementations can define subclasses to provide different bits
/// of information.

I'm suggesting this, because in our downstream implementation, we have lots of event types, each of which would be a variation of TelemtryInfo but with different data, whereas the way the comment is currently phrased implies there should be one event class, with lots of different fields, to cover all the different events.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - yes, the intent is to allow for different event classes , not one class with a bunch of fields.
(I meant to write "Subclasses can add more fields as needed ...")

///
/// For example, The LLDB debugger can define a DebugCommandInfo subclass
/// which has additional fields about the debug-command being instrumented,
/// such as `CommandArguments` or `CommandName`.
struct TelemetryInfo {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we've discussed time points before and I've been giving it a bit more thought today. I think there are three broad time-related variations of a piece of telemetry data:

  1. Things that describe some state related to the telemetry session that are not specific to a point in time.
  2. Things that describe some state at a specific point in time, or that describe an occurrence of something at a specific point of time.
  3. Things that describe the same as 2, but over a specific time range rather than a single point in time.

For a given event type, I don't think it makes sense for it to support more than one of these variations. So I think that implies either three subclasses, or probably simpler, three constructors for TelemetryInfo, like in the following pseudo-code.

TelemetryInfo(); // Sets timepoints to null.
TelemetryInfo(time_point tp); // Sets "start" time to tp, and end time to null.
TelemetryInfo(time_point start, time_point end); // Sets the corresponding start and end times.

I feel like this is generic enough that it deserves to be in the base framework, rather than having most/all downstream vendors having to implement the same things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly prefer to not include time in the base implementation. Our downstream implementation exclusively uses counts and has a different mechanism for collecting performance telemetry.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're looking at an outdated snapshot of the patch somehow.
In my tree, the TelemetryInfo class currently only has one field, which is the SessionId.

Would that resolve your concern here?

// This represents a unique-id, conventionally corresponding to
// a tool's session - i.e., every time the tool starts until it exits.
//
// Note: a tool could have multiple sessions running at once, in which
// case, these shall be multiple sets of TelemetryInfo with multiple unique
// ids.
//
// Different usages can assign different types of IDs to this field.
std::string SessionId;

TelemetryInfo() = default;
~TelemetryInfo() = default;

virtual json::Object serializeToJson() const;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function still feels wrong to me. I reckon it should look like this:

virtual void serialize(Serializer &serializer) const = 0;

Concrete implementations populate it with functions like this:

void serialize(Serializer &serializer) const override {
  serializer.write("key1", value);
  serializer.write("key2", value2);
}

The Serializer class is then an abstract class, with concrete implementations defined in partnership with the Destination classes. For example, you'd have a JsonSerializer that knows how to record key/value pairs as Json, and then Destination classes could be StdoutDestination or FileDestination. Since Json is likely to be a popular choice, we might want to have JsonSerializer defined upstream. We could have StdoutDestination and/or FileDestination upstream too (or more generically StreamDestination). By restricting the destination enabling to vendor-side code, there's no security risk here, because the vendor can simply not provide the hooks to configure these destinations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to clarify the rationale of this a bit. Previously(see comments on Sept 3), you raised an issue with the design of Destination class where you said it tried to do two things ("decide how and where to send the data"). To address that, I've taken the "how" (ie., "how to serialize" )part out and put it into this TelemetryInfo class. Now in retrospect, maybe the documentation should have been written simply as "The Destination class is responsible for receiving telemetry data" (where the data eventually go AFTER that is its own implementation). Would that have addressed your concern?

Second point, w.r.t your concern about the explosion of (serialization-methods) x (final destinations), I think that should be an implementation detail on the Destination class(es). If the vendor wants to delegate the serialization-work to some common util to be shared amongst different Destinations(which, of course, they should), then they can do that - but it doesn't need to be explicitly specified here in the upstream interface.

Third point, your proposal for having the Serializer class does not work well for LLDB (and for our internal) use cases:

  • If we want to write structured-data, then we'd end up replicating most of the llvm::json interface
  • If we want to serialize to protobuf, which is what we would use internally at Google, then it's not quite straightforward because the code here would have to tease out the right key string.

This was also why I initially wanted to "hide" the serialization step inside Destination. The idea is that the Destination class would take an Info object directly, pick out what fields it wants, and put them into whatever serialization form defined by the vendor. (Again, the idea of sharing the serialization code would still be possible, but it's an implementation detail and not part of the interface).

Does this sound reasonable?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we've got slightly different visions of things and it's driving the way we're looking at the design differently. Reading between the lines of some of the comments and responses you've made, I get the impression you expect either all concrete TelemetryInfo subclasses to be defined by vendors, or at least the all downstream (i.e. all) Destination classes to know how to convert them to whatever the Destination wants. In my view, I'd prefer it if a) there was the option to have some (but certainly not all) TelemetryInfo subclasses in the upstream LLVM repository, and b) a way to avoid repeating more than is fundamentally necessary to do the serialization. This is a strategy we use successfully in our internal telemetry project.

In my mind, unless we have concrete instances of at least TelemetryInfo in the upstream codebase, and therefore the ability for upstream tools to dispatch telemetry with no more than a vendor-specified Destination (specified in just one place), the whole upstream framework becomes pointless, because downstream vendors still end up with the same level of patches needed to actually make use of the framework.

Regarding the details, I'd expect the theoretical JsonSerializer class to make use of the llvm::json interface, so converting the input types passed by the TelemetryInfo subclass serialize method into the appropriate Json types (aside: this is probably generic enough that it could live in the core LLVM code). Similarly, a ProtobufSerializer would do something equivalent. The keys are intended to be string literals typically that the TelemetryInfo subclasses specify in their serialize method - I'm not quite sure what you mean by the "right key string" in this context, as I'm not familiar with protobuf; it would help if you could elaborate some more what the issue here is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we've got slightly different visions of things and it's driving the way we're looking at the design differently.

I think we have very similar ideas of how this ought to work, but there's some miscommunication probably. (ie., , your second paragraph describes pretty much how it is implemented, at least for LLDB's telemetry)

Reading between the lines of some of the comments and responses you've made, I get the impression you expect either all concrete TelemetryInfo subclasses to be defined by vendors, or at least the all downstream (i.e. all) Destination classes to know how to convert them to whatever the Destination wants.

I want to point out that we should not conflate vendor-specific code with tool-specific code.
While it is true that I do not propose any concrete implementation here in llvm::telemetry, there is a concrete implementation in lldb::telemetry.

So - there are two levels of (potential) overrides here:

  • One: by a tool that wants to use telemetry (eg., LLDB, clang, etc).
  • Two: by different vendors (for the same tool).

I would expect, the bulk of code implementing subclasses of TelemetryInfo to be upstream in the individual tool's directory. That is to say, all of LLDB's subclasses of TelemetryInfo would live in lldb/telemetry/... (but not here in llvm/telemetry, because, as pointed out before, different tools care about different events and concepts).

Then, different vendors (for the same tool) will provide the concrete implementation of Destination. In this sense, I believe the Destination class does exactly what you envision the Serializer class would (ie., it takes a TelemetryInfo object and picks some or all of the data from that object and forward them elsewhere).

In other words, the only vendor-defined pieces are the concrete implementation of Destination
If you would like to have some share utils (eg., how to take a TelemetryInfo and turns it into a json object, you could still do that by putting the code in the tool's upstream directory, then your vendor-specific code can call into that)

Copy link
Member Author

@oontvoo oontvoo Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S: Concretely, it could look something like this:

// In upstream  LLDB: 
llvm::Error lldb::telemetry::Manager::atStartup( TelemetryInfo* info) {
       // Collect additional data and/or fill in the missing fields that the caller might have
       // left out.  (Eg., figure out which debuger-object this is since there could be many)

        // Forward the `info` to a bunch of pre-registered destinations:
       for (Destination* dest : m_destinations)
             dest->emitEntry(info);
}

llvm::Error lldb::telemetry::Manager::atMainExecutableLoad( MainExecutableInfo* info) {
        // Collect additional data and/or fill in the missing fields that the caller might have
       // left out
       // Eg., read the binary for its BUILD_UUID 
        .... 

        // Forward the `info` to a bunch of pre-registered destinations:
       for (Destination* dest : m_destinations)
             dest->emitEntry(info);
}

llvm::Error lldb::telemetry::Manager::atDebugCommandExecution( DebugCommandInfo* info) {
        // Collect additional data and/or fill in the missing fields that the caller might have
       // left out
       // Eg.,figure out whether this command was attached to a particular target or a standalone
      // ...

        // forward the `info` to a bunch of pre-registered destinations:
       for (Destination* dest : m_destinations)
             dest->emitEntry(info);
}

// In some hypothetical vendor code

llvm::Error lldb::google::InternalStorageDestination::emitEntry(TelemetryInfo* info)  override{
     switch(info->getKind()) {
              // ... dispatch to the corresponding type handler
    }
}

llvm::Error lldb::google::InternalStorageDestination::emitMainExecEntry(MainExecutableInfo* info) {

     google::MainExecutableProto p = make_proto<MainExecutableProto>();
    // pick out the fields we want to record from the argument and copy them into the proto.
   
    // send the proto to final storage:
   sendToInternalStorage(std::move(p));
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the details, I'd expect the theoretical JsonSerializer class to make use of the llvm::json interface, so converting the input types passed by the TelemetryInfo subclass serialize method into the appropriate Json types (aside: this is probably generic enough that it could live in the core LLVM code). Similarly, a ProtobufSerializer would do something equivalent. The keys are intended to be string literals typically that the TelemetryInfo subclasses specify in their serialize method - I'm not quite sure what you mean by the "right key string" in this context, as I'm not familiar with protobuf; it would help if you could elaborate some more what the issue here is.

Let me try to elaborate on this (I was discussing this with Vy last week). Protobufs are more strongly typed than json (you can basically think of them as structs which know how to serialize themselves to a byte stream). The natural way to work with them is to work with named accessors (my_proto.set_field1(value1); my_proto.set_field2(value2)). While they have a reflection API which allows accessing the fields using strings (which means a protobuf serializer could implement a virtual write(StringRef key, ...) function, it's a rather long-winded way of using them (you convert the field to a string in the TelemetryInfo object, and then immediately undo that in the reflection API. Since vendor "knows" about all of the telemetry types it wants to collect (*), it could do it more directly with my_proto.set_field1(info.field1).

The second source of confusion (now I'm moving on to the "right key string" part) is that this serializer.write API (at least in the way I/we understand it) is only suitable for implementing simple dictionary types (map<string, BasicType>). It's not very suitable for more complex value types like lists or (sub)dictionaries, as you'd have to add a new (virtual) function for each new data type. (This can be avoided by creating some sort of a polymorphic value type that can hold arbitrary structured data, but at that point, we're sort of reinventing json::Value.)

(*) we assume that to be the case since the vendor needs to decide (based on its use case, data collection policy, etc.) whether a particular field can be collected or not

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanations - I now see what you're thinking.

I want to point out that we should not conflate vendor-specific code with tool-specific code.

This is a fair point. The key thing this means is that code in the top-level framework that is designed to be common needs to be able to work with subclasses defined in tool-level and vendor code. In particular, I think it's reasonable to expect there to be TelemetryInfo subclasses defined in common LLVM code (so for example a DurationEvent might be something desired by LLD, clang, llvm-objdump etc), in tools (for events specific to particular tools like LLDB) and in vendor code (for events that downstream vendors want which might not be applicable upstream).

I think it's also reasonable to assume that more event types will be written over time (and existing events will be extended). What I think we want to avoid though is for the need for every Destination (or some other class) to need to be extended every time a new event type is implemented. Indeed, it's going to be near impossible for downstream vendors to keep tabs on additions to upstream events in the current structure, I think, so they won't know whether there's a need to review new fields etc, with the current design. I guess the flipside of this is that if some vendors need to review fields for legal reasons, we need to make it clear to them that something has changed that needs review. Mind you, I think there'd still be cases where vendors need to review things without any changes in TelemetryInfo classes at all: imagine the case where someone changes the thing used to populate a field upstream, e.g. to use full paths instead of filenames - for some vendors, having the full path may not be acceptable.

Sorry, that last paragraph is a bit rambly, but I'm leaving it as-is, so that you can see some of the thoughts going through my head.

Ultimately, I think my point is that the advantage of having something like TelemetryInfo::serialize is that the Destination classes don't necessarily all need modifying every time a new event field or type is added. I'm open to other ways of how this might be achieved.

I'll acknowledge that something strongly typed like protobuf doesn't work well with the serialize method approach, because of the extra indirection. However, you could have ProtobufDestination not use the serialize approach at all and dispatch by switching based on type, as described. If we allowed both options, it would admittedly mean the serialize method wouldn't always be used, but it would allow some Destination types to make use of it.

The second source of confusion (now I'm moving on to the "right key string" part) is that this serializer.write API (at least in the way I/we understand it) is only suitable for implementing simple dictionary types (map<string, BasicType>). It's not very suitable for more complex value types like lists or (sub)dictionaries, as you'd have to add a new (virtual) function for each new data type. (This can be avoided by creating some sort of a polymorphic value type that can hold arbitrary structured data, but at that point, we're sort of reinventing json::Value.)

FWIW, in our downstream implementation, our Serializer base class declares startArray/endArray/startObject/endObject methods that our concrete serializers then implement for their different types. There's also a concrete method defined in the base class for serializing iterable types like arrays and vectors, and another for maps (the latter currently only supports std::unordered_map, but there's no reason it couldn't be extended to all map-like objects). Clearly for other complex object types, you'd need to define how to serialize that object type, though I've only rarely needed to resort to this in our downstream code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've thought about this for awhile, I think we can go with James's Serializer proposal.
(To handle the protobuf case, we can, as pointed, simply not use the Serializer).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(NOTE: to be implemented ... this is not resolved yet)


// For isa, dyn_cast, etc, operations.
virtual KindType getKind() const { return EntryKind::Base; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you ever intend to create instances of the base class? Should the class be abstract (perhaps through making this method abstract (=0))?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not immediately in LLDB, but some simpler use case might just need the base class?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, though I find that unlikely, since this entry carries essentially no information. And one could easily define a concrete empty class, if that's what one really needed.

One advantage of making this abstract is that you then don't need the EntryKind::Base enum and can leave the choice of enum values completely in the hands of the user. The second advantage is that it prevents object slicing (since you can't construct actual instances of the base). I think that making all non-leaf classes abstract is a useful property, particularly as here you indent to make copies of these objects (which usually forbidden for class hierarchies precisely due to slicing).

static bool classof(const TelemetryInfo *T) {
if (T == nullptr)
return false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see null checks in the other classof implementations (I think that because we have dyn_cast and dyn_cast_or_null cast flavour versions).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, for a base class, I think the proper implementation of this function is return true (as everything is an instance of the base type).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still believe there shouldn't be a null check here. we have llvm::(dyn_)cast*_if_present* for that.

return T->getKind() == EntryKind::Base;
}
};

/// This class presents a data sink to which the Telemetry framework
/// sends data.
///
/// Its implementation is transparent to the framework.
/// It is up to the vendor to decide which pieces of data to forward
/// and where to forward them.
class Destination {
public:
virtual ~Destination() = default;
virtual Error emitEntry(const TelemetryInfo *Entry) = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think a Destination should be emitting anything - it should be receiving things. So I'd call this receiveEntry or receiveEvent or whatever.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'll respond to this in the same comment w.r.t the serializer because these two are related and it's easier to group comments into one)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - renamed to receiveEntry.

virtual llvm::StringLiteral name() const = 0;
};

/// This class is the main interaction point between any LLVM tool
/// and this framework.
/// It is responsible for collecting telemetry data from the tool being
/// monitored and transmitting the data elsewhere.
class Telemeter {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a weird name. Could you try to use English names only, please.

Copy link
Member Author

@oontvoo oontvoo Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is English, AFAIK. https://www.oxfordlearnersdictionaries.com/us/definition/english/telemeter_2

P.S: Basically means a device for recording & sending telemetry, which is the appropriate name for this class

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shows what I know 😄

Not sure it's a well-known word though, so it doesn't really help convey meaning though, which is the key thing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't resolve conversations I've been involved with (see https://discourse.llvm.org/t/rfc-github-pr-resolve-conversation-button/73178 for a lengthy discussion). I'll resolve these when I'm happy with the resolution. In particular, as noted, I don't think the name "Telemeter" is going to convey the meaning clearly enough. Furthermore, as noted earlier, having all the classes prefixed with Telemetry in their name seems unnecessary, given their presence in the telemetry namespace. Could they simply be called Destination, Config, Event etc?

Copy link
Member Author

@oontvoo oontvoo Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed the Destiantion, Config, etc to get rid of the prefix.

What are your suggestions for alternatives for Telemeter?
(prefer something that does not have Log or Logger in it - as someone pointed out it might cause confusions)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S: How about TelemetryCollector?
(It doesn't fully communicate the aspect of "also transmitting the collected data" which Telemeter does, but I suppose that's close enough)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I had a similar reaction as @jh7370 when I saw the class name. Without the Telemetry prefix, it seems reasonable to use that as the class name? That seems to convey everything the class is doing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would go with Session, personally, because the class seems to most closely resemble our same-named class in our downstream implementation. I'd love to avoid Collector, mostly because for us our "telemetry collector" is actually a separate executable that monitors the file system for telemetry data files created by the individual tools, before forwarding that data onto the servers (in a roundabout way it actually does what this class is trying to do, just at a different point in the process, but I digress...)! However, neither of these points are especially strong, given they're more to do with how our downstream implementation works.

I see this class as being the thing that a) receives the user-created configuration data, b) receives the telemetry data, c) does things related to setup and teardown of the telemetry system. It then forwards that data onto the Destinations and generally manages things to do with telemetry. I therefore could be persuaded by Session, Manager, Service or even System.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like Session, assuming this class is responsible for generating/storing the unique session ID. I would imagine in LLDB a session is created during Initialization and cleaned up during Termination.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I believe this is resolved - we've agreed to name it telemetry::Manager)

public:
// Invoked upon tool startup
virtual void atStartup(llvm::StringRef ToolPath, TelemetryInfo *Entry) = 0;

// Invoked upon tool exit.
virtual void atExit(llvm::StringRef ToolPath, TelemetryInfo *Entry) = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow why atStartup and atExit should receive a TelemetryInfo. I also don't follow why atExit would need a ToolPath argument.

Our telemetry Session class constructor takes an ApplicationInfo object, which, among other things, includes the tool name, the tool version and various properties about the tool that are more vendor specific. This data is attached to every one of our events, so that we can see in the database what the source of our different events are. Perhaps the StringRef ToolPath should be replaced with something similar?

Also, is atStartup really a separate function, or should it be part of the constructor? Should it then take a Config?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to paraphase the 3 questions (I think) posed in this comment:

  1. Why does atStartup and atExit have to take a TelemetryInfo?
    Answer: The intention is that the caller would put data into this TelemetryInfo and give to the telemetry::Manager via these atStartup or atExit. In otherwords, the TelemetryInfo (and its subclasses) acts as the "data courier" to move data from the tool (the caller) to the telemtry framework.

  2. Why would they need a Toolpath argument?
    Answer: No, in fact, it doesn't. This piece of data could be put into the TelemetryInfo object,

  3. does atStartup need to be separate function vs part of ctor (presumably of the telemetry::Manager class)

It could go either way. My preference for having a separate atStartup from the ctor is that it allows for flexibility on when the telemetry::Manager is created. But that doesn't mean down stream implementations can't change this one way or the other.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks for the explanation. I'm happy with the 2 and 3 responses. For the 1 case, how might atStart and atExit actually differ internally, since all the data is stored in the TelemetryInfo input? Relatedly, how would you send events at other points in time?

Copy link
Member Author

@oontvoo oontvoo Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the 1 case, how might atStart and atExit actually differ internally, since all the data is stored in the TelemetryInfo input?

Do you mean, how would the implementation of atStart and atExit be different?
At the base level, perhaps not much. But different tools could have different things it wants to do at startup VS at exit.

Relatedly, how would you send events at other points in time?

The subclass of telemetry::Manager can define additional functions for that (eg., for LLDB's , we define lldb::telemetry::LldbManager::logMainExecutableLoad(ModuleTelemetryInfo) to be called when the main executable is loaded, or ::logDebugCommand (DebugCommandTelemetryInfo) to be called when a command is executed.)

These functions would be specific to each tool, hence not shared in the llvm::telemetry::Manager interface.
Also, their arguments (ModuleTelemetryInfo and DebugCommandTelemetryInfo) would be subclasses of llvm::telemetry::TelemetryInfo, as these classes can have additional fields/attributes to describe additional concepts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I expressed this in the past, but I still take issue with these two methods, I don't think they belong in the base implementation and should be specific instance of TelemtryInfo. This only makes sense in the context of a tool and not in the context of library like LLDB. When used from Python the calls to Initialize and Terminate do not have to correspond to startup or exit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what you'd like to be in this class?

Right now, it only has atStartup and atExit because they are the only two common actions most tools have.
Without them, the class would be empty?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think a generic dispatch (or alternatively named function) that takes a TelemetryInfo and forwards it to the configured Destination(s) would make more sense. Then in the tool/library code, it simply calls dispatch whenever required (which could include during startup/close down) and passes in the desired TelemetryInfo to do what it wants.

As well as the dispatch method, I think a std::vector<std::unique_ptr<Destination>> probably belongs in the Manager class (which dispatch uses). Otherwise, where are the Destination instances owned? If the downstream vendor has an implementation that needs to own stuff, then the Destination stored in Manager could just be a thing that references the downstream vendor piece.

dispatch would look something like:

void dispatch(TelemetryInfo *event) {
  // Maybe have an `onDispatch` hook that is vendor-defined (default do 
  // nothing to allow the vendor to do extra stuff here).
  for (Destination *dest : destinations)
    dest->receiveEntry(event);
}

Then in the tool code, you might have something like:

int tool_main(...) {
  ...
  manager.dispatch(StartupEvent());
  ...
  manager.dispatch(OtherEvent());
  ...
  manager.dispatch(ShutdownEvent());
}

Finally, I think it makes sense for the Manager constructor to take a Config, so that the downstream implementations of Manager can then hook into the stuff defined by Config in some manner. In particular, I could see an argument that the dispatch method should do nothing when telemetry is not enabled, so it would need access to that information.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - changed this to dispatch()


virtual void addDestination(Destination *Destination) = 0;
};

} // namespace telemetry
} // namespace llvm

#endif // LLVM_TELEMETRY_TELEMETRY_H
1 change: 1 addition & 0 deletions llvm/lib/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ add_subdirectory(ProfileData)
add_subdirectory(Passes)
add_subdirectory(TargetParser)
add_subdirectory(TextAPI)
add_subdirectory(Telemetry)
add_subdirectory(ToolDrivers)
add_subdirectory(XRay)
if (LLVM_INCLUDE_TESTS)
Expand Down
6 changes: 6 additions & 0 deletions llvm/lib/Telemetry/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
add_llvm_component_library(LLVMTelemetry
Telemetry.cpp

ADDITIONAL_HEADER_DIRS
"${LLVM_MAIN_INCLUDE_DIR}/llvm/Telemetry"
)
13 changes: 13 additions & 0 deletions llvm/lib/Telemetry/Telemetry.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#include "llvm/Telemetry/Telemetry.h"

namespace llvm {
namespace telemetry {

llvm::json::Object TelemetryInfo::serializeToJson() const {
return json::Object{
{"SessionId", SessionId},
};
};

} // namespace telemetry
} // namespace llvm
Loading