Skip to content

[llvm]Add a simple Telemetry framework #102323

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 99 commits into from
Dec 20, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
99 commits
Select commit Hold shift + click to select a range
dbb8b15
[llvm][lib]Propose a simple Telemetry framework.
oontvoo Aug 7, 2024
6e43f67
add header
oontvoo Aug 7, 2024
e25f5fc
fixed typo
oontvoo Aug 7, 2024
ad3906c
Merge branch 'llvm:main' into llvm_telemetry
oontvoo Aug 9, 2024
24d07d6
Merge branch 'llvm:main' into llvm_telemetry
oontvoo Aug 22, 2024
0057bcf
add tests and addressed review comments
oontvoo Aug 29, 2024
750e4ac
formatting changes
oontvoo Aug 29, 2024
1378ed4
more formatting
oontvoo Aug 29, 2024
02e750e
more formatting changes
oontvoo Aug 29, 2024
63e99fc
Added header comment to describe the package
oontvoo Aug 29, 2024
fa88512
reformat header
oontvoo Aug 29, 2024
0866f64
addressed review comments and added separate docs in llvm/docs/
oontvoo Sep 3, 2024
a8523f7
updated field names in tests
oontvoo Sep 4, 2024
47e8b06
reformated doc and added additional details
oontvoo Sep 4, 2024
690c6ab
fix formatting
oontvoo Sep 4, 2024
db668f4
details
oontvoo Sep 4, 2024
0290a14
formatted TelemetryTest.cpp
oontvoo Sep 4, 2024
14b5234
fix formatting in docs
oontvoo Sep 4, 2024
6ca87e5
convert comments to doxygen style
oontvoo Sep 4, 2024
4155add
fix doc toc
oontvoo Sep 4, 2024
11ead4a
group all the testing-params into a Context struct
oontvoo Sep 5, 2024
48228ee
fix conversion warnings
oontvoo Sep 5, 2024
ed8a3f1
Merge branch 'llvm:main' into llvm_telemetry
oontvoo Sep 5, 2024
990e1ca
finish up todos
oontvoo Sep 5, 2024
479cd13
reformat docs
oontvoo Sep 20, 2024
6d8aee2
Merge branch 'llvm:main' into llvm_telemetry
oontvoo Sep 24, 2024
3a17dbb
Update llvm/docs/Telemetry.rst
oontvoo Sep 24, 2024
a6a6c7b
Update llvm/docs/Telemetry.rst
oontvoo Sep 24, 2024
f30dec6
Update llvm/docs/Telemetry.rst
oontvoo Sep 24, 2024
fd3da20
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Sep 24, 2024
60616ef
Update llvm/unittests/Telemetry/TelemetryTest.cpp
oontvoo Sep 24, 2024
8c0ac5a
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Sep 24, 2024
c8be829
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Sep 24, 2024
1393c1f
Update llvm/docs/Telemetry.rst
oontvoo Sep 24, 2024
eb07577
Update llvm/docs/Telemetry.rst
oontvoo Sep 24, 2024
0376abc
fix comment
oontvoo Sep 24, 2024
e2f7c23
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Sep 24, 2024
17dfac7
Update llvm/docs/Telemetry.rst
oontvoo Sep 24, 2024
1ea0906
Update llvm/docs/Telemetry.rst
oontvoo Sep 24, 2024
39fd0e7
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Sep 24, 2024
67b7688
Addressed review comments: remove most of the member fields from Tele…
oontvoo Sep 24, 2024
efd25d8
use llvm::StringLiteral
oontvoo Sep 24, 2024
4a8276f
Addressed review comment:
oontvoo Sep 26, 2024
a16344b
pass test context as arg to ctor
oontvoo Sep 27, 2024
26ee5eb
construct expected json directly rather than parsing from strings
oontvoo Sep 30, 2024
b766e3c
remove unnecessary use of atomic seed
oontvoo Sep 30, 2024
c8cddab
addressed review comments
oontvoo Nov 21, 2024
5f8d65f
s/emitEntry/receiveEntry
oontvoo Nov 21, 2024
dffeacd
Addressed review comments:
oontvoo Nov 27, 2024
398ed3e
updated doc
oontvoo Nov 27, 2024
e462908
fix build warning
oontvoo Dec 4, 2024
df8a9af
more cleanup
oontvoo Dec 4, 2024
70f4742
spacing
oontvoo Dec 4, 2024
8eab77a
more docs
oontvoo Dec 4, 2024
f9e1a65
pass map arg as const ref
oontvoo Dec 9, 2024
2a1fbe5
Update llvm/docs/Telemetry.rst
oontvoo Dec 11, 2024
6e3a85d
Update llvm/docs/Telemetry.rst
oontvoo Dec 11, 2024
f9b1cce
addressed review comments
oontvoo Dec 11, 2024
0f38a74
Update llvm/docs/Telemetry.rst
oontvoo Dec 16, 2024
ad57099
Update llvm/docs/Telemetry.rst
oontvoo Dec 16, 2024
a2fcd4d
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Dec 16, 2024
628e7e9
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Dec 16, 2024
6ea4e92
Update llvm/docs/Telemetry.rst
oontvoo Dec 16, 2024
08cf32f
Update llvm/docs/Telemetry.rst
oontvoo Dec 16, 2024
3c52401
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Dec 16, 2024
07f06c0
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Dec 16, 2024
06e746e
addressed review comments:
oontvoo Dec 16, 2024
2476294
more styling fixe
oontvoo Dec 16, 2024
5ce406c
dest -> Dest
oontvoo Dec 16, 2024
fd57d06
fix caps
oontvoo Dec 16, 2024
a8a878b
Update llvm/docs/Telemetry.rst
oontvoo Dec 17, 2024
54eeaba
Update llvm/docs/Telemetry.rst
oontvoo Dec 17, 2024
32dfc6a
Update llvm/docs/Telemetry.rst
oontvoo Dec 17, 2024
0893c5b
Update llvm/docs/Telemetry.rst
oontvoo Dec 17, 2024
155f243
Update llvm/docs/Telemetry.rst
oontvoo Dec 17, 2024
b255e3a
Update llvm/docs/Telemetry.rst
oontvoo Dec 17, 2024
14d1c92
formatting
oontvoo Dec 17, 2024
bf2172d
update design chart
oontvoo Dec 17, 2024
c2dcb7d
replace write(std::map) with beginObject/endObject interface
oontvoo Dec 17, 2024
bbe9009
formatting
oontvoo Dec 17, 2024
e2036c6
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Dec 18, 2024
05947d5
Update llvm/docs/Telemetry.rst
oontvoo Dec 18, 2024
f1100bb
add helper write() that takes a map. also updated docs
oontvoo Dec 18, 2024
2f64b29
added additional overloads for int types
oontvoo Dec 18, 2024
a5df7a0
formatting
oontvoo Dec 18, 2024
585fee8
remove more unused headers
oontvoo Dec 18, 2024
3812cda
Update llvm/unittests/Telemetry/TelemetryTest.cpp
oontvoo Dec 19, 2024
0cdc240
Update llvm/unittests/Telemetry/TelemetryTest.cpp
oontvoo Dec 19, 2024
da8056f
Update llvm/unittests/Telemetry/TelemetryTest.cpp
oontvoo Dec 19, 2024
c116074
Update llvm/unittests/Telemetry/TelemetryTest.cpp
oontvoo Dec 19, 2024
a4e2799
Update llvm/unittests/Telemetry/TelemetryTest.cpp
oontvoo Dec 19, 2024
2763d46
Update llvm/unittests/Telemetry/TelemetryTest.cpp
oontvoo Dec 19, 2024
75c1b4b
addressed review comments
oontvoo Dec 19, 2024
9780ec7
Addressed review comments:
oontvoo Dec 19, 2024
4715743
Update llvm/lib/Telemetry/Telemetry.cpp
oontvoo Dec 20, 2024
d231bf2
Update llvm/include/llvm/Telemetry/Telemetry.h
oontvoo Dec 20, 2024
1941b1f
addressed review comments:
oontvoo Dec 20, 2024
bd3df5e
Add simple assert to make sure the map's key is string compat
oontvoo Dec 20, 2024
4351b94
Merge branch 'llvm:main' into llvm_telemetry
oontvoo Dec 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
99 changes: 99 additions & 0 deletions llvm/include/llvm/Telemetry/Telemetry.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
#ifndef LVM_TELEMETRY_TELEMETRY_H
#define LVM_TELEMETRY_TELEMETRY_H

#include <chrono>
#include <ctime>
#include <memory>
#include <optional>
#include <string>

#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/Error.h"
#include "llvm/Support/JSON.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not looked at the other headers, but I believe at least JSON.h isn't needed now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


namespace llvm {
namespace telemetry {

using SteadyTimePoint = std::chrono::time_point<std::chrono::steady_clock>;

struct TelemetryConfig {
// If true, telemetry will be enabled.
bool enable_telemetry;

// Additional destinations to send the logged entries.
// Could be stdout, stderr, or some local paths.
// Note: these are destinations are __in addition to__ whatever the default
// destination(s) are, as implemented by vendors.
std::vector<std::string> additional_destinations;
};

struct TelemetryEventStats {
// REQUIRED: Start time of event
SteadyTimePoint m_start;
// OPTIONAL: End time of event - may be empty if not meaningful.
std::optional<SteadyTimePoint> m_end;
// TBD: could add some memory stats here too?

TelemetryEventStats() = default;
TelemetryEventStats(SteadyTimePoint start) : m_start(start) {}
TelemetryEventStats(SteadyTimePoint start, SteadyTimePoint end)
: m_start(start), m_end(end) {}

std::string ToString() const;
};

struct ExitDescription {
int exit_code;
std::string description;

std::string ToString() const;
};

// The base class contains the basic set of data.
// Downstream implementations can add more fields as needed.
struct TelemetryInfo {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we've discussed time points before and I've been giving it a bit more thought today. I think there are three broad time-related variations of a piece of telemetry data:

  1. Things that describe some state related to the telemetry session that are not specific to a point in time.
  2. Things that describe some state at a specific point in time, or that describe an occurrence of something at a specific point of time.
  3. Things that describe the same as 2, but over a specific time range rather than a single point in time.

For a given event type, I don't think it makes sense for it to support more than one of these variations. So I think that implies either three subclasses, or probably simpler, three constructors for TelemetryInfo, like in the following pseudo-code.

TelemetryInfo(); // Sets timepoints to null.
TelemetryInfo(time_point tp); // Sets "start" time to tp, and end time to null.
TelemetryInfo(time_point start, time_point end); // Sets the corresponding start and end times.

I feel like this is generic enough that it deserves to be in the base framework, rather than having most/all downstream vendors having to implement the same things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly prefer to not include time in the base implementation. Our downstream implementation exclusively uses counts and has a different mechanism for collecting performance telemetry.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're looking at an outdated snapshot of the patch somehow.
In my tree, the TelemetryInfo class currently only has one field, which is the SessionId.

Would that resolve your concern here?

// A "session" corresponds to every time the tool starts.
// All entries emitted for the same session will have
// the same session_uuid
std::string session_uuid;

TelemetryEventStats stats;

std::optional<ExitDescription> exit_description;

// Counting number of entries.
// (For each set of entries with the same session_uuid, this value should
// be unique for each entry)
size_t counter;

TelemetryInfo() = default;
~TelemetryInfo() = default;
virtual std::string ToString() const;
};

// Where/how to send the telemetry entries.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mixing of the "Where" and the "How" is a red-flag to me, because it suggests that the class is trying to do two things at once. It's related to a comment I made previously: these two concepts are somewhat orthogonal. It might be that we want to send data to a file or a string or a pipe or ... , and we might want to send it as JSON/YAML/some kind of structured messages etc. It's certainly true as noted before that mixing some "hows" with some "wheres" doesn't make sense, but there are enough cases where it does make sense that I don't think we should risk having to have a combinatorial explosion of classes e.g. JSONFileDestination, JSONStringDestination, YAMLFileDestination, YAMLStringDestination etc. (Yes, strings and files might both be modelled as streams, but that's not really the point).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - added the comment on the usage for the class , which is that it acts as a data sink for the framework :

  • the telemeter collects data from the tool being monitored and dumps it there
  • this "sink" (Destination class) can then forward it wherever, transparently to the framework. (Also, the class gets to decide which data to send and which to simply drop on the floor)

class TelemetryDestination {
public:
virtual ~TelemetryDestination() = default;
virtual Error EmitEntry(const TelemetryInfo *entry) = 0;
virtual std::string name() const = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect names to always be constant strings? If so we can have this return an llvm::StringLiteral?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - seems a bit restrictive for a general API, though.

};

class Telemeter {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a weird name. Could you try to use English names only, please.

Copy link
Member Author

@oontvoo oontvoo Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is English, AFAIK. https://www.oxfordlearnersdictionaries.com/us/definition/english/telemeter_2

P.S: Basically means a device for recording & sending telemetry, which is the appropriate name for this class

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shows what I know 😄

Not sure it's a well-known word though, so it doesn't really help convey meaning though, which is the key thing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't resolve conversations I've been involved with (see https://discourse.llvm.org/t/rfc-github-pr-resolve-conversation-button/73178 for a lengthy discussion). I'll resolve these when I'm happy with the resolution. In particular, as noted, I don't think the name "Telemeter" is going to convey the meaning clearly enough. Furthermore, as noted earlier, having all the classes prefixed with Telemetry in their name seems unnecessary, given their presence in the telemetry namespace. Could they simply be called Destination, Config, Event etc?

Copy link
Member Author

@oontvoo oontvoo Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed the Destiantion, Config, etc to get rid of the prefix.

What are your suggestions for alternatives for Telemeter?
(prefer something that does not have Log or Logger in it - as someone pointed out it might cause confusions)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S: How about TelemetryCollector?
(It doesn't fully communicate the aspect of "also transmitting the collected data" which Telemeter does, but I suppose that's close enough)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I had a similar reaction as @jh7370 when I saw the class name. Without the Telemetry prefix, it seems reasonable to use that as the class name? That seems to convey everything the class is doing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would go with Session, personally, because the class seems to most closely resemble our same-named class in our downstream implementation. I'd love to avoid Collector, mostly because for us our "telemetry collector" is actually a separate executable that monitors the file system for telemetry data files created by the individual tools, before forwarding that data onto the servers (in a roundabout way it actually does what this class is trying to do, just at a different point in the process, but I digress...)! However, neither of these points are especially strong, given they're more to do with how our downstream implementation works.

I see this class as being the thing that a) receives the user-created configuration data, b) receives the telemetry data, c) does things related to setup and teardown of the telemetry system. It then forwards that data onto the Destinations and generally manages things to do with telemetry. I therefore could be persuaded by Session, Manager, Service or even System.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like Session, assuming this class is responsible for generating/storing the unique session ID. I would imagine in LLDB a session is created during Initialization and cleaned up during Termination.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I believe this is resolved - we've agreed to name it telemetry::Manager)

public:
virtual ~Telemeter() = default;

// Invoked upon tool startup
virtual void LogStartup(llvm::StringRef tool_path, TelemetryInfo *entry) = 0;

// Invoked upon tool exit.
virtual void LogExit(llvm::StringRef tool_path, TelemetryInfo *entry) = 0;

virtual void AddDestination(TelemetryDestination *destination) = 0;
};

} // namespace telemetry
} // namespace llvm

#endif // LVM_TELEMETRY_TELEMETRY_H
1 change: 1 addition & 0 deletions llvm/lib/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ add_subdirectory(ProfileData)
add_subdirectory(Passes)
add_subdirectory(TargetParser)
add_subdirectory(TextAPI)
add_subdirectory(Telemetry)
add_subdirectory(ToolDrivers)
add_subdirectory(XRay)
if (LLVM_INCLUDE_TESTS)
Expand Down
6 changes: 6 additions & 0 deletions llvm/lib/Telemetry/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
add_llvm_component_library(LLVMTelemetry
Telemetry.cpp

ADDITIONAL_HEADER_DIRS
"${LLVM_MAIN_INCLUDE_DIR}/llvm/Telemetry"
)
32 changes: 32 additions & 0 deletions llvm/lib/Telemetry/Telemetry.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#include "llvm/Telemetry/Telemetry.h"

namespace llvm {
namespace telemetry {

std::string TelemetryEventStats::ToString() const {
std::string result;
llvm::raw_string_ostream os(result);
os << "start_timestamp: " << m_start.time_since_epoch().count()
<< ", end_timestamp: "
<< (m_end.has_value() ? std::to_string(m_end->time_since_epoch().count())
: "<NONE>");
return result;
}

std::string ExitDescription::ToString() const {
return "exit_code: " + std::to_string(exit_code) +
", description: " + description + "\n";
}

std::string TelemetryInfo::ToString() const {
return "[TelemetryInfo]\n" + (" session_uuid:" + session_uuid + "\n") +
(" stats: " + stats.ToString() + "\n") +
(" exit_description: " +
(exit_description.has_value() ? exit_description->ToString()
: "<NONE>") +
"\n") +
(" counter: " + std::to_string(counter) + "\n");
}

} // namespace telemetry
} // namespace llvm
Loading