|
5 | 5 | - [Architecture](#architecture)
|
6 | 6 | - [The Dispatcher](#the-dispatcher)
|
7 | 7 | - [The Subscriber](#the-subscriber)
|
| 8 | + - [Using the Reference Dispatcher and Subscriber](#using-the-reference-dispatcher-and-subscriber) |
8 | 9 | - [Tracing Framework and Callback APIs](#tracing-framework-and-callback-apis)
|
9 | 10 | - [Brief API Concepts](#brief-api-concepts)
|
10 | 11 | - [`xptiInitialize`](#xptiinitialize)
|
@@ -56,8 +57,10 @@ performance of the framework.
|
56 | 57 | To enable the build to use TBB for the framework and tests, use the commands as
|
57 | 58 | shown below:
|
58 | 59 |
|
59 |
| - cd xptifw |
60 |
| - cmake -DXPTI_ENABLE_TBB=ON -DXPTI_SOURCE_DIR=$SYCL_HOME/xpti ./ |
| 60 | + ```bash |
| 61 | + % cd xptifw |
| 62 | + % cmake -DXPTI_ENABLE_TBB=ON -DXPTI_SOURCE_DIR=$SYCL_HOME/xpti ./ |
| 63 | + ``` |
61 | 64 |
|
62 | 65 | > **NOTE:** This document is best viewed with [Markdown Reader](https://chrome.google.com/webstore/detail/markdown-reader/gpoigdifkoadgajcincpilkjmejcaanc)
|
63 | 66 | > plugin for Chrome or the [Markdown Preview Extension]() for Visual Studio Code.
|
@@ -232,6 +235,62 @@ combined value of the `unique_id` and `instance_id` should always be unique.
|
232 | 235 | > **NOTE:** The specification for a given event stream **must** be consulted
|
233 | 236 | > before implementing the callback handlers for various trace types.
|
234 | 237 |
|
| 238 | +### Using the Reference Dispatcher and Subscriber |
| 239 | + |
| 240 | +The XPTI framework package provides a reference implementation of the XPTI |
| 241 | +dispatcher and a sample subscriber that can be used to see what is being emitted |
| 242 | +by any stream generated using XPTI. If you wish to skip the rest of the |
| 243 | +document and inspect the generated stream, you can follow the steps outlined |
| 244 | +below. |
| 245 | + |
| 246 | +1. **Build the XPTI framework dispatcher:** The instructions below show how to |
| 247 | + build the library with standard containers. If you have access to TBB, you |
| 248 | + can enable the macro `-DXPTI_USE_TBB` in the cmake command. |
| 249 | + |
| 250 | + ```bash |
| 251 | + % cd xptifw |
| 252 | + % cmake -DXPTI_SOURCE_DIR=$SYCL_HOME/xpti ./ |
| 253 | + % make |
| 254 | + ``` |
| 255 | + |
| 256 | + The binaries will be built and installed in `lib/Release`. These include the |
| 257 | + dispatcher, a sample subscriber that prints the contents of the stream, the |
| 258 | + unit test and a performance characterization application for the framework. |
| 259 | + |
| 260 | +2. **Run an instrumented SYCL application:** |
| 261 | + To enable the dispatcher and subscriber, set the following environment |
| 262 | + variables. The commands for enabling the environment variables are provided |
| 263 | + for Linux environments in the example below: |
| 264 | + |
| 265 | + ```bash |
| 266 | + % export XPTI_TRACE_ENABLE=1 |
| 267 | + % export XPTI_FRAMEWORK_DISPATCHER=/path/to/libxptifw.so |
| 268 | + % export XPTI_SUBSCRIBERS=/path/to/libbasic_collector.so |
| 269 | + ``` |
| 270 | + |
| 271 | + You can now run a SYCL application that has been linked with a runtime that |
| 272 | + supports the XPTI instrumentation and inspect the resulting stream. |
| 273 | + |
| 274 | +3. **Running the unit tests:** The unit tests included cover the exported API |
| 275 | + and incorporate some correctness tests. |
| 276 | + |
| 277 | + ```bash |
| 278 | + % <xptifw-dir>/lib/Release/xpti_tests |
| 279 | + ``` |
| 280 | +4. **Understanding the throughput of the framework:** This document discusses |
| 281 | + the performance of the framework in detail in the sections [Performance of the Framework](#performance-of-the-framework) and [Modeling and projection](#modeling-and-projection). For details on the command line arguments, |
| 282 | + please refer to these sections. |
| 283 | + |
| 284 | + ```bash |
| 285 | + % <xptifw-dir>/lib/Release/run_test --trace-points 1000 --type performance --overhead 1.5 --num-threads 0,1,2,3 --test-id 1,2 --tp-frequency 50 |
| 286 | + ``` |
| 287 | + |
| 288 | + The above command will run the performance tests in which 1000 trace points |
| 289 | + are created and each trace point visited twice. The trace point creation and |
| 290 | + notification costs are measured in single thread and multi-threaded |
| 291 | + scenarios and the output shows the throughput projection of the framework |
| 292 | + using the events/sec metric at 1.5% overheads to the application runtime. |
| 293 | + |
235 | 294 | ## Tracing Framework and Callback APIs
|
236 | 295 |
|
237 | 296 | The current version of the instrumentation API adopts a model where traces are
|
|
0 commit comments