Skip to content

Commit e7c15cd

Browse files
committed
tracing: Added hardware latency tracer
The hardware latency tracer has been in the PREEMPT_RT patch for some time. It is used to detect possible SMIs or any other hardware interruptions that the kernel is unaware of. Note, NMIs may also be detected, but that may be good to note as well. The logic is pretty simple. It simply creates a thread that spins on a single CPU for a specified amount of time (width) within a periodic window (window). These numbers may be adjusted by their cooresponding names in /sys/kernel/tracing/hwlat_detector/ The defaults are window = 1000000 us (1 second) width = 500000 us (1/2 second) The loop consists of: t1 = trace_clock_local(); t2 = trace_clock_local(); Where trace_clock_local() is a variant of sched_clock(). The difference of t2 - t1 is recorded as the "inner" timestamp and also the timestamp t1 - prev_t2 is recorded as the "outer" timestamp. If either of these differences are greater than the time denoted in /sys/kernel/tracing/tracing_thresh then it records the event. When this tracer is started, and tracing_thresh is zero, it changes to the default threshold of 10 us. The hwlat tracer in the PREEMPT_RT patch was originally written by Jon Masters. I have modified it quite a bit and turned it into a tracer. Based-on-code-by: Jon Masters <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
1 parent 8861dd3 commit e7c15cd

File tree

7 files changed

+642
-1
lines changed

7 files changed

+642
-1
lines changed

kernel/trace/Kconfig

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,6 +221,41 @@ config SCHED_TRACER
221221
This tracer tracks the latency of the highest priority task
222222
to be scheduled in, starting from the point it has woken up.
223223

224+
config HWLAT_TRACER
225+
bool "Tracer to detect hardware latencies (like SMIs)"
226+
select GENERIC_TRACER
227+
help
228+
This tracer, when enabled will create one or more kernel threads,
229+
depening on what the cpumask file is set to, which each thread
230+
spinning in a loop looking for interruptions caused by
231+
something other than the kernel. For example, if a
232+
System Management Interrupt (SMI) takes a noticeable amount of
233+
time, this tracer will detect it. This is useful for testing
234+
if a system is reliable for Real Time tasks.
235+
236+
Some files are created in the tracing directory when this
237+
is enabled:
238+
239+
hwlat_detector/width - time in usecs for how long to spin for
240+
hwlat_detector/window - time in usecs between the start of each
241+
iteration
242+
243+
A kernel thread is created that will spin with interrupts disabled
244+
for "width" microseconds in every "widow" cycle. It will not spin
245+
for "window - width" microseconds, where the system can
246+
continue to operate.
247+
248+
The output will appear in the trace and trace_pipe files.
249+
250+
When the tracer is not running, it has no affect on the system,
251+
but when it is running, it can cause the system to be
252+
periodically non responsive. Do not run this tracer on a
253+
production system.
254+
255+
To enable this tracer, echo in "hwlat" into the current_tracer
256+
file. Every time a latency is greater than tracing_thresh, it will
257+
be recorded into the ring buffer.
258+
224259
config ENABLE_DEFAULT_TRACERS
225260
bool "Trace process context switches and events"
226261
depends on !GENERIC_TRACER

kernel/trace/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ obj-$(CONFIG_FUNCTION_TRACER) += trace_functions.o
4141
obj-$(CONFIG_IRQSOFF_TRACER) += trace_irqsoff.o
4242
obj-$(CONFIG_PREEMPT_TRACER) += trace_irqsoff.o
4343
obj-$(CONFIG_SCHED_TRACER) += trace_sched_wakeup.o
44+
obj-$(CONFIG_HWLAT_TRACER) += trace_hwlat.o
4445
obj-$(CONFIG_NOP_TRACER) += trace_nop.o
4546
obj-$(CONFIG_STACK_TRACER) += trace_stack.o
4647
obj-$(CONFIG_MMIOTRACE) += trace_mmiotrace.o

kernel/trace/trace.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1047,7 +1047,7 @@ void disable_trace_on_warning(void)
10471047
*
10481048
* Shows real state of the ring buffer if it is enabled or not.
10491049
*/
1050-
static int tracer_tracing_is_on(struct trace_array *tr)
1050+
int tracer_tracing_is_on(struct trace_array *tr)
10511051
{
10521052
if (tr->trace_buffer.buffer)
10531053
return ring_buffer_record_is_on(tr->trace_buffer.buffer);

kernel/trace/trace.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ enum trace_type {
3838
TRACE_USER_STACK,
3939
TRACE_BLK,
4040
TRACE_BPUTS,
41+
TRACE_HWLAT,
4142

4243
__TRACE_LAST_TYPE,
4344
};
@@ -326,6 +327,7 @@ extern void __ftrace_bad_type(void);
326327
IF_ASSIGN(var, ent, struct print_entry, TRACE_PRINT); \
327328
IF_ASSIGN(var, ent, struct bprint_entry, TRACE_BPRINT); \
328329
IF_ASSIGN(var, ent, struct bputs_entry, TRACE_BPUTS); \
330+
IF_ASSIGN(var, ent, struct hwlat_entry, TRACE_HWLAT); \
329331
IF_ASSIGN(var, ent, struct trace_mmiotrace_rw, \
330332
TRACE_MMIO_RW); \
331333
IF_ASSIGN(var, ent, struct trace_mmiotrace_map, \
@@ -571,6 +573,7 @@ void tracing_reset_current(int cpu);
571573
void tracing_reset_all_online_cpus(void);
572574
int tracing_open_generic(struct inode *inode, struct file *filp);
573575
bool tracing_is_disabled(void);
576+
int tracer_tracing_is_on(struct trace_array *tr);
574577
struct dentry *trace_create_file(const char *name,
575578
umode_t mode,
576579
struct dentry *parent,

kernel/trace/trace_entries.h

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -322,3 +322,26 @@ FTRACE_ENTRY(branch, trace_branch,
322322
FILTER_OTHER
323323
);
324324

325+
326+
FTRACE_ENTRY(hwlat, hwlat_entry,
327+
328+
TRACE_HWLAT,
329+
330+
F_STRUCT(
331+
__field( u64, duration )
332+
__field( u64, outer_duration )
333+
__field_struct( struct timespec, timestamp )
334+
__field_desc( long, timestamp, tv_sec )
335+
__field_desc( long, timestamp, tv_nsec )
336+
__field( unsigned int, seqnum )
337+
),
338+
339+
F_printk("cnt:%u\tts:%010lu.%010lu\tinner:%llu\touter:%llu\n",
340+
__entry->seqnum,
341+
__entry->tv_sec,
342+
__entry->tv_nsec,
343+
__entry->duration,
344+
__entry->outer_duration),
345+
346+
FILTER_OTHER
347+
);

0 commit comments

Comments
 (0)