summaryrefslogtreecommitdiff
path: root/tools/perf/scripts/python/check-perf-trace.py
diff options
context:
space:
mode:
authorSteven Rostedt (VMware) <rostedt@goodmis.org>2020-06-30 13:05:29 -0400
committerSteven Rostedt (VMware) <rostedt@goodmis.org>2020-07-01 22:12:07 -0400
commitbbeba3e58f040a4297a5ba88ebf6e2b16adc3657 (patch)
tree9561961bfd549a73c8324672c502b4e0b7847294 /tools/perf/scripts/python/check-perf-trace.py
parent74e879373b377f15d4ecb45bf8316b77e8badc49 (diff)
ring-buffer: Call trace_clock_local() directly for RETPOLINE kernels
After doing some benchmarks and examining the code, I found that the ring buffer clock calls were quite expensive, and noticed that it uses retpolines. This is because the ring buffer clock is programmable, and can be set. But in most cases it simply uses the fastest ns unit clock which is the trace_clock_local(). For RETPOLINE builds, checking if the ring buffer clock is set to trace_clock_local() and then calling it directly has brought the time of an event on my i7 box from an average of 93 nanoseconds an event down to 83 nanoseconds an event, and the minimum time from 81 nanoseconds to 68 nanoseconds! Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Diffstat (limited to 'tools/perf/scripts/python/check-perf-trace.py')
0 files changed, 0 insertions, 0 deletions