Age | Commit message (Collapse) | Author |
|
All tools will want those, so provide a convenient way to get them.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-v16pe3sbf3wjmn152u18f649@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
So that we can have macros for the OPT_ entries and also for finding
those in an evlist, this way other tools will use this very easily.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-q0og1xoqqi0w38ve5u0a43k2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now other tools that want switching can use an evswitch for that, just
set it up and add it to the PERF_RECORD_SAMPLE processing function.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-b1trj1q97qwfv251l66q3noj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now that we see that the simple userspace-based "slicing" of events
using delimiter events ("markers") works, lets move it to a separate
header to make it available to other tools, next step will be having
the switch on/off check done at the PERF_RECORD_SAMPLE processing
function moved too.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-z0cyi9ifzlr37cedr9xztc1k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Counterpart of --switch-on:
# perf record -e sched:*,syscalls:sys_*_nanosleep sleep 1
[ perf record: Woken up 36 times to write data ]
[ perf record: Captured and wrote 0.032 MB perf.data (10 samples) ]
#
# perf script
:20918 20918 [002] 109866.143696: sched:sched_waking: comm=perf pid=20919 prio=120 target_cpu=001
:20918 20918 [002] 109866.143702: sched:sched_wakeup: perf:20919 [120] success=1 CPU:001
sleep 20919 [001] 109866.144081: sched:sched_process_exec: filename=/usr/bin/sleep pid=20919 old_pid=20919
sleep 20919 [001] 109866.144408: syscalls:sys_enter_nanosleep: rqtp: 0x7ffc2384fef0, rmtp: 0x00000000
sleep 20919 [001] 109866.144411: sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n>
sleep 20919 [001] 109866.144412: sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
swapper 0 [001] 109867.144568: sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
swapper 0 [001] 109867.144586: sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
sleep 20919 [001] 109867.144614: syscalls:sys_exit_nanosleep: 0x0
sleep 20919 [001] 109867.144753: sched:sched_process_exit: comm=sleep pid=20919 prio=120
#
# perf script --switch-off syscalls:sys_exit_nanosleep
:20918 20918 [002] 109866.143696: sched:sched_waking: comm=perf pid=20919 prio=120 target_cpu=001
:20918 20918 [002] 109866.143702: sched:sched_wakeup: perf:20919 [120] success=1 CPU:001
sleep 20919 [001] 109866.144081: sched:sched_process_exec: filename=/usr/bin/sleep pid=20919 old_pid=20919
sleep 20919 [001] 109866.144408: syscalls:sys_enter_nanosleep: rqtp: 0x7ffc2384fef0, rmtp: 0x00000000
sleep 20919 [001] 109866.144411: sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n>
sleep 20919 [001] 109866.144412: sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
swapper 0 [001] 109867.144568: sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
swapper 0 [001] 109867.144586: sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
sleep 20919 [001] 109867.144753: sched:sched_process_exit: comm=sleep pid=20919 prio=120
#
# perf script --switch-on syscalls:sys_enter_nanosleep --switch-off syscalls:sys_exit_nanosleep
sleep 20919 [001] 109866.144411: sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n>
sleep 20919 [001] 109866.144412: sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
swapper 0 [001] 109867.144568: sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
swapper 0 [001] 109867.144586: sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
#
# perf script --switch-on syscalls:sys_enter_nanosleep --switch-off syscalls:sys_exit_nanosleep --show-on-off
sleep 20919 [001] 109866.144408: syscalls:sys_enter_nanosleep: rqtp: 0x7ffc2384fef0, rmtp: 0x00000000
sleep 20919 [001] 109866.144411: sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n>
sleep 20919 [001] 109866.144412: sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
swapper 0 [001] 109867.144568: sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
swapper 0 [001] 109867.144586: sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
sleep 20919 [001] 109867.144614: syscalls:sys_exit_nanosleep: 0x0
#
Now think about using this together with 'perf probe' to create custom on/off
events in your app :-)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-li3j01c4tmj9kw6ydsl8swej@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
One may want to see the --switch-on event as well, allow for that, using
the previous cset example:
# perf script --switch-on syscalls:sys_enter_nanosleep --show-on-off
sleep 13638 [001] 108237.582286: syscalls:sys_enter_nanosleep: rqtp: 0x7fff1948ac40, rmtp: 0x00000000
sleep 13638 [001] 108237.582289: sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns]
sleep 13638 [001] 108237.582291: sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120]
swapper 0 [001] 108238.582428: sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001
swapper 0 [001] 108238.582458: sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001
sleep 13638 [001] 108238.582698: sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns]
sleep 13638 [001] 108238.582782: sched:sched_process_exit: comm=sleep pid=13638 prio=120
#
# perf script --switch-on syscalls:sys_enter_nanosleep
sleep 13638 [001] 108237.582289: sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns]
sleep 13638 [001] 108237.582291: sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120]
swapper 0 [001] 108238.582428: sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001
swapper 0 [001] 108238.582458: sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001
sleep 13638 [001] 108238.582698: sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns]
sleep 13638 [001] 108238.582782: sched:sched_process_exit: comm=sleep pid=13638 prio=120
#
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Cc: Florian Weimer <fweimer@redhat.com>
Link: https://lkml.kernel.org/n/tip-0omwwoywj1v63gu8cz0tr0cy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Sometime we want to only consider events after something happens, so
allow discarding events till such events is found, e.g.:
Record all scheduler tracepoints and the sys_enter_nanosleep syscall
event for the 'sleep 1' workload:
# perf record -e sched:*,syscalls:sys_enter_nanosleep sleep 1
[ perf record: Woken up 31 times to write data ]
[ perf record: Captured and wrote 0.032 MB perf.data (10 samples) ]
#
So we have these events in the generated perf data file:
# perf evlist
sched:sched_kthread_stop
sched:sched_kthread_stop_ret
sched:sched_waking
sched:sched_wakeup
sched:sched_wakeup_new
sched:sched_switch
sched:sched_migrate_task
sched:sched_process_free
sched:sched_process_exit
sched:sched_wait_task
sched:sched_process_wait
sched:sched_process_fork
sched:sched_process_exec
sched:sched_stat_wait
sched:sched_stat_sleep
sched:sched_stat_iowait
sched:sched_stat_blocked
sched:sched_stat_runtime
sched:sched_pi_setprio
sched:sched_move_numa
sched:sched_stick_numa
sched:sched_swap_numa
sched:sched_wake_idle_without_ipi
syscalls:sys_enter_nanosleep
# Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
#
Then show all of the events that actually took place in this 'perf record' session:
# perf script
:13637 13637 [002] 108237.581529: sched:sched_waking: comm=perf pid=13638 prio=120 target_cpu=001
:13637 13637 [002] 108237.581537: sched:sched_wakeup: perf:13638 [120] success=1 CPU:001
sleep 13638 [001] 108237.581992: sched:sched_process_exec: filename=/usr/bin/sleep pid=13638 old_pid=13638
sleep 13638 [001] 108237.582286: syscalls:sys_enter_nanosleep: rqtp: 0x7fff1948ac40, rmtp: 0x00000000
sleep 13638 [001] 108237.582289: sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns]
sleep 13638 [001] 108237.582291: sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120]
swapper 0 [001] 108238.582428: sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001
swapper 0 [001] 108238.582458: sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001
sleep 13638 [001] 108238.582698: sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns]
sleep 13638 [001] 108238.582782: sched:sched_process_exit: comm=sleep pid=13638 prio=120
#
Now lets see only the ones that took place after a certain "marker":
# perf script --switch-on syscalls:sys_enter_nanosleep
sleep 13638 [001] 108237.582289: sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns]
sleep 13638 [001] 108237.582291: sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120]
swapper 0 [001] 108238.582428: sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001
swapper 0 [001] 108238.582458: sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001
sleep 13638 [001] 108238.582698: sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns]
sleep 13638 [001] 108238.582782: sched:sched_process_exit: comm=sleep pid=13638 prio=120
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/n/tip-f1oo0ufdhrkx6nhy2lj1ierm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move the nr_members member from perf's evsel to libperf's perf_evsel.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-60-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move the evlist__set_maps() function from tools/perf to libperf.
Committer notes:
Fix up reject due to earlier inversion in calling perf_evlist__init().
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-57-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move 'threads' from tools/perf's evsel to libperf's perf_evsel struct.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-53-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Mov the 'cpus' field from tools/perf's evsel to libperf's perf_evsel.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-51-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move the perf_event_attr struct fron 'struct evsel' to 'struct perf_evsel'.
Committer notes:
Fixed up these:
tools/perf/arch/arm/util/auxtrace.c
tools/perf/arch/arm/util/cs-etm.c
tools/perf/arch/arm64/util/arm-spe.c
tools/perf/arch/s390/util/auxtrace.c
tools/perf/util/cs-etm.c
Also
cc1: warnings being treated as errors
tests/sample-parsing.c: In function 'do_test':
tests/sample-parsing.c:162: error: missing initializer
tests/sample-parsing.c:162: error: (near initialization for 'evsel.core.cpus')
struct evsel evsel = {
.needs_swap = false,
- .core.attr = {
- .sample_type = sample_type,
- .read_format = read_format,
+ .core = {
+ . attr = {
+ .sample_type = sample_type,
+ .read_format = read_format,
+ },
[perfbuilder@a70e4eeb5549 /]$ gcc --version |& head -1
gcc (GCC) 4.4.7
Also we don't need to include perf_event.h in
tools/perf/lib/include/perf/evsel.h, forward declaring 'struct
perf_event_attr' is enough. And this even fixes the build in some
systems where things are used somewhere down the include path from
perf_event.h without defining __always_inline.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-43-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rename struct perf_evlist to struct evlist, so we don't have a name
clash when we add struct perf_evlist in libperf.
Committer notes:
Added fixes to build on arm64, from Jiri and from me
(tools/perf/util/cs-etm.c)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rename struct perf_evsel to struct evsel, so we don't have a name clash
when we add struct perf_evsel in libperf.
Committer notes:
Added fixes for arm64, provided by Jiri.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rename struct thread_map to struct perf_thread_map, so it could be part
of libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rename struct cpu_map to struct perf_cpu_map, so it could be part of
libperf.
Committer notes:
Added fixes for arm64, provided by Jiri.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When we hit the end of a program block, need to count the last
instruction too for the IPC computation. This caused large errors for
small blocks.
% perf script -b ls / > /dev/null
Before:
% perf script -F +brstackinsn --xed
...
00007f94c9ac70d8 jz 0x7f94c9ac70e3 # PRED 3 cycles [36] 4.33 IPC
00007f94c9ac70e3 testb $0x20, 0x31d(%rbx)
00007f94c9ac70ea jnz 0x7f94c9ac70b0
00007f94c9ac70ec testb $0x8, 0x205ad(%rip)
00007f94c9ac70f3 jz 0x7f94c9ac6ff0 # PRED 1 cycles [37] 3.00 IPC
After:
% perf script -F +brstackinsn --xed
...
00007f94c9ac70d8 jz 0x7f94c9ac70e3 # PRED 3 cycles [15] 4.67 IPC
00007f94c9ac70e3 testb $0x20, 0x31d(%rbx)
00007f94c9ac70ea jnz 0x7f94c9ac70b0
00007f94c9ac70ec testb $0x8, 0x205ad(%rip)
00007f94c9ac70f3 jz 0x7f94c9ac6ff0 # PRED 1 cycles [16] 4.00 IPC
Suggested-by: Denis Bakhvalov <denis.bakhvalov@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190711181922.18765-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add scripting operation process_switch() to process switch events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190710085810.1650-18-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In pipe mode, session->header.env.arch is not populated until the events
are processed. Therefore, the following command crashes:
perf record -o - | perf script
(gdb) bt
It fails when we try to compare env.arch against uts.machine:
if (!strcmp(uts.machine, session->header.env.arch) ||
(!strcmp(uts.machine, "x86_64") &&
!strcmp(session->header.env.arch, "i386")))
native_arch = true;
In pipe mode, it is tricky to find env.arch at this stage. To keep it
simple, let's just assume native_arch is always true for pipe mode.
Reported-by: David Carrillo Cisneros <davidca@fb.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kernel-team@fb.com
Cc: stable@vger.kernel.org #v5.1+
Fixes: 3ab481a1cfe1 ("perf script: Support insn output for normal samples")
Link: http://lkml.kernel.org/r/20190621014438.810342-1-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Eroding a bit more the tools/perf/util/util.h hodpodge header.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The 'perf kvm' command set up things so that we can record, report, top,
etc, but not 'script', so make 'perf script' be able to process samples
by allowing to pass guest kallsyms, vmlinux, modules, etc, and if at
least one of those is provided, set perf_guest to true so that guest
samples get properly resolved.
Testing it:
# perf kvm --guest --guestkallsyms /wb/rhel6.kallsyms --guestmodules /wb/rhel6.modules record -e cycles:Gk
^C[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 3.602 MB perf.data.guest (10492 samples) ]
#
# perf evlist -i perf.data.guest
cycles:Gk
# perf evlist -v -i perf.data.guest
cycles:Gk: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, exclude_user: 1, exclude_hv: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_host: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
#
# perf kvm --guestkallsyms /wb/rhel6.kallsyms --guestmodules /wb/rhel6.modules report --stdio -s sym | head -30
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 10K of event 'cycles:Gk'
# Event count (approx.): 2434201408
#
# Overhead Symbol
# ........ ..............................................
#
11.93% [g] avtab_search_node
3.95% [g] sidtab_context_to_sid
2.41% [g] n_tty_write
2.20% [g] _spin_unlock_irqrestore
1.37% [g] _aesni_dec4
1.33% [g] kmem_cache_alloc
1.07% [g] native_write_cr0
0.99% [g] kfree
0.95% [g] _spin_lock
0.91% [g] __memset
0.87% [g] schedule
0.83% [g] _spin_lock_irqsave
0.76% [g] __kmalloc
0.67% [g] avc_has_perm_noaudit
0.66% [g] kmem_cache_free
0.65% [g] glue_xts_crypt_128bit
0.59% [g] __d_lookup
0.59% [g] __audit_syscall_exit
0.56% [g] __memcpy
#
Then, when trying to use perf script to generate a python script and
then process the events after adding a python hook for non-tracepoint
events:
# perf script -i perf.data.guest -g python
generated Python script: perf-script.py
# vim perf-script.py
# tail -2 perf-script.py
def process_event(param_dict):
print(param_dict["symbol"])
#
# perf script -i perf.data.guest -s perf-script.py | head
in trace_begin
vmx_vmexit
vmx_vmexit
vmx_vmexit
vmx_vmexit
vmx_vmexit
vmx_vmexit
vmx_vmexit
vmx_vmexit
vmx_vmexit
231
#
We'd see just the vmx_vmexit, i.e. the samples from the guest don't show
up.
After this patch:
# perf script --guestkallsyms /wb/rhel6.kallsyms --guestmodules /wb/rhel6.modules -i perf.data.guest -s perf-script.py 2> /dev/null | head -30
in trace_begin
apic_timer_interrupt
apic_timer_interrupt
apic_timer_interrupt
apic_timer_interrupt
apic_timer_interrupt
save_args
do_timer
drain_array
inode_permission
avc_has_perm_noaudit
run_timer_softirq
apic_timer_interrupt
apic_timer_interrupt
apic_timer_interrupt
apic_timer_interrupt
apic_timer_interrupt
kvm_guest_apic_eoi_write
run_posix_cpu_timers
_spin_lock
handle_pte_fault
rcu_irq_enter
delay_tsc
delay_tsc
native_read_tsc
apic_timer_interrupt
sys_open
internal_add_timer
list_del
rcu_exit_nohz
#
Jiri Olsa noticed we need to set 'perf_guest' to true if we want to
process guest samples and I made it be set if one of the guest files
settings get set via the command line options added in this patch, that
match those present in the 'perf kvm' command.
We probably want to have 'perf record', 'perf report' etc to notice that
there are guest samples and do the right thing, which is to look for
files with some suffix that make it be associated with the guest used to
collect the samples, i.e. if a vmlinux file is passed, we can get the
build-id from it, if not some other identifier or simply looking for
"kallsyms.guest", for instance, in the current directory.
Reported-by: Mariano Pache <npache@redhat.com>
Tested-by: Mariano Pache <npache@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: Ali Raza <alirazabhutta.10@gmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Orran Krieger <okrieger@redhat.com>
Cc: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Yunlong Song <yunlong.song@huawei.com>
Link: https://lkml.kernel.org/n/tip-d54gj64rerlxcqsrod05biwn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
No change in behaviour, just using the same kernel idiom for such
operation.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: André Goddard Rosa <andre.goddard@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-a85lkptkt0ru40irpga8yf54@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We got the sane_ctype.h headers from git and kept using it so far, but
since that code originally came from the kernel sources to the git
sources, perhaps its better to just use the one in the kernel, so that
we can leverage tools/perf/check_headers.sh to be notified when our copy
gets out of sync, i.e. when fixes or goodies are added to the code we've
copied.
This will help with things like tools/lib/string.c where we want to have
more things in common with the kernel, such as strim(), skip_spaces(),
etc so as to go on removing the things that we have in tools/perf/util/
and instead using the code in the kernel, indirectly and removing things
like EXPORT_SYMBOL(), etc, getting notified when fixes and improvements
are made to the original code.
Hopefully this also should help with reducing the difference of code
hosted in tools/ to the one in the kernel proper.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7k9868l713wqtgo01xxygn12@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Instruction trace decoders can optimize output based on what time
intervals will be filtered, so pass that information in
itrace_synth_ops.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190604130017.31207-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add field 'ipc' to display instructions-per-cycle.
Example:
perf record -e intel_pt/cyc/u ls
perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid
ls 2670177.697113434: 7f0dfdbcd090 _start+0x0 mov %rsp, %rdi IPC: 0.00 (1/877)
ls 2670177.697113434: 7f0dfdbcd093 _start+0x3 callq 0x7f0dfdbce030
ls 2670177.697113434: 7f0dfdbce030 _dl_start+0x0 pushq %rbp
ls 2670177.697113434: 7f0dfdbce031 _dl_start+0x1 mov %rsp, %rbp
ls 2670177.697113434: 7f0dfdbce034 _dl_start+0x4 pushq %r15
ls 2670177.697113434: 7f0dfdbce036 _dl_start+0x6 pushq %r14
ls 2670177.697113434: 7f0dfdbce038 _dl_start+0x8 pushq %r13
ls 2670177.697113434: 7f0dfdbce03a _dl_start+0xa pushq %r12
ls 2670177.697113434: 7f0dfdbce03c _dl_start+0xc mov %rdi, %r12
ls 2670177.697113434: 7f0dfdbce03f _dl_start+0xf pushq %rbx
ls 2670177.697113434: 7f0dfdbce040 _dl_start+0x10 sub $0x38, %rsp
ls 2670177.697113434: 7f0dfdbce044 _dl_start+0x14 rdtsc
ls 2670177.697113434: 7f0dfdbce046 _dl_start+0x16 mov %eax, %eax
ls 2670177.697113434: 7f0dfdbce048 _dl_start+0x18 shl $0x20, %rdx
ls 2670177.697113434: 7f0dfdbce04c _dl_start+0x1c or %rax, %rdx
ls 2670177.697114471: 7f0dfdbce04f _dl_start+0x1f movq 0x27e22(%rip), %rax IPC: 0.00 (15/1685)
ls 2670177.697116177: 7f0dfdbce056 _dl_start+0x26 movq %rdx, 0x27683(%rip) IPC: 0.00 (1/881)
Note, the IPC values are low due to page faults at the beginning of
execution. The additional cycles are due to the time to enter the
kernel, not the actual kernel page fault handler.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add the --show-bpf-events command line option to show the eBPF related events:
PERF_RECORD_KSYMBOL
PERF_RECORD_BPF_EVENT
Usage:
# perf record -a
...
# perf script --show-bpf-events
...
swapper 0 [000] 0.000000: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0ef971d len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
swapper 0 [000] 0.000000: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 36
...
Committer testing:
# perf script --show-bpf-events | egrep -i 'PERF_RECORD_(BPF|KSY)'
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc029a6c3 len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 47
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc029c1ae len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 48
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc02ddd1c len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 49
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc02dfc11 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 50
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc045da0a len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 51
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc04ef4b4 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 52
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc09e15da len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 53
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0d2b1a3 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 54
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0fd9850 len 381 type 1 flags 0x0 name bpf_prog_819967866022f1e1_sys_enter
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 179
0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0feb1ec len 191 type 1 flags 0x0 name bpf_prog_c1bd85c092d6e4aa_sys_exit
0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 180
^C[root@quaco pt]# perf evlist
intel_pt//ku
dummy:u
#
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stanislav Fomichev <sdf@google.com>
Link: http://lkml.kernel.org/r/20190508132010.14512-11-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Pad the DSO name in --call-trace so we don't have the indent screwed by
different DSO name lengths, as now for kernel there's also BPF code
displayed.
# perf-with-kcore record pt -e intel_pt//ku -- sleep 1
# perf-core/perf-with-kcore script pt --call-trace
Before:
sleep 3660 [16] 57036.806464404: ([kernel.kallsyms]) kretprobe_perf_func
sleep 3660 [16] 57036.806464404: ([kernel.kallsyms]) trace_call_bpf
sleep 3660 [16] 57036.806464404: ([kernel.kallsyms]) __x86_indirect_thunk_rax
sleep 3660 [16] 57036.806464404: ([kernel.kallsyms]) __x86_indirect_thunk_rax
sleep 3660 [16] 57036.806464725: (bpf_prog_da4fe6b3d2c29b25_trace_return) bpf_get_current_pid_tgid
sleep 3660 [16] 57036.806464725: (bpf_prog_da4fe6b3d2c29b25_trace_return) bpf_ktime_get_ns
sleep 3660 [16] 57036.806464725: ([kernel.kallsyms]) __x86_indirect_thunk_rax
sleep 3660 [16] 57036.806464725: ([kernel.kallsyms]) __x86_indirect_thunk_rax
sleep 3660 [16] 57036.806465045: (bpf_prog_da4fe6b3d2c29b25_trace_return) __htab_map_lookup_elem
sleep 3660 [16] 57036.806465366: ([kernel.kallsyms]) memcmp
sleep 3660 [16] 57036.806465687: (bpf_prog_da4fe6b3d2c29b25_trace_return) bpf_probe_read
sleep 3660 [16] 57036.806465687: ([kernel.kallsyms]) probe_kernel_read
sleep 3660 [16] 57036.806465687: ([kernel.kallsyms]) __check_object_size
sleep 3660 [16] 57036.806465687: ([kernel.kallsyms]) check_stack_object
sleep 3660 [16] 57036.806465687: ([kernel.kallsyms]) copy_user_enhanced_fast_string
sleep 3660 [16] 57036.806465687: (bpf_prog_da4fe6b3d2c29b25_trace_return) bpf_probe_read
sleep 3660 [16] 57036.806465687: ([kernel.kallsyms]) probe_kernel_read
sleep 3660 [16] 57036.806465687: ([kernel.kallsyms]) __check_object_size
sleep 3660 [16] 57036.806465687: ([kernel.kallsyms]) check_stack_object
sleep 3660 [16] 57036.806465687: ([kernel.kallsyms]) copy_user_enhanced_fast_string
sleep 3660 [16] 57036.806466008: (bpf_prog_da4fe6b3d2c29b25_trace_return) bpf_get_current_uid_gid
sleep 3660 [16] 57036.806466008: ([kernel.kallsyms]) from_kgid
sleep 3660 [16] 57036.806466008: ([kernel.kallsyms]) from_kuid
sleep 3660 [16] 57036.806466008: (bpf_prog_da4fe6b3d2c29b25_trace_return) bpf_perf_event_output
sleep 3660 [16] 57036.806466328: ([kernel.kallsyms]) perf_event_output
sleep 3660 [16] 57036.806466328: ([kernel.kallsyms]) perf_prepare_sample
sleep 3660 [16] 57036.806466328: ([kernel.kallsyms]) perf_misc_flags
sleep 3660 [16] 57036.806466328: ([kernel.kallsyms]) __x86_indirect_thunk_rax
sleep 3660 [16] 57036.806466328: ([kernel.kallsyms]) __x86_indirect_thunk_rax
sleep 3660 [16] 57036.806466328: ([kvm]) kvm_is_in_guest
sleep 3660 [16] 57036.806466649: ([kernel.kallsyms]) __perf_event_header__init_id.isra.0
sleep 3660 [16] 57036.806466649: ([kernel.kallsyms]) perf_output_begin
After:
sleep 3660 [16] 57036.806464404: ([kernel.kallsyms] ) kretprobe_perf_func
sleep 3660 [16] 57036.806464404: ([kernel.kallsyms] ) trace_call_bpf
sleep 3660 [16] 57036.806464404: ([kernel.kallsyms] ) __x86_indirect_thunk_rax
sleep 3660 [16] 57036.806464404: ([kernel.kallsyms] ) __x86_indirect_thunk_rax
sleep 3660 [16] 57036.806464725: (bpf_prog_da4fe6b3d2c29b25_trace_return ) bpf_get_current_pid_tgid
sleep 3660 [16] 57036.806464725: (bpf_prog_da4fe6b3d2c29b25_trace_return ) bpf_ktime_get_ns
sleep 3660 [16] 57036.806464725: ([kernel.kallsyms] ) __x86_indirect_thunk_rax
sleep 3660 [16] 57036.806464725: ([kernel.kallsyms] ) __x86_indirect_thunk_rax
sleep 3660 [16] 57036.806465045: (bpf_prog_da4fe6b3d2c29b25_trace_return ) __htab_map_lookup_elem
sleep 3660 [16] 57036.806465366: ([kernel.kallsyms] ) memcmp
sleep 3660 [16] 57036.806465687: (bpf_prog_da4fe6b3d2c29b25_trace_return ) bpf_probe_read
sleep 3660 [16] 57036.806465687: ([kernel.kallsyms] ) probe_kernel_read
sleep 3660 [16] 57036.806465687: ([kernel.kallsyms] ) __check_object_size
sleep 3660 [16] 57036.806465687: ([kernel.kallsyms] ) check_stack_object
sleep 3660 [16] 57036.806465687: ([kernel.kallsyms] ) copy_user_enhanced_fast_string
sleep 3660 [16] 57036.806465687: (bpf_prog_da4fe6b3d2c29b25_trace_return ) bpf_probe_read
sleep 3660 [16] 57036.806465687: ([kernel.kallsyms] ) probe_kernel_read
sleep 3660 [16] 57036.806465687: ([kernel.kallsyms] ) __check_object_size
sleep 3660 [16] 57036.806465687: ([kernel.kallsyms] ) check_stack_object
sleep 3660 [16] 57036.806465687: ([kernel.kallsyms] ) copy_user_enhanced_fast_string
sleep 3660 [16] 57036.806466008: (bpf_prog_da4fe6b3d2c29b25_trace_return ) bpf_get_current_uid_gid
sleep 3660 [16] 57036.806466008: ([kernel.kallsyms] ) from_kgid
sleep 3660 [16] 57036.806466008: ([kernel.kallsyms] ) from_kuid
sleep 3660 [16] 57036.806466008: (bpf_prog_da4fe6b3d2c29b25_trace_return ) bpf_perf_event_output
sleep 3660 [16] 57036.806466328: ([kernel.kallsyms] ) perf_event_output
sleep 3660 [16] 57036.806466328: ([kernel.kallsyms] ) perf_prepare_sample
sleep 3660 [16] 57036.806466328: ([kernel.kallsyms] ) perf_misc_flags
sleep 3660 [16] 57036.806466328: ([kernel.kallsyms] ) __x86_indirect_thunk_rax
sleep 3660 [16] 57036.806466328: ([kernel.kallsyms] ) __x86_indirect_thunk_rax
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stanislav Fomichev <sdf@google.com>
Link: http://lkml.kernel.org/r/20190508132010.14512-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When comparing time stamps in 'perf script' traces it can be annoying to
work with the full perf time stamps.
Add a --reltime option that displays time stamps relative to the trace
start to make it easier to read the traces.
Note: not currently supported for --time. Report an error in this
case.
Before:
% perf script
swapper 0 [000] 245402.891216: 1 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
swapper 0 [000] 245402.891223: 1 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
swapper 0 [000] 245402.891227: 5 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
swapper 0 [000] 245402.891231: 41 cycles:ppp: ffffffffa0068816 native_write_msr+0x6 ([kernel.kallsyms])
swapper 0 [000] 245402.891235: 355 cycles:ppp: ffffffffa000dd51 intel_bts_enable_local+0x21 ([kernel.kallsyms])
swapper 0 [000] 245402.891239: 3084 cycles:ppp: ffffffffa0a0150a end_repeat_nmi+0x48 ([kernel.kallsyms])
After:
% perf script --reltime
swapper 0 [000] 0.000000: 1 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
swapper 0 [000] 0.000006: 1 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
swapper 0 [000] 0.000010: 5 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
swapper 0 [000] 0.000014: 41 cycles:ppp: ffffffffa0068816 native_write_msr+0x6 ([kernel.kallsyms])
swapper 0 [000] 0.000018: 355 cycles:ppp: ffffffffa000dd51 intel_bts_enable_local+0x21 ([kernel.kallsyms])
swapper 0 [000] 0.000022: 3084 cycles:ppp: ffffffffa0a0150a end_repeat_nmi+0x48 ([kernel.kallsyms])
Committer notes:
Do not use 'time' as the name of a variable, as this breaks the build on
older glibcs:
cc1: warnings being treated as errors
builtin-script.c: In function 'perf_sample__fprintf_start':
builtin-script.c:691: warning: declaration of 'time' shadows a global declaration
/usr/include/time.h:187: warning: shadowed declaration is here
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
LPU-Reference: 20190314225002.30108-8-andi@firstfloor.org
Link: https://lkml.kernel.org/n/tip-bpahyi6pr9r399mvihu65fvc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Don't overflow array when the scripts directory is too large, or the
script file name is too long.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190311144502.15423-11-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The --cpu option only filtered samples. Filter other perf events, such
as COMM, FORK, SWITCH by the CPU too.
Reported-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190311144502.15423-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Upcoming changes add timestamp output in perf report. Add a --ns
argument similar to perf script to support nanoseconds resolution when
needed.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20190305144758.12397-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf script -F +insn was only working for PT traces because the PT
instruction decoder was filling in the insn/insn_len sample attributes.
Support it for non PT samples too on x86 using the existing x86
instruction decoder.
This adds some extra checking to ensure that we don't try to decode
instructions when using perf.data from a different architecture.
% perf record -a sleep 1
% perf script -F ip,sym,insn --xed
ffffffff811704c9 remote_function movl %eax, 0x18(%rbx)
ffffffff8100bb50 intel_bts_enable_local retq
ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
ffffffff810f1f79 generic_exec_single xor %eax, %eax
ffffffff811704c9 remote_function movl %eax, 0x18(%rbx)
ffffffff8100bb34 intel_bts_enable_local movl 0x2000(%rax), %edx
ffffffff81048610 native_apic_mem_write mov %edi, %edi
...
Committer testing:
Before:
# perf script -F ip,sym,insn --xed | head -5
ffffffffa4068804 native_write_msr addb %al, (%rax)
ffffffffa4068804 native_write_msr addb %al, (%rax)
ffffffffa4068804 native_write_msr addb %al, (%rax)
ffffffffa4068806 native_write_msr addb %al, (%rax)
ffffffffa4068806 native_write_msr addb %al, (%rax)
# perf script -F ip,sym,insn --xed | grep -v "addb %al, (%rax)"
#
After:
# perf script -F ip,sym,insn --xed | head -5
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1)
ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1)
# perf script -F ip,sym,insn --xed | grep -v "addb %al, (%rax)" | head -5
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1)
ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1)
#
More examples:
# perf script -F ip,sym,insn --xed | grep -v native_write_msr | head
ffffffffa416b90e tick_check_broadcast_expired btq %rax, 0x1a5f42a(%rip)
ffffffffa4956bd0 nmi_cpu_backtrace pushq %r13
ffffffffa415b95e __hrtimer_next_event_base movq 0x18(%rax), %rdx
ffffffffa4956bf3 nmi_cpu_backtrace popq %r12
ffffffffa4171d5c smp_call_function_single pause
ffffffffa4956bdd nmi_cpu_backtrace mov %ebp, %r12d
ffffffffa4797e4d menu_select cmp $0x190, %rax
ffffffffa4171d5c smp_call_function_single pause
ffffffffa405a7d8 nmi_cpu_backtrace_handler callq 0xffffffffa4956bd0
ffffffffa4797f7a menu_select shr $0x3, %rax
#
Which matches the annotate output modulo resolving callqs:
# perf annotate --stdio2 nmi_cpu_backtrace_handler
Samples: 4 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 35908, [percent: local period]
nmi_cpu_backtrace_handler() /lib/modules/5.0.0+/build/vmlinux
Percent
Disassembly of section .text:
ffffffff8105a7d0 <nmi_cpu_backtrace_handler>:
nmi_cpu_backtrace_handler():
nmi_trigger_cpumask_backtrace(mask, exclude_self,
nmi_raise_cpu_backtrace);
}
static int nmi_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
{
24.45 → callq __fentry__
if (nmi_cpu_backtrace(regs))
mov %rsi,%rdi
75.55 → callq nmi_cpu_backtrace
return NMI_HANDLED;
movzbl %al,%eax
return NMI_DONE;
}
← retq
#
# perf annotate --stdio2 __hrtimer_next_event_base
Samples: 4 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 767977, [percent: local period]
__hrtimer_next_event_base() /lib/modules/5.0.0+/build/vmlinux
Percent
Disassembly of section .text:
ffffffff8115b910 <__hrtimer_next_event_base>:
__hrtimer_next_event_base():
static ktime_t __hrtimer_next_event_base(struct hrtimer_cpu_base *cpu_base,
const struct hrtimer *exclude,
unsigned int active,
ktime_t expires_next)
{
→ callq __fentry__
<SNIP>
4a: add $0x1,%r14
77.31 mov 0x18(%rax),%rdx
shl $0x6,%r14
sub 0x38(%rbx,%r14,1),%rdx
if (expires < expires_next) {
cmp %r12,%rdx
↓ jge 68
<SNIP>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20190305144758.12397-3-andi@firstfloor.org
[ Converted fetch_exe() to use the name it ended up having when merged: thread__memcpy() ]
[ archinsn.c needs the instruction decoder that is only build when CONFIG_AUXTRACE=y, fix that ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Jiri points out that we don't need any time checking and time string
parsing if the --time option is not set. That makes sense.
This patch refactors the time range parsing code, move the duplicated
code from perf report and perf script to time_utils and check if --time
option is set before parsing the time string. This patch is no logic
change expected. So the usage of --time is same as before.
For example:
Select the first and second 10% time slices:
perf report --time 10%/1,10%/2
perf script --time 10%/1,10%/2
Select the slices from 0% to 10% and from 30% to 40%:
perf report --time 0%-10%,30%-40%
perf script --time 0%-10%,30%-40%
Select the time slices from timestamp 3971 to 3973
perf report --time 3971,3973
perf script --time 3971,3973
Committer testing:
Using the above examples, check before and after to see if it remains
the same:
$ perf record -F 10000 -- find . -name "*.[ch]" -exec cat {} + > /dev/null
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 1.626 MB perf.data (42392 samples) ]
$
$ perf report --time 10%/1,10%/2 > /tmp/report.before.1
$ perf script --time 10%/1,10%/2 > /tmp/script.before.1
$ perf report --time 0%-10%,30%-40% > /tmp/report.before.2
$ perf script --time 0%-10%,30%-40% > /tmp/script.before.2
$ perf report --time 180457.375844,180457.377717 > /tmp/report.before.3
$ perf script --time 180457.375844,180457.377717 > /tmp/script.before.3
For example, the 3rd test produces this slice:
$ cat /tmp/script.before.3
cat 3147 180457.375844: 2143 cycles:uppp: 7f79362590d9 cfree@GLIBC_2.2.5+0x9 (/usr/lib64/libc-2.28.so)
cat 3147 180457.375986: 2245 cycles:uppp: 558b70f3d86e [unknown] (/usr/bin/cat)
cat 3147 180457.376012: 2164 cycles:uppp: 7f7936257430 _int_malloc+0x8c0 (/usr/lib64/libc-2.28.so)
cat 3147 180457.376140: 2921 cycles:uppp: 558b70f3a554 [unknown] (/usr/bin/cat)
cat 3147 180457.376296: 2844 cycles:uppp: 7f7936258abe malloc+0x4e (/usr/lib64/libc-2.28.so)
cat 3147 180457.376431: 2717 cycles:uppp: 558b70f3b0ca [unknown] (/usr/bin/cat)
cat 3147 180457.376667: 2630 cycles:uppp: 558b70f3d86e [unknown] (/usr/bin/cat)
cat 3147 180457.376795: 2442 cycles:uppp: 7f79362bff55 read+0x15 (/usr/lib64/libc-2.28.so)
cat 3147 180457.376927: 2376 cycles:uppp: ffffffff9aa00163 [unknown] ([unknown])
cat 3147 180457.376954: 2307 cycles:uppp: 7f7936257438 _int_malloc+0x8c8 (/usr/lib64/libc-2.28.so)
cat 3147 180457.377116: 3091 cycles:uppp: 7f7936258a70 malloc+0x0 (/usr/lib64/libc-2.28.so)
cat 3147 180457.377362: 2945 cycles:uppp: 558b70f3a3b0 [unknown] (/usr/bin/cat)
cat 3147 180457.377517: 2727 cycles:uppp: 558b70f3a9aa [unknown] (/usr/bin/cat)
$
Install 'coreutils-debuginfo' to see cat's guts (symbols), but then, the
above chunk translates into this 'perf report' output:
$ cat /tmp/report.before.3
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 13 of event 'cycles:uppp' (time slices: 180457.375844,180457.377717)
# Event count (approx.): 33552
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ......................
#
17.69% cat libc-2.28.so [.] malloc
14.53% cat cat [.] 0x000000000000586e
13.33% cat libc-2.28.so [.] _int_malloc
8.78% cat cat [.] 0x00000000000023b0
8.71% cat cat [.] 0x0000000000002554
8.13% cat cat [.] 0x00000000000029aa
8.10% cat cat [.] 0x00000000000030ca
7.28% cat libc-2.28.so [.] read
7.08% cat [unknown] [k] 0xffffffff9aa00163
6.39% cat libc-2.28.so [.] cfree@GLIBC_2.2.5
#
# (Tip: Order by the overhead of source file name and line number: perf report -s srcline)
#
$
Now lets see after applying this patch, nothing should change:
$ perf report --time 10%/1,10%/2 > /tmp/report.after.1
$ perf script --time 10%/1,10%/2 > /tmp/script.after.1
$ perf report --time 0%-10%,30%-40% > /tmp/report.after.2
$ perf script --time 0%-10%,30%-40% > /tmp/script.after.2
$ perf report --time 180457.375844,180457.377717 > /tmp/report.after.3
$ perf script --time 180457.375844,180457.377717 > /tmp/script.after.3
$ diff -u /tmp/report.before.1 /tmp/report.after.1
$ diff -u /tmp/script.before.1 /tmp/script.after.1
$ diff -u /tmp/report.before.2 /tmp/report.after.2
--- /tmp/report.before.2 2019-03-01 11:01:53.526094883 -0300
+++ /tmp/report.after.2 2019-03-01 11:09:18.231770467 -0300
@@ -352,5 +352,5 @@
#
-# (Tip: Generate a script for your data: perf script -g <lang>)
+# (Tip: Treat branches as callchains: perf report --branch-history)
#
$ diff -u /tmp/script.before.2 /tmp/script.after.2
$ diff -u /tmp/report.before.3 /tmp/report.after.3
--- /tmp/report.before.3 2019-03-01 11:03:08.890045588 -0300
+++ /tmp/report.after.3 2019-03-01 11:09:40.660224002 -0300
@@ -22,5 +22,5 @@
#
-# (Tip: Order by the overhead of source file name and line number: perf report -s srcline)
+# (Tip: List events using substring match: perf list <keyword>)
#
$ diff -u /tmp/script.before.3 /tmp/script.after.3
$
Cool, just the 'perf report' tips changed, QED.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1551435186-6008-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When using -F + syntax to add a field the existing defaults are
currently all marked user_set. This can cause errors when some field is
missing in the perf.data
This patch tracks the actually user set fields separately, so that we don't
error out in this case.
Before:
% perf record true
% perf script -F +metric
Samples for 'cycles:ppp' event do not have CPU attribute set. Cannot print 'cpu' field.
%
After:
5 perf record true
% perf script -F +metric
perf 28936 278636.237688: 1 cycles:ppp: ffffffff8117da99 perf_event_exec+0x59 (/lib/modules/4.20.0-odilo/build/vmlinux)
...
%
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224153722.27020-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a 'path' member to 'struct perf_data'. It will keep the configured
path for the data (const char *). The path in struct perf_data_file is
now dynamically allocated (duped) from it.
This scheme is useful/used in following patches where struct
perf_data::path holds the 'configure' directory path and struct
perf_data_file::path holds the allocated path for specific files.
Also it actually makes the code little simpler.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org
[ Fixup data-convert-bt.c missing conversion ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add support to add/remove fields for specific event types in -F option.
It's now possible to use '+-' after event type, like:
# cat > test.c
#include <stdio.h>
int main(void)
{
printf("Hello world\n");
while(1) {}
}
^D
# gcc -g -o test test.c
# perf probe -x test 'test.c:5'
# perf record -e '{cpu/cpu-cycles,period=10000/,probe_test:main}:S' ./test
...
# perf script -Ftrace:+period,-cpu
test 3859 396291.117343: 10275 cpu/cpu-cycles,period=10000/: 7f..
test 3859 396291.118234: 11041 cpu/cpu-cycles,period=10000/: ffffff..
test 3859 396291.118234: 1 probe_test:main:
test 3859 396291.118248: 8668 cpu/cpu-cycles,period=10000/: ffffff..
test 3859 396291.118263: 10139 cpu/cpu-cycles,period=10000/: ffffff..
Committer testing:
Couldn't make the test above work, but tested it with:
# perf probe -x hello main
Added new event:
probe_hello:main (on main in /home/acme/c/hello)
You can now use it in all perf tools, such as:
perf record -e probe_hello:main -aR sleep 1
# perf record -e probe_hello:main ./hello
hello, world
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.025 MB perf.data (1 samples) ]
# perf script
hello 21454 [002] 254116.874005: probe_hello:main: (401126)
#
# perf script -Ftrace:+period,-cpu
hello 21454 254116.874005: 1 probe_hello:main: (401126)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190220122800.864-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Lots of places get the map.h file indirectly, and since we're going to
remove it from machine.h, then those need to include it directly, do it
now, before we remove that dep.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ob8jehdjda8h5jsrv9dqj9tf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
While updating perf to work with Python3 and Python2 I noticed that the
stat-cpi script was dumping core.
$ perf stat -e cycles,instructions record -o /tmp/perf.data /bin/false
Performance counter stats for '/bin/false':
802,148 cycles
604,622 instructions 802,148 cycles
604,622 instructions
0.001445842 seconds time elapsed
$ perf script -i /tmp/perf.data -s scripts/python/stat-cpi.py
Segmentation fault (core dumped)
...
...
rblist=rblist@entry=0xb2a200 <rt_stat>,
new_entry=new_entry@entry=0x7ffcb755c310) at util/rblist.c:33
ctx=<optimized out>, type=<optimized out>, create=<optimized out>,
cpu=<optimized out>, evsel=<optimized out>) at util/stat-shadow.c:118
ctx=<optimized out>, type=<optimized out>, st=<optimized out>)
at util/stat-shadow.c:196
count=count@entry=727442, cpu=cpu@entry=0, st=0xb2a200 <rt_stat>)
at util/stat-shadow.c:239
config=config@entry=0xafeb40 <stat_config>,
counter=counter@entry=0x133c6e0) at util/stat.c:372
...
...
The issue is that since 1fcd03946b52 perf_stat__update_shadow_stats now calls
update_runtime_stat passing rt_stat rather than calling update_stats but
perf_stat__init_shadow_stats has never been called to initialize rt_stat in
the script path processing recorded stat data.
Since I can't see any reason why perf_stat__init_shadow_stats() is presently
initialized like it is in builtin-script.c::perf_sample__fprint_metric()
[4bd1bef8bba2f] I'm proposing it instead be initialized once in __cmd_script
Committer testing:
After applying the patch:
# perf script -i /tmp/perf.data -s tools/perf/scripts/python/stat-cpi.py
0.001970: cpu -1, thread -1 -> cpi 1.709079 (1075684/629394)
#
No segfault.
Signed-off-by: Tony Jones <tonyj@suse.de>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Fixes: 1fcd03946b52 ("perf stat: Update per-thread shadow stats")
Link: http://lkml.kernel.org/r/20190120191414.12925-1-tonyj@suse.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'perf script' crashes currently when printing mixed trace points and
other events because the trace format does not handle events without
trace meta data. Add a simple check to avoid that.
% cat > test.c
main()
{
printf("Hello world\n");
}
^D
% gcc -g -o test test.c
% sudo perf probe -x test 'test.c:3'
% perf record -e '{cpu/cpu-cycles,period=10000/,probe_test:main}:S' ./test
% perf script
<segfault>
Committer testing:
Before:
# perf probe -x /lib64/libc-2.28.so malloc
Added new event:
probe_libc:malloc (on malloc in /usr/lib64/libc-2.28.so)
You can now use it in all perf tools, such as:
perf record -e probe_libc:malloc -aR sleep 1
# perf probe -l
probe_libc:malloc (on __libc_malloc@malloc/malloc.c in /usr/lib64/libc-2.28.so)
# perf record -e '{cpu/cpu-cycles,period=10000/,probe_libc:*}:S' sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.023 MB perf.data (40 samples) ]
# perf script
Segmentation fault (core dumped)
^C
#
After:
# perf script | head -6
sleep 2888 94796.944981: 16198 cpu/cpu-cycles,period=10000/: ffffffff925dc04f get_random_u32+0x1f (/lib/modules/5.0.0-rc2+/build/vmlinux)
sleep 2888 [-01] 94796.944981: probe_libc:malloc:
sleep 2888 94796.944983: 4713 cpu/cpu-cycles,period=10000/: ffffffff922763af change_protection+0xcf (/lib/modules/5.0.0-rc2+/build/vmlinux)
sleep 2888 [-01] 94796.944983: probe_libc:malloc:
sleep 2888 94796.944986: 9934 cpu/cpu-cycles,period=10000/: ffffffff922777e0 move_page_tables+0x0 (/lib/modules/5.0.0-rc2+/build/vmlinux)
sleep 2888 [-01] 94796.944986: probe_libc:malloc:
#
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190117194834.21940-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf creates a single 'struct thread' to represent the idle task. That
is because threads are identified by PID and TID, and the idle task
always has PID == TID == 0.
However, there are actually separate idle tasks for each CPU. That
creates a problem for thread stack processing which assumes that each
thread has a single stack, not one stack per CPU.
Fix that by passing through the CPU number, and in the case of the idle
"thread", pick the thread stack from an array based on the CPU number.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is a fix for another instance of the skid problem Milian recently
found [1]
The LBRs don't freeze at the exact same time as the PMI is triggered.
The perf script brstackinsn code that dumps LBR assembler assumes that
the last branch in the LBR leads to the sample point. But with skid
it's possible that the CPU executes one or more branches before the
sample, but which do not appear in the LBR.
What happens then is either that the sample point is before the last LBR
branch. In this case the dumper sees a negative length and ignores it.
Or it the sample point is long after the last branch. Then the dumper
sees a very long block and dumps it upto its block limit (16k bytes),
which is noise in the output.
On typical sample session this can happen regularly.
This patch tries to detect and handle the situation. On the last block
that is dumped by the LBR dumper we always stop on the first branch. If
the block length is negative just scan forward to the first branch.
Otherwise scan until a branch is found.
The PT decoder already has a function that uses the instruction decoder
to detect branches, so we can just reuse it here.
Then when a terminating branch is found print an indication and stop
dumping. This might miss a few instructions, but at least shows no
runaway blocks.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Link: http://lkml.kernel.org/r/20181120050617.4119-1-andi@firstfloor.org
[ Resolved conflict with dd2e18e9ac20 ("perf tools: Support 'srccode' output") ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When looking at PT or brstackinsn traces with 'perf script' it can be
very useful to see the source code. This adds a simple facility to print
them with 'perf script', if the information is available through dwarf
% perf record ...
% perf script -F insn,ip,sym,srccode
...
4004c6 main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004c6 main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004b3 main
6 v++;
% perf record -b ...
% perf script -F insn,ip,sym,srccode,brstackinsn
...
main+22:
0000000000400543 insn: e8 ca ff ff ff # PRED
|18 f1();
f1:
0000000000400512 insn: 55
|10 {
0000000000400513 insn: 48 89 e5
0000000000400516 insn: b8 00 00 00 00
|11 f2();
000000000040051b insn: e8 d6 ff ff ff # PRED
f2:
00000000004004f6 insn: 55
|5 {
00000000004004f7 insn: 48 89 e5
00000000004004fa insn: 8b 05 2c 0b 20 00
|6 c = a / b;
0000000000400500 insn: 8b 0d 2a 0b 20 00
0000000000400506 insn: 99
0000000000400507 insn: f7 f9
0000000000400509 insn: 89 05 29 0b 20 00
000000000040050f insn: 90
|7 }
0000000000400510 insn: 5d
0000000000400511 insn: c3 # PRED
f1+14:
0000000000400520 insn: b8 00 00 00 00
|12 f2();
0000000000400525 insn: e8 cc ff ff ff # PRED
f2:
00000000004004f6 insn: 55
|5 {
00000000004004f7 insn: 48 89 e5
00000000004004fa insn: 8b 05 2c 0b 20 00
|6 c = a / b;
Not supported for callchains currently, would need some layout changes
there.
Committer notes:
Fixed the build on Alpine Linux (3.4 .. 3.8) by addressing this
warning:
In file included from util/srccode.c:19:0:
/usr/include/sys/fcntl.h:1:2: error: #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> [-Werror=cpp]
#warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h>
^~~~~~~
cc1: all warnings being treated as errors
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181204001848.24769-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Branch stacks do not necessarily have the same cpumode as the 'ip'. Use
the fallback functions in those cases.
This patch depends on patch "perf tools: Add fallback functions for cases
where cpumode is insufficient".
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: stable@vger.kernel.org # 4.19
Link: http://lkml.kernel.org/r/20181106210712.12098-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The iregs output was missing the newline at end as well as the leading
ABI output. This made it hard to compare the iregs and uregs values.
Instead, use a single function to output the register values and use it
for both, iregs and uregs, to ensure the output is consistent.
Before:
perf 7049 [-01] 1343.354347: 1 cycles:ppp:
ffffffffa7bc21ce perf_event_exec+0x18e (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ffffffffa7c7ead3 setup_new_exec+0xf3 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ffffffffa7cd7be5 load_elf_binary+0x395 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ffffffffa7c7e540 search_binary_handler+0x80 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ffffffffa7c7f1aa __do_execve_file.isra.13+0x58a (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ffffffffa7c7f561 do_execve+0x21 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ffffffffa7c7f596 __x64_sys_execve+0x26 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ffffffffa7a041cb do_syscall_64+0x5b (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ffffffffa840008c entry_SYSCALL_64+0x7c (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
AX:0x80000000 BX:0x0 CX:0x0 DX:0x7 SI:0xf DI:0x286 BP:0xffff95bc8213a460 SP:0xffffacbf0ba97d18 IP:0xffffffffa7bc21cd FLAGS:0x28e CS:0x10 SS:0x18 R8:0x2 R9:0x21440 R10:0x33816fb3b8c R11:0x1 R12:0xffff95bc8213a460 R13:0xffff95bc8213a400 R14:0xffff95bc8213a400 R15:0x1 ABI:2 AX:0xffffffffffffffda BX:0xffffffffffffffff CX:0x7f84ad85798b DX:0x560209699d50 SI:0x7ffe2c7a6820 DI:0x7ffe2c7a8c9b BP:0x7ffe2c7a20d0 SP:0x7ffe2c7a2058 IP:0x7f84ad85798b FLAGS:0x206 CS:0x33 SS:0x2b R8:0x7ffe2c7a2030 R9:0x7f84ae55f010 R10:0x8 R11:0x206 R12:0xffffffffffffffff R13:0xffffffffffffffff R14:0xffffffffffffffff R15:0xffffffffffffffff
perf 7049 [-01] 1343.354363: 1 cycles:ppp:
...
After:
perf 7049 [-01] 1343.354347: 1 cycles:ppp:
ffffffffa7bc21ce perf_event_exec+0x18e (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ffffffffa7c7ead3 setup_new_exec+0xf3 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ffffffffa7cd7be5 load_elf_binary+0x395 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ffffffffa7c7e540 search_binary_handler+0x80 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ffffffffa7c7f1aa __do_execve_file.isra.13+0x58a (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ffffffffa7c7f561 do_execve+0x21 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ffffffffa7c7f596 __x64_sys_execve+0x26 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ffffffffa7a041cb do_syscall_64+0x5b (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ffffffffa840008c entry_SYSCALL_64+0x7c (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
ABI:2 AX:0x80000000 BX:0x0 CX:0x0 DX:0x7 SI:0xf DI:0x286 BP:0xffff95bc8213a460 SP:0xffffacbf0ba97d18 IP:0xffffffffa7bc21cd FLAGS:0x28e CS:0x10 SS:0x18 R8:0x2 R9:0x21440 R10:0x33816fb3b8c R11:0x1 R12:0xffff95bc8213a460 R13:0xffff95bc8213a400 R14:0xffff95bc8213a400 R15:0x1
ABI:2 AX:0xffffffffffffffda BX:0xffffffffffffffff CX:0x7f84ad85798b DX:0x560209699d50 SI:0x7ffe2c7a6820 DI:0x7ffe2c7a8c9b BP:0x7ffe2c7a20d0 SP:0x7ffe2c7a2058 IP:0x7f84ad85798b FLAGS:0x206 CS:0x33 SS:0x2b R8:0x7ffe2c7a2030 R9:0x7f84ae55f010 R10:0x8 R11:0x206 R12:0xffffffffffffffff R13:0xffffffffffffffff R14:0xffffffffffffffff R15:0xffffffffffffffff
perf 7049 [-01] 1343.354363: 1 cycles:ppp:
...
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181107223437.9071-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This change makes it much easier to easily distinguish between
consecutive samples by keeping the empty line between them, like we see
when we do not enable uregs output.
Before:
cpp-inlining 28298 [-01] 54837.342780: 3068085 cycles:pp:
7ffff7c96709 __hypot_finite+0xa9 (/usr/lib/libm-2.28.so)
...
ABI:2 AX:0x0 BX:0x40f56cf6 CX:0x294a3ae7 ...
cpp-inlining 28298 [-01] 54837.344493: 2881929 cycles:pp:
7ffff7c96696 __hypot_finite+0x36 (/usr/lib/libm-2.28.so)
...
ABI:2 AX:0x40d440c7 BX:0x40d440c7 CX:0x4d45e5da ...
After:
cpp-inlining 28298 [-01] 54837.342780: 3068085 cycles:pp:
7ffff7c96709 __hypot_finite+0xa9 (/usr/lib/libm-2.28.so)
...
ABI:2 AX:0x0 BX:0x40f56cf6 CX:0x294a3ae7 ...
cpp-inlining 28298 [-01] 54837.344493: 2881929 cycles:pp:
7ffff7c96696 __hypot_finite+0x36 (/usr/lib/libm-2.28.so)
...
ABI:2 AX:0x40d440c7 BX:0x40d440c7 CX:0x4d45e5da ...
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181107093705.16346-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
For 'perf script' brstackinsn also print a running cycles count. This
makes it easier to calculate cycle deltas for code sections measured
with LBRs.
% perf record -b -a sleep 1
% perf script -F +brstackinsn
...
00007f73ecc41083 insn: 74 06 # PRED 9 cycles [17] 1.11 IPC
00007f73ecc4108b insn: a8 10
00007f73ecc4108d insn: 74 71 # PRED 1 cycles [18] 1.00 IPC
00007f73ecc41100 insn: 48 8b 46 10
00007f73ecc41104 insn: 4c 8b 38
00007f73ecc41107 insn: 4d 85 ff
00007f73ecc4110a insn: 0f 84 b0 00 00 00
00007f73ecc41110 insn: 83 43 58 01
00007f73ecc41114 insn: 48 89 df
00007f73ecc41117 insn: e8 94 73 04 00 # PRED 6 cycles [24] 1.00 IPC
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Link: http://lkml.kernel.org/r/20180924170732.GA28040@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a ftrace style --graph-function argument to 'perf script' that
allows to print itrace function calls only below a given function. This
makes it easier to find the code of interest in a large trace.
% perf record -e intel_pt//k -a sleep 1
% perf script --graph-function group_sched_in --call-trace
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) calc_timer_values
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_cpu
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) arch_perf_update_userpage
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __fentry__
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) using_native_sched_clock
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_stable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) group_sched_in
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) __x86_indirect_thunk_rax
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) event_sched_in.isra.107
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) perf_event_set_state.part.71
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) perf_event_update_time
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) perf_pmu_disable
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) perf_log_itrace_start
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) __x86_indirect_thunk_rax
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) perf_event_update_userpage
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) calc_timer_values
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) sched_clock_cpu
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) __x86_indirect_thunk_rax
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) arch_perf_update_userpage
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) __fentry__
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) using_native_sched_clock
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) sched_clock_stable
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Link: http://lkml.kernel.org/r/20180920180540.14039-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add short cut options to print PT call trace and call-ret-trace, for
calls and call and returns. Roughly corresponds to ftrace function
tracer and function graph tracer.
Just makes these common use cases nicer to use.
% perf record -a -e intel_pt// sleep 1
% perf script --call-trace
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
% perf script --call-ret-trace
perf 900 [000] 194167.205652203: tr strt ([unknown]) pt_config
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_config
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_event_add
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_void
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_int
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_txn
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) perf_event_set_state.part.71
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Link: http://lkml.kernel.org/r/20180920180540.14039-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
By default 'perf script' for itrace outputs sampled instructions or
branches. In my experience this is confusing to users because it's hard
to correlate with real program behavior. The sampling makes sense for
tools like 'perf report' that actually sample to reduce the run time,
but run time is normally not a problem for 'perf script'. It's better
to give an accurate representation of the program flow.
Default 'perf script' to output all calls for itrace. That's a much saner
default. The old behavior can be still requested with 'perf script'
--itrace=ibxwpe100000
v2: Fix ETM build failure
v3: Really fix ETM build failure (Kim Phillips)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Link: http://lkml.kernel.org/r/20180920180540.14039-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a --insn-trace short hand option for decoding and disassembling
instruction streams for intel_pt. This automatically pipes the output
into the xed disassembler to generate disassembled instructions. This
just makes this use model much nicer to use.
Before
% perf record -e intel_pt// ...
% perf script --itrace=i0ns --ns -F +insn,-event,-period | xed -F insn: -A -64
swapper 0 [000] 17276.429606186: ffffffff81010486 pt_config ([kernel.kallsyms]) nopl %eax, (%rax,%rax,1)
swapper 0 [000] 17276.429606186: ffffffff8101048b pt_config ([kernel.kallsyms]) add $0x10, %rsp
swapper 0 [000] 17276.429606186: ffffffff8101048f pt_config ([kernel.kallsyms]) popq %rbx
swapper 0 [000] 17276.429606186: ffffffff81010490 pt_config ([kernel.kallsyms]) popq %rbp
swapper 0 [000] 17276.429606186: ffffffff81010491 pt_config ([kernel.kallsyms]) popq %r12
swapper 0 [000] 17276.429606186: ffffffff81010493 pt_config ([kernel.kallsyms]) popq %r13
swapper 0 [000] 17276.429606186: ffffffff81010495 pt_config ([kernel.kallsyms]) popq %r14
swapper 0 [000] 17276.429606186: ffffffff81010497 pt_config ([kernel.kallsyms]) popq %r15
swapper 0 [000] 17276.429606186: ffffffff81010499 pt_config ([kernel.kallsyms]) retq
swapper 0 [000] 17276.429606186: ffffffff8101063e pt_event_add ([kernel.kallsyms]) cmpl $0x1, 0x1b0(%rbx)
swapper 0 [000] 17276.429606186: ffffffff81010645 pt_event_add ([kernel.kallsyms]) mov $0xffffffea, %eax
swapper 0 [000] 17276.429606186: ffffffff8101064a pt_event_add ([kernel.kallsyms]) mov $0x0, %edx
swapper 0 [000] 17276.429606186: ffffffff8101064f pt_event_add ([kernel.kallsyms]) popq %rbx
swapper 0 [000] 17276.429606186: ffffffff81010650 pt_event_add ([kernel.kallsyms]) cmovnz %edx, %eax
swapper 0 [000] 17276.429606186: ffffffff81010653 pt_event_add ([kernel.kallsyms]) jmp 0xffffffff81010635
swapper 0 [000] 17276.429606186: ffffffff81010635 pt_event_add ([kernel.kallsyms]) retq
swapper 0 [000] 17276.429606186: ffffffff8115e687 event_sched_in.isra.107 ([kernel.kallsyms]) test %eax, %eax
Now:
% perf record -e intel_pt// ...
% perf script --insn-trace --xed
... same output ...
XED needs to be installed with:
$ git clone https://github.com/intelxed/mbuild.git mbuild
$ git clone https://github.com/intelxed/xed
$ cd xed
$ ./mfile.py
$ ./mfile.py examples
$ sudo ./mfile.py --prefix=/usr/local install
$ sudo cp obj/examples/xed /usr/local/bin
$ xed | head -3
ERROR: required argument(s) were missing
Copyright (C) 2017, Intel Corporation. All rights reserved.
XED version: [v10.0-328-g7d62c8c49b7b]
$
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20180920180540.14039-2-andi@firstfloor.org
[ Fixed up whitespace damage, added the 'mfile.py examples + cp obj/examples/xed ... ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|