diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2015-08-31 19:49:05 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2015-08-31 19:49:05 -0700 |
commit | 41d859a83c567a9c9f50a34082cc64aab0abb0cd (patch) | |
tree | ab911ea521701401413d041e1b92225f3dbdab41 /tools/perf/builtin-trace.c | |
parent | 4658000955d1864b54890214434e171949c7f1c5 (diff) | |
parent | bac2e4a96d1c0bcce5e9654dcc902f75576b9b03 (diff) |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"Main perf kernel side changes:
- uprobes updates/fixes. (Oleg Nesterov)
- Add PERF_RECORD_SWITCH to indicate context switches and use it in
tooling. (Adrian Hunter)
- Support BPF programs attached to uprobes and first steps for BPF
tooling support. (Wang Nan)
- x86 generic x86 MSR-to-perf PMU driver. (Andy Lutomirski)
- x86 Intel PT, LBR and BTS updates. (Alexander Shishkin)
- x86 Intel Skylake support. (Andi Kleen)
- x86 Intel Knights Landing (KNL) RAPL support. (Dasaratharaman
Chandramouli)
- x86 Intel Broadwell-DE uncore support. (Kan Liang)
- x86 hw breakpoints robustization (Andy Lutomirski)
Main perf tooling side changes:
- Support Intel PT in several tools, enabling the use of the
processor trace feature introduced in Intel Broadwell processors:
(Adrian Hunter)
# dmesg | grep Performance
# [0.188477] Performance Events: PEBS fmt2+, 16-deep LBR, Broadwell events, full-width counters, Intel PMU driver.
# perf record -e intel_pt//u -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.216 MB perf.data ]
# perf script # then navigate in the tool output to some area, like this one:
184 1030 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba661440 dl_main (/usr/lib64/ld-2.17.so)
185 1457 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba669f10 _dl_new_object (/usr/lib64/ld-2.17.so)
186 9f37 _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba677b90 strlen (/usr/lib64/ld-2.17.so)
187 7ba3 strlen (/usr/lib64/ld-2.17.so) => 7f21ba677c75 strlen (/usr/lib64/ld-2.17.so)
188 7c78 strlen (/usr/lib64/ld-2.17.so) => 7f21ba669f3c _dl_new_object (/usr/lib64/ld-2.17.so)
189 9f8a _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba65fab0 calloc@plt (/usr/lib64/ld-2.17.so)
190 fab0 calloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e70 calloc (/usr/lib64/ld-2.17.so)
191 5e87 calloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa90 malloc@plt (/usr/lib64/ld-2.17.so)
192 fa90 malloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e60 malloc (/usr/lib64/ld-2.17.so)
193 5e68 malloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so)
194 fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so) => 7f21ba675d50 __libc_memalign (/usr/lib64/ld-2.17.so)
195 5d63 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e20 __libc_memalign (/usr/lib64/ld-2.17.so)
196 5e40 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675d73 __libc_memalign (/usr/lib64/ld-2.17.so)
197 5d97 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e18 __libc_memalign (/usr/lib64/ld-2.17.so)
198 5e1e __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675df9 __libc_memalign (/usr/lib64/ld-2.17.so)
199 5e10 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba669f8f _dl_new_object (/usr/lib64/ld-2.17.so)
200 9fc2 _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba678e70 memcpy (/usr/lib64/ld-2.17.so)
201 8e8c memcpy (/usr/lib64/ld-2.17.so) => 7f21ba678ea0 memcpy (/usr/lib64/ld-2.17.so)
- Add support for using several Intel PT features (CYC, MTC packets),
the relevant documentation was updated in:
tools/perf/Documentation/intel-pt.txt
briefly describing those packets, its purposes, how to configure
them in the event config terms and relevant external documentation
for further reading. (Adrian Hunter)
- Introduce support for probing at an absolute address, for user and
kernel 'perf probe's, useful when one have the symbol maps on a
developer machine but not on an embedded system. (Wang Nan)
- Add Intel BTS support, with a call-graph script to show it and PT
in use in a GUI using 'perf script' python scripting with
postgresql and Qt. (Adrian Hunter)
- Allow selecting the type of callchains per event, including
disabling callchains in all but one entry in an event list, to save
space, and also to ask for the callchains collected in one event to
be used in other events. (Kan Liang)
- Beautify more syscall arguments in 'perf trace': (Arnaldo Carvalho
de Melo)
* A bunch more translate file/pathnames from pointers to strings.
* Convert numbers to strings for the 'keyctl' syscall 'option'
arg.
* Add missing 'clockid' entries.
- Introduce 'srcfile' sort key: (Andi Kleen)
# perf record -F 10000 usleep 1
# perf report --stdio --dsos '[kernel.vmlinux]' -s srcfile
<SNIP>
# Overhead Source File
26.49% copy_page_64.S
5.49% signal.c
0.51% msr.h
#
It can be combined with other fields, for instance, experiment with
'-s srcfile,symbol'.
There are some oddities in some distros and with some specific
DSOs, being investigated, so your mileage may vary.
- Support per-event 'freq' term: (Namhyung Kim)
$ perf record -e 'cpu/instructions,freq=1234/',cycles -c 1000 sleep 1
$ perf evlist -F
cpu/instructions,freq=1234/: sample_freq=1234
cycles: sample_period=1000
$
- Deref sys_enter pointer args with contents from probe:vfs_getname,
showing pathnames instead of pointers in many syscalls in 'perf
trace'. (Arnaldo Carvalho de Melo)
- Stop collecting /proc/kallsyms in perf.data files, saving about
4.5MB on a typical x86-64 system, use the the symbol resolution
routines used in all the other tools (report, top, etc) now that we
can ask libtraceevent to use perf's symbol resolution code.
(Arnaldo Carvalho de Melo)
- Allow filtering out of perf's PID via 'perf record --exclude-perf'.
(Wang Nan)
- 'perf trace' now supports syscall groups, like strace, i.e:
$ trace -e file touch file
Will expand 'file' into multiple, file related, syscalls. More
work needed to add extra groups for other syscall groups, and also
to complement what was added for the 'file' group, included as a
proof of concept. (Arnaldo Carvalho de Melo)
- Add lock_pi stresser to 'perf bench futex', to test the kernel code
related to FUTEX_(UN)LOCK_PI. (Davidlohr Bueso)
- Let user have timestamps with per-thread recording in 'perf record'
(Adrian Hunter)
- ... and tons of other changes, see the shortlog and the Git log for
details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (240 commits)
perf evlist: Add backpointer for perf_env to evlist
perf tools: Rename perf_session_env to perf_env
perf tools: Do not change lib/api/fs/debugfs directly
perf tools: Add tracing_path and remove unneeded functions
perf buildid: Introduce sysfs/filename__sprintf_build_id
perf evsel: Add a backpointer to the evlist a evsel is in
perf trace: Add header with copyright and background info
perf scripts python: Add new compaction-times script
perf stat: Get correct cpu id for print_aggr
tools lib traceeveent: Allow for negative numbers in print format
perf script: Add --[no-]-demangle/--[no-]-demangle-kernel
tracing/uprobes: Do not print '0x (null)' when offset is 0
perf probe: Support probing at absolute address
perf probe: Fix error reported when offset without function
perf probe: Fix list result when address is zero
perf probe: Fix list result when symbol can't be found
tools build: Allow duplicate objects in the object list
perf tools: Remove export.h from MANIFEST
perf probe: Prevent segfault when reading probe point with absolute address
perf tools: Update Intel PT documentation
...
Diffstat (limited to 'tools/perf/builtin-trace.c')
-rw-r--r-- | tools/perf/builtin-trace.c | 469 |
1 files changed, 362 insertions, 107 deletions
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c index 39ad4d0ca884..4e3abba03062 100644 --- a/tools/perf/builtin-trace.c +++ b/tools/perf/builtin-trace.c @@ -1,8 +1,27 @@ +/* + * builtin-trace.c + * + * Builtin 'trace' command: + * + * Display a continuously updated trace of any workload, CPU, specific PID, + * system wide, etc. Default format is loosely strace like, but any other + * event may be specified using --event. + * + * Copyright (C) 2012, 2013, 2014, 2015 Red Hat Inc, Arnaldo Carvalho de Melo <acme@redhat.com> + * + * Initially based on the 'trace' prototype by Thomas Gleixner: + * + * http://lwn.net/Articles/415728/ ("Announcing a new utility: 'trace'") + * + * Released under the GPL v2. (and only v2, not any later version) + */ + #include <traceevent/event-parse.h> #include "builtin.h" #include "util/color.h" #include "util/debug.h" #include "util/evlist.h" +#include "util/exec_cmd.h" #include "util/machine.h" #include "util/session.h" #include "util/thread.h" @@ -26,6 +45,7 @@ #ifndef MADV_HWPOISON # define MADV_HWPOISON 100 + #endif #ifndef MADV_MERGEABLE @@ -247,42 +267,6 @@ out_delete: ({ struct syscall_tp *fields = evsel->priv; \ fields->name.pointer(&fields->name, sample); }) -static int perf_evlist__add_syscall_newtp(struct perf_evlist *evlist, - void *sys_enter_handler, - void *sys_exit_handler) -{ - int ret = -1; - struct perf_evsel *sys_enter, *sys_exit; - - sys_enter = perf_evsel__syscall_newtp("sys_enter", sys_enter_handler); - if (sys_enter == NULL) - goto out; - - if (perf_evsel__init_sc_tp_ptr_field(sys_enter, args)) - goto out_delete_sys_enter; - - sys_exit = perf_evsel__syscall_newtp("sys_exit", sys_exit_handler); - if (sys_exit == NULL) - goto out_delete_sys_enter; - - if (perf_evsel__init_sc_tp_uint_field(sys_exit, ret)) - goto out_delete_sys_exit; - - perf_evlist__add(evlist, sys_enter); - perf_evlist__add(evlist, sys_exit); - - ret = 0; -out: - return ret; - -out_delete_sys_exit: - perf_evsel__delete_priv(sys_exit); -out_delete_sys_enter: - perf_evsel__delete_priv(sys_enter); - goto out; -} - - struct syscall_arg { unsigned long val; struct thread *thread; @@ -604,6 +588,15 @@ static DEFINE_STRARRAY_OFFSET(epoll_ctl_ops, 1); static const char *itimers[] = { "REAL", "VIRTUAL", "PROF", }; static DEFINE_STRARRAY(itimers); +static const char *keyctl_options[] = { + "GET_KEYRING_ID", "JOIN_SESSION_KEYRING", "UPDATE", "REVOKE", "CHOWN", + "SETPERM", "DESCRIBE", "CLEAR", "LINK", "UNLINK", "SEARCH", "READ", + "INSTANTIATE", "NEGATE", "SET_REQKEY_KEYRING", "SET_TIMEOUT", + "ASSUME_AUTHORITY", "GET_SECURITY", "SESSION_TO_PARENT", "REJECT", + "INSTANTIATE_IOV", "INVALIDATE", "GET_PERSISTENT", +}; +static DEFINE_STRARRAY(keyctl_options); + static const char *whences[] = { "SET", "CUR", "END", #ifdef SEEK_DATA "DATA", @@ -634,7 +627,8 @@ static DEFINE_STRARRAY(sighow); static const char *clockid[] = { "REALTIME", "MONOTONIC", "PROCESS_CPUTIME_ID", "THREAD_CPUTIME_ID", - "MONOTONIC_RAW", "REALTIME_COARSE", "MONOTONIC_COARSE", + "MONOTONIC_RAW", "REALTIME_COARSE", "MONOTONIC_COARSE", "BOOTTIME", + "REALTIME_ALARM", "BOOTTIME_ALARM", "SGI_CYCLE", "TAI" }; static DEFINE_STRARRAY(clockid); @@ -779,6 +773,11 @@ static size_t syscall_arg__scnprintf_access_mode(char *bf, size_t size, #define SCA_ACCMODE syscall_arg__scnprintf_access_mode +static size_t syscall_arg__scnprintf_filename(char *bf, size_t size, + struct syscall_arg *arg); + +#define SCA_FILENAME syscall_arg__scnprintf_filename + static size_t syscall_arg__scnprintf_open_flags(char *bf, size_t size, struct syscall_arg *arg) { @@ -1006,14 +1005,23 @@ static struct syscall_fmt { bool hexret; } syscall_fmts[] = { { .name = "access", .errmsg = true, - .arg_scnprintf = { [1] = SCA_ACCMODE, /* mode */ }, }, + .arg_scnprintf = { [0] = SCA_FILENAME, /* filename */ + [1] = SCA_ACCMODE, /* mode */ }, }, { .name = "arch_prctl", .errmsg = true, .alias = "prctl", }, { .name = "brk", .hexret = true, .arg_scnprintf = { [0] = SCA_HEX, /* brk */ }, }, + { .name = "chdir", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* filename */ }, }, + { .name = "chmod", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* filename */ }, }, + { .name = "chroot", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* filename */ }, }, { .name = "clock_gettime", .errmsg = true, STRARRAY(0, clk_id, clockid), }, { .name = "close", .errmsg = true, .arg_scnprintf = { [0] = SCA_CLOSE_FD, /* fd */ }, }, { .name = "connect", .errmsg = true, }, + { .name = "creat", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* pathname */ }, }, { .name = "dup", .errmsg = true, .arg_scnprintf = { [0] = SCA_FD, /* fd */ }, }, { .name = "dup2", .errmsg = true, @@ -1024,7 +1032,8 @@ static struct syscall_fmt { { .name = "eventfd2", .errmsg = true, .arg_scnprintf = { [1] = SCA_EFD_FLAGS, /* flags */ }, }, { .name = "faccessat", .errmsg = true, - .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ }, }, + .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ + [1] = SCA_FILENAME, /* filename */ }, }, { .name = "fadvise64", .errmsg = true, .arg_scnprintf = { [0] = SCA_FD, /* fd */ }, }, { .name = "fallocate", .errmsg = true, @@ -1034,11 +1043,13 @@ static struct syscall_fmt { { .name = "fchmod", .errmsg = true, .arg_scnprintf = { [0] = SCA_FD, /* fd */ }, }, { .name = "fchmodat", .errmsg = true, - .arg_scnprintf = { [0] = SCA_FDAT, /* fd */ }, }, + .arg_scnprintf = { [0] = SCA_FDAT, /* fd */ + [1] = SCA_FILENAME, /* filename */ }, }, { .name = "fchown", .errmsg = true, .arg_scnprintf = { [0] = SCA_FD, /* fd */ }, }, { .name = "fchownat", .errmsg = true, - .arg_scnprintf = { [0] = SCA_FDAT, /* fd */ }, }, + .arg_scnprintf = { [0] = SCA_FDAT, /* fd */ + [1] = SCA_FILENAME, /* filename */ }, }, { .name = "fcntl", .errmsg = true, .arg_scnprintf = { [0] = SCA_FD, /* fd */ [1] = SCA_STRARRAY, /* cmd */ }, @@ -1053,7 +1064,8 @@ static struct syscall_fmt { { .name = "fstat", .errmsg = true, .alias = "newfstat", .arg_scnprintf = { [0] = SCA_FD, /* fd */ }, }, { .name = "fstatat", .errmsg = true, .alias = "newfstatat", - .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ }, }, + .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ + [1] = SCA_FILENAME, /* filename */ }, }, { .name = "fstatfs", .errmsg = true, .arg_scnprintf = { [0] = SCA_FD, /* fd */ }, }, { .name = "fsync", .errmsg = true, @@ -1063,13 +1075,18 @@ static struct syscall_fmt { { .name = "futex", .errmsg = true, .arg_scnprintf = { [1] = SCA_FUTEX_OP, /* op */ }, }, { .name = "futimesat", .errmsg = true, - .arg_scnprintf = { [0] = SCA_FDAT, /* fd */ }, }, + .arg_scnprintf = { [0] = SCA_FDAT, /* fd */ + [1] = SCA_FILENAME, /* filename */ }, }, { .name = "getdents", .errmsg = true, .arg_scnprintf = { [0] = SCA_FD, /* fd */ }, }, { .name = "getdents64", .errmsg = true, .arg_scnprintf = { [0] = SCA_FD, /* fd */ }, }, { .name = "getitimer", .errmsg = true, STRARRAY(0, which, itimers), }, { .name = "getrlimit", .errmsg = true, STRARRAY(0, resource, rlimit_resources), }, + { .name = "getxattr", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* pathname */ }, }, + { .name = "inotify_add_watch", .errmsg = true, + .arg_scnprintf = { [1] = SCA_FILENAME, /* pathname */ }, }, { .name = "ioctl", .errmsg = true, .arg_scnprintf = { [0] = SCA_FD, /* fd */ #if defined(__i386__) || defined(__x86_64__) @@ -1082,22 +1099,44 @@ static struct syscall_fmt { #else [2] = SCA_HEX, /* arg */ }, }, #endif + { .name = "keyctl", .errmsg = true, STRARRAY(0, option, keyctl_options), }, { .name = "kill", .errmsg = true, .arg_scnprintf = { [1] = SCA_SIGNUM, /* sig */ }, }, + { .name = "lchown", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* filename */ }, }, + { .name = "lgetxattr", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* pathname */ }, }, { .name = "linkat", .errmsg = true, .arg_scnprintf = { [0] = SCA_FDAT, /* fd */ }, }, + { .name = "listxattr", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* pathname */ }, }, + { .name = "llistxattr", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* pathname */ }, }, + { .name = "lremovexattr", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* pathname */ }, }, { .name = "lseek", .errmsg = true, .arg_scnprintf = { [0] = SCA_FD, /* fd */ [2] = SCA_STRARRAY, /* whence */ }, .arg_parm = { [2] = &strarray__whences, /* whence */ }, }, - { .name = "lstat", .errmsg = true, .alias = "newlstat", }, + { .name = "lsetxattr", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* pathname */ }, }, + { .name = "lstat", .errmsg = true, .alias = "newlstat", + .arg_scnprintf = { [0] = SCA_FILENAME, /* filename */ }, }, + { .name = "lsxattr", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* pathname */ }, }, { .name = "madvise", .errmsg = true, .arg_scnprintf = { [0] = SCA_HEX, /* start */ [2] = SCA_MADV_BHV, /* behavior */ }, }, + { .name = "mkdir", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* pathname */ }, }, { .name = "mkdirat", .errmsg = true, - .arg_scnprintf = { [0] = SCA_FDAT, /* fd */ }, }, + .arg_scnprintf = { [0] = SCA_FDAT, /* fd */ + [1] = SCA_FILENAME, /* pathname */ }, }, + { .name = "mknod", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* filename */ }, }, { .name = "mknodat", .errmsg = true, - .arg_scnprintf = { [0] = SCA_FDAT, /* fd */ }, }, + .arg_scnprintf = { [0] = SCA_FDAT, /* fd */ + [1] = SCA_FILENAME, /* filename */ }, }, { .name = "mlock", .errmsg = true, .arg_scnprintf = { [0] = SCA_HEX, /* addr */ }, }, { .name = "mlockall", .errmsg = true, @@ -1110,6 +1149,8 @@ static struct syscall_fmt { { .name = "mprotect", .errmsg = true, .arg_scnprintf = { [0] = SCA_HEX, /* start */ [2] = SCA_MMAP_PROT, /* prot */ }, }, + { .name = "mq_unlink", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* u_name */ }, }, { .name = "mremap", .hexret = true, .arg_scnprintf = { [0] = SCA_HEX, /* addr */ [3] = SCA_MREMAP_FLAGS, /* flags */ @@ -1121,14 +1162,17 @@ static struct syscall_fmt { { .name = "name_to_handle_at", .errmsg = true, .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ }, }, { .name = "newfstatat", .errmsg = true, - .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ }, }, + .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ + [1] = SCA_FILENAME, /* filename */ }, }, { .name = "open", .errmsg = true, - .arg_scnprintf = { [1] = SCA_OPEN_FLAGS, /* flags */ }, }, + .arg_scnprintf = { [0] = SCA_FILENAME, /* filename */ + [1] = SCA_OPEN_FLAGS, /* flags */ }, }, { .name = "open_by_handle_at", .errmsg = true, .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ [2] = SCA_OPEN_FLAGS, /* flags */ }, }, { .name = "openat", .errmsg = true, .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ + [1] = SCA_FILENAME, /* filename */ [2] = SCA_OPEN_FLAGS, /* flags */ }, }, { .name = "perf_event_open", .errmsg = true, .arg_scnprintf = { [1] = SCA_INT, /* pid */ @@ -1150,18 +1194,28 @@ static struct syscall_fmt { .arg_scnprintf = { [0] = SCA_FD, /* fd */ }, }, { .name = "read", .errmsg = true, .arg_scnprintf = { [0] = SCA_FD, /* fd */ }, }, + { .name = "readlink", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* path */ }, }, { .name = "readlinkat", .errmsg = true, - .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ }, }, + .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ + [1] = SCA_FILENAME, /* pathname */ }, }, { .name = "readv", .errmsg = true, .arg_scnprintf = { [0] = SCA_FD, /* fd */ }, }, { .name = "recvfrom", .errmsg = true, - .arg_scnprintf = { [3] = SCA_MSG_FLAGS, /* flags */ }, }, + .arg_scnprintf = { [0] = SCA_FD, /* fd */ + [3] = SCA_MSG_FLAGS, /* flags */ }, }, { .name = "recvmmsg", .errmsg = true, - .arg_scnprintf = { [3] = SCA_MSG_FLAGS, /* flags */ }, }, + .arg_scnprintf = { [0] = SCA_FD, /* fd */ + [3] = SCA_MSG_FLAGS, /* flags */ }, }, { .name = "recvmsg", .errmsg = true, - .arg_scnprintf = { [2] = SCA_MSG_FLAGS, /* flags */ }, }, + .arg_scnprintf = { [0] = SCA_FD, /* fd */ + [2] = SCA_MSG_FLAGS, /* flags */ }, }, + { .name = "removexattr", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* pathname */ }, }, { .name = "renameat", .errmsg = true, .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ }, }, + { .name = "rmdir", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* pathname */ }, }, { .name = "rt_sigaction", .errmsg = true, .arg_scnprintf = { [0] = SCA_SIGNUM, /* sig */ }, }, { .name = "rt_sigprocmask", .errmsg = true, STRARRAY(0, how, sighow), }, @@ -1171,13 +1225,18 @@ static struct syscall_fmt { .arg_scnprintf = { [2] = SCA_SIGNUM, /* sig */ }, }, { .name = "select", .errmsg = true, .timeout = true, }, { .name = "sendmmsg", .errmsg = true, - .arg_scnprintf = { [3] = SCA_MSG_FLAGS, /* flags */ }, }, + .arg_scnprintf = { [0] = SCA_FD, /* fd */ + [3] = SCA_MSG_FLAGS, /* flags */ }, }, { .name = "sendmsg", .errmsg = true, - .arg_scnprintf = { [2] = SCA_MSG_FLAGS, /* flags */ }, }, + .arg_scnprintf = { [0] = SCA_FD, /* fd */ + [2] = SCA_MSG_FLAGS, /* flags */ }, }, { .name = "sendto", .errmsg = true, - .arg_scnprintf = { [3] = SCA_MSG_FLAGS, /* flags */ }, }, + .arg_scnprintf = { [0] = SCA_FD, /* fd */ + [3] = SCA_MSG_FLAGS, /* flags */ }, }, { .name = "setitimer", .errmsg = true, STRARRAY(0, which, itimers), }, { .name = "setrlimit", .errmsg = true, STRARRAY(0, resource, rlimit_resources), }, + { .name = "setxattr", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* pathname */ }, }, { .name = "shutdown", .errmsg = true, .arg_scnprintf = { [0] = SCA_FD, /* fd */ }, }, { .name = "socket", .errmsg = true, @@ -1188,18 +1247,35 @@ static struct syscall_fmt { .arg_scnprintf = { [0] = SCA_STRARRAY, /* family */ [1] = SCA_SK_TYPE, /* type */ }, .arg_parm = { [0] = &strarray__socket_families, /* family */ }, }, - { .name = "stat", .errmsg = true, .alias = "newstat", }, + { .name = "stat", .errmsg = true, .alias = "newstat", + .arg_scnprintf = { [0] = SCA_FILENAME, /* pathname */ }, }, + { .name = "statfs", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* pathname */ }, }, + { .name = "swapoff", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* specialfile */ }, }, + { .name = "swapon", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* specialfile */ }, }, { .name = "symlinkat", .errmsg = true, .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ }, }, { .name = "tgkill", .errmsg = true, .arg_scnprintf = { [2] = SCA_SIGNUM, /* sig */ }, }, { .name = "tkill", .errmsg = true, .arg_scnprintf = { [1] = SCA_SIGNUM, /* sig */ }, }, + { .name = "truncate", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* path */ }, }, { .name = "uname", .errmsg = true, .alias = "newuname", }, { .name = "unlinkat", .errmsg = true, - .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ }, }, + .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ + [1] = SCA_FILENAME, /* pathname */ }, }, + { .name = "utime", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* filename */ }, }, { .name = "utimensat", .errmsg = true, - .arg_scnprintf = { [0] = SCA_FDAT, /* dirfd */ }, }, + .arg_scnprintf = { [0] = SCA_FDAT, /* dirfd */ + [1] = SCA_FILENAME, /* filename */ }, }, + { .name = "utimes", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FILENAME, /* filename */ }, }, + { .name = "vmsplice", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FD, /* fd */ }, }, { .name = "write", .errmsg = true, .arg_scnprintf = { [0] = SCA_FD, /* fd */ }, }, { .name = "writev", .errmsg = true, @@ -1223,7 +1299,6 @@ struct syscall { int nr_args; struct format_field *args; const char *name; - bool filtered; bool is_exit; struct syscall_fmt *fmt; size_t (**arg_scnprintf)(char *bf, size_t size, struct syscall_arg *arg); @@ -1244,6 +1319,11 @@ static size_t fprintf_duration(unsigned long t, FILE *fp) return printed + fprintf(fp, "): "); } +/** + * filename.ptr: The filename char pointer that will be vfs_getname'd + * filename.entry_str_pos: Where to insert the string translated from + * filename.ptr by the vfs_getname tracepoint/kprobe. + */ struct thread_trace { u64 entry_time; u64 exit_time; @@ -1252,6 +1332,13 @@ struct thread_trace { unsigned long pfmaj, pfmin; char *entry_str; double runtime_ms; + struct { + unsigned long ptr; + short int entry_str_pos; + bool pending_open; + unsigned int namelen; + char *name; + } filename; struct { int max; char **table; @@ -1298,6 +1385,8 @@ fail: #define TRACE_PFMAJ (1 << 0) #define TRACE_PFMIN (1 << 1) +static const size_t trace__entry_str_size = 2048; + struct trace { struct perf_tool tool; struct { @@ -1307,6 +1396,10 @@ struct trace { struct { int max; struct syscall *table; + struct { + struct perf_evsel *sys_enter, + *sys_exit; + } events; } syscalls; struct record_opts opts; struct perf_evlist *evlist; @@ -1316,7 +1409,10 @@ struct trace { FILE *output; unsigned long nr_events; struct strlist *ev_qualifier; - const char *last_vfs_getname; + struct { + size_t nr; + int *entries; + } ev_qualifier_ids; struct intlist *tid_list; struct intlist *pid_list; struct { @@ -1340,6 +1436,7 @@ struct trace { bool show_tool_stats; bool trace_syscalls; bool force; + bool vfs_getname; int trace_pgfaults; }; @@ -1443,6 +1540,27 @@ static size_t syscall_arg__scnprintf_close_fd(char *bf, size_t size, return printed; } +static void thread__set_filename_pos(struct thread *thread, const char *bf, + unsigned long ptr) +{ + struct thread_trace *ttrace = thread__priv(thread); + + ttrace->filename.ptr = ptr; + ttrace->filename.entry_str_pos = bf - ttrace->entry_str; +} + +static size_t syscall_arg__scnprintf_filename(char *bf, size_t size, + struct syscall_arg *arg) +{ + unsigned long ptr = arg->val; + + if (!arg->trace->vfs_getname) + return scnprintf(bf, size, "%#x", ptr); + + thread__set_filename_pos(arg->thread, bf, ptr); + return 0; +} + static bool trace__filter_duration(struct trace *trace, double t) { return t < (trace->duration_filter * NSEC_PER_MSEC); @@ -1517,6 +1635,9 @@ static int trace__symbols_init(struct trace *trace, struct perf_evlist *evlist) if (trace->host == NULL) return -ENOMEM; + if (trace_event__register_resolver(trace->host, machine__resolve_kernel_addr) < 0) + return -errno; + err = __machine__synthesize_threads(trace->host, &trace->tool, &trace->opts.target, evlist->threads, trace__tool_process, false, trace->opts.proc_map_timeout); @@ -1578,19 +1699,6 @@ static int trace__read_syscall_info(struct trace *trace, int id) sc = trace->syscalls.table + id; sc->name = name; - if (trace->ev_qualifier) { - bool in = strlist__find(trace->ev_qualifier, name) != NULL; - - if (!(in ^ trace->not_ev_qualifier)) { - sc->filtered = true; - /* - * No need to do read tracepoint information since this will be - * filtered out. - */ - return 0; - } - } - sc->fmt = syscall_fmt__find(sc->name); snprintf(tp_name, sizeof(tp_name), "sys_enter_%s", sc->name); @@ -1619,13 +1727,27 @@ static int trace__read_syscall_info(struct trace *trace, int id) static int trace__validate_ev_qualifier(struct trace *trace) { - int err = 0; + int err = 0, i; struct str_node *pos; + trace->ev_qualifier_ids.nr = strlist__nr_entries(trace->ev_qualifier); + trace->ev_qualifier_ids.entries = malloc(trace->ev_qualifier_ids.nr * + sizeof(trace->ev_qualifier_ids.entries[0])); + + if (trace->ev_qualifier_ids.entries == NULL) { + fputs("Error:\tNot enough memory for allocating events qualifier ids\n", + trace->output); + err = -EINVAL; + goto out; + } + + i = 0; + strlist__for_each(pos, trace->ev_qualifier) { const char *sc = pos->s; + int id = audit_name_to_syscall(sc, trace->audit.machine); - if (audit_name_to_syscall(sc, trace->audit.machine) < 0) { + if (id < 0) { if (err == 0) { fputs("Error:\tInvalid syscall ", trace->output); err = -EINVAL; @@ -1635,13 +1757,17 @@ static int trace__validate_ev_qualifier(struct trace *trace) fputs(sc, trace->output); } + + trace->ev_qualifier_ids.entries[i++] = id; } if (err < 0) { fputs("\nHint:\ttry 'perf list syscalls:sys_enter_*'" "\nHint:\tand: 'man syscalls'\n", trace->output); + zfree(&trace->ev_qualifier_ids.entries); + trace->ev_qualifier_ids.nr = 0; } - +out: return err; } @@ -1833,9 +1959,6 @@ static int trace__sys_enter(struct trace *trace, struct perf_evsel *evsel, if (sc == NULL) return -1; - if (sc->filtered) - return 0; - thread = machine__findnew_thread(trace->host, sample->pid, sample->tid); ttrace = thread__trace(thread, trace->output); if (ttrace == NULL) @@ -1844,7 +1967,7 @@ static int trace__sys_enter(struct trace *trace, struct perf_evsel *evsel, args = perf_evsel__sc_tp_ptr(evsel, args, sample); if (ttrace->entry_str == NULL) { - ttrace->entry_str = malloc(1024); + ttrace->entry_str = malloc(trace__entry_str_size); if (!ttrace->entry_str) goto out_put; } @@ -1854,9 +1977,9 @@ static int trace__sys_enter(struct trace *trace, struct perf_evsel *evsel, ttrace->entry_time = sample->time; msg = ttrace->entry_str; - printed += scnprintf(msg + printed, 1024 - printed, "%s(", sc->name); + printed += scnprintf(msg + printed, trace__entry_str_size - printed, "%s(", sc->name); - printed += syscall__scnprintf_args(sc, msg + printed, 1024 - printed, + printed += syscall__scnprintf_args(sc, msg + printed, trace__entry_str_size - printed, args, trace, thread); if (sc->is_exit) { @@ -1864,8 +1987,11 @@ static int trace__sys_enter(struct trace *trace, struct perf_evsel *evsel, trace__fprintf_entry_head(trace, thread, 1, sample->time, trace->output); fprintf(trace->output, "%-70s\n", ttrace->entry_str); } - } else + } else { ttrace->entry_pending = true; + /* See trace__vfs_getname & trace__sys_exit */ + ttrace->filename.pending_open = false; + } if (trace->current != thread) { thread__put(trace->current); @@ -1891,9 +2017,6 @@ static int trace__sys_exit(struct trace *trace, struct perf_evsel *evsel, if (sc == NULL) return -1; - if (sc->filtered) - return 0; - thread = machine__findnew_thread(trace->host, sample->pid, sample->tid); ttrace = thread__trace(thread, trace->output); if (ttrace == NULL) @@ -1904,9 +2027,9 @@ static int trace__sys_exit(struct trace *trace, struct perf_evsel *evsel, ret = perf_evsel__sc_tp_uint(evsel, ret, sample); - if (id == trace->audit.open_id && ret >= 0 && trace->last_vfs_getname) { - trace__set_fd_pathname(thread, ret, trace->last_vfs_getname); - trace->last_vfs_getname = NULL; + if (id == trace->audit.open_id && ret >= 0 && ttrace->filename.pending_open) { + trace__set_fd_pathname(thread, ret, ttrace->filename.name); + ttrace->filename.pending_open = false; ++trace->stats.vfs_getname; } @@ -1961,7 +2084,56 @@ static int trace__vfs_getname(struct trace *trace, struct perf_evsel *evsel, union perf_event *event __maybe_unused, struct perf_sample *sample) { - trace->last_vfs_getname = perf_evsel__rawptr(evsel, sample, "pathname"); + struct thread *thread = machine__findnew_thread(trace->host, sample->pid, sample->tid); + struct thread_trace *ttrace; + size_t filename_len, entry_str_len, to_move; + ssize_t remaining_space; + char *pos; + const char *filename = perf_evsel__rawptr(evsel, sample, "pathname"); + + if (!thread) + goto out; + + ttrace = thread__priv(thread); + if (!ttrace) + goto out; + + filename_len = strlen(filename); + + if (ttrace->filename.namelen < filename_len) { + char *f = realloc(ttrace->filename.name, filename_len + 1); + + if (f == NULL) + goto out; + + ttrace->filename.namelen = filename_len; + ttrace->filename.name = f; + } + + strcpy(ttrace->filename.name, filename); + ttrace->filename.pending_open = true; + + if (!ttrace->filename.ptr) + goto out; + + entry_str_len = strlen(ttrace->entry_str); + remaining_space = trace__entry_str_size - entry_str_len - 1; /* \0 */ + if (remaining_space <= 0) + goto out; + + if (filename_len > (size_t)remaining_space) { + filename += filename_len - remaining_space; + filename_len = remaining_space; + } + + to_move = entry_str_len - ttrace->filename.entry_str_pos + 1; /* \0 */ + pos = ttrace->entry_str + ttrace->filename.entry_str_pos; + memmove(pos + filename_len, pos, to_move); + memcpy(pos, filename, filename_len); + + ttrace->filename.ptr = 0; + ttrace->filename.entry_str_pos = 0; +out: return 0; } @@ -2214,19 +2386,20 @@ static int trace__record(struct trace *trace, int argc, const char **argv) static size_t trace__fprintf_thread_summary(struct trace *trace, FILE *fp); -static void perf_evlist__add_vfs_getname(struct perf_evlist *evlist) +static bool perf_evlist__add_vfs_getname(struct perf_evlist *evlist) { struct perf_evsel *evsel = perf_evsel__newtp("probe", "vfs_getname"); if (evsel == NULL) - return; + return false; if (perf_evsel__field(evsel, "pathname") == NULL) { perf_evsel__delete(evsel); - return; + return false; } evsel->handler = trace__vfs_getname; perf_evlist__add(evlist, evsel); + return true; } static int perf_evlist__add_pgfault(struct perf_evlist *evlist, @@ -2283,9 +2456,68 @@ static void trace__handle_event(struct trace *trace, union perf_event *event, st } } +static int trace__add_syscall_newtp(struct trace *trace) +{ + int ret = -1; + struct perf_evlist *evlist = trace->evlist; + struct perf_evsel *sys_enter, *sys_exit; + + sys_enter = perf_evsel__syscall_newtp("sys_enter", trace__sys_enter); + if (sys_enter == NULL) + goto out; + + if (perf_evsel__init_sc_tp_ptr_field(sys_enter, args)) + goto out_delete_sys_enter; + + sys_exit = perf_evsel__syscall_newtp("sys_exit", trace__sys_exit); + if (sys_exit == NULL) + goto out_delete_sys_enter; + + if (perf_evsel__init_sc_tp_uint_field(sys_exit, ret)) + goto out_delete_sys_exit; + + perf_evlist__add(evlist, sys_enter); + perf_evlist__add(evlist, sys_exit); + + trace->syscalls.events.sys_enter = sys_enter; + trace->syscalls.events.sys_exit = sys_exit; + + ret = 0; +out: + return ret; + +out_delete_sys_exit: + perf_evsel__delete_priv(sys_exit); +out_delete_sys_enter: + perf_evsel__delete_priv(sys_enter); + goto out; +} + +static int trace__set_ev_qualifier_filter(struct trace *trace) +{ + int err = -1; + char *filter = asprintf_expr_inout_ints("id", !trace->not_ev_qualifier, + trace->ev_qualifier_ids.nr, + trace->ev_qualifier_ids.entries); + + if (filter == NULL) + goto out_enomem; + + if (!perf_evsel__append_filter(trace->syscalls.events.sys_enter, "&&", filter)) + err = perf_evsel__append_filter(trace->syscalls.events.sys_exit, "&&", filter); + + free(filter); +out: + return err; +out_enomem: + errno = ENOMEM; + goto out; +} + static int trace__run(struct trace *trace, int argc, const char **argv) { struct perf_evlist *evlist = trace->evlist; + struct perf_evsel *evsel; int err = -1, i; unsigned long before; const bool forks = argc > 0; @@ -2293,13 +2525,11 @@ static int trace__run(struct trace *trace, int argc, const char **argv) trace->live = true; - if (trace->trace_syscalls && - perf_evlist__add_syscall_newtp(evlist, trace__sys_enter, - trace__sys_exit)) + if (trace->trace_syscalls && trace__add_syscall_newtp(trace)) goto out_error_raw_syscalls; if (trace->trace_syscalls) - perf_evlist__add_vfs_getname(evlist); + trace->vfs_getname = perf_evlist__add_vfs_getname(evlist); if ((trace->trace_pgfaults & TRACE_PFMAJ) && perf_evlist__add_pgfault(evlist, PERF_COUNT_SW_PAGE_FAULTS_MAJ)) { @@ -2356,11 +2586,22 @@ static int trace__run(struct trace *trace, int argc, const char **argv) else if (thread_map__pid(evlist->threads, 0) == -1) err = perf_evlist__set_filter_pid(evlist, getpid()); - if (err < 0) { - printf("err=%d,%s\n", -err, strerror(-err)); - exit(1); + if (err < 0) + goto out_error_mem; + + if (trace->ev_qualifier_ids.nr > 0) { + err = trace__set_ev_qualifier_filter(trace); + if (err < 0) + goto out_errno; + + pr_debug("event qualifier tracepoint filter: %s\n", + trace->syscalls.events.sys_exit->filter); } + err = perf_evlist__apply_filters(evlist, &evsel); + if (err < 0) + goto out_error_apply_filters; + err = perf_evlist__mmap(evlist, trace->opts.mmap_pages, false); if (err < 0) goto out_error_mmap; @@ -2462,10 +2703,21 @@ out_error_open: out_error: fprintf(trace->output, "%s\n", errbuf); goto out_delete_evlist; + +out_error_apply_filters: + fprintf(trace->output, + "Failed to set filter \"%s\" on event %s with %d (%s)\n", + evsel->filter, perf_evsel__name(evsel), errno, + strerror_r(errno, errbuf, sizeof(errbuf))); + goto out_delete_evlist; } out_error_mem: fprintf(trace->output, "Not enough memory to run!\n"); goto out_delete_evlist; + +out_errno: + fprintf(trace->output, "errno=%d,%s\n", errno, strerror(errno)); + goto out_delete_evlist; } static int trace__replay(struct trace *trace) @@ -2586,9 +2838,9 @@ static size_t thread__dump_stats(struct thread_trace *ttrace, printed += fprintf(fp, "\n"); - printed += fprintf(fp, " syscall calls min avg max stddev\n"); - printed += fprintf(fp, " (msec) (msec) (msec) (%%)\n"); - printed += fprintf(fp, " --------------- -------- --------- --------- --------- ------\n"); + printed += fprintf(fp, " syscall calls total min avg max stddev\n"); + printed += fprintf(fp, " (msec) (msec) (msec) (msec) (%%)\n"); + printed += fprintf(fp, " --------------- -------- --------- --------- --------- --------- ------\n"); /* each int_node is a syscall */ while (inode) { @@ -2605,8 +2857,8 @@ static size_t thread__dump_stats(struct thread_trace *ttrace, sc = &trace->syscalls.table[inode->i]; printed += fprintf(fp, " %-15s", sc->name); - printed += fprintf(fp, " %8" PRIu64 " %9.3f %9.3f", - n, min, avg); + printed += fprintf(fp, " %8" PRIu64 " %9.3f %9.3f %9.3f", + n, avg * n, min, avg); printed += fprintf(fp, " %9.3f %9.2f%%\n", max, pct); } @@ -2778,7 +3030,7 @@ int cmd_trace(int argc, const char **argv, const char *prefix __maybe_unused) .mmap_pages = UINT_MAX, .proc_map_timeout = 500, }, - .output = stdout, + .output = stderr, .show_comm = true, .trace_syscalls = true, }; @@ -2879,11 +3131,14 @@ int cmd_trace(int argc, const char **argv, const char *prefix __maybe_unused) if (ev_qualifier_str != NULL) { const char *s = ev_qualifier_str; + struct strlist_config slist_config = { + .dirname = system_path(STRACE_GROUPS_DIR), + }; trace.not_ev_qualifier = *s == '!'; if (trace.not_ev_qualifier) ++s; - trace.ev_qualifier = strlist__new(true, s); + trace.ev_qualifier = strlist__new(s, &slist_config); if (trace.ev_qualifier == NULL) { fputs("Not enough memory to parse event qualifier", trace.output); |