diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2021-02-22 13:59:43 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2021-02-22 13:59:43 -0800 |
commit | 3a36281a17199737b468befb826d4a23eb774445 (patch) | |
tree | 18bbdfcb0c30215883809f248d0334b7ea84d5ba /tools/lib | |
parent | 7c70f3a7488d2fa62d32849d138bf2b8420fe788 (diff) | |
parent | 3027ce36ccbae74f2e7c1afbfc3f69fee0c2a996 (diff) |
Merge tag 'perf-tools-for-v5.12-2020-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tool updates from Arnaldo Carvalho de Melo:
"New features:
- Support instruction latency in 'perf report', with both memory
latency (weight) and instruction latency information, users can
locate expensive load instructions and understand time spent in
different stages.
- Extend 'perf c2c' to display the number of loads which were blocked
by data or address conflict.
- Add 'perf stat' support for L2 topdown events in systems such as
Intel's Sapphire rapids server.
- Add support for PERF_SAMPLE_CODE_PAGE_SIZE in various tools, as a
sort key, for instance:
perf report --stdio --sort=comm,symbol,code_page_size
- New 'perf daemon' command to run long running sessions while
providing a way to control the enablement of events without
restarting a traditional 'perf record' session.
- Enable counting events for BPF programs in 'perf stat' just like
for other targets (tid, cgroup, cpu, etc), e.g.:
# perf stat -e ref-cycles,cycles -b 254 -I 1000
1.487903822 115,200 ref-cycles
1.487903822 86,012 cycles
2.489147029 80,560 ref-cycles
2.489147029 73,784 cycles
^C
The example above counts 'cycles' and 'ref-cycles' of BPF program
of id 254. It is similar to bpftool-prog-profile command, but more
flexible.
- Support the new layout for PERF_RECORD_MMAP2 to carry the DSO
build-id using infrastructure generalised from the eBPF subsystem,
removing the need for traversing the perf.data file to collect
build-ids at the end of 'perf record' sessions and helping with
long running sessions where binaries can get replaced in updates,
leading to possible mis-resolution of symbols.
- Support filtering by hex address in 'perf script'.
- Support DSO filter in 'perf script', like in other perf tools.
- Add namespaces support to 'perf inject'
- Add support for SDT (Dtrace Style Markers) events on ARM64.
perf record:
- Fix handling of eventfd() when draining a buffer in 'perf record'.
- Improvements to the generation of metadata events for pre-existing
threads (mmaps, comm, etc), speeding up the work done at the start
of system wide or per CPU 'perf record' sessions.
Hardware tracing:
- Initial support for tracing KVM with Intel PT.
- Intel PT fixes for IPC
- Support Intel PT PSB (synchronization packets) events.
- Automatically group aux-output events to overcome --filter syntax.
- Enable PERF_SAMPLE_DATA_SRC on ARMs SPE.
- Update ARM's CoreSight hardware tracing OpenCSD library to v1.0.0.
perf annotate TUI:
- Fix handling of 'k' ("show line number") hotkey
- Fix jump parsing for C++ code.
perf probe:
- Add protection to avoid endless loop.
cgroups:
- Avoid reading cgroup mountpoint multiple times, caching it.
- Fix handling of cgroup v1/v2 in mixed hierarchy.
Symbol resolving:
- Add OCaml symbol demangling.
- Further fixes for handling PE executables when using perf with Wine
and .exe/.dll files.
- Fix 'perf unwind' DSO handling.
- Resolve symbols against debug file first, to deal with artifacts
related to LTO.
- Fix gap between kernel end and module start on powerpc.
Reporting tools:
- The DSO filter shouldn't show samples in unresolved maps.
- Improve debuginfod support in various tools.
build ids:
- Fix 16-byte build ids in 'perf buildid-cache', add a 'perf test'
entry for that case.
perf test:
- Support for PERF_SAMPLE_WEIGHT_STRUCT.
- Add test case for PERF_SAMPLE_CODE_PAGE_SIZE.
- Shell based tests for 'perf daemon's commands ('start', 'stop,
'reconfig', 'list', etc).
- ARM cs-etm 'perf test' fixes.
- Add parse-metric memory bandwidth testcase.
Compiler related:
- Fix 'perf probe' kretprobe issue caused by gcc 11 bug when used
with -fpatchable-function-entry.
- Fix ARM64 build with gcc 11's -Wformat-overflow.
- Fix unaligned access in sample parsing test.
- Fix printf conversion specifier for IP addresses on arm64, s390 and
powerpc.
Arch specific:
- Support exposing Performance Monitor Counter SPRs as part of
extended regs on powerpc.
- Add JSON 'perf stat' metrics for ARM64's imx8mp, imx8mq and imx8mn
DDR, fix imx8mm ones.
- Fix common and uarch events for ARM64's A76 and Ampere eMag"
* tag 'perf-tools-for-v5.12-2020-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (148 commits)
perf buildid-cache: Don't skip 16-byte build-ids
perf buildid-cache: Add test for 16-byte build-id
perf symbol: Remove redundant libbfd checks
perf test: Output the sub testing result in cs-etm
perf test: Suppress logs in cs-etm testing
perf tools: Fix arm64 build error with gcc-11
perf intel-pt: Add documentation for tracing virtual machines
perf intel-pt: Split VM-Entry and VM-Exit branches
perf intel-pt: Adjust sample flags for VM-Exit
perf intel-pt: Allow for a guest kernel address filter
perf intel-pt: Support decoding of guest kernel
perf machine: Factor out machine__idle_thread()
perf machine: Factor out machines__find_guest()
perf intel-pt: Amend decoder to track the NR flag
perf intel-pt: Retain the last PIP packet payload as is
perf intel_pt: Add vmlaunch and vmresume as branches
perf script: Add branch types for VM-Entry and VM-Exit
perf auxtrace: Automatically group aux-output events
perf test: Fix unaligned access in sample parsing test
perf tools: Support arch specific PERF_SAMPLE_WEIGHT_STRUCT processing
...
Diffstat (limited to 'tools/lib')
-rw-r--r-- | tools/lib/api/fs/cgroup.c | 95 | ||||
-rw-r--r-- | tools/lib/perf/include/perf/event.h | 18 |
2 files changed, 79 insertions, 34 deletions
diff --git a/tools/lib/api/fs/cgroup.c b/tools/lib/api/fs/cgroup.c index 889a6eb4aaca..1573dae4259d 100644 --- a/tools/lib/api/fs/cgroup.c +++ b/tools/lib/api/fs/cgroup.c @@ -8,12 +8,29 @@ #include <string.h> #include "fs.h" +struct cgroupfs_cache_entry { + char subsys[32]; + char mountpoint[PATH_MAX]; +}; + +/* just cache last used one */ +static struct cgroupfs_cache_entry cached; + int cgroupfs_find_mountpoint(char *buf, size_t maxlen, const char *subsys) { FILE *fp; - char mountpoint[PATH_MAX + 1], tokens[PATH_MAX + 1], type[PATH_MAX + 1]; - char path_v1[PATH_MAX + 1], path_v2[PATH_MAX + 2], *path; - char *token, *saved_ptr = NULL; + char *line = NULL; + size_t len = 0; + char *p, *path; + char mountpoint[PATH_MAX]; + + if (!strcmp(cached.subsys, subsys)) { + if (strlen(cached.mountpoint) < maxlen) { + strcpy(buf, cached.mountpoint); + return 0; + } + return -1; + } fp = fopen("/proc/mounts", "r"); if (!fp) @@ -22,45 +39,63 @@ int cgroupfs_find_mountpoint(char *buf, size_t maxlen, const char *subsys) /* * in order to handle split hierarchy, we need to scan /proc/mounts * and inspect every cgroupfs mount point to find one that has - * perf_event subsystem + * the given subsystem. If we found v1, just use it. If not we can + * use v2 path as a fallback. */ - path_v1[0] = '\0'; - path_v2[0] = '\0'; + mountpoint[0] = '\0'; - while (fscanf(fp, "%*s %"__stringify(PATH_MAX)"s %"__stringify(PATH_MAX)"s %" - __stringify(PATH_MAX)"s %*d %*d\n", - mountpoint, type, tokens) == 3) { + /* + * The /proc/mounts has the follow format: + * + * <devname> <mount point> <fs type> <options> ... + * + */ + while (getline(&line, &len, fp) != -1) { + /* skip devname */ + p = strchr(line, ' '); + if (p == NULL) + continue; + + /* save the mount point */ + path = ++p; + p = strchr(p, ' '); + if (p == NULL) + continue; - if (!path_v1[0] && !strcmp(type, "cgroup")) { + *p++ = '\0'; - token = strtok_r(tokens, ",", &saved_ptr); + /* check filesystem type */ + if (strncmp(p, "cgroup", 6)) + continue; - while (token != NULL) { - if (subsys && !strcmp(token, subsys)) { - strcpy(path_v1, mountpoint); - break; - } - token = strtok_r(NULL, ",", &saved_ptr); - } + if (p[6] == '2') { + /* save cgroup v2 path */ + strcpy(mountpoint, path); + continue; } - if (!path_v2[0] && !strcmp(type, "cgroup2")) - strcpy(path_v2, mountpoint); + /* now we have cgroup v1, check the options for subsystem */ + p += 7; - if (path_v1[0] && path_v2[0]) - break; + p = strstr(p, subsys); + if (p == NULL) + continue; + + /* sanity check: it should be separated by a space or a comma */ + if (!strchr(" ,", p[-1]) || !strchr(" ,", p[strlen(subsys)])) + continue; + + strcpy(mountpoint, path); + break; } + free(line); fclose(fp); - if (path_v1[0]) - path = path_v1; - else if (path_v2[0]) - path = path_v2; - else - return -1; + strncpy(cached.subsys, subsys, sizeof(cached.subsys) - 1); + strcpy(cached.mountpoint, mountpoint); - if (strlen(path) < maxlen) { - strcpy(buf, path); + if (mountpoint[0] && strlen(mountpoint) < maxlen) { + strcpy(buf, mountpoint); return 0; } return -1; diff --git a/tools/lib/perf/include/perf/event.h b/tools/lib/perf/include/perf/event.h index 988c539bedb6..d82054225fcc 100644 --- a/tools/lib/perf/include/perf/event.h +++ b/tools/lib/perf/include/perf/event.h @@ -23,10 +23,20 @@ struct perf_record_mmap2 { __u64 start; __u64 len; __u64 pgoff; - __u32 maj; - __u32 min; - __u64 ino; - __u64 ino_generation; + union { + struct { + __u32 maj; + __u32 min; + __u64 ino; + __u64 ino_generation; + }; + struct { + __u8 build_id_size; + __u8 __reserved_1; + __u16 __reserved_2; + __u8 build_id[20]; + }; + }; __u32 prot; __u32 flags; char filename[PATH_MAX]; |