diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2024-05-21 15:45:14 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2024-05-21 15:45:14 -0700 |
commit | 29c73fc794c83505066ee6db893b2a83ac5fac63 (patch) | |
tree | b584f9eb09f308d0172e7179c1d920577af6ff1c /tools/arch | |
parent | 4865a27c66fda6a32511ec5492f4bbec437f512d (diff) | |
parent | ea558c86248b4955e5c5f3c0c921df450880605e (diff) |
Merge tag 'perf-tools-for-v6.10-1-2024-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
"General:
- Integrate the shellcheck utility with the build of perf to allow
catching shell problems early in areas such as 'perf test', 'perf
trace' scrape scripts, etc
- Add 'uretprobe' variant in the 'perf bench uprobe' tool
- Add script to run instances of 'perf script' in parallel
- Allow parsing tracepoint names that start with digits, such as
9p/9p_client_req, etc. Make sure 'perf test' tests it even on
systems where those tracepoints aren't available
- Add Kan Liang to MAINTAINERS as a perf tools reviewer
- Add support for using the 'capstone' disassembler library in
various tools, such as 'perf script' and 'perf annotate'. This is
an alternative for the use of the 'xed' and 'objdump' disassemblers
Data-type profiling improvements:
- Resolve types for a->b->c by backtracking the assignments until it
finds DWARF info for one of those members
- Support for global variables, keeping a cache to speed up lookups
- Handle the 'call' instruction, dealing with effects on registers
and handling its return when tracking register data types
- Handle x86's segment based addressing like %gs:0x28, to support
things like per CPU variables, the stack canary, etc
- Data-type profiling got big speedups when using capstone for
disassembling. The objdump outoput parsing method is left as a
fallback when capstone fails or isn't available. There are patches
posted for 6.11 that to use a LLVM disassembler
- Support event group display in the TUI when annotating types with
--data-type, for instance to show memory load and store events for
the data type fields
- Optimize the 'perf annotate' data structures, reducing memory usage
- Add a initial 'perf test' for 'perf annotate', checking that a
target symbol appears on the output, specifying objdump via the
command line, etc
Vendor Events:
- Update Intel JSON files for Cascade Lake X, Emerald Rapids, Grand
Ridge, Ice Lake X, Lunar Lake, Meteor Lake, Sapphire Rapids, Sierra
Forest, Sky Lake X, Sky Lake and Snow Ridge X. Remove info metrics
erroneously in TopdownL1
- Add AMD's Zen 5 core and uncore events and metrics. Those come from
the "Performance Monitor Counters for AMD Family 1Ah Model 00h- 0Fh
Processors" document, with events that capture information on op
dispatch, execution and retirement, branch prediction, L1 and L2
cache activity, TLB activity, etc
- Mark L1D_CACHE_INVAL impacted by errata for ARM64's AmpereOne/
AmpereOneX
Miscellaneous:
- Sync header copies with the kernel sources
- Move some header copies used only for generating translation string
tables for ioctl cmds and other syscall integer arguments to a new
directory under tools/perf/beauty/, to separate from copies in
tools/include/ that are used to build the tools
- Introduce scrape script for several syscall 'flags'/'mask'
arguments
- Improve cpumap utilization, fixing up pairing of refcounts, using
the right iterators (perf_cpu_map__for_each_cpu), etc
- Give more details about raw event encodings in 'perf list', show
tracepoint encoding in the detailed output
- Refactor the DSOs handling code, reducing memory usage
- Document the BPF event modifier and add a 'perf test' for it
- Improve the event parser, better error messages and add further
'perf test's for it
- Add reference count checking to 'struct comm_str' and 'struct
mem_info'
- Make ARM64's 'perf test' entries for the Neoverse N1 more robust
- Tweak the ARM64's Coresight 'perf test's
- Improve ARM64's CoreSight ETM version detection and error reporting
- Fix handling of symbols when using kcore
- Fix PAI (Processor Activity Instrumentation) counter names for s390
virtual machines in 'perf report'
- Fix -g/--call-graph option failure in 'perf sched timehist'
- Add LIBTRACEEVENT_DIR build option to allow building with
libtraceevent installed in non-standard directories, such as when
doing cross builds
- Various 'perf test' and 'perf bench' fixes
- Improve 'perf probe' error message for long C++ probe names"
* tag 'perf-tools-for-v6.10-1-2024-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (260 commits)
tools lib subcmd: Show parent options in help
perf pmu: Count sys and cpuid JSON events separately
perf stat: Don't display metric header for non-leader uncore events
perf annotate-data: Ensure the number of type histograms
perf annotate: Fix segfault on sample histogram
perf daemon: Fix file leak in daemon_session__control
libsubcmd: Fix parse-options memory leak
perf lock: Avoid memory leaks from strdup()
perf sched: Rename 'switches' column header to 'count' and add usage description, options for latency
perf tools: Ignore deleted cgroups
perf parse: Allow tracepoint names to start with digits
perf parse-events: Add new 'fake_tp' parameter for tests
perf parse-events: pass parse_state to add_tracepoint
perf symbols: Fix ownership of string in dso__load_vmlinux()
perf symbols: Update kcore map before merging in remaining symbols
perf maps: Re-use __maps__free_maps_by_name()
perf symbols: Remove map from list before updating addresses
perf tracepoint: Don't scan all tracepoints to test if one exists
perf dwarf-aux: Fix build with HAVE_DWARF_CFI_SUPPORT
perf thread: Fixes to thread__new() related to initializing comm
...
Diffstat (limited to 'tools/arch')
-rw-r--r-- | tools/arch/x86/include/asm/cpufeatures.h | 7 | ||||
-rw-r--r-- | tools/arch/x86/include/asm/irq_vectors.h | 140 | ||||
-rw-r--r-- | tools/arch/x86/include/asm/msr-index.h | 9 | ||||
-rw-r--r-- | tools/arch/x86/include/uapi/asm/prctl.h | 43 |
4 files changed, 14 insertions, 185 deletions
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h index a38f8f9ba657..3c7434329661 100644 --- a/tools/arch/x86/include/asm/cpufeatures.h +++ b/tools/arch/x86/include/asm/cpufeatures.h @@ -461,11 +461,15 @@ /* * Extended auxiliary flags: Linux defined - for features scattered in various - * CPUID levels like 0x80000022, etc. + * CPUID levels like 0x80000022, etc and Linux defined features. * * Reuse free bits when adding new feature flags! */ #define X86_FEATURE_AMD_LBR_PMC_FREEZE (21*32+ 0) /* AMD LBR and PMC Freeze */ +#define X86_FEATURE_CLEAR_BHB_LOOP (21*32+ 1) /* "" Clear branch history at syscall entry using SW loop */ +#define X86_FEATURE_BHI_CTRL (21*32+ 2) /* "" BHI_DIS_S HW control available */ +#define X86_FEATURE_CLEAR_BHB_HW (21*32+ 3) /* "" BHI_DIS_S HW control enabled */ +#define X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT (21*32+ 4) /* "" Clear branch history at vmexit using SW loop */ /* * BUG word(s) @@ -515,4 +519,5 @@ #define X86_BUG_SRSO X86_BUG(1*32 + 0) /* AMD SRSO bug */ #define X86_BUG_DIV0 X86_BUG(1*32 + 1) /* AMD DIV0 speculation bug */ #define X86_BUG_RFDS X86_BUG(1*32 + 2) /* CPU is vulnerable to Register File Data Sampling */ +#define X86_BUG_BHI X86_BUG(1*32 + 3) /* CPU is affected by Branch History Injection */ #endif /* _ASM_X86_CPUFEATURES_H */ diff --git a/tools/arch/x86/include/asm/irq_vectors.h b/tools/arch/x86/include/asm/irq_vectors.h deleted file mode 100644 index d18bfb238f66..000000000000 --- a/tools/arch/x86/include/asm/irq_vectors.h +++ /dev/null @@ -1,140 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef _ASM_X86_IRQ_VECTORS_H -#define _ASM_X86_IRQ_VECTORS_H - -#include <linux/threads.h> -/* - * Linux IRQ vector layout. - * - * There are 256 IDT entries (per CPU - each entry is 8 bytes) which can - * be defined by Linux. They are used as a jump table by the CPU when a - * given vector is triggered - by a CPU-external, CPU-internal or - * software-triggered event. - * - * Linux sets the kernel code address each entry jumps to early during - * bootup, and never changes them. This is the general layout of the - * IDT entries: - * - * Vectors 0 ... 31 : system traps and exceptions - hardcoded events - * Vectors 32 ... 127 : device interrupts - * Vector 128 : legacy int80 syscall interface - * Vectors 129 ... LOCAL_TIMER_VECTOR-1 - * Vectors LOCAL_TIMER_VECTOR ... 255 : special interrupts - * - * 64-bit x86 has per CPU IDT tables, 32-bit has one shared IDT table. - * - * This file enumerates the exact layout of them: - */ - -/* This is used as an interrupt vector when programming the APIC. */ -#define NMI_VECTOR 0x02 - -/* - * IDT vectors usable for external interrupt sources start at 0x20. - * (0x80 is the syscall vector, 0x30-0x3f are for ISA) - */ -#define FIRST_EXTERNAL_VECTOR 0x20 - -#define IA32_SYSCALL_VECTOR 0x80 - -/* - * Vectors 0x30-0x3f are used for ISA interrupts. - * round up to the next 16-vector boundary - */ -#define ISA_IRQ_VECTOR(irq) (((FIRST_EXTERNAL_VECTOR + 16) & ~15) + irq) - -/* - * Special IRQ vectors used by the SMP architecture, 0xf0-0xff - * - * some of the following vectors are 'rare', they are merged - * into a single vector (CALL_FUNCTION_VECTOR) to save vector space. - * TLB, reschedule and local APIC vectors are performance-critical. - */ - -#define SPURIOUS_APIC_VECTOR 0xff -/* - * Sanity check - */ -#if ((SPURIOUS_APIC_VECTOR & 0x0F) != 0x0F) -# error SPURIOUS_APIC_VECTOR definition error -#endif - -#define ERROR_APIC_VECTOR 0xfe -#define RESCHEDULE_VECTOR 0xfd -#define CALL_FUNCTION_VECTOR 0xfc -#define CALL_FUNCTION_SINGLE_VECTOR 0xfb -#define THERMAL_APIC_VECTOR 0xfa -#define THRESHOLD_APIC_VECTOR 0xf9 -#define REBOOT_VECTOR 0xf8 - -/* - * Generic system vector for platform specific use - */ -#define X86_PLATFORM_IPI_VECTOR 0xf7 - -/* - * IRQ work vector: - */ -#define IRQ_WORK_VECTOR 0xf6 - -/* 0xf5 - unused, was UV_BAU_MESSAGE */ -#define DEFERRED_ERROR_VECTOR 0xf4 - -/* Vector on which hypervisor callbacks will be delivered */ -#define HYPERVISOR_CALLBACK_VECTOR 0xf3 - -/* Vector for KVM to deliver posted interrupt IPI */ -#define POSTED_INTR_VECTOR 0xf2 -#define POSTED_INTR_WAKEUP_VECTOR 0xf1 -#define POSTED_INTR_NESTED_VECTOR 0xf0 - -#define MANAGED_IRQ_SHUTDOWN_VECTOR 0xef - -#if IS_ENABLED(CONFIG_HYPERV) -#define HYPERV_REENLIGHTENMENT_VECTOR 0xee -#define HYPERV_STIMER0_VECTOR 0xed -#endif - -#define LOCAL_TIMER_VECTOR 0xec - -#define NR_VECTORS 256 - -#ifdef CONFIG_X86_LOCAL_APIC -#define FIRST_SYSTEM_VECTOR LOCAL_TIMER_VECTOR -#else -#define FIRST_SYSTEM_VECTOR NR_VECTORS -#endif - -#define NR_EXTERNAL_VECTORS (FIRST_SYSTEM_VECTOR - FIRST_EXTERNAL_VECTOR) -#define NR_SYSTEM_VECTORS (NR_VECTORS - FIRST_SYSTEM_VECTOR) - -/* - * Size the maximum number of interrupts. - * - * If the irq_desc[] array has a sparse layout, we can size things - * generously - it scales up linearly with the maximum number of CPUs, - * and the maximum number of IO-APICs, whichever is higher. - * - * In other cases we size more conservatively, to not create too large - * static arrays. - */ - -#define NR_IRQS_LEGACY 16 - -#define CPU_VECTOR_LIMIT (64 * NR_CPUS) -#define IO_APIC_VECTOR_LIMIT (32 * MAX_IO_APICS) - -#if defined(CONFIG_X86_IO_APIC) && defined(CONFIG_PCI_MSI) -#define NR_IRQS \ - (CPU_VECTOR_LIMIT > IO_APIC_VECTOR_LIMIT ? \ - (NR_VECTORS + CPU_VECTOR_LIMIT) : \ - (NR_VECTORS + IO_APIC_VECTOR_LIMIT)) -#elif defined(CONFIG_X86_IO_APIC) -#define NR_IRQS (NR_VECTORS + IO_APIC_VECTOR_LIMIT) -#elif defined(CONFIG_PCI_MSI) -#define NR_IRQS (NR_VECTORS + CPU_VECTOR_LIMIT) -#else -#define NR_IRQS NR_IRQS_LEGACY -#endif - -#endif /* _ASM_X86_IRQ_VECTORS_H */ diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h index 05956bd8bacf..e72c2b872957 100644 --- a/tools/arch/x86/include/asm/msr-index.h +++ b/tools/arch/x86/include/asm/msr-index.h @@ -61,10 +61,13 @@ #define SPEC_CTRL_SSBD BIT(SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */ #define SPEC_CTRL_RRSBA_DIS_S_SHIFT 6 /* Disable RRSBA behavior */ #define SPEC_CTRL_RRSBA_DIS_S BIT(SPEC_CTRL_RRSBA_DIS_S_SHIFT) +#define SPEC_CTRL_BHI_DIS_S_SHIFT 10 /* Disable Branch History Injection behavior */ +#define SPEC_CTRL_BHI_DIS_S BIT(SPEC_CTRL_BHI_DIS_S_SHIFT) /* A mask for bits which the kernel toggles when controlling mitigations */ #define SPEC_CTRL_MITIGATIONS_MASK (SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD \ - | SPEC_CTRL_RRSBA_DIS_S) + | SPEC_CTRL_RRSBA_DIS_S \ + | SPEC_CTRL_BHI_DIS_S) #define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */ #define PRED_CMD_IBPB BIT(0) /* Indirect Branch Prediction Barrier */ @@ -163,6 +166,10 @@ * are restricted to targets in * kernel. */ +#define ARCH_CAP_BHI_NO BIT(20) /* + * CPU is not affected by Branch + * History Injection. + */ #define ARCH_CAP_PBRSB_NO BIT(24) /* * Not susceptible to Post-Barrier * Return Stack Buffer Predictions. diff --git a/tools/arch/x86/include/uapi/asm/prctl.h b/tools/arch/x86/include/uapi/asm/prctl.h deleted file mode 100644 index 384e2cc6ac19..000000000000 --- a/tools/arch/x86/include/uapi/asm/prctl.h +++ /dev/null @@ -1,43 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ -#ifndef _ASM_X86_PRCTL_H -#define _ASM_X86_PRCTL_H - -#define ARCH_SET_GS 0x1001 -#define ARCH_SET_FS 0x1002 -#define ARCH_GET_FS 0x1003 -#define ARCH_GET_GS 0x1004 - -#define ARCH_GET_CPUID 0x1011 -#define ARCH_SET_CPUID 0x1012 - -#define ARCH_GET_XCOMP_SUPP 0x1021 -#define ARCH_GET_XCOMP_PERM 0x1022 -#define ARCH_REQ_XCOMP_PERM 0x1023 -#define ARCH_GET_XCOMP_GUEST_PERM 0x1024 -#define ARCH_REQ_XCOMP_GUEST_PERM 0x1025 - -#define ARCH_XCOMP_TILECFG 17 -#define ARCH_XCOMP_TILEDATA 18 - -#define ARCH_MAP_VDSO_X32 0x2001 -#define ARCH_MAP_VDSO_32 0x2002 -#define ARCH_MAP_VDSO_64 0x2003 - -/* Don't use 0x3001-0x3004 because of old glibcs */ - -#define ARCH_GET_UNTAG_MASK 0x4001 -#define ARCH_ENABLE_TAGGED_ADDR 0x4002 -#define ARCH_GET_MAX_TAG_BITS 0x4003 -#define ARCH_FORCE_TAGGED_SVA 0x4004 - -#define ARCH_SHSTK_ENABLE 0x5001 -#define ARCH_SHSTK_DISABLE 0x5002 -#define ARCH_SHSTK_LOCK 0x5003 -#define ARCH_SHSTK_UNLOCK 0x5004 -#define ARCH_SHSTK_STATUS 0x5005 - -/* ARCH_SHSTK_ features bits */ -#define ARCH_SHSTK_SHSTK (1ULL << 0) -#define ARCH_SHSTK_WRSS (1ULL << 1) - -#endif /* _ASM_X86_PRCTL_H */ |