summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2017-07-03Merge branch 'locking-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle were: - Add CONFIG_REFCOUNT_FULL=y to allow the disabling of the 'full' (robustness checked) refcount_t implementation with slightly lower runtime overhead. (Kees Cook) The lighter weight variant is the default. The two variants use the same API. Having this variant was a precondition by some maintainers to merge refcount_t cleanups. - Add lockdep support for rtmutexes (Peter Zijlstra) - liblockdep fixes and improvements (Sasha Levin, Ben Hutchings) - ... misc fixes and improvements" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits) locking/refcount: Remove the half-implemented refcount_sub() API locking/refcount: Create unchecked atomic_t implementation locking/rtmutex: Don't initialize lockdep when not required locking/selftest: Add RT-mutex support locking/selftest: Remove the bad unlock ordering test rt_mutex: Add lockdep annotations MAINTAINERS: Claim atomic*_t maintainership locking/x86: Remove the unused atomic_inc_short() methd tools/lib/lockdep: Remove private kernel headers tools/lib/lockdep: Hide liblockdep output from test results tools/lib/lockdep: Add dummy current_gfp_context() tools/include: Add IS_ERR_OR_NULL to err.h tools/lib/lockdep: Add empty __is_[module,kernel]_percpu_address tools/lib/lockdep: Include err.h tools/include: Add (mostly) empty include/linux/sched/mm.h tools/lib/lockdep: Use LDFLAGS tools/lib/lockdep: Remove double-quotes from soname tools/lib/lockdep: Fix object file paths used in an out-of-tree build tools/lib/lockdep: Fix compilation for 4.11 tools/lib/lockdep: Don't mix fd-based and stream IO ...
2017-07-03Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The sole purpose of these changes is to shrink and simplify the RCU code base, which has suffered from creeping bloat over the past couple of years. The end result is a net removal of ~2700 lines of code: 79 files changed, 1496 insertions(+), 4211 deletions(-) Plus there's a marked reduction in the Kconfig space complexity as well, here's the number of matches on 'grep RCU' in the .config: before after x86-defconfig 17 15 x86-allmodconfig 33 20" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (86 commits) rcu: Remove RCU CPU stall warnings from Tiny RCU rcu: Remove event tracing from Tiny RCU rcu: Move RCU debug Kconfig options to kernel/rcu rcu: Move RCU non-debug Kconfig options to kernel/rcu rcu: Eliminate NOCBs CPU-state Kconfig options rcu: Remove debugfs tracing srcu: Remove Classic SRCU srcu: Fix rcutorture-statistics typo rcu: Remove SPARSE_RCU_POINTER Kconfig option rcu: Remove the now-obsolete PROVE_RCU_REPEATEDLY Kconfig option rcu: Remove typecheck() from RCU locking wrapper functions rcu: Remove #ifdef moving rcu_end_inkernel_boot from rcupdate.h rcu: Remove nohz_full full-system-idle state machine rcu: Remove the RCU_KTHREAD_PRIO Kconfig option rcu: Remove *_SLOW_* Kconfig options srcu: Use rnp->lock wrappers to replace explicit memory barriers rcu: Move rnp->lock wrappers for SRCU use rcu: Convert rnp->lock wrappers to macros for SRCU use rcu: Refactor #includes from include/linux/rcupdate.h bcm47xx: Fix build regression ...
2017-07-03Merge branch 'core-objtool-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: "This is an extensive rewrite of the objdump tool to track all stack pointer modifications through the machine instructions of disassembled functions found in kernel .o files. This re-design removes the prior dependency on CONFIG_FRAME_POINTERS, with the goal to prepare the tool to generate kernel debuginfo data in the future. There's also an increase in checking/tracking robustness as a side effect as well. No (intended) changes to existing functionality" * 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Silence warnings for functions which use IRET objtool: Implement stack validation 2.0 objtool, x86: Add several functions and files to the objtool whitelist objtool: Move checking code to check.c
2017-07-03Merge tag 'uuid-for-4.13' of git://git.infradead.org/users/hch/uuidLinus Torvalds
Pull uuid subsystem from Christoph Hellwig: "This is the new uuid subsystem, in which Amir, Andy and I have started consolidating our uuid/guid helpers and improving the types used for them. Note that various other subsystems have pulled in this tree, so I'd like it to go in early. UUID/GUID summary: - introduce the new uuid_t/guid_t types that are going to replace the somewhat confusing uuid_be/uuid_le types and make the terminology fit the various specs, as well as the userspace libuuid library. (me, based on a previous version from Amir) - consolidated generic uuid/guid helper functions lifted from XFS and libnvdimm (Amir and me) - conversions to the new types and helpers (Amir, Andy and me)" * tag 'uuid-for-4.13' of git://git.infradead.org/users/hch/uuid: (34 commits) ACPI: hns_dsaf_acpi_dsm_guid can be static mmc: sdhci-pci: make guid intel_dsm_guid static uuid: Take const on input of uuid_is_null() and guid_is_null() thermal: int340x_thermal: fix compile after the UUID API switch thermal: int340x_thermal: Switch to use new generic UUID API acpi: always include uuid.h ACPI: Switch to use generic guid_t in acpi_evaluate_dsm() ACPI / extlog: Switch to use new generic UUID API ACPI / bus: Switch to use new generic UUID API ACPI / APEI: Switch to use new generic UUID API acpi, nfit: Switch to use new generic UUID API MAINTAINERS: add uuid entry tmpfs: generate random sb->s_uuid scsi_debug: switch to uuid_t nvme: switch to uuid_t sysctl: switch to use uuid_t partitions/ldm: switch to use uuid_t overlayfs: use uuid_t instead of uuid_be fs: switch ->s_uuid to uuid_t ima/policy: switch to use uuid_t ...
2017-07-01Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Thomas Gleixner: "The last fix for perf for this cycles: - Prevent a segfault when kernel.kptr_restrict=2 is set by avoiding a null pointer dereference" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf machine: Fix segfault for kernel.kptr_restrict=2
2017-06-30objtool: Silence warnings for functions which use IRETJosh Poimboeuf
Previously, objtool ignored functions which have the IRET instruction in them. That's because it assumed that such functions know what they're doing with respect to frame pointers. With the new "objtool 2.0" changes, it stopped ignoring such functions, and started complaining about them: arch/x86/kernel/alternative.o: warning: objtool: do_sync_core()+0x1b: unsupported instruction in callable function arch/x86/kernel/alternative.o: warning: objtool: text_poke()+0x1a8: unsupported instruction in callable function arch/x86/kernel/ftrace.o: warning: objtool: do_sync_core()+0x16: unsupported instruction in callable function arch/x86/kernel/cpu/mcheck/mce.o: warning: objtool: machine_check_poll()+0x166: unsupported instruction in callable function arch/x86/kernel/cpu/mcheck/mce.o: warning: objtool: do_machine_check()+0x147: unsupported instruction in callable function Silence those warnings for now. They can be re-enabled later, once we have unwind hints which will allow the code to annotate the IRET usages. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Fixes: baa41469a7b9 ("objtool: Implement stack validation 2.0") Link: http://lkml.kernel.org/r/20170630140934.mmwtpockvpupahro@treble Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-30objtool: Implement stack validation 2.0Josh Poimboeuf
This is a major rewrite of objtool. Instead of only tracking frame pointer changes, it now tracks all stack-related operations, including all register saves/restores. In addition to making stack validation more robust, this also paves the way for undwarf generation. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/678bd94c0566c6129bcc376cddb259c4c5633004.1498659915.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-30objtool: Move checking code to check.cJosh Poimboeuf
In preparation for the new 'objtool undwarf generate' command, which will rely on 'objtool check', move the checking code from builtin-check.c to check.c where it can be used by other commands. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Jiri Slaby <jslaby@suse.cz> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/294c5c695fd73c1a5000bbe5960a7c9bec4ee6b4.1498659915.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Need to access netdev->num_rx_queues behind an accessor in netvsc driver otherwise the build breaks with some configs, from Arnd Bergmann. 2) Add dummy xfrm_dev_event() so that build doesn't fail when CONFIG_XFRM_OFFLOAD is not set. From Hangbin Liu. 3) Don't OOPS when pfkey_msg2xfrm_state() signals an erros, from Dan Carpenter. 4) Fix MCDI command size for filter operations in sfc driver, from Martin Habets. 5) Fix UFO segmenting so that we don't calculate incorrect checksums, from Michal Kubecek. 6) When ipv6 datagram connects fail, reset destination address and port. From Wei Wang. 7) TCP disconnect must reset the cached receive DST, from WANG Cong. 8) Fix sign extension bug on 32-bit in dev_get_stats(), from Eric Dumazet. 9) fman driver has to depend on HAS_DMA, from Madalin Bucur. 10) Fix bpf pointer leak with xadd in verifier, from Daniel Borkmann. 11) Fix negative page counts with GFO, from Michal Kubecek. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits) sfc: fix attempt to translate invalid filter ID net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish() bpf: prevent leaking pointer via xadd on unpriviledged arcnet: com20020-pci: add missing pdev setup in netdev structure arcnet: com20020-pci: fix dev_id calculation arcnet: com20020: remove needless base_addr assignment Trivial fix to spelling mistake in arc_printk message arcnet: change irq handler to lock irqsave rocker: move dereference before free mlxsw: spectrum_router: Fix NULL pointer dereference net: sched: Fix one possible panic when no destroy callback virtio-net: serialize tx routine during reset net: usb: asix88179_178a: Add support for the Belkin B2B128 fsl/fman: add dependency on HAS_DMA net: prevent sign extension in dev_get_stats() tcp: reset sk_rx_dst in tcp_disconnect() net: ipv6: reset daddr and dport in sk if connect() fails bnx2x: Don't log mc removal needlessly bnxt_en: Fix netpoll handling. bnxt_en: Add missing logic to handle TPA end error conditions. ...
2017-06-29bpf: prevent leaking pointer via xadd on unpriviledgedDaniel Borkmann
Leaking kernel addresses on unpriviledged is generally disallowed, for example, verifier rejects the following: 0: (b7) r0 = 0 1: (18) r2 = 0xffff897e82304400 3: (7b) *(u64 *)(r1 +48) = r2 R2 leaks addr into ctx Doing pointer arithmetic on them is also forbidden, so that they don't turn into unknown value and then get leaked out. However, there's xadd as a special case, where we don't check the src reg for being a pointer register, e.g. the following will pass: 0: (b7) r0 = 0 1: (7b) *(u64 *)(r1 +48) = r0 2: (18) r2 = 0xffff897e82304400 ; map 4: (db) lock *(u64 *)(r1 +48) += r2 5: (95) exit We could store the pointer into skb->cb, loose the type context, and then read it out from there again to leak it eventually out of a map value. Or more easily in a different variant, too: 0: (bf) r6 = r1 1: (7a) *(u64 *)(r10 -8) = 0 2: (bf) r2 = r10 3: (07) r2 += -8 4: (18) r1 = 0x0 6: (85) call bpf_map_lookup_elem#1 7: (15) if r0 == 0x0 goto pc+3 R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R6=ctx R10=fp 8: (b7) r3 = 0 9: (7b) *(u64 *)(r0 +0) = r3 10: (db) lock *(u64 *)(r0 +0) += r6 11: (b7) r0 = 0 12: (95) exit from 7 to 11: R0=inv,min_value=0,max_value=0 R6=ctx R10=fp 11: (b7) r0 = 0 12: (95) exit Prevent this by checking xadd src reg for pointer types. Also add a couple of test cases related to this. Fixes: 1be7f75d1668 ("bpf: enable non-root eBPF programs") Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26perf machine: Fix segfault for kernel.kptr_restrict=2Jiri Olsa
Michael reported the segfault when kernel.kptr_restrict=2 is set. $ perf record ls ... perf: Segmentation fault Obtained 16 stack frames. ./perf(dump_stack+0x2d) [0x5068df] ./perf(sighandler_dump_stack+0x2d) [0x5069bf] ./perf() [0x43e47b] /lib64/libc.so.6(+0x3594f) [0x7f762004794f] /lib64/libc.so.6(strlen+0x26) [0x7f762009ef86] /lib64/libc.so.6(__strdup+0xd) [0x7f762009ecbd] ./perf(maps__set_kallsyms_ref_reloc_sym+0x4d) [0x51590f] ./perf(machine__create_kernel_maps+0x136) [0x50a7de] ./perf(perf_session__create_kernel_maps+0x2c) [0x510a81] ./perf(perf_session__new+0x13d) [0x510e23] ./perf() [0x43fd61] ./perf(cmd_record+0x704) [0x441823] ./perf() [0x4bc1a0] ./perf() [0x4bc40d] ./perf() [0x4bc55f] ./perf(main+0x2d5) [0x4bc939] Segmentation fault (core dumped) The reason is that with kernel.kptr_restrict=2, we don't get the symbol from machine__get_running_kernel_start, which we want to use in maps__set_kallsyms_ref_reloc_sym and we crash. Check the symbol name value before calling maps__set_kallsyms_ref_reloc_sym() and succeed without ref_reloc_sym being set. It's safe because we check its existence before we use it. Reported-by: Michael Petlan <mpetlan@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20170626095153.553-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-22perf probe: Fix probe definition for inlined functionsBjörn Töpel
In commit 613f050d68a8 ("perf probe: Fix to probe on gcc generated functions in modules"), the offset from symbol is, incorrectly, added to the trace point address. This leads to incorrect probe trace points for inlined functions and when using relative line number on symbols. Prior this patch: $ perf probe -m nf_nat -D in_range p:probe/in_range nf_nat:in_range.isra.9+0 $ perf probe -m i40e -D i40e_clean_rx_irq p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+2212 $ perf probe -m i40e -D i40e_clean_rx_irq:16 p:probe/i40e_clean_rx_irq i40e:i40e_lan_xmit_frame+626 After: $ perf probe -m nf_nat -D in_range p:probe/in_range nf_nat:in_range.isra.9+0 $ perf probe -m i40e -D i40e_clean_rx_irq p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+1106 $ perf probe -m i40e -D i40e_clean_rx_irq:16 p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+2665 Committer testing: Using 'pfunct', a tool found in the 'dwarves' package [1], one can ask what are the functions that while not being explicitely marked as inline, were inlined by the compiler: # pfunct --cc_inlined /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko | head __ew32 e1000_regdump e1000e_dump_ps_pages e1000_desc_unused e1000e_systim_to_hwtstamp e1000e_rx_hwtstamp e1000e_update_rdt_wa e1000e_update_tdt_wa e1000_put_txbuf e1000_consume_page Then ask 'perf probe' to produce the kprobe_tracer probe definitions for two of them: # perf probe -m e1000e -D e1000e_rx_hwtstamp p:probe/e1000e_rx_hwtstamp e1000e:e1000_receive_skb+74 # perf probe -m e1000e -D e1000_consume_page p:probe/e1000_consume_page e1000e:e1000_clean_jumbo_rx_irq+876 p:probe/e1000_consume_page_1 e1000e:e1000_clean_jumbo_rx_irq+1506 p:probe/e1000_consume_page_2 e1000e:e1000_clean_rx_irq_ps+1074 Now lets concentrate on the 'e1000_consume_page' one, that was inlined twice in e1000_clean_jumbo_rx_irq(), lets see what readelf says about the DWARF tags for that function: $ readelf -wi /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko <SNIP> <1><13e27b>: Abbrev Number: 121 (DW_TAG_subprogram) <13e27c> DW_AT_name : (indirect string, offset: 0xa8945): e1000_clean_jumbo_rx_irq <13e287> DW_AT_low_pc : 0x17a30 <3><13e6ef>: Abbrev Number: 119 (DW_TAG_inlined_subroutine) <13e6f0> DW_AT_abstract_origin: <0x13ed2c> <13e6f4> DW_AT_low_pc : 0x17be6 <SNIP> <1><13ed2c>: Abbrev Number: 142 (DW_TAG_subprogram) <13ed2e> DW_AT_name : (indirect string, offset: 0xa54c3): e1000_consume_page So, the first time in e1000_clean_jumbo_rx_irq() where e1000_consume_page() is inlined is at PC 0x17be6, which subtracted from e1000_clean_jumbo_rx_irq()'s address, gives us the offset we should use in the probe definition: 0x17be6 - 0x17a30 = 438 but above we have 876, which is twice as much. Lets see the second inline expansion of e1000_consume_page() in e1000_clean_jumbo_rx_irq(): <3><13e86e>: Abbrev Number: 119 (DW_TAG_inlined_subroutine) <13e86f> DW_AT_abstract_origin: <0x13ed2c> <13e873> DW_AT_low_pc : 0x17d21 0x17d21 - 0x17a30 = 753 So we where adding it at twice the offset from the containing function as we should. And then after this patch: # perf probe -m e1000e -D e1000e_rx_hwtstamp p:probe/e1000e_rx_hwtstamp e1000e:e1000_receive_skb+37 # perf probe -m e1000e -D e1000_consume_page p:probe/e1000_consume_page e1000e:e1000_clean_jumbo_rx_irq+438 p:probe/e1000_consume_page_1 e1000e:e1000_clean_jumbo_rx_irq+753 p:probe/e1000_consume_page_2 e1000e:e1000_clean_jumbo_rx_irq+1353 # Which matches the two first expansions and shows that because we were doubling the offset it would spill over the next function: readelf -sw /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko 673: 0000000000017a30 1626 FUNC LOCAL DEFAULT 2 e1000_clean_jumbo_rx_irq 674: 0000000000018090 2013 FUNC LOCAL DEFAULT 2 e1000_clean_rx_irq_ps This is the 3rd inline expansion of e1000_consume_page() in e1000_clean_jumbo_rx_irq(): <3><13ec77>: Abbrev Number: 119 (DW_TAG_inlined_subroutine) <13ec78> DW_AT_abstract_origin: <0x13ed2c> <13ec7c> DW_AT_low_pc : 0x17f79 0x17f79 - 0x17a30 = 1353 So: 0x17a30 + 2 * 1353 = 0x184c2 And: 0x184c2 - 0x18090 = 1074 Which explains the bogus third expansion for e1000_consume_page() to end up at: p:probe/e1000_consume_page_2 e1000e:e1000_clean_rx_irq_ps+1074 All fixed now :-) [1] https://git.kernel.org/pub/scm/devel/pahole/pahole.git/ Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: stable@vger.kernel.org Fixes: 613f050d68a8 ("perf probe: Fix to probe on gcc generated functions in modules") Link: http://lkml.kernel.org/r/20170621164134.5701-1-bjorn.topel@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-22Merge branch 'linus' into locking/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-19NTB: ntb_test: fix bug printing ntb_perf resultsLogan Gunthorpe
The code mistakenly prints the local perf results for the remote test so the script reports identical results for both directions. Fix this by ensuring we print the remote result. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Fixes: a9c59ef77458 ("ntb_test: Add a selftest script for the NTB subsystem") Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-18Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "Three fixes for the perf user space side: - Fix the probing of precise_ip level, which got broken recently for x86. - Unbreak the ARCH=x86_64 build - Report module before trying to unwind into the module code, which avoids broken stack frames displayed" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf unwind: Report module before querying isactivation in dwfl unwind perf tools: Fix build with ARCH=x86_64 perf evsel: Fix probing of precise_ip level for default cycles event
2017-06-18Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix from Thomas Gleixner: "A single fix which adds fortify_panic to the list of no return functions" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Add fortify_panic as __noreturn function
2017-06-16perf unwind: Report module before querying isactivation in dwfl unwindMilian Wolff
The PC returned by dwfl_frame_pc() may map into a not-yet-reported module. We have to report it before we continue unwinding. But when we query for the isactivation flag in dwfl_frame_pc, libdw will actually do one more unwinding step internally which can then break and lead to missed frames or broken stacks. With libunwind we get e.g.: ~~~~~ heaptrack_gui 2228 135073.400474: 613969 cycles: 108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0) 109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0) 1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0) 147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0) 109fbf QLocalePrivate::updateSystemPrivate (/usr/lib/libQt5Core.so.5.8.0) 10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0) 1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0) 211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0) 92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0) 2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0) 297c53 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0) f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0) 1589e8 QApplicationPrivate::init (/usr/lib/libQt5Widgets.so.5.8.0) 78622 main (/home/milian/projects/compiled/other/bin/heaptrack_gui) 20439 __libc_start_main (/usr/lib/libc-2.25.so) 78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui) heaptrack_gui 2228 135073.401156: 569521 cycles: 131633 QString::endsWith (/usr/lib/libQt5Core.so.5.8.0) 1a0701 QDir::cleanPath (/usr/lib/libQt5Core.so.5.8.0) 21b82d [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1b3727 QFileInfo::canonicalFilePath (/usr/lib/libQt5Core.so.5.8.0) 2780c7 QFactoryLoader::update (/usr/lib/libQt5Core.so.5.8.0) 279525 QFactoryLoader::QFactoryLoader (/usr/lib/libQt5Core.so.5.8.0) e5bd0 QPlatformIntegrationFactory::create (/usr/lib/libQt5Gui.so.5.8.0) f5a1c QGuiApplicationPrivate::createPlatformIntegration (/usr/lib/libQt5Gui.so.5.8.0) f650c QGuiApplicationPrivate::createEventDispatcher (/usr/lib/libQt5Gui.so.5.8.0) 298524 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0) f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0) 1589e8 QApplicationPrivate::init (/usr/lib/libQt5Widgets.so.5.8.0) 78622 main (/home/milian/projects/compiled/other/bin/heaptrack_gui) 20439 __libc_start_main (/usr/lib/libc-2.25.so) 78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui) ~~~~~ Note the two frames 1589e8 and 78622 in the first sample. These are missing when unwinding with libdw. The second sample's breakage is more obvious: ~~~~~ heaptrack_gui 2228 135073.400474: 613969 cycles: 108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0) 109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0) 1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0) 147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0) 109fbf QLocalePrivate::updateSystemPrivate (/usr/lib/libQt5Core.so.5.8.0) 10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0) 1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0) 211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0) 92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0) 2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0) 297c53 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0) f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0) 20439 __libc_start_main (/usr/lib/libc-2.25.so) 78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui) heaptrack_gui 2228 135073.401156: 569521 cycles: 131633 QString::endsWith (/usr/lib/libQt5Core.so.5.8.0) 1a0701 QDir::cleanPath (/usr/lib/libQt5Core.so.5.8.0) 21b82d [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1b3727 QFileInfo::canonicalFilePath (/usr/lib/libQt5Core.so.5.8.0) 2780c7 QFactoryLoader::update (/usr/lib/libQt5Core.so.5.8.0) 279525 QFactoryLoader::QFactoryLoader (/usr/lib/libQt5Core.so.5.8.0) e5bd0 QPlatformIntegrationFactory::create (/usr/lib/libQt5Gui.so.5.8.0) 723dbf [unknown] ([unknown]) ~~~~~ This patch fixes this issue and the libdw unwinder mimicks the libunwind behavior more closely. Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Acked-by: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20170602143753.16907-2-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-16objtool: Add fortify_panic as __noreturn functionKees Cook
CONFIG_FORTIFY_SOURCE=y implements fortify_panic() as a __noreturn function, so objtool needs to know about it too. Suggested-by: Daniel Micay <danielmicay@gmail.com> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1497532835-32704-1-git-send-email-jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) The netlink attribute passed in to dev_set_alias() is not necessarily NULL terminated, don't use strlcpy() on it. From Alexander Potapenko. 2) Fix implementation of atomics in arm64 bpf JIT, from Daniel Borkmann. 3) Correct the release of netdevs and driver private data in certain circumstances. 4) Sanitize netlink message length properly in decnet, from Mateusz Jurczyk. 5) Don't leak kernel data in rtnl_fill_vfinfo() netlink blobs. From Yuval Mintz. 6) Hash secret is never initialized in ipv6 ILA translation code, from Arnd Bergmann. I guess those clang warnings about unused inline functions are useful for something! 7) Fix endian selection in bpf_endian.h, from Daniel Borkmann. 8) Sanitize sockaddr length before dereferncing any fields in AF_UNIX and CAIF. From Mateusz Jurczyk. 9) Fix timestamping for GMAC3 chips in stmmac driver, from Mario Molitor. 10) Do not leak netdev on dev_alloc_name() errors in mac80211, from Johannes Berg. 11) Fix locking in sctp_for_each_endpoint(), from Xin Long. 12) Fix wrong memset size on 32-bit in snmp6, from Christian Perle. 13) Fix use after free in ip_mc_clear_src(), from WANG Cong. 14) Fix regressions caused by ICMP rate limiting changes in 4.11, from Jesper Dangaard Brouer. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (91 commits) i40e: Fix a sleep-in-atomic bug net: don't global ICMP rate limit packets originating from loopback net/act_pedit: fix an error code net: update undefined ->ndo_change_mtu() comment net_sched: move tcf_lock down after gen_replace_estimator() caif: Add sockaddr length check before accessing sa_family in connect handler qed: fix dump of context data qmi_wwan: new Telewell and Sierra device IDs net: phy: Fix MDIO_THUNDER dependencies netconsole: Remove duplicate "netconsole: " logging prefix igmp: acquire pmc lock for ip_mc_clear_src() r8152: give the device version net: rps: fix uninitialized symbol warning mac80211: don't send SMPS action frame in AP mode when not needed mac80211/wpa: use constant time memory comparison for MACs mac80211: set bss_info data before configuring the channel mac80211: remove 5/10 MHz rate code from station MLME mac80211: Fix incorrect condition when checking rx timestamp mac80211: don't look at the PM bit of BAR frames i40e: fix handling of HW ATR eviction ...
2017-06-14perf tools: Fix build with ARCH=x86_64Jiada Wang
With commit: 0a943cb10ce78 (tools build: Add HOSTARCH Makefile variable) when building for ARCH=x86_64, ARCH=x86_64 is passed to perf instead of ARCH=x86, so the perf build process searchs header files from tools/arch/x86_64/include, which doesn't exist. The following build failure is seen: In file included from util/event.c:2:0: tools/include/uapi/linux/mman.h:4:27: fatal error: uapi/asm/mman.h: No such file or directory compilation terminated. Fix this issue by using SRCARCH instead of ARCH in perf, just like the main kernel Makefile and tools/objtool's. Signed-off-by: Jiada Wang <jiada_wang@mentor.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Eugeniu Rosca <erosca@de.adit-jv.com> Cc: Jan Stancek <jstancek@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Rui Teng <rui.teng@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 0a943cb10ce7 ("tools build: Add HOSTARCH Makefile variable") Link: http://lkml.kernel.org/r/1491793357-14977-2-git-send-email-jiada_wang@mentor.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-14perf evsel: Fix probing of precise_ip level for default cycles eventArnaldo Carvalho de Melo
Since commit 18e7a45af91a ("perf/x86: Reject non sampling events with precise_ip") returns -EINVAL for sys_perf_event_open() with an attribute with (attr.precise_ip > 0 && attr.sample_period == 0), just like is done in the routine used to probe the max precise level when no events were passed to 'perf record' or 'perf top', i.e.: perf_evsel__new_cycles() perf_event_attr__set_max_precise_ip() The x86 code, in x86_pmu_hw_config(), which is called all the way from sys_perf_event_open() did, starting with the aforementioned commit: /* There's no sense in having PEBS for non sampling events: */ if (!is_sampling_event(event)) return -EINVAL; Which makes it fail for cycles:ppp, cycles:pp and cycles:p, always using just the non precise cycles variant. To make sure that this is the case, I tested it, before this patch, with: # perf probe -L x86_pmu_hw_config <x86_pmu_hw_config@/home/acme/git/linux/arch/x86/events/core.c:0> 0 int x86_pmu_hw_config(struct perf_event *event) 1 { 2 if (event->attr.precise_ip) { <SNIP> 17 if (event->attr.precise_ip > precise) 18 return -EOPNOTSUPP; /* There's no sense in having PEBS for non sampling events: */ 21 if (!is_sampling_event(event)) 22 return -EINVAL; } <SNIP> # perf probe x86_pmu_hw_config:22 Added new events: probe:x86_pmu_hw_config (on x86_pmu_hw_config:22) probe:x86_pmu_hw_config_1 (on x86_pmu_hw_config:22) You can now use it in all perf tools, such as: perf record -e probe:x86_pmu_hw_config_1 -aR sleep 1 # perf trace -e perf_event_open,probe:x86_pmu_hwconfig*/max-stack=16/ perf record usleep 1 0.000 ( 0.015 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ... 0.015 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1)) x86_pmu_hw_config ([kernel.kallsyms]) hsw_hw_config ([kernel.kallsyms]) x86_pmu_event_init ([kernel.kallsyms]) perf_try_init_event ([kernel.kallsyms]) perf_event_alloc ([kernel.kallsyms]) SYSC_perf_event_open ([kernel.kallsyms]) sys_perf_event_open ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) syscall (/usr/lib64/libc-2.24.so) perf_event_attr__set_max_precise_ip (/home/acme/bin/perf) perf_evsel__new_cycles (/home/acme/bin/perf) perf_evlist__add_default (/home/acme/bin/perf) cmd_record (/home/acme/bin/perf) run_builtin (/home/acme/bin/perf) handle_internal_command (/home/acme/bin/perf) 0.000 ( 0.021 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument 0.023 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ... 0.025 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1)) x86_pmu_hw_config ([kernel.kallsyms]) hsw_hw_config ([kernel.kallsyms]) x86_pmu_event_init ([kernel.kallsyms]) perf_try_init_event ([kernel.kallsyms]) perf_event_alloc ([kernel.kallsyms]) SYSC_perf_event_open ([kernel.kallsyms]) sys_perf_event_open ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) syscall (/usr/lib64/libc-2.24.so) perf_event_attr__set_max_precise_ip (/home/acme/bin/perf) perf_evsel__new_cycles (/home/acme/bin/perf) perf_evlist__add_default (/home/acme/bin/perf) cmd_record (/home/acme/bin/perf) run_builtin (/home/acme/bin/perf) handle_internal_command (/home/acme/bin/perf) 0.023 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument 0.028 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ... 0.030 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1)) x86_pmu_hw_config ([kernel.kallsyms]) hsw_hw_config ([kernel.kallsyms]) x86_pmu_event_init ([kernel.kallsyms]) perf_try_init_event ([kernel.kallsyms]) perf_event_alloc ([kernel.kallsyms]) SYSC_perf_event_open ([kernel.kallsyms]) sys_perf_event_open ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) syscall (/usr/lib64/libc-2.24.so) perf_event_attr__set_max_precise_ip (/home/acme/bin/perf) perf_evsel__new_cycles (/home/acme/bin/perf) perf_evlist__add_default (/home/acme/bin/perf) cmd_record (/home/acme/bin/perf) run_builtin (/home/acme/bin/perf) handle_internal_command (/home/acme/bin/perf) 0.028 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument 41.018 ( 0.012 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8b5dd0, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 41.065 ( 0.011 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 41.080 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 41.103 ( 0.010 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4 41.115 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5 41.122 ( 0.004 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6 41.128 ( 0.008 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (2 samples) ] # I.e. that return -EINVAL in x86_pmu_hw_config() is hit three times. So fix it by just setting attr.sample_period Now, after this patch: # perf trace --max-stack=2 -e perf_event_open,probe:x86_pmu_hw_config* perf record usleep 1 [ perf record: Woken up 1 times to write data ] 0.000 ( 0.017 ms): perf/8469 perf_event_open(attr_uptr: 0x7ffe36c27d10, pid: -1, cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 4 syscall (/usr/lib64/libc-2.24.so) perf_event_open_cloexec_flag (/home/acme/bin/perf) 0.050 ( 0.031 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 syscall (/usr/lib64/libc-2.24.so) perf_evlist__config (/home/acme/bin/perf) 0.092 ( 0.040 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 syscall (/usr/lib64/libc-2.24.so) perf_evlist__config (/home/acme/bin/perf) 0.143 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, cpu: -1, group_fd: -1 ) = 4 syscall (/usr/lib64/libc-2.24.so) perf_event_attr__set_max_precise_ip (/home/acme/bin/perf) 0.161 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4 syscall (/usr/lib64/libc-2.24.so) perf_evsel__open (/home/acme/bin/perf) 0.171 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5 syscall (/usr/lib64/libc-2.24.so) perf_evsel__open (/home/acme/bin/perf) 0.180 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6 syscall (/usr/lib64/libc-2.24.so) perf_evsel__open (/home/acme/bin/perf) 0.190 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8 syscall (/usr/lib64/libc-2.24.so) perf_evsel__open (/home/acme/bin/perf) [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] # The probe one called from perf_event_attr__set_max_precise_ip() works the first time, with attr.precise_ip = 3, wit hthe next ones being the per cpu ones for the cycles:ppp event. And here is the text from a report and alternative proposed patch by Thomas-Mich Richter: --- On s390 the counter and sampling facility do not support a precise IP skid level and sometimes returns EOPNOTSUPP when structure member precise_ip in struct perf_event_attr is not set to zero. On s390 commnd 'perf record -- true' fails with error EOPNOTSUPP. This happens only when no events are specified on command line. The functions called are ... --> perf_evlist__add_default --> perf_evsel__new_cycles --> perf_event_attr__set_max_precise_ip The last function determines the value of structure member precise_ip by invoking the perf_event_open() system call and checking the return code. The first successful open is the value for precise_ip. However the value is determined without setting member sample_period and indicates no sampling. On s390 the counter facility and sampling facility are different. The above procedure determines a precise_ip value of 3 using the counter facility. Later it uses the sampling facility with a value of 3 and fails with EOPNOTSUPP. --- v2: Older compilers (e.g. gcc 4.4.7) don't support referencing members of unnamed union members in the container struct initialization, so move from: struct perf_event_attr attr = { ... .sample_period = 1, }; to right after it as: struct perf_event_attr attr = { ... }; attr.sample_period = 1; v3: We need to reset .sample_period to 0 to let the users of perf_evsel__new_cycles() to properly setup attr.sample_period or attr.sample_freq. Reported by Ingo Molnar. Reported-and-Acked-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com> Acked-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 18e7a45af91a ("perf/x86: Reject non sampling events with precise_ip") Link: http://lkml.kernel.org/n/tip-yv6nnkl7tzqocrm0hl3x7vf1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-13Merge branch 'for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: "The largest feature of this series is shrinking and simplification, with the following diffstat summary: 79 files changed, 1496 insertions(+), 4211 deletions(-) In other words, this series represents a net reduction of more than 2700 lines of code." Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-08rcu: Remove RCU CPU stall warnings from Tiny RCUPaul E. McKenney
Tiny RCU's job is to be tiny, so this commit removes its RCU CPU stall warning code. After this, there is no longer any need for rcu_sched_ctrlblk and rcu_bh_ctrlblk to be in tiny_plugin.h, so this commit also moves them to tiny.c. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08rcu: Eliminate NOCBs CPU-state Kconfig optionsPaul E. McKenney
The CONFIG_RCU_NOCB_CPU_ALL, CONFIG_RCU_NOCB_CPU_NONE, and CONFIG_RCU_NOCB_CPU_ZERO Kconfig options are used only in testing and are redundant with the rcu_nocbs= boot parameter. This commit therefore removes these three Kconfig options and adjusts the rcutorture scripts to use the boot parameter instead. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08rcu: Remove debugfs tracingPaul E. McKenney
RCU's debugfs tracing used to be the only reasonable low-level debug information available, but ftrace and event tracing has since surpassed the RCU debugfs level of usefulness. This commit therefore removes RCU's debugfs tracing. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08srcu: Remove Classic SRCUPaul E. McKenney
Classic SRCU was only ever intended to be a fallback in case of issues with Tree/Tiny SRCU, and the latter two are doing quite well in testing. This commit therefore removes Classic SRCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08rcu: Remove SPARSE_RCU_POINTER Kconfig optionPaul E. McKenney
The sparse-based checking for non-RCU accesses to RCU-protected pointers has been around for a very long time, and it is now the only type of sparse-based checking that is optional. This commit therefore makes it unconditional. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Fengguang Wu <fengguang.wu@intel.com>
2017-06-08rcu: Remove the now-obsolete PROVE_RCU_REPEATEDLY Kconfig optionPaul E. McKenney
The PROVE_RCU_REPEATEDLY Kconfig option was initially added due to the volume of messages from PROVE_RCU: Doing just one per boot would have required excessive numbers of boots to locate them all. However, PROVE_RCU messages are now relatively rare, so there is no longer any reason to need more than one such message per boot. This commit therefore removes the PROVE_RCU_REPEATEDLY Kconfig option. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org>
2017-06-08rcu: Remove nohz_full full-system-idle state machinePaul E. McKenney
The NO_HZ_FULL_SYSIDLE full-system-idle capability was added in 2013 by commit 0edd1b1784cb ("nohz_full: Add full-system-idle state machine"), but has not been used. This commit therefore removes it. If it turns out to be needed later, this commit can always be reverted. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-08rcu: Remove the RCU_KTHREAD_PRIO Kconfig optionPaul E. McKenney
Anything that can be done with the RCU_KTHREAD_PRIO Kconfig option can also be done with the rcutree.kthread_prio kernel boot parameter. This commit therefore removes this Kconfig option. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Rik van Riel <riel@redhat.com>
2017-06-08rcu: Remove *_SLOW_* Kconfig optionsPaul E. McKenney
The RCU_TORTURE_TEST_SLOW_PREINIT, RCU_TORTURE_TEST_SLOW_PREINIT_DELAY, RCU_TORTURE_TEST_SLOW_PREINIT_DELAY, RCU_TORTURE_TEST_SLOW_INIT, RCU_TORTURE_TEST_SLOW_INIT_DELAY, RCU_TORTURE_TEST_SLOW_CLEANUP, and RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY Kconfig options are only useful for torture testing, and there are the rcutree.gp_cleanup_delay, rcutree.gp_init_delay, and rcutree.gp_preinit_delay kernel boot parameters that rcutorture can use instead. The effect of these parameters is to artificially slow down grace period initialization and cleanup in order to make some types of race conditions happen more often. This commit therefore simplifies Tree RCU a bit by removing the Kconfig options and adding the corresponding kernel parameters to rcutorture's .boot files instead. However, this commit also leaves out the kernel parameters for TREE02, TREE04, and TREE07 in order to have about the same number of tests slowed as not slowed. TREE01, TREE03, TREE05, and TREE06 are slowed, and the rest are not slowed. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08bpf, tests: fix endianness selectionDaniel Borkmann
I noticed that test_l4lb was failing in selftests: # ./test_progs test_pkt_access:PASS:ipv4 77 nsec test_pkt_access:PASS:ipv6 44 nsec test_xdp:PASS:ipv4 2933 nsec test_xdp:PASS:ipv6 1500 nsec test_l4lb:PASS:ipv4 377 nsec test_l4lb:PASS:ipv6 544 nsec test_l4lb:FAIL:stats 6297600000 200000 test_tcp_estats:PASS: 0 nsec Summary: 7 PASSED, 1 FAILED Tracking down the issue actually revealed that endianness selection in bpf_endian.h is broken when compiled with clang with bpf target. test_pkt_access.c, test_l4lb.c is compiled with __BYTE_ORDER as __BIG_ENDIAN, test_xdp.c as __LITTLE_ENDIAN! test_l4lb noticeably fails, because the test accounts bytes via bpf_ntohs(ip6h->payload_len) and bpf_ntohs(iph->tot_len), and compares them against a defined value and given a wrong endianness, the test outcome is different, of course. Turns out that there are actually two bugs: i) when we do __BYTE_ORDER comparison with __LITTLE_ENDIAN/__BIG_ENDIAN, then depending on the include order we see different outcomes. Reason is that __BYTE_ORDER is undefined due to missing endian.h include. Before we include the asm/byteorder.h (e.g. through linux/in.h), then __BYTE_ORDER equals __LITTLE_ENDIAN since both are undefined, after the include which correctly pulls in linux/byteorder/little_endian.h, __LITTLE_ENDIAN is defined, but given __BYTE_ORDER is still undefined, we match on __BYTE_ORDER equals to __BIG_ENDIAN since __BIG_ENDIAN is also undefined at that point, sigh. ii) But even that would be wrong, since when compiling the test cases with clang, one can select between bpfeb and bpfel targets for cross compilation. Hence, we can also not rely on what the system's endian.h provides, but we need to look at the compiler's defined endianness. The compiler defines __BYTE_ORDER__, and we can match __ORDER_LITTLE_ENDIAN__ and __ORDER_BIG_ENDIAN__, which also reflects targets bpf (native), bpfel, bpfeb correctly, thus really only rely on that. After patch: # ./test_progs test_pkt_access:PASS:ipv4 74 nsec test_pkt_access:PASS:ipv6 42 nsec test_xdp:PASS:ipv4 2340 nsec test_xdp:PASS:ipv6 1461 nsec test_l4lb:PASS:ipv4 400 nsec test_l4lb:PASS:ipv6 530 nsec test_tcp_estats:PASS: 0 nsec Summary: 7 PASSED, 0 FAILED Fixes: 43bcf707ccdc ("bpf: fix _htons occurences in test_progs") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08perf symbols: Kill dso__build_id_is_kmod()Namhyung Kim
The commit e7ee40475760 ("perf symbols: Fix symbols searching for module in buildid-cache") added the function to check kernel modules reside in the build-id cache. This was because there's no way to identify a DSO which is actually a kernel module. So it searched linkname of the file and find ".ko" suffix. But this does not work for compressed kernel modules and now such DSOs hCcave correct symtab_type now. So no need to check it anymore. This patch essentially reverts the commit. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170608073109.30699-10-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf symbols: Keep DSO->symtab_type after decompressNamhyung Kim
The symsrc__init() overwrites dso->symtab_type as symsrc->type in dso__load_sym(). But for compressed kernel modules in the build-id cache, it should have original symtab type to be decompressed as needed. This fixes perf annotate to show disassembly of the function properly. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170608073109.30699-9-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf tests: Decompress kernel module before objdumpNamhyung Kim
If a kernel modules is compressed, it should be decompressed before running objdump to parse binary data correctly. This fixes a failure of object code reading test for me. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170608073109.30699-8-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf tools: Consolidate error path in __open_dso()Namhyung Kim
On failure, it should free the 'name', so clean up the error path using goto. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170608073109.30699-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf tools: Decompress kernel module when reading DSO dataNamhyung Kim
Currently perf decompresses kernel modules when loading the symbol table but it missed to do it when reading raw data. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170608073109.30699-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf annotate: Use dso__decompress_kmodule_path()Namhyung Kim
Convert open-coded decompress routine to use the function. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170608073109.30699-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf tools: Introduce dso__decompress_kmodule_{fd,path}Namhyung Kim
Move decompress_kmodule() to util/dso.c and split it into two functions returning fd and (decompressed) file path. The existing user only wants the fd version but the path version will be used soon. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170608073109.30699-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf tools: Fix a memory leak in __open_dso()Namhyung Kim
The 'name' variable should be freed on the error path. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170608073109.30699-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf annotate: Fix symbolic link of build-id cacheNamhyung Kim
The commit 6ebd2547dd24 ("perf annotate: Fix a bug following symbolic link of a build-id file") changed to use dirname to follow the symlink. But it only considers new-style build-id cache names so old names fail on readlink() and force to use system path which might not available. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Fixes: 6ebd2547dd24 ("perf annotate: Fix a bug following symbolic link of a build-id file") Link: http://lkml.kernel.org/r/20170608073109.30699-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08rcutorture: Add "git diff" output to testid.txt filePaul E. McKenney
Currently, when running from a git archive, the testid.txt file contains only the branch name, the output of "git status", and the SHA-1 of the current HEAD. This is useful, but does not uniquely identify the source code that was built. This commit therefore adds the output of "git diff HEAD", which means that if two testid.txt files compare equal, they correspond to exactly the same source code. Give or take the possibility of SHA-1 collisions, that is. ;-) Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08srcu-cbmc: Use /usr/bin/awk instead of /bin/awkPriyalee Kushwaha
Most OS distribution have awk in /usr/bin not in /bin Without this patch, kernel-devsrc fails to build as runtime dependency for srcu-cbmc script /bin/awk is not found. Signed-off-by: Kushwaha, Priyalee <priyalee.kushwaha@intel.com> Acked-by: Lance Roy <ldr709@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08rcutorture: Reduce CPUs dedicated to testing Classic SRCUPaul E. McKenney
Given that the plan is to retire Classic SRCU in the near future, this commit reduces the number of CPUs dedicated to testing Classic SRCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08rcuperf: Add the ability to test tiny RCU flavorsPaul E. McKenney
This commit adds a TINY rcuperf test scenario, which allows performance testing of Tiny RCU and Tiny SRCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08rcuperf: Add a Kconfig-fragment file for Classic SRCUPaul E. McKenney
This commit adds a Kconfig-fragment file for Classic SRCU to ease performance comparisons with Tree SRCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08rcuperf: Remove conflicting Kconfig optionsPaul E. McKenney
The TREE and TREE54 rcuperf scenarios' Kconfig fragment files specified conflicting values for CONFIG_RCU_TRACE. This commit therefore removes the =n line in favor of the =y line. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08rcutorture: Update test scenarios based on new Kconfig dependenciesPaul E. McKenney
A number of the rcutorture test scenarios were not using the desired Kconfig options because dependencies were preventing the selections in the Kconfig-fragment files from being honored. This commit therefore updates the Kconfig-fragment files to account for these changes in dependencies. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08rcutorture: Correctly handle CONFIG_RCU_TORTURE_TEST_* optionsPaul E. McKenney
The rcutorture scripting handles the CONFIG_*_TORTURE_TEST Kconfig options specially, and therefore greps them out of the Kconfig-fragment files. Unfortunately, a poor choice of grep pattern means that the CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP, CONFIG_RCU_TORTURE_TEST_SLOW_INIT, and CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT Kconfig options are also grepped out, preventing rcutorture from using them. This commit therefore fixes the offending grep pattern to focus only on the CONFIG_*_TORTURE_TEST Kconfig options. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08rcutorture: Add a scenario for Classic SRCUPaul E. McKenney
A robust combination of paranoia and cowardice has resulted in retaining Classic SRCU (CONFIG_CLASSIC_SRCU) as a backup for the shiny new Tiny and Tree SRCU implementations. If it is to be a viable backup, it of course needs to be tested. This commit therefore adds an rcutorture scenario named SRCU-C for Classic SRCU. This commit also adds this scenario to the set that are run by default. Once sufficient good experience has accumulated for Tiny and Tree SRCU, this test will be removed, along with the Classic SRCU implementation itself. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>