Age | Commit message (Collapse) | Author |
|
The get_current() and __my_cpu_offset() accessors evaluate to only a
single instruction emitted inline, but due to the size of the asm string
that is created for SMP+v6 configurations, the compiler assumes
otherwise, and may emit the functions out of line instead.
So use __always_inline to avoid this.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
|
|
Nathan reports the group relocations go out of range in pathological
cases such as allyesconfig kernels, which have little chance of actually
booting but are still used in validation.
So add a Kconfig symbol for this feature, and make it depend on
!COMPILE_TEST.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
Nathan reports that the new get_current() and per-CPU offset accessors
may cause problems at build time due to the use of a literal to hold the
address of the respective variables. This is due to the fact that LLD
before v14 does not support the PC-relative group relocations that are
normally used for this, and the fallback relies on literals but does not
emit the literal pools explictly using the .ltorg directive.
./arch/arm/include/asm/current.h:53:6: error: out of range pc-relative fixup value
asm(LOAD_SYM_ARMV6(%0, __current) : "=r"(cur));
^
./arch/arm/include/asm/insn.h:25:2: note: expanded from macro 'LOAD_SYM_ARMV6'
" ldr " #reg ", =" #sym " nt"
^
<inline asm>:1:3: note: instantiated into assembly here
ldr r0, =__current
^
Since emitting a literal pool in this particular case is not possible,
let's avoid the LOAD_SYM_ARMV6() entirely, and use the ordinary C
assigment instead.
As it turns out, there are other such cases, and here, using .ltorg to
emit the literal pool within range of the LDR instruction would be
possible due to the presence of an unconditional branch right after it.
Unfortunately, putting .ltorg directives in subsections appears to
confuse the Clang inline assembler, resulting in similar errors even
though the .ltorg is most definitely within range.
So let's fix this by emitting the literal explicitly, and not rely on
the assembler to figure this out. This means we have move the fallback
out of the LOAD_SYM_ARMV6() macro and into the callers.
Link: https://github.com/ClangBuiltLinux/linux/issues/1551
Fixes: 9c46929e7989 ("ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems")
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Permit the use of the TPIDRPRW system register for carrying the per-CPU
offset in generic SMP configurations that also target non-SMP capable
ARMv6 cores. This uses the SMP_ON_UP code patching framework to turn all
TPIDRPRW accesses into reads/writes of entry #0 in the __per_cpu_offset
array.
While at it, switch over some existing direct TPIDRPRW accesses in asm
code to invocations of a new helper that is patched in the same way when
necessary.
Note that CPU_V6+SMP without SMP_ON_UP results in a kernel that does not
boot on v6 CPUs without SMP extensions, so add this dependency to
Kconfig as well.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
|
|
In order to use <asm/percpu.h> in irqflags.h, we need to make sure
asm/percpu.h does not itself depend on irqflags.h.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/20200623083721.454517573@infradead.org
|
|
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms and conditions of the gnu general public license
version 2 as published by the free software foundation this program
is distributed in the hope it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not see http www gnu org
licenses
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 228 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190528171438.107155473@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Using global current_stack_pointer works on both clang and gcc.
current_stack_pointer is an unsigned long and needs to be cast
as a pointer to dereference.
Signed-off-by: Mark Charlebois <charlebm@gmail.com>
Signed-off-by: Behan Webster <behanw@converseincode.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
__my_cpu_offset is non-volatile, since we want its value to be cached
when we access several per-cpu variables in a row with preemption
disabled. This means that we rely on preempt_{en,dis}able to hazard
with the operation via the barrier() macro, so that we can't end up
migrating CPUs without reloading the per-cpu offset.
Unfortunately, GCC doesn't treat a "memory" clobber on a non-volatile
asm block as a side-effect, and will happily re-order it before other
memory clobbers (including those in prempt_disable()) and cache the
value. This has been observed to break the cmpxchg logic in the slub
allocator, leading to livelock in kmem_cache_alloc in mainline kernels.
This patch adds a dummy memory input operand to __my_cpu_offset,
forcing it to be ordered with respect to the barrier() macro.
Cc: <stable@vger.kernel.org>
Cc: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Use the previously unused TPIDRPRW register to store percpu offsets.
TPIDRPRW is only accessible in PL1, so it can only be used in the kernel.
This replaces 2 loads with a mrc instruction for each percpu variable
access. With hackbench, the performance improvement is 1.4% on Cortex-A9
(highbank). Taking an average of 30 runs of "hackbench -l 1000" yields:
Before: 6.2191
After: 6.1348
Will Deacon reported similar delta on v6 with 11MPCore.
The asm "memory clobber" are needed here to ensure the percpu offset
gets reloaded. Testing by Will found that this would not happen in
__schedule() which is a bit of a special case as preemption is disabled
but the execution can move cores.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
With d8ecc5c (kbuild: asm-generic support, 2011-04-27) we can
remove a handful of asm-generic wrappers in ARM code. Since the
generic version of sizes.h doesn't contain SZ_48M, we replace
the 4 users of SZ_48M with the equivalent SZ_32M + SZ_16M.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Imre Kaloz <kaloz@openwrt.org>
Acked-by: Krzysztof Halasa <khc@pm.waw.pl>
Cc: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Move platform independent header files to arch/arm/include/asm, leaving
those in asm/arch* and asm/plat* alone.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|