summaryrefslogtreecommitdiff
path: root/arch/x86/xen
AgeCommit message (Collapse)Author
2023-04-28Merge tag 'objtool-core-2023-04-27' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: - Mark arch_cpu_idle_dead() __noreturn, make all architectures & drivers that did this inconsistently follow this new, common convention, and fix all the fallout that objtool can now detect statically - Fix/improve the ORC unwinder becoming unreliable due to UNWIND_HINT_EMPTY ambiguity, split it into UNWIND_HINT_END_OF_STACK and UNWIND_HINT_UNDEFINED to resolve it - Fix noinstr violations in the KCSAN code and the lkdtm/stackleak code - Generate ORC data for __pfx code - Add more __noreturn annotations to various kernel startup/shutdown and panic functions - Misc improvements & fixes * tag 'objtool-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) x86/hyperv: Mark hv_ghcb_terminate() as noreturn scsi: message: fusion: Mark mpt_halt_firmware() __noreturn x86/cpu: Mark {hlt,resume}_play_dead() __noreturn btrfs: Mark btrfs_assertfail() __noreturn objtool: Include weak functions in global_noreturns check cpu: Mark nmi_panic_self_stop() __noreturn cpu: Mark panic_smp_self_stop() __noreturn arm64/cpu: Mark cpu_park_loop() and friends __noreturn x86/head: Mark *_start_kernel() __noreturn init: Mark start_kernel() __noreturn init: Mark [arch_call_]rest_init() __noreturn objtool: Generate ORC data for __pfx code x86/linkage: Fix padding for typed functions objtool: Separate prefix code from stack validation code objtool: Remove superfluous dead_end_function() check objtool: Add symbol iteration helpers objtool: Add WARN_INSN() scripts/objdump-func: Support multiple functions context_tracking: Fix KCSAN noinstr violation objtool: Add stackleak instrumentation to uaccess safe list ...
2023-04-25Merge tag 'x86-apic-2023-04-24' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 APIC updates from Thomas Gleixner: - Fix the incorrect handling of atomic offset updates in reserve_eilvt_offset() The check for the return value of atomic_cmpxchg() is not compared against the old value, it is compared against the new value, which makes it two round on success. Convert it to atomic_try_cmpxchg() which does the right thing. - Handle IO/APIC less systems correctly When IO/APIC is not advertised by ACPI then the computation of the lower bound for dynamically allocated interrupts like MSI goes wrong. This lower bound is used to exclude the IO/APIC legacy GSI space as that must stay reserved for the legacy interrupts. In case that the system, e.g. VM, does not advertise an IO/APIC the lower bound stays at 0. 0 is an invalid interrupt number except for the legacy timer interrupt on x86. The return value is unchecked in the core code, so it ends up to allocate interrupt number 0 which is subsequently considered to be invalid by the caller, e.g. the MSI allocation code. A similar problem was already cured for device tree based systems years ago, but that missed - or did not envision - the zero IO/APIC case. Consolidate the zero check and return the provided "from" argument to the core code call site, which is guaranteed to be greater than 0. - Simplify the X2APIC cluster CPU mask logic for CPU hotplug Per cluster CPU masks are required for X2APIC in cluster mode to determine the correct cluster for a target CPU when calculating the destination for IPIs These masks are established when CPUs are borught up. The first CPU in a cluster must allocate a new cluster CPU mask. As this happens during the early startup of a CPU, where memory allocations cannot be done, the mask has to be allocated by the control CPU. The current implementation allocates a clustermask just in case and if the to be brought up CPU is the first in a cluster the CPU takes over this allocation from a global pointer. This works nicely in the fully serialized CPU bringup scenario which is used today, but would fail completely for parallel bringup of CPUs. The cluster association of a CPU can be computed from the APIC ID which is enumerated by ACPI/MADT. So the cluster CPU masks can be preallocated and associated upfront and the upcoming CPUs just need to set their corresponding bit. Aside of preparing for parallel bringup this is a valuable simplification on its own. - Remove global variables which control the early startup of secondary CPUs on 64-bit The only information which is needed by a starting CPU is the Linux CPU number. The CPU number allows it to retrieve the rest of the required data from already existing per CPU storage. So instead of initial_stack, early_gdt_desciptor and initial_gs provide a new variable smpboot_control which contains the Linux CPU number for now. The starting CPU can retrieve and compute all required information for startup from there. Aside of being a cleanup, this is also preparing for parallel CPU bringup, where starting CPUs will look up their Linux CPU number via the APIC ID, when smpboot_control has the corresponding control bit set. - Make cc_vendor globally accesible Subsequent parallel bringup changes require access to cc_vendor because confidental computing platforms need special treatment in the early startup phase vs. CPUID and APCI ID readouts. The change makes cc_vendor global and provides stub accessors in case that CONFIG_ARCH_HAS_CC_PLATFORM is not set. This was merged from the x86/cc branch in anticipation of further parallel bringup commits which require access to cc_vendor. Due to late discoveries of fundamental issue with those patches these commits never happened. The merge commit is unfortunately in the middle of the APIC commits so unraveling it would have required a rebase or revert. As the parallel bringup seems to be well on its way for 6.5 this would be just pointless churn. As the commit does not contain any functional change it's not a risk to keep it. * tag 'x86-apic-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ioapic: Don't return 0 from arch_dynirq_lower_bound() x86/apic: Fix atomic update of offset in reserve_eilvt_offset() x86/coco: Export cc_vendor x86/smpboot: Reference count on smpboot_setup_warm_reset_vector() x86/smpboot: Remove initial_gs x86/smpboot: Remove early_gdt_descr on 64-bit x86/smpboot: Remove initial_stack on 64-bit x86/apic/x2apic: Allow CPU cluster_mask to be populated in parallel
2023-04-25Merge tag 'x86_paravirt_for_v6.4_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 paravirt updates from Borislav Petkov: - Convert a couple of paravirt callbacks to asm to prevent '-fzero-call-used-regs' builds from zeroing live registers because paravirt hides the CALLs from the compiler so latter doesn't know there's a CALL in the first place - Merge two paravirt callbacks into one, as their functionality is identical * tag 'x86_paravirt_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/paravirt: Convert simple paravirt functions to asm x86/paravirt: Merge activate_mm() and dup_mmap() callbacks
2023-03-24Merge tag 'for-linus-6.3-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - fix build warning - avoid concurrent accesses to the Xen PV console ring page * tag 'for-linus-6.3-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/PVH: avoid 32-bit build warning when obtaining VGA console info hvc/xen: prevent concurrent accesses to the shared ring
2023-03-23x86,objtool: Split UNWIND_HINT_EMPTY in twoJosh Poimboeuf
Mark reported that the ORC unwinder incorrectly marks an unwind as reliable when the unwind terminates prematurely in the dark corners of return_to_handler() due to lack of information about the next frame. The problem is UNWIND_HINT_EMPTY is used in two different situations: 1) The end of the kernel stack unwind before hitting user entry, boot code, or fork entry 2) A blind spot in ORC coverage where the unwinder has to bail due to lack of information about the next frame The ORC unwinder has no way to tell the difference between the two. When it encounters an undefined stack state with 'end=1', it blindly marks the stack reliable, which can break the livepatch consistency model. Fix it by splitting UNWIND_HINT_EMPTY into UNWIND_HINT_UNDEFINED and UNWIND_HINT_END_OF_STACK. Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/fd6212c8b450d3564b855e1cb48404d6277b4d9f.1677683419.git.jpoimboe@kernel.org
2023-03-22x86/PVH: avoid 32-bit build warning when obtaining VGA console infoJan Beulich
In the commit referenced below I failed to pay attention to this code also being buildable as 32-bit. Adjust the type of "ret" - there's no real need for it to be wider than 32 bits. Fixes: 934ef33ee75c ("x86/PVH: obtain VGA console info in Dom0") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/2d2193ff-670b-0a27-e12d-2c5c4c121c79@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2023-03-21x86/smpboot: Remove initial_stack on 64-bitBrian Gerst
In order to facilitate parallel startup, start to eliminate some of the global variables passing information to CPUs in the startup path. However, start by introducing one more: smpboot_control. For now this merely holds the CPU# of the CPU which is coming up. Each CPU can then find its own per-cpu data, and everything else it needs can be found from there, allowing the other global variables to be removed. First to be removed is initial_stack. Each CPU can load %rsp from its current_task->thread.sp instead. That is already set up with the correct idle thread for APs. Set up the .sp field in INIT_THREAD on x86 so that the BSP also finds a suitable stack pointer in the static per-cpu data when coming up on first boot. On resume from S3, the CPU needs a temporary stack because its idle task is already active. Instead of setting initial_stack, the sleep code can simply set its own current->thread.sp to point to the temporary stack. Nobody else cares about ->thread.sp for a thread which is currently on a CPU, because the true value is actually in the %rsp register. Which is restored with the rest of the CPU context in do_suspend_lowlevel(). Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Usama Arif <usama.arif@bytedance.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Usama Arif <usama.arif@bytedance.com> Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20230316222109.1940300-7-usama.arif@bytedance.com
2023-03-17Merge tag 'for-linus-6.3-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - cleanup for xen time handling - enable the VGA console in a Xen PVH dom0 - cleanup in the xenfs driver * tag 'for-linus-6.3-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: remove unnecessary (void*) conversions x86/PVH: obtain VGA console info in Dom0 x86/xen/time: cleanup xen_tsc_safe_clocksource xen: update arch/x86/include/asm/xen/cpuid.h
2023-03-14x86/PVH: obtain VGA console info in Dom0Jan Beulich
A new platform-op was added to Xen to allow obtaining the same VGA console information PV Dom0 is handed. Invoke the new function and have the output data processed by xen_init_vga(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/8f315e92-7bda-c124-71cc-478ab9c5e610@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2023-03-06x86/paravirt: Merge activate_mm() and dup_mmap() callbacksJuergen Gross
The two paravirt callbacks .mmu.activate_mm() and .mmu.dup_mmap() are sharing the same implementations in all cases: for Xen PV guests they are pinning the PGD of the new mm_struct, and for all other cases they are a NOP. In the end, both callbacks are meant to register an address space with the underlying hypervisor, so there needs to be only a single callback for that purpose. So merge them to a common callback .mmu.enter_mmap() (in contrast to the corresponding already existing .mmu.exit_mmap()). As the first parameter of the old callbacks isn't used, drop it from the replacement one. [ bp: Remove last occurrence of paravirt_activate_mm() in asm/mmu_context.h ] Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu> Link: https://lore.kernel.org/r/20230207075902.7539-1-jgross@suse.com
2023-02-23x86/xen/time: cleanup xen_tsc_safe_clocksourceKrister Johansen
Modifies xen_tsc_safe_clocksource() to use newly defined constants from arch/x86/include/asm/xen/cpuid.h. This replaces a numeric value with XEN_CPUID_TSC_MODE_NEVER_EMULATE, and deletes a comment that is now self explanatory. There should be no change in the function's behavior. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/a69ca370fecf85d312d2db633d9438ace2af6e5b.1677038165.git.kjlx@templeofstupid.com Signed-off-by: Juergen Gross <jgross@suse.com>
2023-02-21Merge tag 'for-linus-6.3-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - help deprecate the /proc/xen files by making the related information available via sysfs - mark the Xen variants of play_dead "noreturn" - support a shared Xen platform interrupt - several small cleanups and fixes * tag 'for-linus-6.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: sysfs: make kobj_type structure constant x86/Xen: drop leftover VM-assist uses xen: Replace one-element array with flexible-array member xen/grant-dma-iommu: Implement a dummy probe_device() callback xen/pvcalls-back: fix permanently masked event channel xen: Allow platform PCI interrupt to be shared x86/xen/time: prefer tsc as clocksource when it is invariant x86/xen: mark xen_pv_play_dead() as __noreturn x86/xen: don't let xen_pv_play_dead() return drivers/xen/hypervisor: Expose Xen SIF flags to userspace
2023-02-21Merge tag 'x86_cpu_for_v6.3_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpuid updates from Borislav Petkov: - Cache the AMD debug registers in per-CPU variables to avoid MSR writes where possible, when supporting a debug registers swap feature for SEV-ES guests - Add support for AMD's version of eIBRS called Automatic IBRS which is a set-and-forget control of indirect branch restriction speculation resources on privilege change - Add support for a new x86 instruction - LKGS - Load kernel GS which is part of the FRED infrastructure - Reset SPEC_CTRL upon init to accomodate use cases like kexec which rediscover - Other smaller fixes and cleanups * tag 'x86_cpu_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/amd: Cache debug register values in percpu variables KVM: x86: Propagate the AMD Automatic IBRS feature to the guest x86/cpu: Support AMD Automatic IBRS x86/cpu, kvm: Add the SMM_CTL MSR not present feature x86/cpu, kvm: Add the Null Selector Clears Base feature x86/cpu, kvm: Move X86_FEATURE_LFENCE_RDTSC to its native leaf x86/cpu, kvm: Add the NO_NESTED_DATA_BP feature KVM: x86: Move open-coded CPUID leaf 0x80000021 EAX bit propagation code x86/cpu, kvm: Add support for CPUID_80000021_EAX x86/gsseg: Add the new <asm/gsseg.h> header to <asm/asm-prototypes.h> x86/gsseg: Use the LKGS instruction if available for load_gs_index() x86/gsseg: Move load_gs_index() to its own new header file x86/gsseg: Make asm_load_gs_index() take an u16 x86/opcode: Add the LKGS instruction to x86-opcode-map x86/cpufeature: Add the CPU feature bit for LKGS x86/bugs: Reset speculation control settings on init x86/cpu: Remove redundant extern x86_read_arch_cap_msr()
2023-02-18x86/Xen: drop leftover VM-assist usesJan Beulich
Both the 4Gb-segments and the PAE-extended-CR3 one are applicable to 32-bit guests only. The PAE-extended-CR3 use, furthermore, was redundant with the PAE_MODE ELF note anyway. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/215515af-cfb9-3237-03ba-3312e3fa0d34@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2023-02-13x86/xen/time: prefer tsc as clocksource when it is invariantKrister Johansen
Kvm elects to use tsc instead of kvm-clock when it can detect that the TSC is invariant. (As of commit 7539b174aef4 ("x86: kvmguest: use TSC clocksource if invariant TSC is exposed")). Notable cloud vendors[1] and performance engineers[2] recommend that Xen users preferentially select tsc over xen-clocksource due the performance penalty incurred by the latter. These articles are persuasive and tailored to specific use cases. In order to understand the tradeoffs around this choice more fully, this author had to reference the documented[3] complexities around the Xen configuration, as well as the kernel's clocksource selection algorithm. Many users may not attempt this to correctly configure the right clock source in their guest. The approach taken in the kvm-clock module spares users this confusion, where possible. Both the Intel SDM[4] and the Xen tsc documentation explain that marking a tsc as invariant means that it should be considered stable by the OS and is elibile to be used as a wall clock source. In order to obtain better out-of-the-box performance, and reduce the need for user tuning, follow kvm's approach and decrease the xen clock rating so that tsc is preferable, if it is invariant, stable, and the tsc will never be emulated. [1] https://aws.amazon.com/premiumsupport/knowledge-center/manage-ec2-linux-clock-source/ [2] https://www.brendangregg.com/blog/2021-09-26/the-speed-of-time.html [3] https://xenbits.xen.org/docs/unstable/man/xen-tscmode.7.html [4] Intel 64 and IA-32 Architectures Sofware Developer's Manual Volume 3b: System Programming Guide, Part 2, Section 17.17.1, Invariant TSC Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Code-reviewed-by: David Reaver <me@davidreaver.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20221216162118.GB2633@templeofstupid.com Signed-off-by: Juergen Gross <jgross@suse.com>
2023-02-13x86/xen: mark xen_pv_play_dead() as __noreturnJuergen Gross
Mark xen_pv_play_dead() and related to that xen_cpu_bringup_again() as "__noreturn". Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20221125063248.30256-3-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2023-02-13x86/xen: don't let xen_pv_play_dead() returnJuergen Gross
A function called via the paravirt play_dead() hook should not return to the caller. xen_pv_play_dead() has a problem in this regard, as it currently will return in case an offlined cpu is brought to life again. This can be changed only by doing basically a longjmp() to cpu_bringup_and_idle(), as the hypercall for bringing down the cpu will just return when the cpu is coming up again. Just re-initializing the cpu isn't possible, as the Xen hypervisor will deny that operation. So introduce xen_cpu_bringup_again() resetting the stack and calling cpu_bringup_and_idle(), which can be called after HYPERVISOR_vcpu_op() in xen_pv_play_dead(). Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20221125063248.30256-2-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2023-01-31sched/clock/x86: Mark sched_clock() noinstrPeter Zijlstra
In order to use sched_clock() from noinstr code, mark it and all it's implenentations noinstr. The whole pvclock thing (used by KVM/Xen) is a bit of a pain, since it calls out to watchdogs, create a pvclock_clocksource_read_nowd() variant doesn't do that and can be noinstr. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230126151323.702003578@infradead.org
2023-01-31Merge tag 'v6.2-rc6' into sched/core, to pick up fixesIngo Molnar
Pick up fixes before merging another batch of cpuidle updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-01-13cpuidle, xenpv: Make more PARAVIRT_XXL noinstr cleanPeter Zijlstra
objtool found a few cases where this code called out into instrumented code: vmlinux.o: warning: objtool: acpi_idle_enter_s2idle+0xde: call to wbinvd() leaves .noinstr.text section vmlinux.o: warning: objtool: default_idle+0x4: call to arch_safe_halt() leaves .noinstr.text section vmlinux.o: warning: objtool: xen_safe_halt+0xa: call to HYPERVISOR_sched_op.constprop.0() leaves .noinstr.text section Solve this by: - marking arch_safe_halt(), wbinvd(), native_wbinvd() and HYPERVISOR_sched_op() as __always_inline(). - Explicitly uninlining xen_safe_halt() and pv_native_wbinvd() [they were already uninlined by the compiler on use as function pointers] and annotating them as 'noinstr'. - Annotating pv_native_safe_halt() as 'noinstr'. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20230112195541.171918174@infradead.org
2023-01-13x86/gsseg: Use the LKGS instruction if available for load_gs_index()H. Peter Anvin (Intel)
The LKGS instruction atomically loads a segment descriptor into the %gs descriptor registers, *except* that %gs.base is unchanged, and the base is instead loaded into MSR_IA32_KERNEL_GS_BASE, which is exactly what we want this function to do. Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Xin Li <xin3.li@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230112072032.35626-6-xin3.li@intel.com Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>
2023-01-12Merge tag 'for-linus-6.2-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - two cleanup patches - a fix of a memory leak in the Xen pvfront driver - a fix of a locking issue in the Xen hypervisor console driver * tag 'for-linus-6.2-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pvcalls: free active map buffer on pvcalls_front_free_map hvc/xen: lock console list traversal x86/xen: Remove the unused function p2m_index() xen: make remove callback of xen driver void returned
2023-01-09x86/xen: Remove the unused function p2m_index()Jiapeng Chong
The function p2m_index is defined in the p2m.c file, but not called elsewhere, so remove this unused function. arch/x86/xen/p2m.c:137:24: warning: unused function 'p2m_index'. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3557 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20230105090141.36248-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Juergen Gross <jgross@suse.com>
2022-12-14Merge tag 'x86_core_for_v6.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Borislav Petkov: - Add the call depth tracking mitigation for Retbleed which has been long in the making. It is a lighterweight software-only fix for Skylake-based cores where enabling IBRS is a big hammer and causes a significant performance impact. What it basically does is, it aligns all kernel functions to 16 bytes boundary and adds a 16-byte padding before the function, objtool collects all functions' locations and when the mitigation gets applied, it patches a call accounting thunk which is used to track the call depth of the stack at any time. When that call depth reaches a magical, microarchitecture-specific value for the Return Stack Buffer, the code stuffs that RSB and avoids its underflow which could otherwise lead to the Intel variant of Retbleed. This software-only solution brings a lot of the lost performance back, as benchmarks suggest: https://lore.kernel.org/all/20220915111039.092790446@infradead.org/ That page above also contains a lot more detailed explanation of the whole mechanism - Implement a new control flow integrity scheme called FineIBT which is based on the software kCFI implementation and uses hardware IBT support where present to annotate and track indirect branches using a hash to validate them - Other misc fixes and cleanups * tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (80 commits) x86/paravirt: Use common macro for creating simple asm paravirt functions x86/paravirt: Remove clobber bitmask from .parainstructions x86/debug: Include percpu.h in debugreg.h to get DECLARE_PER_CPU() et al x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit x86/Kconfig: Enable kernel IBT by default x86,pm: Force out-of-line memcpy() objtool: Fix weak hole vs prefix symbol objtool: Optimize elf_dirty_reloc_sym() x86/cfi: Add boot time hash randomization x86/cfi: Boot time selection of CFI scheme x86/ibt: Implement FineIBT objtool: Add --cfi to generate the .cfi_sites section x86: Add prefix symbols for function padding objtool: Add option to generate prefix symbols objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf objtool: Slice up elf_create_section_symbol() kallsyms: Revert "Take callthunks into account" x86: Unconfuse CONFIG_ and X86_FEATURE_ namespaces x86/retpoline: Fix crash printing warning x86/paravirt: Fix a !PARAVIRT build warning ...
2022-12-13Merge tag 'x86_boot_for_v6.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Borislav Petkov: "A of early boot cleanups and fixes. - Do some spring cleaning to the compressed boot code by moving the EFI mixed-mode code to a separate compilation unit, the AMD memory encryption early code where it belongs and fixing up build dependencies. Make the deprecated EFI handover protocol optional with the goal of removing it at some point (Ard Biesheuvel) - Skip realmode init code on Xen PV guests as it is not needed there - Remove an old 32-bit PIC code compiler workaround" * tag 'x86_boot_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Remove x86_32 PIC using %ebx workaround x86/boot: Skip realmode init code when running as Xen PV guest x86/efi: Make the deprecated EFI handover protocol optional x86/boot/compressed: Only build mem_encrypt.S if AMD_MEM_ENCRYPT=y x86/boot/compressed: Adhere to calling convention in get_sev_encryption_bit() x86/boot/compressed: Move startup32_check_sev_cbit() out of head_64.S x86/boot/compressed: Move startup32_check_sev_cbit() into .text x86/boot/compressed: Move startup32_load_idt() out of head_64.S x86/boot/compressed: Move startup32_load_idt() into .text section x86/boot/compressed: Pull global variable reference into startup32_load_idt() x86/boot/compressed: Avoid touching ECX in startup32_set_idt_entry() x86/boot/compressed: Simplify IDT/GDT preserve/restore in the EFI thunk x86/boot/compressed, efi: Merge multiple definitions of image_offset into one x86/boot/compressed: Move efi32_pe_entry() out of head_64.S x86/boot/compressed: Move efi32_entry out of head_64.S x86/boot/compressed: Move efi32_pe_entry into .text section x86/boot/compressed: Move bootargs parsing out of 32-bit startup code x86/boot/compressed: Move 32-bit entrypoint code into .text section x86/boot/compressed: Rename efi_thunk_64.S to efi-mixed.S
2022-12-12Merge tag 'random-6.2-rc1-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: - Replace prandom_u32_max() and various open-coded variants of it, there is now a new family of functions that uses fast rejection sampling to choose properly uniformly random numbers within an interval: get_random_u32_below(ceil) - [0, ceil) get_random_u32_above(floor) - (floor, U32_MAX] get_random_u32_inclusive(floor, ceil) - [floor, ceil] Coccinelle was used to convert all current users of prandom_u32_max(), as well as many open-coded patterns, resulting in improvements throughout the tree. I'll have a "late" 6.1-rc1 pull for you that removes the now unused prandom_u32_max() function, just in case any other trees add a new use case of it that needs to converted. According to linux-next, there may be two trivial cases of prandom_u32_max() reintroductions that are fixable with a 's/.../.../'. So I'll have for you a final conversion patch doing that alongside the removal patch during the second week. This is a treewide change that touches many files throughout. - More consistent use of get_random_canary(). - Updates to comments, documentation, tests, headers, and simplification in configuration. - The arch_get_random*_early() abstraction was only used by arm64 and wasn't entirely useful, so this has been replaced by code that works in all relevant contexts. - The kernel will use and manage random seeds in non-volatile EFI variables, refreshing a variable with a fresh seed when the RNG is initialized. The RNG GUID namespace is then hidden from efivarfs to prevent accidental leakage. These changes are split into random.c infrastructure code used in the EFI subsystem, in this pull request, and related support inside of EFISTUB, in Ard's EFI tree. These are co-dependent for full functionality, but the order of merging doesn't matter. - Part of the infrastructure added for the EFI support is also used for an improvement to the way vsprintf initializes its siphash key, replacing an sleep loop wart. - The hardware RNG framework now always calls its correct random.c input function, add_hwgenerator_randomness(), rather than sometimes going through helpers better suited for other cases. - The add_latent_entropy() function has long been called from the fork handler, but is a no-op when the latent entropy gcc plugin isn't used, which is fine for the purposes of latent entropy. But it was missing out on the cycle counter that was also being mixed in beside the latent entropy variable. So now, if the latent entropy gcc plugin isn't enabled, add_latent_entropy() will expand to a call to add_device_randomness(NULL, 0), which adds a cycle counter, without the absent latent entropy variable. - The RNG is now reseeded from a delayed worker, rather than on demand when used. Always running from a worker allows it to make use of the CPU RNG on platforms like S390x, whose instructions are too slow to do so from interrupts. It also has the effect of adding in new inputs more frequently with more regularity, amounting to a long term transcript of random values. Plus, it helps a bit with the upcoming vDSO implementation (which isn't yet ready for 6.2). - The jitter entropy algorithm now tries to execute on many different CPUs, round-robining, in hopes of hitting even more memory latencies and other unpredictable effects. It also will mix in a cycle counter when the entropy timer fires, in addition to being mixed in from the main loop, to account more explicitly for fluctuations in that timer firing. And the state it touches is now kept within the same cache line, so that it's assured that the different execution contexts will cause latencies. * tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits) random: include <linux/once.h> in the right header random: align entropy_timer_state to cache line random: mix in cycle counter when jitter timer fires random: spread out jitter callback to different CPUs random: remove extraneous period and add a missing one in comments efi: random: refresh non-volatile random seed when RNG is initialized vsprintf: initialize siphash key using notifier random: add back async readiness notifier random: reseed in delayed work rather than on-demand random: always mix cycle counter in add_latent_entropy() hw_random: use add_hwgenerator_randomness() for early entropy random: modernize documentation comment on get_random_bytes() random: adjust comment to account for removed function random: remove early archrandom abstraction random: use random.trust_{bootloader,cpu} command line option only stackprotector: actually use get_random_canary() stackprotector: move get_random_canary() into stackprotector.h treewide: use get_random_u32_inclusive() when possible treewide: use get_random_u32_{above,below}() instead of manual loop treewide: use get_random_u32_below() instead of deprecated function ...
2022-12-12Merge tag 'x86-misc-2022-12-10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 updates from Thomas Gleixner: "Updates for miscellaneous x86 areas: - Reserve a new boot loader type for barebox which is usally used on ARM and MIPS, but can also be utilized as EFI payload on x86 to provide watchdog-supervised boot up. - Consolidate the native and compat 32bit signal handling code and split the 64bit version out into a separate source file - Switch the ESPFIX random usage to get_random_long()" * tag 'x86-misc-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/espfix: Use get_random_long() rather than archrandom x86/signal/64: Move 64-bit signal code to its own file x86/signal/32: Merge native and compat 32-bit signal code x86/signal: Add ABI prefixes to frame setup functions x86/signal: Merge get_sigframe() x86: Remove __USER32_DS signal/compat: Remove compat_sigset_t override x86/signal: Remove sigset_t parameter from frame setup functions x86/signal: Remove sig parameter from frame setup functions Documentation/x86/boot: Reserve type_of_loader=13 for barebox
2022-12-05x86/xen: Fix memory leak in xen_init_lock_cpu()Xiu Jianfeng
In xen_init_lock_cpu(), the @name has allocated new string by kasprintf(), if bind_ipi_to_irqhandler() fails, it should be freed, otherwise may lead to a memory leak issue, fix it. Fixes: 2d9e1e2f58b5 ("xen: implement Xen-specific spinlocks") Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20221123155858.11382-3-xiujianfeng@huawei.com Signed-off-by: Juergen Gross <jgross@suse.com>
2022-12-05x86/xen: Fix memory leak in xen_smp_intr_init{_pv}()Xiu Jianfeng
These local variables @{resched|pmu|callfunc...}_name saves the new string allocated by kasprintf(), and when bind_{v}ipi_to_irqhandler() fails, it goes to the @fail tag, and calls xen_smp_intr_free{_pv}() to free resource, however the new string is not saved, which cause a memory leak issue. fix it. Fixes: 9702785a747a ("i386: move xen") Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20221123155858.11382-2-xiujianfeng@huawei.com Signed-off-by: Juergen Gross <jgross@suse.com>
2022-11-25x86/boot: Skip realmode init code when running as Xen PV guestJuergen Gross
When running as a Xen PV guest there is no need for setting up the realmode trampoline, as realmode isn't supported in this environment. Trying to setup the trampoline has been proven to be problematic in some cases, especially when trying to debug early boot problems with Xen requiring to keep the EFI boot-services memory mapped (some firmware variants seem to claim basically all memory below 1Mb for boot services). Introduce new x86_platform_ops operations for that purpose, which can be set to a NOP by the Xen PV specific kernel boot code. [ bp: s/call_init_real_mode/do_init_real_mode/ ] Fixes: 084ee1c641a0 ("x86, realmode: Relocator for realmode code") Suggested-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221123114523.3467-1-jgross@suse.com
2022-11-21Merge tag 'v6.1-rc6' into x86/core, to resolve conflictsIngo Molnar
Resolve conflicts between these commits in arch/x86/kernel/asm-offsets.c: # upstream: debc5a1ec0d1 ("KVM: x86: use a separate asm-offsets.c file") # retbleed work in x86/core: 5d8213864ade ("x86/retbleed: Add SKL return thunk") ... and these commits in include/linux/bpf.h: # upstram: 18acb7fac22f ("bpf: Revert ("Fix dispatcher patchable function entry to 5 bytes nop")") # x86/core commits: 931ab63664f0 ("x86/ibt: Implement FineIBT") bea75b33895f ("x86/Kconfig: Introduce function padding") The latter two modify BPF_DISPATCHER_ATTRIBUTES(), which was removed upstream. Conflicts: arch/x86/kernel/asm-offsets.c include/linux/bpf.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2022-11-18stackprotector: move get_random_canary() into stackprotector.hJason A. Donenfeld
This has nothing to do with random.c and everything to do with stack protectors. Yes, it uses randomness. But many things use randomness. random.h and random.c are concerned with the generation of randomness, not with each and every use. So move this function into the more specific stackprotector.h file where it belongs. Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-11-16Merge tag 'for-linus-6.1-rc6-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Two trivial cleanups, and three simple fixes" * tag 'for-linus-6.1-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/platform-pci: use define instead of literal number xen/platform-pci: add missing free_irq() in error path xen-pciback: Allow setting PCI_MSIX_FLAGS_MASKALL too xen/pcpu: fix possible memory leak in register_pcpu() x86/xen: Use kstrtobool() instead of strtobool()
2022-11-14x86/xen: Use kstrtobool() instead of strtobool()Christophe JAILLET
strtobool() is the same as kstrtobool(). However, the latter is more used within the kernel. In order to remove strtobool() and slightly simplify kstrtox.h, switch to the other function name. While at it, include the corresponding header file (<linux/kstrtox.h>) Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/e91af3c8708af38b1c57e0a2d7eb9765dda0e963.1667336095.git.christophe.jaillet@wanadoo.fr Signed-off-by: Juergen Gross <jgross@suse.com>
2022-11-06Merge tag 'for-linus-6.1-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "One fix for silencing a smatch warning, and a small cleanup patch" * tag 'for-linus-6.1-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: simplify sysenter and syscall setup x86/xen: silence smatch warning in pmu_msr_chk_emulated()
2022-11-03x86/xen: simplify sysenter and syscall setupJuergen Gross
xen_enable_sysenter() and xen_enable_syscall() can be simplified a lot. While at it, switch to use cpu_feature_enabled() instead of boot_cpu_has(). Signed-off-by: Juergen Gross <jgross@suse.com>
2022-11-03x86/xen: silence smatch warning in pmu_msr_chk_emulated()Juergen Gross
Commit 8714f7bcd3c2 ("xen/pv: add fault recovery control to pmu msr accesses") introduced code resulting in a warning issued by the smatch static checker, claiming to use an uninitialized variable. This is a false positive, but work around the warning nevertheless. Fixes: 8714f7bcd3c2 ("xen/pv: add fault recovery control to pmu msr accesses") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-19x86: Remove __USER32_DSBrian Gerst
Replace all users with the equivalent __USER_DS, which will make merging native and compat code simpler. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Link: https://lore.kernel.org/r/20220606203802.158958-5-brgerst@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>
2022-10-17x86/cpu: Get rid of redundant switch_to_new_gdt() invocationsThomas Gleixner
The only place where switch_to_new_gdt() is required is early boot to switch from the early GDT to the direct GDT. Any other invocation is completely redundant because it does not change anything. Secondary CPUs come out of the ASM code with GDT and GSBASE correctly set up. The same is true for XEN_PV. Remove all the voodoo invocations which are left overs from the ancient past, rename the function to switch_gdt_and_percpu_base() and mark it init. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111143.198076128@infradead.org
2022-10-12Merge tag 'for-linus-6.1-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - Some minor typo fixes - A fix of the Xen pcifront driver for supporting the device model to run in a Linux stub domain - A cleanup of the pcifront driver - A series to enable grant-based virtio with Xen on x86 - A cleanup of Xen PV guests to distinguish between safe and faulting MSR accesses - Two fixes of the Xen gntdev driver - Two fixes of the new xen grant DMA driver * tag 'for-linus-6.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: Kconfig: Fix spelling mistake "Maxmium" -> "Maximum" xen/pv: support selecting safe/unsafe msr accesses xen/pv: refactor msr access functions to support safe and unsafe accesses xen/pv: fix vendor checks for pmu emulation xen/pv: add fault recovery control to pmu msr accesses xen/virtio: enable grant based virtio on x86 xen/virtio: use dom0 as default backend for CONFIG_XEN_VIRTIO_FORCE_GRANT xen/virtio: restructure xen grant dma setup xen/pcifront: move xenstore config scanning into sub-function xen/gntdev: Accommodate VMA splitting xen/gntdev: Prevent leaking grants xen/virtio: Fix potential deadlock when accessing xen_grant_dma_devices xen/virtio: Fix n_pages calculation in xen_grant_dma_map(unmap)_page() xen/xenbus: Fix spelling mistake "hardward" -> "hardware" xen-pcifront: Handle missed Connected state
2022-10-11xen/pv: support selecting safe/unsafe msr accessesJuergen Gross
Instead of always doing the safe variants for reading and writing MSRs in Xen PV guests, make the behavior controllable via Kconfig option and a boot parameter. The default will be the current behavior, which is to always use the safe variant. Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-11xen/pv: refactor msr access functions to support safe and unsafe accessesJuergen Gross
Refactor and rename xen_read_msr_safe() and xen_write_msr_safe() to support both cases of MSR accesses, safe ones and potentially GP-fault generating ones. This will prepare to no longer swallow GPs silently in xen_read_msr() and xen_write_msr(). Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-11xen/pv: fix vendor checks for pmu emulationJuergen Gross
The CPU vendor checks for pmu emulation are rather limited today, as the assumption seems to be that only Intel and AMD are existing and/or supported vendors. Fix that by handling Centaur and Zhaoxin CPUs the same way as Intel, and Hygon the same way as AMD. While at it fix the return type of is_intel_pmu_msr(). Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-11xen/pv: add fault recovery control to pmu msr accessesJuergen Gross
Today pmu_msr_read() and pmu_msr_write() fall back to the safe variants of read/write MSR in case the MSR access isn't emulated via Xen. Allow the caller to select that faults should not be recovered from by passing NULL for the error pointer. Restructure the code to make it more readable. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-10Merge tag 'bitmap-6.1-rc1' of https://github.com/norov/linuxLinus Torvalds
Pull bitmap updates from Yury Norov: - Fix unsigned comparison to -1 in CPUMAP_FILE_MAX_BYTES (Phil Auld) - cleanup nr_cpu_ids vs nr_cpumask_bits mess (me) This series cleans that mess and adds new config FORCE_NR_CPUS that allows to optimize cpumask subsystem if the number of CPUs is known at compile-time. - optimize find_bit() functions (me) Reworks find_bit() functions based on new FIND_{FIRST,NEXT}_BIT() macros. - add find_nth_bit() (me) Adds find_nth_bit(), which is ~70 times faster than bitcounting with for_each() loop: for_each_set_bit(bit, mask, size) if (n-- == 0) return bit; Also adds bitmap_weight_and() to let people replace this pattern: tmp = bitmap_alloc(nbits); bitmap_and(tmp, map1, map2, nbits); weight = bitmap_weight(tmp, nbits); bitmap_free(tmp); with a single bitmap_weight_and() call. - repair cpumask_check() (me) After switching cpumask to use nr_cpu_ids, cpumask_check() started generating many false-positive warnings. This series fixes it. - Add for_each_cpu_andnot() and for_each_cpu_andnot() (Valentin Schneider) Extends the API with one more function and applies it in sched/core. * tag 'bitmap-6.1-rc1' of https://github.com/norov/linux: (28 commits) sched/core: Merge cpumask_andnot()+for_each_cpu() into for_each_cpu_andnot() lib/test_cpumask: Add for_each_cpu_and(not) tests cpumask: Introduce for_each_cpu_andnot() lib/find_bit: Introduce find_next_andnot_bit() cpumask: fix checking valid cpu range lib/bitmap: add tests for for_each() loops lib/find: optimize for_each() macros lib/bitmap: introduce for_each_set_bit_wrap() macro lib/find_bit: add find_next{,_and}_bit_wrap cpumask: switch for_each_cpu{,_not} to use for_each_bit() net: fix cpu_max_bits_warn() usage in netif_attrmask_next{,_and} cpumask: add cpumask_nth_{,and,andnot} lib/bitmap: remove bitmap_ord_to_pos lib/bitmap: add tests for find_nth_bit() lib: add find_nth{,_and,_andnot}_bit() lib/bitmap: add bitmap_weight_and() lib/bitmap: don't call __bitmap_weight() in kernel code tools: sync find_bit() implementation lib/find_bit: optimize find_next_bit() functions lib/find_bit: create find_first_zero_bit_le() ...
2022-10-10xen/virtio: enable grant based virtio on x86Juergen Gross
Use an x86-specific virtio_check_mem_acc_cb() for Xen in order to setup the correct DMA ops. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # common code Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2022-09-26x86/entry: Work around Clang __bdos() bugKees Cook
Clang produces a false positive when building with CONFIG_FORTIFY_SOURCE=y and CONFIG_UBSAN_BOUNDS=y when operating on an array with a dynamic offset. Work around this by using a direct assignment of an empty instance. Avoids this warning: ../include/linux/fortify-string.h:309:4: warning: call to __write_overflow_field declared with 'warn ing' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wat tribute-warning] __write_overflow_field(p_size_field, size); ^ which was isolated to the memset() call in xen_load_idt(). Note that this looks very much like another bug that was worked around: https://github.com/ClangBuiltLinux/linux/issues/1592 Cc: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: xen-devel@lists.xenproject.org Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/lkml/41527d69-e8ab-3f86-ff37-6b298c01d5bc@oracle.com Signed-off-by: Kees Cook <keescook@chromium.org>
2022-09-19smp: add set_nr_cpu_ids()Yury Norov
In preparation to support compile-time nr_cpu_ids, add a setter for the variable. This is a no-op for all arches. Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-08-12x86/xen: Add support for HVMOP_set_evtchn_upcall_vectorJane Malalane
Implement support for the HVMOP_set_evtchn_upcall_vector hypercall in order to set the per-vCPU event channel vector callback on Linux and use it in preference of HVM_PARAM_CALLBACK_IRQ. If the per-VCPU vector setup is successful on BSP, use this method for the APs. If not, fallback to the global vector-type callback. Also register callback_irq at per-vCPU event channel setup to trick toolstack to think the domain is enlightened. Suggested-by: "Roger Pau Monné" <roger.pau@citrix.com> Signed-off-by: Jane Malalane <jane.malalane@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20220729070416.23306-1-jane.malalane@citrix.com Signed-off-by: Juergen Gross <jgross@suse.com>
2022-08-01xen: don't require virtio with grants for non-PV guestsJuergen Gross
Commit fa1f57421e0b ("xen/virtio: Enable restricted memory access using Xen grant mappings") introduced a new requirement for using virtio devices: the backend now needs to support the VIRTIO_F_ACCESS_PLATFORM feature. This is an undue requirement for non-PV guests, as those can be operated with existing backends without any problem, as long as those backends are running in dom0. Per default allow virtio devices without grant support for non-PV guests. On Arm require VIRTIO_F_ACCESS_PLATFORM for devices having been listed in the device tree to use grants. Add a new config item to always force use of grants for virtio. Fixes: fa1f57421e0b ("xen/virtio: Enable restricted memory access using Xen grant mappings") Reported-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # Arm64 guest using Xen Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Link: https://lore.kernel.org/r/20220622063838.8854-4-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>