summaryrefslogtreecommitdiff
path: root/drivers/firmware/efi/efi.c
AgeCommit message (Collapse)Author
2023-11-01Merge tag 'asm-generic-6.7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull ia64 removal and asm-generic updates from Arnd Bergmann: - The ia64 architecture gets its well-earned retirement as planned, now that there is one last (mostly) working release that will be maintained as an LTS kernel. - The architecture specific system call tables are updated for the added map_shadow_stack() syscall and to remove references to the long-gone sys_lookup_dcookie() syscall. * tag 'asm-generic-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: hexagon: Remove unusable symbols from the ptrace.h uapi asm-generic: Fix spelling of architecture arch: Reserve map_shadow_stack() syscall number for all architectures syscalls: Cleanup references to sys_lookup_dcookie() Documentation: Drop or replace remaining mentions of IA64 lib/raid6: Drop IA64 support Documentation: Drop IA64 from feature descriptions kernel: Drop IA64 support from sig_fault handlers arch: Remove Itanium (IA-64) architecture
2023-10-20Merge 3rd batch of EFI fixes into efi/urgentArd Biesheuvel
2023-10-13efi: fix memory leak in krealloc failure handlingKuan-Wei Chiu
In the previous code, there was a memory leak issue where the previously allocated memory was not freed upon a failed krealloc operation. This patch addresses the problem by releasing the old memory before setting the pointer to NULL in case of a krealloc failure. This ensures that memory is properly managed and avoids potential memory leaks. Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-09-19efi/unaccepted: Make sure unaccepted table is mappedKirill A. Shutemov
Unaccepted table is now allocated from EFI_ACPI_RECLAIM_MEMORY. It translates into E820_TYPE_ACPI, which is not added to memblock and therefore not mapped in the direct mapping. This causes a crash on the first touch of the table. Use memblock_add() to make sure that the table is mapped in direct mapping. Align the range to the nearest page borders. Ranges smaller than page size are not mapped. Fixes: e7761d827e99 ("efi/unaccepted: Use ACPI reclaim memory for unaccepted memory table") Reported-by: Hongyu Ning <hongyu.ning@intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-09-11arch: Remove Itanium (IA-64) architectureArd Biesheuvel
The Itanium architecture is obsolete, and an informal survey [0] reveals that any residual use of Itanium hardware in production is mostly HP-UX or OpenVMS based. The use of Linux on Itanium appears to be limited to enthusiasts that occasionally boot a fresh Linux kernel to see whether things are still working as intended, and perhaps to churn out some distro packages that are rarely used in practice. None of the original companies behind Itanium still produce or support any hardware or software for the architecture, and it is listed as 'Orphaned' in the MAINTAINERS file, as apparently, none of the engineers that contributed on behalf of those companies (nor anyone else, for that matter) have been willing to support or maintain the architecture upstream or even be responsible for applying the odd fix. The Intel firmware team removed all IA-64 support from the Tianocore/EDK2 reference implementation of EFI in 2018. (Itanium is the original architecture for which EFI was developed, and the way Linux supports it deviates significantly from other architectures.) Some distros, such as Debian and Gentoo, still maintain [unofficial] ia64 ports, but many have dropped support years ago. While the argument is being made [1] that there is a 'for the common good' angle to being able to build and run existing projects such as the Grid Community Toolkit [2] on Itanium for interoperability testing, the fact remains that none of those projects are known to be deployed on Linux/ia64, and very few people actually have access to such a system in the first place. Even if there were ways imaginable in which Linux/ia64 could be put to good use today, what matters is whether anyone is actually doing that, and this does not appear to be the case. There are no emulators widely available, and so boot testing Itanium is generally infeasible for ordinary contributors. GCC still supports IA-64 but its compile farm [3] no longer has any IA-64 machines. GLIBC would like to get rid of IA-64 [4] too because it would permit some overdue code cleanups. In summary, the benefits to the ecosystem of having IA-64 be part of it are mostly theoretical, whereas the maintenance overhead of keeping it supported is real. So let's rip off the band aid, and remove the IA-64 arch code entirely. This follows the timeline proposed by the Debian/ia64 maintainer [5], which removes support in a controlled manner, leaving IA-64 in a known good state in the most recent LTS release. Other projects will follow once the kernel support is removed. [0] https://lore.kernel.org/all/CAMj1kXFCMh_578jniKpUtx_j8ByHnt=s7S+yQ+vGbKt9ud7+kQ@mail.gmail.com/ [1] https://lore.kernel.org/all/0075883c-7c51-00f5-2c2d-5119c1820410@web.de/ [2] https://gridcf.org/gct-docs/latest/index.html [3] https://cfarm.tetaneutral.net/machines/list/ [4] https://lore.kernel.org/all/87bkiilpc4.fsf@mid.deneb.enyo.de/ [5] https://lore.kernel.org/all/ff58a3e76e5102c94bb5946d99187b358def688a.camel@physik.fu-berlin.de/ Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-06-30Merge tag 'efi-next-for-v6.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: "Although some more stuff is brewing, the EFI changes that are ready for mainline are few this cycle: - improve the PCI DMA paranoia logic in the EFI stub - some constification changes - add statfs support to efivarfs - allow user space to enumerate updatable firmware resources without CAP_SYS_ADMIN" * tag 'efi-next-for-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi/libstub: Disable PCI DMA before grabbing the EFI memory map efi/esrt: Allow ESRT access without CAP_SYS_ADMIN efivarfs: expose used and total size efi: make kobj_type structure constant efi: x86: make kobj_type structure constant
2023-06-26Merge tag 'x86_cc_for_v6.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 confidential computing update from Borislav Petkov: - Add support for unaccepted memory as specified in the UEFI spec v2.9. The gist of it all is that Intel TDX and AMD SEV-SNP confidential computing guests define the notion of accepting memory before using it and thus preventing a whole set of attacks against such guests like memory replay and the like. There are a couple of strategies of how memory should be accepted - the current implementation does an on-demand way of accepting. * tag 'x86_cc_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: virt: sevguest: Add CONFIG_CRYPTO dependency x86/efi: Safely enable unaccepted memory in UEFI x86/sev: Add SNP-specific unaccepted memory support x86/sev: Use large PSC requests if applicable x86/sev: Allow for use of the early boot GHCB for PSC requests x86/sev: Put PSC struct on the stack in prep for unaccepted memory support x86/sev: Fix calculation of end address based on number of pages x86/tdx: Add unaccepted memory support x86/tdx: Refactor try_accept_one() x86/tdx: Make _tdx_hypercall() and __tdx_module_call() available in boot stub efi/unaccepted: Avoid load_unaligned_zeropad() stepping into unaccepted memory efi: Add unaccepted memory support x86/boot/compressed: Handle unaccepted memory efi/libstub: Implement support for unaccepted memory efi/x86: Get full memory map in allocate_e820() mm: Add support for unaccepted memory
2023-06-21Revert "efi: random: refresh non-volatile random seed when RNG is initialized"Linus Torvalds
This reverts commit e7b813b32a42a3a6281a4fd9ae7700a0257c1d50 (and the subsequent fix for it: 41a15855c1ee "efi: random: fix NULL-deref when refreshing seed"). It turns otu to cause non-deterministic boot stalls on at least a HP 6730b laptop. Reported-and-bisected-by: Sami Korkalainen <sami.korkalainen@proton.me> Link: https://lore.kernel.org/all/GQUnKz2al3yke5mB2i1kp3SzNHjK8vi6KJEh7rnLrOQ24OrlljeCyeWveLW9pICEmB9Qc8PKdNt3w1t_g3-Uvxq1l8Wj67PpoMeWDoH8PKk=@proton.me/ Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-06efi: Add unaccepted memory supportKirill A. Shutemov
efi_config_parse_tables() reserves memory that holds unaccepted memory configuration table so it won't be reused by page allocator. Core-mm requires few helpers to support unaccepted memory: - accept_memory() checks the range of addresses against the bitmap and accept memory if needed. - range_contains_unaccepted_memory() checks if anything within the range requires acceptance. Architectural code has to provide efi_get_unaccepted_table() that returns pointer to the unaccepted memory configuration table. arch_accept_memory() handles arch-specific part of memory acceptance. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20230606142637.5171-6-kirill.shutemov@linux.intel.com
2023-06-06efi/libstub: Implement support for unaccepted memoryKirill A. Shutemov
UEFI Specification version 2.9 introduces the concept of memory acceptance: Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, requiring memory to be accepted before it can be used by the guest. Accepting happens via a protocol specific for the Virtual Machine platform. Accepting memory is costly and it makes VMM allocate memory for the accepted guest physical address range. It's better to postpone memory acceptance until memory is needed. It lowers boot time and reduces memory overhead. The kernel needs to know what memory has been accepted. Firmware communicates this information via memory map: a new memory type -- EFI_UNACCEPTED_MEMORY -- indicates such memory. Range-based tracking works fine for firmware, but it gets bulky for the kernel: e820 (or whatever the arch uses) has to be modified on every page acceptance. It leads to table fragmentation and there's a limited number of entries in the e820 table. Another option is to mark such memory as usable in e820 and track if the range has been accepted in a bitmap. One bit in the bitmap represents a naturally aligned power-2-sized region of address space -- unit. For x86, unit size is 2MiB: 4k of the bitmap is enough to track 64GiB or physical address space. In the worst-case scenario -- a huge hole in the middle of the address space -- It needs 256MiB to handle 4PiB of the address space. Any unaccepted memory that is not aligned to unit_size gets accepted upfront. The bitmap is allocated and constructed in the EFI stub and passed down to the kernel via EFI configuration table. allocate_e820() allocates the bitmap if unaccepted memory is present, according to the size of unaccepted region. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20230606142637.5171-4-kirill.shutemov@linux.intel.com
2023-05-17efivarfs: expose used and total sizeAnisse Astier
When writing EFI variables, one might get errors with no other message on why it fails. Being able to see how much is used by EFI variables helps analyzing such issues. Since this is not a conventional filesystem, block size is intentionally set to 1 instead of PAGE_SIZE. x86 quirks of reserved size are taken into account; so that available and free size can be different, further helping debugging space issues. With this patch, one can see the remaining space in EFI variable storage via efivarfs, like this: $ df -h /sys/firmware/efi/efivars/ Filesystem Size Used Avail Use% Mounted on efivarfs 176K 106K 66K 62% /sys/firmware/efi/efivars Signed-off-by: Anisse Astier <an.astier@criteo.com> [ardb: - rename efi_reserved_space() to efivar_reserved_space() - whitespace/coding style tweaks] Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-02-23Merge tag 'efi-next-for-v6.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: "A healthy mix of EFI contributions this time: - Performance tweaks for efifb earlycon (Andy) - Preparatory refactoring and cleanup work in the efivar layer, which is needed to accommodate the Snapdragon arm64 laptops that expose their EFI variable store via a TEE secure world API (Johan) - Enhancements to the EFI memory map handling so that Xen dom0 can safely access EFI configuration tables (Demi Marie) - Wire up the newly introduced IBT/BTI flag in the EFI memory attributes table, so that firmware that is generated with ENDBR/BTI landing pads will be mapped with enforcement enabled - Clean up how we check and print the EFI revision exposed by the firmware - Incorporate EFI memory attributes protocol definition and wire it up in the EFI zboot code (Evgeniy) This ensures that these images can execute under new and stricter rules regarding the default memory permissions for EFI page allocations (More work is in progress here) - CPER header cleanup (Dan Williams) - Use a raw spinlock to protect the EFI runtime services stack on arm64 to ensure the correct semantics under -rt (Pierre) - EFI framebuffer quirk for Lenovo Ideapad (Darrell)" * tag 'efi-next-for-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (24 commits) firmware/efi sysfb_efi: Add quirk for Lenovo IdeaPad Duet 3 arm64: efi: Make efi_rt_lock a raw_spinlock efi: Add mixed-mode thunk recipe for GetMemoryAttributes efi: x86: Wire up IBT annotation in memory attributes table efi: arm64: Wire up BTI annotation in memory attributes table efi: Discover BTI support in runtime services regions efi/cper, cxl: Remove cxl_err.h efi: Use standard format for printing the EFI revision efi: Drop minimum EFI version check at boot efi: zboot: Use EFI protocol to remap code/data with the right attributes efi/libstub: Add memory attribute protocol definitions efi: efivars: prevent double registration efi: verify that variable services are supported efivarfs: always register filesystem efi: efivars: add efivars printk prefix efi: Warn if trying to reserve memory under Xen efi: Actually enable the ESRT under Xen efi: Apply allowlist to EFI configuration tables when running under Xen efi: xen: Implement memory descriptor lookup based on hypercall efi: memmap: Disregard bogus entries instead of returning them ...
2023-02-03efi: Use standard format for printing the EFI revisionArd Biesheuvel
The UEFI spec section 4.2.1 describes the way the human readable EFI revision should be constructed from the 32-bit revision field in the system table: The upper 16 bits of this field contain the major revision value, and the lower 16 bits contain the minor revision value. The minor revision values are binary coded decimals and are limited to the range of 00..99. When printed or displayed UEFI spec revision is referred as (Major revision).(Minor revision upper decimal).(Minor revision lower decimal) or (Major revision).(Minor revision upper decimal) in case Minor revision lower decimal is set to 0. Let's adhere to this when logging the EFI revision to the kernel log. Note that the bit about binary coded decimals is bogus, and the minor revision lower decimal is simply the minor revision modulo 10, given the symbolic definitions provided by the spec itself: #define EFI_2_40_SYSTEM_TABLE_REVISION ((2<<16) | (40)) #define EFI_2_31_SYSTEM_TABLE_REVISION ((2<<16) | (31)) #define EFI_2_30_SYSTEM_TABLE_REVISION ((2<<16) | (30)) Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-02-03efi: Drop minimum EFI version check at bootArd Biesheuvel
We currently pass a minimum major version to the generic EFI helper that checks the system table magic and version, and refuse to boot if the value is lower. The motivation for this check is unknown, and even the code that uses major version 2 as the minimum (ARM, arm64 and RISC-V) should make it past this check without problems, and boot to a point where we have access to a console or some other means to inform the user that the firmware's major revision number made us unhappy. (Revision 2.0 of the UEFI specification was released in January 2006, whereas ARM, arm64 and RISC-V support where added in 2009, 2013 and 2017, respectively, so checking for major version 2 or higher is completely arbitrary) So just drop the check. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-02-03efi: fix potential NULL deref in efi_mem_reserve_persistentAnton Gusev
When iterating on a linked list, a result of memremap is dereferenced without checking it for NULL. This patch adds a check that falls back on allocating a new page in case memremap doesn't succeed. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 18df7577adae ("efi/memreserve: deal with memreserve entries in unmapped memory") Signed-off-by: Anton Gusev <aagusev@ispras.ru> [ardb: return -ENOMEM instead of breaking out of the loop] Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-26efi: verify that variable services are supportedJohan Hovold
Current Qualcomm UEFI firmware does not implement the variable services but not all revisions clear the corresponding bits in the RT_PROP table services mask and instead the corresponding calls return EFI_UNSUPPORTED. This leads to efi core registering the generic efivar ops even when the variable services are not supported or when they are accessed through some other interface (e.g. Google SMI or the upcoming Qualcomm SCM implementation). Instead of playing games with init call levels to make sure that the custom implementations are registered after the generic one, make sure that get_next_variable() is actually supported before registering the generic ops. Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-23efi: Warn if trying to reserve memory under XenDemi Marie Obenour
Doing so cannot work and should never happen. Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-23efi: Apply allowlist to EFI configuration tables when running under XenDemi Marie Obenour
As it turns out, Xen does not guarantee that EFI boot services data regions in memory are preserved, which means that EFI configuration tables pointing into such memory regions may be corrupted before the dom0 OS has had a chance to inspect them. This is causing problems for Qubes OS when it attempts to perform system firmware updates, which requires that the contents of the EFI System Resource Table are valid when the fwupd userspace program runs. However, other configuration tables such as the memory attributes table or the runtime properties table are equally affected, and so we need a comprehensive workaround that works for any table type. So when running under Xen, check the EFI memory descriptor covering the start of the table, and disregard the table if it does not reside in memory that is preserved by Xen. Co-developed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-22efi: xen: Implement memory descriptor lookup based on hypercallDemi Marie Obenour
Xen on x86 boots dom0 in EFI mode but without providing a memory map. This means that some consistency checks we would like to perform on configuration tables or other data structures in memory are not currently possible. Xen does, however, expose EFI memory descriptor info via a Xen hypercall, so let's wire that up instead. It turns out that the returned information is not identical to what Linux's efi_mem_desc_lookup would return: the address returned is the address passed to the hypercall, and the size returned is the number of bytes remaining in the configuration table. However, none of the callers of efi_mem_desc_lookup() currently care about this. In the future, Xen may gain a hypercall that returns the actual start address, which can be used instead. Co-developed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-22efi: memmap: Disregard bogus entries instead of returning themDemi Marie Obenour
The ESRT code currently contains two consistency checks on the memory descriptor it obtains, but one of them is both incomplete and can only trigger on invalid descriptors. So let's drop these checks, and instead disregard descriptors entirely if the start address is misaligned, or if the number of pages reaches to or beyond the end of the address space. Note that the memory map as a whole could still be inconsistent: multiple entries might cover the same area, or the address could be outside of the addressable PA space, but validating that goes beyond the scope of these helpers. Also note that since the physical address space is never 64-bits wide, a descriptor that includes the last page of memory is not valid. This is fortunate, since it means that a valid physical address will never be an error pointer and that the length of a memory descriptor in bytes will fit in a 64-bit unsigned integer. Co-developed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-17efi: efivars: drop kobject from efivars_register()Johan Hovold
Since commit 0f5b2c69a4cb ("efi: vars: Remove deprecated 'efivars' sysfs interface") and the removal of the sysfs interface there are no users of the efivars kobject. Drop the kobject argument from efivars_register() and add a new efivar_is_available() helper in favour of the old efivars_kobject(). Note that the new helper uses the prefix 'efivar' (i.e. without an 's') for consistency with efivar_supports_writes() and the rest of the interface (except the registration functions). For the benefit of drivers with optional EFI support, also provide a dummy implementation of efivar_is_available(). Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-03efi: fix NULL-deref in init error pathJohan Hovold
In cases where runtime services are not supported or have been disabled, the runtime services workqueue will never have been allocated. Do not try to destroy the workqueue unconditionally in the unlikely event that EFI initialisation fails to avoid dereferencing a NULL pointer. Fixes: 98086df8b70c ("efi: add missed destroy_workqueue when efisubsys_init fails") Cc: stable@vger.kernel.org Cc: Li Heng <liheng40@huawei.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-12-20efi: random: fix NULL-deref when refreshing seedJohan Hovold
Do not try to refresh the RNG seed in case the firmware does not support setting variables. This is specifically needed to prevent a NULL-pointer dereference on the Lenovo X13s with some firmware revisions, or more generally, whenever the runtime services have been disabled (e.g. efi=noruntime or with PREEMPT_RT). Fixes: e7b813b32a42 ("efi: random: refresh non-volatile random seed when RNG is initialized") Reported-by: Steev Klimaszewski <steev@kali.org> Reported-by: Bjorn Andersson <andersson@kernel.org> Tested-by: Steev Klimaszewski <steev@kali.org> Tested-by: Andrew Halaney <ahalaney@redhat.com> # sc8280xp-lenovo-thinkpad-x13s Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-12-13Merge tag 'efi-next-for-v6.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: "Another fairly sizable pull request, by EFI subsystem standards. Most of the work was done by me, some of it in collaboration with the distro and bootloader folks (GRUB, systemd-boot), where the main focus has been on removing pointless per-arch differences in the way EFI boots a Linux kernel. - Refactor the zboot code so that it incorporates all the EFI stub logic, rather than calling the decompressed kernel as a EFI app. - Add support for initrd= command line option to x86 mixed mode. - Allow initrd= to be used with arbitrary EFI accessible file systems instead of just the one the kernel itself was loaded from. - Move some x86-only handling and manipulation of the EFI memory map into arch/x86, as it is not used anywhere else. - More flexible handling of any random seeds provided by the boot environment (i.e., systemd-boot) so that it becomes available much earlier during the boot. - Allow improved arch-agnostic EFI support in loaders, by setting a uniform baseline of supported features, and adding a generic magic number to the DOS/PE header. This should allow loaders such as GRUB or systemd-boot to reduce the amount of arch-specific handling substantially. - (arm64) Run EFI runtime services from a dedicated stack, and use it to recover from synchronous exceptions that might occur in the firmware code. - (arm64) Ensure that we don't allocate memory outside of the 48-bit addressable physical range. - Make EFI pstore record size configurable - Add support for decoding CXL specific CPER records" * tag 'efi-next-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (43 commits) arm64: efi: Recover from synchronous exceptions occurring in firmware arm64: efi: Execute runtime services from a dedicated stack arm64: efi: Limit allocations to 48-bit addressable physical region efi: Put Linux specific magic number in the DOS header efi: libstub: Always enable initrd command line loader and bump version efi: stub: use random seed from EFI variable efi: vars: prohibit reading random seed variables efi: random: combine bootloader provided RNG seed with RNG protocol output efi/cper, cxl: Decode CXL Error Log efi/cper, cxl: Decode CXL Protocol Error Section efi: libstub: fix efi_load_initrd_dev_path() kernel-doc comment efi: x86: Move EFI runtime map sysfs code to arch/x86 efi: runtime-maps: Clarify purpose and enable by default for kexec efi: pstore: Add module parameter for setting the record size efi: xen: Set EFI_PARAVIRT for Xen dom0 boot on all architectures efi: memmap: Move manipulation routines into x86 arch tree efi: memmap: Move EFI fake memmap support into x86 arch tree efi: libstub: Undeprecate the command line initrd loader efi: libstub: Add mixed mode support to command line initrd loader efi: libstub: Permit mixed mode return types other than efi_status_t ...
2022-11-22efi: random: refresh non-volatile random seed when RNG is initializedJason A. Donenfeld
EFI has a rather unique benefit that it has access to some limited non-volatile storage, where the kernel can store a random seed. Register a notification for when the RNG is initialized, and at that point, store a new random seed. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-11-18efi: random: combine bootloader provided RNG seed with RNG protocol outputArd Biesheuvel
Instead of blindly creating the EFI random seed configuration table if the RNG protocol is implemented and works, check whether such a EFI configuration table was provided by an earlier boot stage and if so, concatenate the existing and the new seeds, leaving it up to the core code to mix it in and credit it the way it sees fit. This can be used for, e.g., systemd-boot, to pass an additional seed to Linux in a way that can be consumed by the kernel very early. In that case, the following definitions should be used to pass the seed to the EFI stub: struct linux_efi_random_seed { u32 size; // of the 'seed' array in bytes u8 seed[]; }; The memory for the struct must be allocated as EFI_ACPI_RECLAIM_MEMORY pool memory, and the address of the struct in memory should be installed as a EFI configuration table using the following GUID: LINUX_EFI_RANDOM_SEED_TABLE_GUID 1ce1e5bc-7ceb-42f2-81e5-8aadf180f57b Note that doing so is safe even on kernels that were built without this patch applied, but the seed will simply be overwritten with a seed derived from the EFI RNG protocol, if available. The recommended seed size is 32 bytes, and seeds larger than 512 bytes are considered corrupted and ignored entirely. In order to preserve forward secrecy, seeds from previous bootloaders are memzero'd out, and in order to preserve memory, those older seeds are also freed from memory. Freeing from memory without first memzeroing is not safe to do, as it's possible that nothing else will ever overwrite those pages used by EFI. Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> [ardb: incorporate Jason's followup changes to extend the maximum seed size on the consumer end, memzero() it and drop a needless printk] Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18efi: x86: Move EFI runtime map sysfs code to arch/x86Ard Biesheuvel
The EFI runtime map code is only wired up on x86, which is the only architecture that has a need for it in its implementation of kexec. So let's move this code under arch/x86 and drop all references to it from generic code. To ensure that the efi_runtime_map_init() is invoked at the appropriate time use a 'sync' subsys_initcall() that will be called right after the EFI initcall made from generic code where the original invocation of efi_runtime_map_init() resided. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Dave Young <dyoung@redhat.com>
2022-11-09efi: libstub: Move screen_info handling to common codeArd Biesheuvel
Currently, arm64, RISC-V and LoongArch rely on the fact that struct screen_info can be accessed directly, due to the fact that the EFI stub and the core kernel are part of the same image. This will change after a future patch, so let's ensure that the screen_info handling is able to deal with this, by adopting the arm32 approach of passing it as a configuration table. While at it, switch to ACPI reclaim memory to hold the screen_info data, which is more appropriate for this kind of allocation. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-10-24efi: random: reduce seed size to 32 bytesArd Biesheuvel
We no longer need at least 64 bytes of random seed to permit the early crng init to complete. The RNG is now based on Blake2s, so reduce the EFI seed size to the Blake2s hash size, which is sufficient for our purposes. While at it, drop the READ_ONCE(), which was supposed to prevent size from being evaluated after seed was unmapped. However, this cannot actually happen, so READ_ONCE() is unnecessary here. Cc: <stable@vger.kernel.org> # v4.14+ Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
2022-10-21efi: ssdt: Don't free memory if ACPI table was loaded successfullyArd Biesheuvel
Amadeusz reports KASAN use-after-free errors introduced by commit 3881ee0b1edc ("efi: avoid efivars layer when loading SSDTs from variables"). The problem appears to be that the memory that holds the new ACPI table is now freed unconditionally, instead of only when the ACPI core reported a failure to load the table. So let's fix this, by omitting the kfree() on success. Cc: <stable@vger.kernel.org> # v6.0 Link: https://lore.kernel.org/all/a101a10a-4fbb-5fae-2e3c-76cf96ed8fbd@linux.intel.com/ Fixes: 3881ee0b1edc ("efi: avoid efivars layer when loading SSDTs from variables") Reported-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-10-10Merge tag 'mm-stable-2022-10-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in linux-next for a couple of months without, to my knowledge, any negative reports (or any positive ones, come to that). - Also the Maple Tree from Liam Howlett. An overlapping range-based tree for vmas. It it apparently slightly more efficient in its own right, but is mainly targeted at enabling work to reduce mmap_lock contention. Liam has identified a number of other tree users in the kernel which could be beneficially onverted to mapletrees. Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat at [1]. This has yet to be addressed due to Liam's unfortunately timed vacation. He is now back and we'll get this fixed up. - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses clang-generated instrumentation to detect used-unintialized bugs down to the single bit level. KMSAN keeps finding bugs. New ones, as well as the legacy ones. - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of memory into THPs. - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support file/shmem-backed pages. - userfaultfd updates from Axel Rasmussen - zsmalloc cleanups from Alexey Romanov - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure - Huang Ying adds enhancements to NUMA balancing memory tiering mode's page promotion, with a new way of detecting hot pages. - memcg updates from Shakeel Butt: charging optimizations and reduced memory consumption. - memcg cleanups from Kairui Song. - memcg fixes and cleanups from Johannes Weiner. - Vishal Moola provides more folio conversions - Zhang Yi removed ll_rw_block() :( - migration enhancements from Peter Xu - migration error-path bugfixes from Huang Ying - Aneesh Kumar added ability for a device driver to alter the memory tiering promotion paths. For optimizations by PMEM drivers, DRM drivers, etc. - vma merging improvements from Jakub Matěn. - NUMA hinting cleanups from David Hildenbrand. - xu xin added aditional userspace visibility into KSM merging activity. - THP & KSM code consolidation from Qi Zheng. - more folio work from Matthew Wilcox. - KASAN updates from Andrey Konovalov. - DAMON cleanups from Kaixu Xia. - DAMON work from SeongJae Park: fixes, cleanups. - hugetlb sysfs cleanups from Muchun Song. - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core. Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1] * tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits) hugetlb: allocate vma lock for all sharable vmas hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer hugetlb: fix vma lock handling during split vma and range unmapping mglru: mm/vmscan.c: fix imprecise comments mm/mglru: don't sync disk for each aging cycle mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol mm: memcontrol: use do_memsw_account() in a few more places mm: memcontrol: deprecate swapaccounting=0 mode mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled mm/secretmem: remove reduntant return value mm/hugetlb: add available_huge_pages() func mm: remove unused inline functions from include/linux/mm_inline.h selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd selftests/vm: add thp collapse shmem testing selftests/vm: add thp collapse file and tmpfs testing selftests/vm: modularize thp collapse memory operations selftests/vm: dedup THP helpers mm/khugepaged: add tracepoint to hpage_collapse_scan_file() mm/madvise: add file and shmem support to MADV_COLLAPSE ...
2022-09-27efi: libstub: unify initrd loading between architecturesArd Biesheuvel
Use a EFI configuration table to pass the initrd to the core kernel, instead of per-arch methods. This cleans up the code considerably, and should make it easier for architectures to get rid of their reliance on DT for doing EFI boot in the future. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-09-26mm: remove rb tree.Liam R. Howlett
Remove the RB tree and start using the maple tree for vm_area_struct tracking. Drop validate_mm() calls in expand_upwards() and expand_downwards() as the lock is not held. Link: https://lkml.kernel.org/r/20220906194824.2110408-18-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26mm: start tracking VMAs with maple treeLiam R. Howlett
Start tracking the VMAs with the new maple tree structure in parallel with the rb_tree. Add debug and trace events for maple tree operations and duplicate the rb_tree that is created on forks into the maple tree. The maple tree is added to the mm_struct including the mm_init struct, added support in required mm/mmap functions, added tracking in kernel/fork for process forking, and used to find the unmapped_area and checked against what the rbtree finds. This also moves the mmap_lock() in exit_mmap() since the oom reaper call does walk the VMAs. Otherwise lockdep will be unhappy if oom happens. When splitting a vma fails due to allocations of the maple tree nodes, the error path in __split_vma() calls new->vm_ops->close(new). The page accounting for hugetlb is actually in the close() operation, so it accounts for the removal of 1/2 of the VMA which was not adjusted. This results in a negative exit value. To avoid the negative charge, set vm_start = vm_end and vm_pgoff = 0. There is also a potential accounting issue in special mappings from insert_vm_struct() failing to allocate, so reverse the charge there in the failure scenario. Link: https://lkml.kernel.org/r/20220906194824.2110408-9-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-24efi: vars: Move efivar caching layer into efivarfsArd Biesheuvel
Move the fiddly bits of the efivar layer into its only remaining user, efivarfs, and confine its use to that particular module. All other uses of the EFI variable store have no need for this additional layer of complexity, given that they either only read variables, or read and write variables into a separate GUIDed namespace, and cannot be used to manipulate EFI variables that are covered by the EFI spec and/or affect the boot flow. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-06-20efi: avoid efivars layer when loading SSDTs from variablesArd Biesheuvel
The efivars intermediate variable access layer provides an abstraction that permits the EFI variable store to be replaced by something else that implements a compatible interface, and caches all variables in the variable store for fast access via the efivarfs pseudo-filesystem. The SSDT override feature does not take advantage of either feature, as it is only used when the generic EFI implementation of efivars is used, and it traverses all variables only once to find the ones it is interested in, and frees all data structures that the efivars layer keeps right after. So in this case, let's just call EFI's code directly, using the function pointers in struct efi. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-06-15efi: Make code to find mirrored memory ranges genericMa Wupeng
Commit b05b9f5f9dcf ("x86, mirror: x86 enabling - find mirrored memory ranges") introduce the efi_find_mirror() function on x86. In order to reuse the API we make it public. Arm64 can support mirrored memory too, so function efi_find_mirror() is added to efi_init() to this support for arm64. Since efi_init() is shared by ARM, arm64 and riscv, this patch will bring mirror memory support for these architectures, but this support is only tested in arm64. Signed-off-by: Ma Wupeng <mawupeng1@huawei.com> Link: https://lore.kernel.org/r/20220614092156.1972846-2-mawupeng1@huawei.com [ardb: fix subject to better reflect the payload] Acked-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-04-13efi: Register efi_secret platform device if EFI secret area is declaredDov Murik
During efi initialization, check if coco_secret is defined in the EFI configuration table; in such case, register platform device "efi_secret". This allows udev to automatically load the efi_secret module (platform driver), which in turn will populate the <securityfs>/secrets/coco directory in guests into which secrets were injected. Note that a declared address of an EFI secret area doesn't mean that secrets where indeed injected to that area; if the secret area is not populated, the driver will not load (but the platform device will still be registered). Signed-off-by: Dov Murik <dovmurik@linux.ibm.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Link: https://lore.kernel.org/r/20220412212127.154182-4-dovmurik@linux.ibm.com Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-04-13efi: Save location of EFI confidential computing areaDov Murik
Confidential computing (coco) hardware such as AMD SEV (Secure Encrypted Virtualization) allows a guest owner to inject secrets into the VMs memory without the host/hypervisor being able to read them. Firmware support for secret injection is available in OVMF, which reserves a memory area for secret injection and includes a pointer to it the in EFI config table entry LINUX_EFI_COCO_SECRET_TABLE_GUID. If EFI exposes such a table entry, uefi_init() will keep a pointer to the EFI config table entry in efi.coco_secret, so it can be used later by the kernel (specifically drivers/virt/coco/efi_secret). It will also appear in the kernel log as "CocoSecret=ADDRESS"; for example: [ 0.000000] efi: EFI v2.70 by EDK II [ 0.000000] efi: CocoSecret=0x7f22e680 SMBIOS=0x7f541000 ACPI=0x7f77e000 ACPI 2.0=0x7f77e014 MEMATTR=0x7ea0c018 The new functionality can be enabled with CONFIG_EFI_COCO_SECRET=y. Signed-off-by: Dov Murik <dovmurik@linux.ibm.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Link: https://lore.kernel.org/r/20220412212127.154182-2-dovmurik@linux.ibm.com Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-04-13efi: Allow to enable EFI runtime services by default on RTJavier Martinez Canillas
Commit d9f283ae71af ("efi: Disable runtime services on RT") disabled EFI runtime services by default when the CONFIG_PREEMPT_RT option is enabled. The rationale for that commit is that some EFI calls could take too much time, leading to large latencies which is an issue for Real-Time kernels. But a side effect of that change was that now is not possible anymore to enable the EFI runtime services by default when CONFIG_PREEMPT_RT is set, without passing an efi=runtime command line parameter to the kernel. Instead, let's add a new EFI_DISABLE_RUNTIME boolean Kconfig option, that would be set to n by default but to y if CONFIG_PREEMPT_RT is enabled. That way, the current behaviour is preserved but gives users a mechanism to enable the EFI runtimes services in their kernels if that is required. For example, if the firmware could guarantee bounded time for EFI calls. Also, having a separate boolean config could allow users to disable the EFI runtime services by default even when CONFIG_PREEMPT_RT is not set. Reported-by: Alexander Larsson <alexl@redhat.com> Fixes: d9f283ae71af ("efi: Disable runtime services on RT") Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://lore.kernel.org/r/20220331151654.184433-1-javierm@redhat.com Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-03-01efi: fix return value of __setup handlersRandy Dunlap
When "dump_apple_properties" is used on the kernel boot command line, it causes an Unknown parameter message and the string is added to init's argument strings: Unknown kernel command line parameters "dump_apple_properties BOOT_IMAGE=/boot/bzImage-517rc6 efivar_ssdt=newcpu_ssdt", will be passed to user space. Run /sbin/init as init process with arguments: /sbin/init dump_apple_properties with environment: HOME=/ TERM=linux BOOT_IMAGE=/boot/bzImage-517rc6 efivar_ssdt=newcpu_ssdt Similarly when "efivar_ssdt=somestring" is used, it is added to the Unknown parameter message and to init's environment strings, polluting them (see examples above). Change the return value of the __setup functions to 1 to indicate that the __setup options have been handled. Fixes: 58c5475aba67 ("x86/efi: Retrieve and assign Apple device properties") Fixes: 475fb4e8b2f4 ("efi / ACPI: load SSTDs from EFI variables") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru> Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru Cc: Ard Biesheuvel <ardb@kernel.org> Cc: linux-efi@vger.kernel.org Cc: Lukas Wunner <lukas@wunner.de> Cc: Octavian Purdila <octavian.purdila@intel.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Link: https://lore.kernel.org/r/20220301041851.12459-1-rdunlap@infradead.org Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-01-23efi: runtime: avoid EFIv2 runtime services on Apple x86 machinesArd Biesheuvel
Aditya reports [0] that his recent MacbookPro crashes in the firmware when using the variable services at runtime. The culprit appears to be a call to QueryVariableInfo(), which we did not use to call on Apple x86 machines in the past as they only upgraded from EFI v1.10 to EFI v2.40 firmware fairly recently, and QueryVariableInfo() (along with UpdateCapsule() et al) was added in EFI v2.00. The only runtime service introduced in EFI v2.00 that we actually use in Linux is QueryVariableInfo(), as the capsule based ones are optional, generally not used at runtime (all the LVFS/fwupd firmware update infrastructure uses helper EFI programs that invoke capsule update at boot time, not runtime), and not implemented by Apple machines in the first place. QueryVariableInfo() is used to 'safely' set variables, i.e., only when there is enough space. This prevents machines with buggy firmwares from corrupting their NVRAMs when they run out of space. Given that Apple machines have been using EFI v1.10 services only for the longest time (the EFI v2.0 spec was released in 2006, and Linux support for the newly introduced runtime services was added in 2011, but the MacbookPro12,1 released in 2015 still claims to be EFI v1.10 only), let's avoid the EFI v2.0 ones on all Apple x86 machines. [0] https://lore.kernel.org/all/6D757C75-65B1-468B-842D-10410081A8E4@live.com/ Cc: <stable@vger.kernel.org> Cc: Jeremy Kerr <jk@ozlabs.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Reported-by: Aditya Garg <gargaditya08@live.com> Tested-by: Orlando Chamberlain <redecorating@protonmail.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Aditya Garg <gargaditya08@live.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=215277
2021-09-28efi: Allow efi=runtimeSebastian Andrzej Siewior
In case the command line option "efi=noruntime" is default at built-time, the user could overwrite its state by `efi=runtime' and allow it again. This is useful on PREEMPT_RT where "efi=noruntime" is default and the user might need to alter the boot order for instance. Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2021-09-28efi: Disable runtime services on RTSebastian Andrzej Siewior
Based on measurements the EFI functions get_variable / get_next_variable take up to 2us which looks okay. The functions get_time, set_time take around 10ms. These 10ms are too much. Even one ms would be too much. Ard mentioned that SetVariable might even trigger larger latencies if the firmware will erase flash blocks on NOR. The time-functions are used by efi-rtc and can be triggered during run-time (either via explicit read/write or ntp sync). The variable write could be used by pstore. These functions can be disabled without much of a loss. The poweroff / reboot hooks may be provided by PSCI. Disable EFI's runtime wrappers on PREEMPT_RT. This was observed on "EFI v2.60 by SoftIron Overdrive 1000". Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2021-07-16firmware/efi: Tell memblock about EFI iomem reservationsMarc Zyngier
kexec_load_file() relies on the memblock infrastructure to avoid stamping over regions of memory that are essential to the survival of the system. However, nobody seems to agree how to flag these regions as reserved, and (for example) EFI only publishes its reservations in /proc/iomem for the benefit of the traditional, userspace based kexec tool. On arm64 platforms with GICv3, this can result in the payload being placed at the location of the LPI tables. Shock, horror! Let's augment the EFI reservation code with a memblock_reserve() call, protecting our dear tables from the secondary kernel invasion. Reported-by: Moritz Fischer <mdf@kernel.org> Tested-by: Moritz Fischer <mdf@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Cc: Ard Biesheuvel <ardb@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2021-03-19firmware/efi: Fix a use after bug in efi_mem_reserve_persistentLv Yunlong
In the for loop in efi_mem_reserve_persistent(), prsv = rsv->next use the unmapped rsv. Use the unmapped pages will cause segment fault. Fixes: 18df7577adae6 ("efi/memreserve: deal with memreserve entries in unmapped memory") Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-12-15mm/gup: prevent gup_fast from racing with COW during forkJason Gunthorpe
Since commit 70e806e4e645 ("mm: Do early cow for pinned pages during fork() for ptes") pages under a FOLL_PIN will not be write protected during COW for fork. This means that pages returned from pin_user_pages(FOLL_WRITE) should not become write protected while the pin is active. However, there is a small race where get_user_pages_fast(FOLL_PIN) can establish a FOLL_PIN at the same time copy_present_page() is write protecting it: CPU 0 CPU 1 get_user_pages_fast() internal_get_user_pages_fast() copy_page_range() pte_alloc_map_lock() copy_present_page() atomic_read(has_pinned) == 0 page_maybe_dma_pinned() == false atomic_set(has_pinned, 1); gup_pgd_range() gup_pte_range() pte_t pte = gup_get_pte(ptep) pte_access_permitted(pte) try_grab_compound_head() pte = pte_wrprotect(pte) set_pte_at(); pte_unmap_unlock() // GUP now returns with a write protected page The first attempt to resolve this by using the write protect caused problems (and was missing a barrrier), see commit f3c64eda3e50 ("mm: avoid early COW write protect games during fork()") Instead wrap copy_p4d_range() with the write side of a seqcount and check the read side around gup_pgd_range(). If there is a collision then get_user_pages_fast() fails and falls back to slow GUP. Slow GUP is safe against this race because copy_page_range() is only called while holding the exclusive side of the mmap_lock on the src mm_struct. [akpm@linux-foundation.org: coding style fixes] Link: https://lore.kernel.org/r/CAHk-=wi=iCnYCARbPGjkVJu9eyYeZ13N64tZYLdOB8CP5Q_PLw@mail.gmail.com Link: https://lkml.kernel.org/r/2-v4-908497cf359a+4782-gup_fork_jgg@nvidia.com Fixes: f3c64eda3e50 ("mm: avoid early COW write protect games during fork()") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: "Ahmed S. Darwish" <a.darwish@linutronix.de> [seqcount_t parts] Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kirill Shutemov <kirill@shutemov.name> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Leon Romanovsky <leonro@nvidia.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-25efi/efivars: Set generic ops before loading SSDTAmadeusz Sławiński
Efivars allows for overriding of SSDT tables, however starting with commit bf67fad19e493b ("efi: Use more granular check for availability for variable services") this use case is broken. When loading SSDT generic ops should be set first, however mentioned commit reversed order of operations. Fix this by restoring original order of operations. Fixes: bf67fad19e493b ("efi: Use more granular check for availability for variable services") Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> Link: https://lore.kernel.org/r/20201123172817.124146-1-amadeuszx.slawinski@linux.intel.com Tested-by: Cezary Rojewski <cezary.rojewski@intel.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-10-12Merge branch 'efi/urgent' into efi/core, to pick up fixesIngo Molnar
These fixes missed the v5.9 merge window, pick them up for early v5.10 merge. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-09-25efi: Add definition of EFI_MEMORY_CPU_CRYPTO and ability to report itArd Biesheuvel
Incorporate the definition of EFI_MEMORY_CPU_CRYPTO from the UEFI specification v2.8, and wire it into our memory map dumping routine as well. To make a bit of space in the output buffer, which is provided by the various callers, shorten the descriptive names of the memory types. Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>