Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
- cpuset isolation improvements
- cpuset cgroup1 support is split into its own file behind the new
config option CONFIG_CPUSET_V1. This makes it the second controller
which makes cgroup1 support optional after memcg
- Handling of unavailable v1 controller handling improved during
cgroup1 mount operations
- union_find applied to cpuset. It makes code simpler and more
efficient
- Reduce spurious events in pids.events
- Cleanups and other misc changes
- Contains a merge of cgroup/for-6.11-fixes to receive cpuset fixes
that further changes build upon
* tag 'cgroup-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (34 commits)
cgroup: Do not report unavailable v1 controllers in /proc/cgroups
cgroup: Disallow mounting v1 hierarchies without controller implementation
cgroup/cpuset: Expose cpuset filesystem with cpuset v1 only
cgroup/cpuset: Move cpu.h include to cpuset-internal.h
cgroup/cpuset: add sefltest for cpuset v1
cgroup/cpuset: guard cpuset-v1 code under CONFIG_CPUSETS_V1
cgroup/cpuset: rename functions shared between v1 and v2
cgroup/cpuset: move v1 interfaces to cpuset-v1.c
cgroup/cpuset: move validate_change_legacy to cpuset-v1.c
cgroup/cpuset: move legacy hotplug update to cpuset-v1.c
cgroup/cpuset: add callback_lock helper
cgroup/cpuset: move memory_spread to cpuset-v1.c
cgroup/cpuset: move relax_domain_level to cpuset-v1.c
cgroup/cpuset: move memory_pressure to cpuset-v1.c
cgroup/cpuset: move common code to cpuset-internal.h
cgroup/cpuset: introduce cpuset-v1.c
selftest/cgroup: Make test_cpuset_prs.sh deal with pre-isolated CPUs
cgroup/cpuset: Account for boot time isolated CPUs
cgroup/cpuset: remove use_parent_ecpus of cpuset
cgroup/cpuset: remove fetch_xcpus
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"A fairly big update at this time, both in core and driver sides.
The core received rewrites in PCM buffer allocation handling and
locking optimizations, PCM rate updates followed by lots of cleanups.
In ASoC side, the legacy Intel drivers have been deprecated by AVS
drivers which leaded to the significant amount of code reduction.
SoundWire driver updates and other cleanups contributed more code
reduction, too.
USB-audio driver received a large cleanup of its big quirk table, and
the old snd_print*() API usages in many legacy drivers are replaced
with the standard print API.
Here are some highlights:
Core:
- More optimized locking in ALSA control code
- Rewrites of memalloc helpers for better DMA API usage
- Drop of obsoleted vmalloc PCM buffer helper API
- Continued MIDI2 UMP updates
- Support of a new user-space driven timer instance
- Update for more PCM support rates and cleanups
- Xrun counter report in the proc files
ASoC:
- Continued simplification and cleanup works for ASoC
- Extensive cleanups and refactoring of the Soundwire drivers
- Removal of Intel machine support obsoleted by the AVS driver
- Lots of DT schema conversions
- Machine support for many AMD and Intel x86 platforms
- Support for AMD ACP 7.1, Mediatek MT6367 and MT8365, Realtek
RTL1320 SoundWire and rev C, and Texas Instruments TAS2563
USB-audio:
- Add support of multiple control interfaces
- A large rewrite of quirk table with macros
- Support for RME Digiface USB
HD-audio:
- Cleanup of quirk code for Samsung Galaxy laptops
- Clean up of detection of Cirrus codecs
- C-Media CM9825 HD-audio codec support
Others:
- Rewrites to standard print API in a lot of legacy drivers"
* tag 'sound-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (410 commits)
ASoC: topology: Fix redundant logical jump
ASoC: tas2781: Add Calibration Kcontrols for Chromebook
ASoC: amd: acp: refactor SoundWire machine driver code
ASoC: sdw_utils/intel: move soundwire endpoint parsing helper functions
ASoC: sdw_util/intel: move soundwire endpoint and dai link structures
ASoC: intel: sof_sdw: rename soundwire parsing helper functions
ASoC: intel: sof_sdw: rename soundwire endpoint and dailink structures
ASoC: atmel: mchp-pdmc: Retain Non-Runtime Controls
ALSA: hda/realtek: Add support for Galaxy Book2 Pro (NP950XEE)
ASoC: mediatek: mt7986-afe-pcm: Remove redundant error message
ALSA: memalloc: Use proper DMA mapping API for x86 S/G buffer allocations
ALSA: memalloc: Use proper DMA mapping API for x86 WC buffer allocations
ALSA: usb-audio: Add logitech Audio profile quirk
ASoc: mediatek: mt8365: Remove unneeded assignment
ASoC: Intel: ARL: Add entry for HDMI-In capture support to non-I2S codec boards.
ASoC: Intel: sof_rt5682: Add HDMI-In capture with rt5682 support for ARL.
ASoC: SOF: Intel: hda: remove common_hdmi_codec_drv
ASoC: Intel: sof_pcm512x: do not check common_hdmi_codec_drv
ASoC: Intel: ehl_rt5660: do not check common_hdmi_codec_drv
ASoC: Intel: skl_hda_dsp_generic: use common module for DAI links
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 timer updates from Thomas Gleixner:
- Use the topology information of number of packages for making the
decision about TSC trust instead of using the number of online nodes
which is not reflecting the real topology.
- Stop the PIT timer 0 when its not in use as to stop pointless
emulation in the VMM.
- Fix the PIT timer stop sequence for timer 0 so it truly stops both
real hardware and buggy VMM emulations.
* tag 'x86-timers-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/tsc: Check for sockets instead of CPUs to make code match comment
clockevents/drivers/i8253: Fix stop sequence for timer 0
x86/i8253: Disable PIT timer 0 when not in use
x86/tsc: Use topology_max_packages() to get package number
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 updates from Thomas Gleixner:
- Rework kcpuid to handle the the autogenerated CSV file correctly and
update the CSV file to cover the whole zoo of CPUID.
- Avoid memcpy() for ia32 syscall_get_arguments() and use direct
assignments as fortified memcpy() is unhappy about writing/reading
beyond the end of the addresses destination/source struct member
- A few new PCI IDs for AMD
- Update MAINTAINERS to cover x86 specific selftests
* tag 'x86-misc-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
MAINTAINERS: Add selftests/x86 entry
x86/amd_nb: Add new PCI IDs for AMD family 1Ah model 60h-70h
x86/syscall: Avoid memcpy() for ia32 syscall_get_arguments()
MAINTAINERS: Add x86 cpuid database entry
tools/x86/kcpuid: Introduce a complete cpuid bitfields CSV file
tools/x86/kcpuid: Parse subleaf ranges if provided
tools/x86/kcpuid: Recognize all leaves with subleaves
tools/x86/kcpuid: Strip bitfield names leading/trailing whitespace
tools/x86/kcpuid: Protect against faulty "max subleaf" values
tools/x86/kcpuid: Set max possible subleaves count to 64
tools/x86/kcpuid: Properly align long-description columns
tools/x86/kcpuid: Remove unused variable
x86/amd_nb: Add new PCI IDs for AMD family 1Ah model 60h
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 memory management updates from Thomas Gleixner:
- Make LAM enablement safe vs. kernel threads using a process mm
temporarily as switching back to the process would not update CR3 and
therefore not enable LAM causing faults in user space when using
tagged pointers. Cure it by synchronizing LAM enablement via IPIs to
all CPUs which use the related mm.
- Cure a LAM harmless inconsistency between CR3 and the state during
context switch. It's both confusing and prone to lead to real bugs
- Handle alt stack handling for threads which run with a non-zero
protection key. The non-zero key prevents the kernel to access the
alternate stack. Cure it by temporarily enabling all protection keys
for the alternate stack setup/restore operations.
- Provide a EFI config table identity mapping for kexec kernel to
prevent kexec fails because the new kernel cannot access the config
table array
- Use GB pages only when a full GB is mapped in the identity map as
otherwise the CPU can speculate into reserved areas after the end of
memory which causes malfunction on UV systems.
- Remove the noisy and pointless SRAT table dump during boot
- Use is_ioremap_addr() for iounmap() address range checks instead of
high_memory. is_ioremap_addr() is more precise.
* tag 'x86-mm-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ioremap: Improve iounmap() address range checks
x86/mm: Remove duplicate check from build_cr3()
x86/mm: Remove unused NX related declarations
x86/mm: Remove unused CR3_HW_ASID_BITS
x86/mm: Don't print out SRAT table information
x86/mm/ident_map: Use gbpages only where full GB page should be mapped.
x86/kexec: Add EFI config table identity mapping for kexec kernel
selftests/mm: Add new testcases for pkeys
x86/pkeys: Restore altstack access in sigreturn()
x86/pkeys: Update PKRU to enable all pkeys before XSAVE
x86/pkeys: Add helper functions to update PKRU on the sigframe
x86/pkeys: Add PKRU as a parameter in signal handling functions
x86/mm: Cleanup prctl_enable_tagged_addr() nr_bits error checking
x86/mm: Fix LAM inconsistency during context switch
x86/mm: Use IPIs to synchronize LAM enablement
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core update from Thomas Gleixner:
"Enable UBSAN traps for x86, which provides better reporting through
metadata encodeded into UD1"
* tag 'x86-core-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/traps: Enable UBSAN traps on x86
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC driver updates from Arnd Bergmann:
"The driver updates seem larger this time around, with changes is many
of the SoC specific drivers, both the custom drivers/soc ones and the
closely related subsystems (memory, bus, firmware, reset, ...).
The at91 platform gains support for sam9x7 chips in the soc and power
management code. This is the latest variant of one of the oldest still
supported SoC families, using the ARM9 (ARMv5) core.
As usual, the qualcomm snapdragon platform gets a ton of updates in
many of their drivers to add more features and additional SoC support.
Most of these are somewhat firmware related as the platform has a
number of firmware based interfaces to the kernel. A notable addition
here is the inclusion of trace events to two of these drivers.
Herve Codina and Christophe Leroy are now sending updates for
drivers/soc/fsl/ code through the SoC tree, this contains both PowerPC
and Arm specific platforms and has previously been problematic to
maintain. The first update here contains support for newer PowerPC
variants and some cleanups.
The turris mox firmware driver has a number of updates, mostly
cleanups.
The Arm SCMI firmware driver gets a major rework to modularize the
existing code into separately loadable drivers for the various
transports, the addition of custom NXP i.MX9 interfaces and a number
of smaller updates.
The Arm FF-A firmware driver gets a feature update to support the v1.2
version of the specification.
The reset controller drivers have some smaller cleanups and a newly
added driver for the Intel/Mobileye EyeQ5/EyeQ6 MIPS SoCs.
The memory controller drivers get some cleanups and refactoring for
Tegra, TI, Freescale/NXP and a couple more platforms.
Finally there are lots of minor updates to firmware (raspberry pi,
tegra, imx), bus (sunxi, omap, tegra) and soc (rockchips, tegra,
amlogic, mediatek) drivers and their DT bindings"
* tag 'soc-drivers-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (212 commits)
firmware: imx: remove duplicate scmi_imx_misc_ctrl_get()
platform: cznic: turris-omnia-mcu: Fix error check in omnia_mcu_register_trng()
bus: sunxi-rsb: Simplify code with dev_err_probe()
soc: fsl: qe: ucc: Export ucc_mux_set_grant_tsa_bkpt
soc: fsl: cpm1: qmc: Fix dependency on fsl_soc.h
dt-bindings: arm: rockchip: Add rk3576 compatible string to pmu.yaml
soc: fsl: qbman: Remove redundant warnings
soc: fsl: qbman: Use iommu_paging_domain_alloc()
MAINTAINERS: Add QE files related to the Freescale QMC controller
soc: fsl: cpm1: qmc: Handle QUICC Engine (QE) soft-qmc firmware
soc: fsl: cpm1: qmc: Add support for QUICC Engine (QE) implementation
soc: fsl: qe: Add missing PUSHSCHED command
soc: fsl: qe: Add resource-managed muram allocators
soc: fsl: cpm1: qmc: Introduce qmc_version
soc: fsl: cpm1: qmc: Rename SCC_GSMRL_MODE_QMC
soc: fsl: cpm1: qmc: Handle RPACK initialization
soc: fsl: cpm1: qmc: Rename qmc_chan_command()
soc: fsl: cpm1: qmc: Introduce qmc_{init,exit}_xcc() and their CPM1 version
soc: fsl: cpm1: qmc: Introduce qmc_init_resource() and its CPM1 version
soc: fsl: cpm1: qmc: Re-order probe() operations
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi updates from Mark Brown:
"This is quite a quiet release for SPI. The one new core feature here
is support for configuring the state of the MOSI pin when the bus is
idle, there are some devices which are very fragile in this regard
even when the chip select signal is not asserted. Otherwise we have
some new driver support, a bunch of small fixes and some general
cleanup work.
- Support for configuring the state of the MOSI pin when the the bus
is idle
- Add the Elgin JG0309-01 in spidev
- Support for Marvell xSPI, Mediatek MTK7981, Microchip PIC64GX, NXP
i.MX8ULP, and Rockchip RK3576 controllers
I also accidentally pulled in an IIO DT bindings update due to a typo
when applying the MOSI idle state patches"
* tag 'spi-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (65 commits)
spi: geni-qcom: Use devm functions to simplify code
spi: remove spi_controller_is_slave() and spi_slave_abort()
platform/olpc: olpc-xo175-ec: switch to use spi_target_abort().
spi: slave-mt27xx: switch to use target_abort
spi: spidev: switch to use spi_target_abort()
spi: slave-system-control: switch to use spi_target_abort()
spi: slave-time: switch to use spi_target_abort()
spi: switch to use spi_controller_is_target()
spi: fspi: add support for imx8ulp
spi: fspi: involve lut_num for struct nxp_fspi_devtype_data
dt-bindings: spi: nxp-fspi: add imx8ulp support
spi: spidev_fdx: Fix the wrong format specifier
spi: mxs: Switch to RUNTIME/SYSTEM_SLEEP_PM_OPS()
spi: dt-bindings: Add rockchip,rk3576-spi compatible
spi: Revert "spi: Insert the missing pci_dev_put()before return"
spi: zynq-qspi: Replace kzalloc with kmalloc for buffer allocation
spi: ppc4xx: Sort headers
spi: ppc4xx: Revert "handle irq_of_parse_and_map() errors"
spi: zynqmp-gqspi: Simplify with dev_err_probe()
spi: zynqmp-gqspi: Use devm_spi_alloc_host()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator updates from Mark Brown:
"This release is almost all cleanup work of various kinds, while the
diffstat for the core is quite large this is almost all cleanups and
documentation improvments with some small fixes rather than any new
feature work. We do have support for a couple of new devices but these
are small additions to existing drivers rather than new drivers.
- Removal of the SM5703 driver which does not have it's dependencies
available.
- Support for Allwinner AXP717, and Qualcomm WCN6855.
The Allwinner support shares some commits with the MFD tree"
* tag 'regulator-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (66 commits)
regulator: sm5703: Remove because it is unused and fails to build
regulator: Split up _regulator_get()
regulator: update some comments ([gs]et_voltage_vsel vs [gs]et_voltage_sel)
regulator: max8973: Use irq_get_trigger_type() helper
regulator: core: fix the broken behavior of regulator_dev_lookup()
regulator: max77650: Use container_of and constify static data
regulator: hi6421v530: Use container_of and constify static data
regulator: hi6421v530: Drop unused 'eco_microamp'
regulator: qcom-refgen: Constify static data
regulator: pfuze100: Constify static data
regulator: pcap: Constify static data
regulator: mtk-dvfsrc: Constify static data
regulator: max77826: Constify static data
regulator: max77826: Drop unused 'rdesc' in 'struct max77826_regulator_info'
regulator: tps65023: Constify static data
regulator: hi6421v600: Constify static data
regulator: hi6421: Constify static data
regulator: da9121: Constify static data
regulator: da9063: Constify static data
regulator: da9055: Constify static data
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"The main update here is Matti's work allowing regmap irqdomains to be
given custom names (allowing multiple interrupt controllers associatd
with a single struct device), this pulls in some commits from Thomas'
tree which it depends on.
Otherwise there's a bit of work on improving handling of regmaps
protected with spinlocks when used with complex cache types, fixing
some valid but harmless lockdep reports seen with some new driver
work"
* tag 'regmap-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: kunit: Add coverage of spinlocked regmaps
regcache: use map->alloc_flags also for allocating cache
regmap: Use locking during kunit tests
regmap: Hold the regmap lock when allocating and freeing the cache
regmap: Allow setting IRQ domain name suffix
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
"This is the "last" part of the support for the new nbcon consoles.
Where "nbcon" stays for "No Big console lock CONsoles" aka not under
the console_lock.
New callbacks are added to struct console:
- write_thread() for flushing nbcon consoles in task context.
- write_atomic() for flushing nbcon consoles in atomic context,
including NMI.
- con->device_lock() and device_unlock() for taking the driver
specific lock, for example, port->lock.
New printk-specific kthreads are created:
- per-console kthreads which get responsible for flushing normal
priority messages on nbcon consoles.
- thread which gets responsible for flushing normal priority messages
on all consoles when CONFIG_RT enabled.
The new callbacks are called under a special per-console lock which
has already been added back in v6.7. It allows to distinguish three
severities: normal, emergency, and panic. A context with a higher
priority could take over the ownership when it is safe even in the
middle of handling a record. The panic context could do it even when
it is not safe. But it is allowed only for the final desperate flush
before entering the infinite loop.
The new lock helps to flush the messages directly in emergency and
panic contexts. But it is not enough in all situations:
- console_lock() is still need for synchronization against boot
consoles.
- con->device_lock() is need for synchronization against other
operations on the same HW, e.g. serial port speed setting,
non-printk related read/write.
The dependency on con->device_lock() is mutual. Any code taking the
driver specific lock has to acquire the related nbcon console context
as well. For example, see the new uart_port_lock() API. It provides
the necessary synchronization against emergency and panic contexts
where the messages are flushed only under the new per-console lock.
Maybe surprisingly, a quite tricky part is the decision how to flush
the consoles in various situations. It has to take into account:
- message priority: normal, emergency, panic
- scheduling context: task, atomic, deferred_legacy
- registered consoles: boot, legacy, nbcon
- threads are running: early boot, suspend, shutdown, panic
- caller: printk(), pr_flush(), printk_flush_in_panic(),
console_unlock(), console_start(), ...
The primary decision is made in printk_get_console_flush_type(). It
creates a hint what the caller should do:
- flush nbcon consoles directly or via the kthread
- call the legacy loop (console_unlock()) directly or via irq_work
The existing behavior is preserved for the legacy consoles. The only
exception is that they are not longer flushed directly from printk()
in panic() before CPUs are stopped. But this blocking happens only
when at least one nbcon console is registered. The motivation is to
increase a chance to produce the crash dump. They legacy consoles
might create a deadlock in compare with nbcon consoles. The nbcon
console should allow to see the messages even when the crash dump
fails.
There are three possible ways how nbcon consoles are flushed:
- The per-nbcon-console kthread is responsible for flushing messages
added with the normal priority. This is the default mode.
- The legacy loop, aka console_unlock(), is used when there is still
a boot console registered. There is no easy way how to match an
early console driver with a nbcon console driver. And the
console_lock() provides the only reliable serialization at the
moment.
The legacy loop uses either con->write_atomic() or
con->write_thread() callbacks depending on whether it is allowed to
schedule. The atomic variant has to be used from printk().
- In other situations, the messages are flushed directly using
write_atomic() which can be called in any context, including NMI.
It is primary needed during early boot or shutdown, in emergency
situations, and panic.
The emergency priority is used by a code called within
nbcon_cpu_emergency_enter()/exit(). At the moment, it is used in four
situations: WARN(), Oops, lockdep, and RCU stall reports.
Finally, there is no nbcon console at the moment. It means that the
changes should _not_ modify the existing behavior. The only exception
is CONFIG_RT which would force offloading the legacy loop, for normal
priority context, into the dedicated kthread"
* tag 'printk-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (54 commits)
printk: Avoid false positive lockdep report for legacy printing
printk: nbcon: Assign nice -20 for printing threads
printk: Implement legacy printer kthread for PREEMPT_RT
tty: sysfs: Add nbcon support for 'active'
proc: Add nbcon support for /proc/consoles
proc: consoles: Add notation to c_start/c_stop
printk: nbcon: Show replay message on takeover
printk: Provide helper for message prepending
printk: nbcon: Rely on kthreads for normal operation
printk: nbcon: Use thread callback if in task context for legacy
printk: nbcon: Relocate nbcon_atomic_emit_one()
printk: nbcon: Introduce printer kthreads
printk: nbcon: Init @nbcon_seq to highest possible
printk: nbcon: Add context to usable() and emit()
printk: Flush console on unregister_console()
printk: Fail pr_flush() if before SYSTEM_SCHEDULING
printk: nbcon: Add function for printers to reacquire ownership
printk: nbcon: Use raw_cpu_ptr() instead of open coding
printk: Use the BITS_PER_LONG macro
lockdep: Mark emergency sections in lockdep splats
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
"Core:
- Overhaul of posix-timers in preparation of removing the workaround
for periodic timers which have signal delivery ignored.
- Remove the historical extra jiffie in msleep()
msleep() adds an extra jiffie to the timeout value to ensure
minimal sleep time. The timer wheel ensures minimal sleep time
since the large rewrite to a non-cascading wheel, but the extra
jiffie in msleep() remained unnoticed. Remove it.
- Make the timer slack handling correct for realtime tasks.
The procfs interface is inconsistent and does neither reflect
reality nor conforms to the man page. Show the correct 0 slack for
real time tasks and enforce it at the core level instead of having
inconsistent individual checks in various timer setup functions.
- The usual set of updates and enhancements all over the place.
Drivers:
- Allow the ACPI PM timer to be turned off during suspend
- No new drivers
- The usual updates and enhancements in various drivers"
* tag 'timers-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits)
ntp: Make sure RTC is synchronized when time goes backwards
treewide: Fix wrong singular form of jiffies in comments
cpu: Use already existing usleep_range()
timers: Rename next_expiry_recalc() to be unique
platform/x86:intel/pmc: Fix comment for the pmc_core_acpi_pm_timer_suspend_resume function
clocksource/drivers/jcore: Use request_percpu_irq()
clocksource/drivers/cadence-ttc: Add missing clk_disable_unprepare in ttc_setup_clockevent
clocksource/drivers/asm9260: Add missing clk_disable_unprepare in asm9260_timer_init
clocksource/drivers/qcom: Add missing iounmap() on errors in msm_dt_timer_init()
clocksource/drivers/ingenic: Use devm_clk_get_enabled() helpers
platform/x86:intel/pmc: Enable the ACPI PM Timer to be turned off when suspended
clocksource: acpi_pm: Add external callback for suspend/resume
clocksource/drivers/arm_arch_timer: Using for_each_available_child_of_node_scoped()
dt-bindings: timer: rockchip: Add rk3576 compatible
timers: Annotate possible non critical data race of next_expiry
timers: Remove historical extra jiffie for timeout in msleep()
hrtimer: Use and report correct timerslack values for realtime tasks
hrtimer: Annotate hrtimer_cpu_base_.*_expiry() for sparse.
timers: Add sparse annotation for timer_sync_wait_running().
signal: Replace BUG_ON()s
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"Core:
- Remove a global lock in the affinity setting code
The lock protects a cpumask for intermediate results and the lock
causes a bottleneck on simultaneous start of multiple virtual
machines. Replace the lock and the static cpumask with a per CPU
cpumask which is nicely serialized by raw spinlock held when
executing this code.
- Provide support for giving a suffix to interrupt domain names.
That's required to support devices with subfunctions so that the
domain names are distinct even if they originate from the same
device node.
- The usual set of cleanups and enhancements all over the place
Drivers:
- Support for longarch AVEC interrupt chip
- Refurbishment of the Armada driver so it can be extended for new
variants.
- The usual set of cleanups and enhancements all over the place"
* tag 'irq-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (73 commits)
genirq: Use cpumask_intersects()
genirq/cpuhotplug: Use cpumask_intersects()
irqchip/apple-aic: Only access system registers on SoCs which provide them
irqchip/apple-aic: Add a new "Global fast IPIs only" feature level
irqchip/apple-aic: Skip unnecessary enabling of use_fast_ipi
dt-bindings: apple,aic: Document A7-A11 compatibles
irqdomain: Use IS_ERR_OR_NULL() in irq_domain_trim_hierarchy()
genirq/msi: Use kmemdup_array() instead of kmemdup()
genirq/proc: Change the return value for set affinity permission error
genirq/proc: Use irq_move_pending() in show_irq_affinity()
genirq/proc: Correctly set file permissions for affinity control files
genirq: Get rid of global lock in irq_do_set_affinity()
genirq: Fix typo in struct comment
irqchip/loongarch-avec: Add AVEC irqchip support
irqchip/loongson-pch-msi: Prepare get_pch_msi_handle() for AVECINTC
irqchip/loongson-eiointc: Rename CPUHP_AP_IRQ_LOONGARCH_STARTING
LoongArch: Architectural preparation for AVEC irqchip
LoongArch: Move irqchip function prototypes to irq-loongson.h
irqchip/loongson-pch-msi: Switch to MSI parent domains
softirq: Remove unused 'action' parameter from action callback
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug updates from Thomas Gleixner:
- Prepare the core for supporting parallel hotplug on loongarch
- A small set of cleanups and enhancements
* tag 'smp-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
smp: Mark smp_prepare_boot_cpu() __init
cpu: Fix W=1 build kernel-doc warning
cpu/hotplug: Provide weak fallback for arch_cpuhp_init_parallel_bringup()
cpu/hotplug: Make HOTPLUG_PARALLEL independent of HOTPLUG_SMT
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm updates from Paul Moore:
- Move the LSM framework to static calls
This transitions the vast majority of the LSM callbacks into static
calls. Those callbacks which haven't been converted were left as-is
due to the general ugliness of the changes required to support the
static call conversion; we can revisit those callbacks at a future
date.
- Add the Integrity Policy Enforcement (IPE) LSM
This adds a new LSM, Integrity Policy Enforcement (IPE). There is
plenty of documentation about IPE in this patches, so I'll refrain
from going into too much detail here, but the basic motivation behind
IPE is to provide a mechanism such that administrators can restrict
execution to only those binaries which come from integrity protected
storage, e.g. a dm-verity protected filesystem. You will notice that
IPE requires additional LSM hooks in the initramfs, dm-verity, and
fs-verity code, with the associated patches carrying ACK/review tags
from the associated maintainers. We couldn't find an obvious
maintainer for the initramfs code, but the IPE patchset has been
widely posted over several years.
Both Deven Bowers and Fan Wu have contributed to IPE's development
over the past several years, with Fan Wu agreeing to serve as the IPE
maintainer moving forward. Once IPE is accepted into your tree, I'll
start working with Fan to ensure he has the necessary accounts, keys,
etc. so that he can start submitting IPE pull requests to you
directly during the next merge window.
- Move the lifecycle management of the LSM blobs to the LSM framework
Management of the LSM blobs (the LSM state buffers attached to
various kernel structs, typically via a void pointer named "security"
or similar) has been mixed, some blobs were allocated/managed by
individual LSMs, others were managed by the LSM framework itself.
Starting with this pull we move management of all the LSM blobs,
minus the XFRM blob, into the framework itself, improving consistency
across LSMs, and reducing the amount of duplicated code across LSMs.
Due to some additional work required to migrate the XFRM blob, it has
been left as a todo item for a later date; from a practical
standpoint this omission should have little impact as only SELinux
provides a XFRM LSM implementation.
- Fix problems with the LSM's handling of F_SETOWN
The LSM hook for the fcntl(F_SETOWN) operation had a couple of
problems: it was racy with itself, and it was disconnected from the
associated DAC related logic in such a way that the LSM state could
be updated in cases where the DAC state would not. We fix both of
these problems by moving the security_file_set_fowner() hook into the
same section of code where the DAC attributes are updated. Not only
does this resolve the DAC/LSM synchronization issue, but as that code
block is protected by a lock, it also resolve the race condition.
- Fix potential problems with the security_inode_free() LSM hook
Due to use of RCU to protect inodes and the placement of the LSM hook
associated with freeing the inode, there is a bit of a challenge when
it comes to managing any LSM state associated with an inode. The VFS
folks are not open to relocating the LSM hook so we have to get
creative when it comes to releasing an inode's LSM state.
Traditionally we have used a single LSM callback within the hook that
is triggered when the inode is "marked for death", but not actually
released due to RCU.
Unfortunately, this causes problems for LSMs which want to take an
action when the inode's associated LSM state is actually released; so
we add an additional LSM callback, inode_free_security_rcu(), that is
called when the inode's LSM state is released in the RCU free
callback.
- Refactor two LSM hooks to better fit the LSM return value patterns
The vast majority of the LSM hooks follow the "return 0 on success,
negative values on failure" pattern, however, there are a small
handful that have unique return value behaviors which has caused
confusion in the past and makes it difficult for the BPF verifier to
properly vet BPF LSM programs. This includes patches to
convert two of these"special" LSM hooks to the common 0/-ERRNO pattern.
- Various cleanups and improvements
A handful of patches to remove redundant code, better leverage the
IS_ERR_OR_NULL() helper, add missing "static" markings, and do some
minor style fixups.
* tag 'lsm-pr-20240911' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (40 commits)
security: Update file_set_fowner documentation
fs: Fix file_set_fowner LSM hook inconsistencies
lsm: Use IS_ERR_OR_NULL() helper function
lsm: remove LSM_COUNT and LSM_CONFIG_COUNT
ipe: Remove duplicated include in ipe.c
lsm: replace indirect LSM hook calls with static calls
lsm: count the LSMs enabled at compile time
kernel: Add helper macros for loop unrolling
init/main.c: Initialize early LSMs after arch code, static keys and calls.
MAINTAINERS: add IPE entry with Fan Wu as maintainer
documentation: add IPE documentation
ipe: kunit test for parser
scripts: add boot policy generation program
ipe: enable support for fs-verity as a trust provider
fsverity: expose verified fsverity built-in signatures to LSMs
lsm: add security_inode_setintegrity() hook
ipe: add support for dm-verity as a trust provider
dm-verity: expose root hash digest and signature data to LSMs
block,lsm: add LSM blob and new LSM hooks for block devices
ipe: add permissive toggle
...
|
|
Pull io_uring async discard support from Jens Axboe:
"Sitting on top of both the 6.12 block and io_uring core branches,
here's support for async discard through io_uring.
This allows applications to issue async discards, rather than rely on
the blocking sync ioctl discards we already have. The sync support is
difficult to use outside of idle/cleanup periods.
On a real (but slow) device, testing shows the following results when
compared to sync discard:
qd64 sync discard: 21K IOPS, lat avg 3 msec (max 21 msec)
qd64 async discard: 76K IOPS, lat avg 845 usec (max 2.2 msec)
qd64 sync discard: 14K IOPS, lat avg 5 msec (max 25 msec)
qd64 async discard: 56K IOPS, lat avg 1153 usec (max 3.6 msec)
and synthetic null_blk testing with the same queue depth and block
size settings as above shows:
Type Trim size IOPS Lat avg (usec) Lat Max (usec)
==============================================================
sync 4k 144K 444 20314
async 4k 1353K 47 595
sync 1M 56K 1136 21031
async 1M 94K 680 760"
* tag 'for-6.12/io_uring-discard-20240913' of git://git.kernel.dk/linux:
block: implement async io_uring discard cmd
block: introduce blk_validate_byte_range()
filemap: introduce filemap_invalidate_pages
io_uring/cmd: give inline space in request to cmds
io_uring/cmd: expose iowq to cmds
|
|
Pull block updates from Jens Axboe:
- MD changes via Song:
- md-bitmap refactoring (Yu Kuai)
- raid5 performance optimization (Artur Paszkiewicz)
- Other small fixes (Yu Kuai, Chen Ni)
- Add a sysfs entry 'new_level' (Xiao Ni)
- Improve information reported in /proc/mdstat (Mateusz Kusiak)
- NVMe changes via Keith:
- Asynchronous namespace scanning (Stuart)
- TCP TLS updates (Hannes)
- RDMA queue controller validation (Niklas)
- Align field names to the spec (Anuj)
- Metadata support validation (Puranjay)
- A syntax cleanup (Shen)
- Fix a Kconfig linking error (Arnd)
- New queue-depth quirk (Keith)
- Add missing unplug trace event (Keith)
- blk-iocost fixes (Colin, Konstantin)
- t10-pi modular removal and fixes (Alexey)
- Fix for potential BLKSECDISCARD overflow (Alexey)
- bio splitting cleanups and fixes (Christoph)
- Deal with folios rather than rather than pages, speeding up how the
block layer handles bigger IOs (Kundan)
- Use spinlocks rather than bit spinlocks in zram (Sebastian, Mike)
- Reduce zoned device overhead in ublk (Ming)
- Add and use sendpages_ok() for drbd and nvme-tcp (Ofir)
- Fix regression in partition error pointer checking (Riyan)
- Add support for write zeroes and rotational status in nbd (Wouter)
- Add Yu Kuai as new BFQ maintainer. The scheduler has been
unmaintained for quite a while.
- Various sets of fixes for BFQ (Yu Kuai)
- Misc fixes and cleanups (Alvaro, Christophe, Li, Md Haris, Mikhail,
Yang)
* tag 'for-6.12/block-20240913' of git://git.kernel.dk/linux: (120 commits)
nvme-pci: qdepth 1 quirk
block: fix potential invalid pointer dereference in blk_add_partition
blk_iocost: make read-only static array vrate_adj_pct const
block: unpin user pages belonging to a folio at once
mm: release number of pages of a folio
block: introduce folio awareness and add a bigger size from folio
block: Added folio-ized version of bio_add_hw_page()
block, bfq: factor out a helper to split bfqq in bfq_init_rq()
block, bfq: remove local variable 'bfqq_already_existing' in bfq_init_rq()
block, bfq: remove local variable 'split' in bfq_init_rq()
block, bfq: remove bfq_log_bfqg()
block, bfq: merge bfq_release_process_ref() into bfq_put_cooperator()
block, bfq: fix procress reference leakage for bfqq in merge chain
block, bfq: fix uaf for accessing waker_bfqq after splitting
blk-throttle: support prioritized processing of metadata
blk-throttle: remove last_low_overflow_time
drbd: Add NULL check for net_conf to prevent dereference in state validation
nvme-tcp: fix link failure for TCP auth
blk-mq: add missing unplug trace event
mtip32xx: Remove redundant null pointer checks in mtip_hw_debugfs_init()
...
|
|
Pull io_uring updates from Jens Axboe:
- NAPI fixes and cleanups (Pavel, Olivier)
- Add support for absolute timeouts (Pavel)
- Fixes for io-wq/sqpoll affinities (Felix)
- Efficiency improvements for dealing with huge pages (Chenliang)
- Support for a minwait mode, where the application essentially has two
timouts - one smaller one that defines the batch timeout, and the
overall large one similar to what we had before. This enables
efficient use of batching based on count + timeout, while still
working well with periods of less intensive workloads
- Use ITER_UBUF for single segment sends
- Add support for incremental buffer consumption. Right now each
operation will always consume a full buffer. With incremental
consumption, a recv/read operation only consumes the part of the
buffer that it needs to satisfy the operation
- Add support for GCOV for io_uring, to help retain a high coverage of
test to code ratio
- Fix regression with ocfs2, where an odd -EOPNOTSUPP wasn't correctly
converted to a blocking retry
- Add support for cloning registered buffers from one ring to another
- Misc cleanups (Anuj, me)
* tag 'for-6.12/io_uring-20240913' of git://git.kernel.dk/linux: (35 commits)
io_uring: add IORING_REGISTER_COPY_BUFFERS method
io_uring/register: provide helper to get io_ring_ctx from 'fd'
io_uring/rsrc: add reference count to struct io_mapped_ubuf
io_uring/rsrc: clear 'slot' entry upfront
io_uring/io-wq: inherit cpuset of cgroup in io worker
io_uring/io-wq: do not allow pinning outside of cpuset
io_uring/rw: drop -EOPNOTSUPP check in __io_complete_rw_common()
io_uring/rw: treat -EOPNOTSUPP for IOCB_NOWAIT like -EAGAIN
io_uring/sqpoll: do not allow pinning outside of cpuset
io_uring/eventfd: move refs to refcount_t
io_uring: remove unused rsrc_put_fn
io_uring: add new line after variable declaration
io_uring: add GCOV_PROFILE_URING Kconfig option
io_uring/kbuf: add support for incremental buffer consumption
io_uring/kbuf: pass in 'len' argument for buffer commit
Revert "io_uring: Require zeroed sqe->len on provided-buffers send"
io_uring/kbuf: move io_ring_head_to_buf() to kbuf.h
io_uring/kbuf: add io_kbuf_commit() helper
io_uring/kbuf: shrink nr_iovs/mode in struct buf_sel_arg
io_uring: wire up min batch wake timeout
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull netfs updates from Christian Brauner:
"This contains the work to improve read/write performance for the new
netfs library.
The main performance enhancing changes are:
- Define a structure, struct folio_queue, and a new iterator type,
ITER_FOLIOQ, to hold a buffer as a replacement for ITER_XARRAY. See
that patch for questions about naming and form.
ITER_FOLIOQ is provided as a replacement for ITER_XARRAY. The
problem with an xarray is that accessing it requires the use of a
lock (typically the RCU read lock) - and this means that we can't
supply iterate_and_advance() with a step function that might sleep
(crypto for example) without having to drop the lock between pages.
ITER_FOLIOQ is the iterator for a chain of folio_queue structs,
where each folio_queue holds a small list of folios. A folio_queue
struct is a simpler structure than xarray and is not subject to
concurrent manipulation by the VM. folio_queue is used rather than
a bvec[] as it can form lists of indefinite size, adding to one end
and removing from the other on the fly.
- Provide a copy_folio_from_iter() wrapper.
- Make cifs RDMA support ITER_FOLIOQ.
- Use folio queues in the write-side helpers instead of xarrays.
- Add a function to reset the iterator in a subrequest.
- Simplify the write-side helpers to use sheaves to skip gaps rather
than trying to work out where gaps are.
- In afs, make the read subrequests asynchronous, putting them into
work items to allow the next patch to do progressive
unlocking/reading.
- Overhaul the read-side helpers to improve performance.
- Fix the caching of a partial block at the end of a file.
- Allow a store to be cancelled.
Then some changes for cifs to make it use folio queues instead of
xarrays for crypto bufferage:
- Use raw iteration functions rather than manually coding iteration
when hashing data.
- Switch to using folio_queue for crypto buffers.
- Remove the xarray bits.
Make some adjustments to the /proc/fs/netfs/stats file such that:
- All the netfs stats lines begin 'Netfs:' but change this to
something a bit more useful.
- Add a couple of stats counters to track the numbers of skips and
waits on the per-inode writeback serialisation lock to make it
easier to check for this as a source of performance loss.
Miscellaneous work:
- Ensure that the sb_writers lock is taken around
vfs_{set,remove}xattr() in the cachefiles code.
- Reduce the number of conditional branches in netfs_perform_write().
- Move the CIFS_INO_MODIFIED_ATTR flag to the netfs_inode struct and
remove cifs_post_modify().
- Move the max_len/max_nr_segs members from netfs_io_subrequest to
netfs_io_request as they're only needed for one subreq at a time.
- Add an 'unknown' source value for tracing purposes.
- Remove NETFS_COPY_TO_CACHE as it's no longer used.
- Set the request work function up front at allocation time.
- Use bh-disabling spinlocks for rreq->lock as cachefiles completion
may be run from block-filesystem DIO completion in softirq context.
- Remove fs/netfs/io.c"
* tag 'vfs-6.12.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (25 commits)
docs: filesystems: corrected grammar of netfs page
cifs: Don't support ITER_XARRAY
cifs: Switch crypto buffer to use a folio_queue rather than an xarray
cifs: Use iterate_and_advance*() routines directly for hashing
netfs: Cancel dirty folios that have no storage destination
cachefiles, netfs: Fix write to partial block at EOF
netfs: Remove fs/netfs/io.c
netfs: Speed up buffered reading
afs: Make read subreqs async
netfs: Simplify the writeback code
netfs: Provide an iterator-reset function
netfs: Use new folio_queue data type and iterator instead of xarray iter
cifs: Provide the capability to extract from ITER_FOLIOQ to RDMA SGEs
iov_iter: Provide copy_folio_from_iter()
mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios
netfs: Use bh-disabling spinlocks for rreq->lock
netfs: Set the request work function upon allocation
netfs: Remove NETFS_COPY_TO_CACHE
netfs: Reserve netfs_sreq_source 0 as unset/unknown
netfs: Move max_len/max_nr_segs from netfs_io_subrequest to netfs_io_stream
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs mount updates from Christian Brauner:
"Recently, we added the ability to list mounts in other mount
namespaces and the ability to retrieve namespace file descriptors
without having to go through procfs by deriving them from pidfds.
This extends nsfs in two ways:
(1) Add the ability to retrieve information about a mount namespace
via NS_MNT_GET_INFO.
This will return the mount namespace id and the number of mounts
currently in the mount namespace. The number of mounts can be
used to size the buffer that needs to be used for listmount() and
is in general useful without having to actually iterate through
all the mounts.
The structure is extensible.
(2) Add the ability to iterate through all mount namespaces over
which the caller holds privilege returning the file descriptor
for the next or previous mount namespace.
To retrieve a mount namespace the caller must be privileged wrt
to it's owning user namespace. This means that PID 1 on the host
can list all mounts in all mount namespaces or that a container
can list all mounts of its nested containers.
Optionally pass a structure for NS_MNT_GET_INFO with
NS_MNT_GET_{PREV,NEXT} to retrieve information about the mount
namespace in one go.
(1) and (2) can be implemented for other namespace types easily.
Together with recent api additions this means one can iterate through
all mounts in all mount namespaces without ever touching procfs.
The commit message in 49224a345c48 ('Merge patch series "nsfs: iterate
through mount namespaces"') contains example code how to do this"
* tag 'vfs-6.12.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
nsfs: iterate through mount namespaces
file: add fput() cleanup helper
fs: add put_mnt_ns() cleanup helper
fs: allow mount namespace fd
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fallocate updates from Christian Brauner:
"This contains work to try and cleanup some the fallocate mode
handling. Currently, it confusingly mixes operation modes and an
optional flag.
The work here tries to better define operation modes and optional
flags allowing the core and filesystem code to use switch statements
to switch on the operation mode"
* tag 'vfs-6.12.fallocate' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
xfs: refactor xfs_file_fallocate
xfs: move the xfs_is_always_cow_inode check into xfs_alloc_file_space
xfs: call xfs_flush_unmap_range from xfs_free_file_space
fs: sort out the fallocate mode vs flag mess
ext4: remove tracing for FALLOC_FL_NO_HIDE_STALE
block: remove checks for FALLOC_FL_NO_HIDE_STALE
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs file updates from Christian Brauner:
"This is the work to cleanup and shrink struct file significantly.
Right now, (focusing on x86) struct file is 232 bytes. After this
series struct file will be 184 bytes aka 3 cacheline and a spare 8
bytes for future extensions at the end of the struct.
With struct file being as ubiquitous as it is this should make a
difference for file heavy workloads and allow further optimizations in
the future.
- struct fown_struct was embedded into struct file letting it take up
32 bytes in total when really it shouldn't even be embedded in
struct file in the first place. Instead, actual users of struct
fown_struct now allocate the struct on demand. This frees up 24
bytes.
- Move struct file_ra_state into the union containg the cleanup hooks
and move f_iocb_flags out of the union. This closes a 4 byte hole
we created earlier and brings struct file to 192 bytes. Which means
struct file is 3 cachelines and we managed to shrink it by 40
bytes.
- Reorder struct file so that nothing crosses a cacheline.
I suspect that in the future we will end up reordering some members
to mitigate false sharing issues or just because someone does
actually provide really good perf data.
- Shrinking struct file to 192 bytes is only part of the work.
Files use a slab that is SLAB_TYPESAFE_BY_RCU and when a kmem cache
is created with SLAB_TYPESAFE_BY_RCU the free pointer must be
located outside of the object because the cache doesn't know what
part of the memory can safely be overwritten as it may be needed to
prevent object recycling.
That has the consequence that SLAB_TYPESAFE_BY_RCU may end up
adding a new cacheline.
So this also contains work to add a new kmem_cache_create_rcu()
function that allows the caller to specify an offset where the
freelist pointer is supposed to be placed. Thus avoiding the
implicit addition of a fourth cacheline.
- And finally this removes the f_version member in struct file.
The f_version member isn't particularly well-defined. It is mainly
used as a cookie to detect concurrent seeks when iterating
directories. But it is also abused by some subsystems for
completely unrelated things.
It is mostly a directory and filesystem specific thing that doesn't
really need to live in struct file and with its wonky semantics it
really lacks a specific function.
For pipes, f_version is (ab)used to defer poll notifications until
a write has happened. And struct pipe_inode_info is used by
multiple struct files in their ->private_data so there's no chance
of pushing that down into file->private_data without introducing
another pointer indirection.
But pipes don't rely on f_pos_lock so this adds a union into struct
file encompassing f_pos_lock and a pipe specific f_pipe member that
pipes can use. This union of course can be extended to other file
types and is similar to what we do in struct inode already"
* tag 'vfs-6.12.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (26 commits)
fs: remove f_version
pipe: use f_pipe
fs: add f_pipe
ubifs: store cookie in private data
ufs: store cookie in private data
udf: store cookie in private data
proc: store cookie in private data
ocfs2: store cookie in private data
input: remove f_version abuse
ext4: store cookie in private data
ext2: store cookie in private data
affs: store cookie in private data
fs: add generic_llseek_cookie()
fs: use must_set_pos()
fs: add must_set_pos()
fs: add vfs_setpos_cookie()
s390: remove unused f_version
ceph: remove unused f_version
adi: remove unused f_version
mm: Removed @freeptr_offset to prevent doc warning
...
|
|
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Pull vfs folio updates from Christian Brauner:
"This contains work to port write_begin and write_end to rely on folios
for various filesystems.
This converts ocfs2, vboxfs, orangefs, jffs2, hostfs, fuse, f2fs,
ecryptfs, ntfs3, nilfs2, reiserfs, minixfs, qnx6, sysv, ufs, and
squashfs.
After this series lands a bunch of the filesystems in this list do not
mention struct page anymore"
* tag 'vfs-6.12.folio' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (61 commits)
Squashfs: Ensure all readahead pages have been used
Squashfs: Rewrite and update squashfs_readahead_fragment() to not use page->index
Squashfs: Update squashfs_readpage_block() to not use page->index
Squashfs: Update squashfs_readahead() to not use page->index
Squashfs: Update page_actor to not use page->index
jffs2: Use a folio in jffs2_garbage_collect_dnode()
jffs2: Convert jffs2_do_readpage_nolock to take a folio
buffer: Convert __block_write_begin() to take a folio
ocfs2: Convert ocfs2_write_zero_page to use a folio
fs: Convert aops->write_begin to take a folio
fs: Convert aops->write_end to take a folio
vboxsf: Use a folio in vboxsf_write_end()
orangefs: Convert orangefs_write_begin() to use a folio
orangefs: Convert orangefs_write_end() to use a folio
jffs2: Convert jffs2_write_begin() to use a folio
jffs2: Convert jffs2_write_end() to use a folio
hostfs: Convert hostfs_write_end() to use a folio
fuse: Convert fuse_write_begin() to use a folio
fuse: Convert fuse_write_end() to use a folio
f2fs: Convert f2fs_write_begin() to use a folio
...
|
|
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
"This contains the usual pile of misc updates:
Features:
- Add F_CREATED_QUERY fcntl() that allows userspace to query whether
a file was actually created. Often userspace wants to know whether
an O_CREATE request did actually create a file without using
O_EXCL. The current logic is that to first attempts to open the
file without O_CREAT | O_EXCL and if ENOENT is returned userspace
tries again with both flags. If that succeeds all is well. If it
now reports EEXIST it retries.
That works fairly well but some corner cases make this more
involved. If this operates on a dangling symlink the first openat()
without O_CREAT | O_EXCL will return ENOENT but the second openat()
with O_CREAT | O_EXCL will fail with EEXIST.
The reason is that openat() without O_CREAT | O_EXCL follows the
symlink while O_CREAT | O_EXCL doesn't for security reasons. So
it's not something we can really change unless we add an explicit
opt-in via O_FOLLOW which seems really ugly.
All available workarounds are really nasty (fanotify, bpf lsm etc)
so add a simple fcntl().
- Try an opportunistic lookup for O_CREAT. Today, when opening a file
we'll typically do a fast lookup, but if O_CREAT is set, the kernel
always takes the exclusive inode lock. This was likely done with
the expectation that O_CREAT means that we always expect to do the
create, but that's often not the case. Many programs set O_CREAT
even in scenarios where the file already exists (see related
F_CREATED_QUERY patch motivation above).
The series contained in the pr rearranges the pathwalk-for-open
code to also attempt a fast_lookup in certain O_CREAT cases. If a
positive dentry is found, the inode_lock can be avoided altogether
and it can stay in rcuwalk mode for the last step_into.
- Expose the 64 bit mount id via name_to_handle_at()
Now that we provide a unique 64-bit mount ID interface in statx(2),
we can now provide a race-free way for name_to_handle_at(2) to
provide a file handle and corresponding mount without needing to
worry about racing with /proc/mountinfo parsing or having to open a
file just to do statx(2).
While this is not necessary if you are using AT_EMPTY_PATH and
don't care about an extra statx(2) call, users that pass full paths
into name_to_handle_at(2) need to know which mount the file handle
comes from (to make sure they don't try to open_by_handle_at a file
handle from a different filesystem) and switching to AT_EMPTY_PATH
would require allocating a file for every name_to_handle_at(2) call
- Add a per dentry expire timeout to autofs
There are two fairly well known automounter map formats, the autofs
format and the amd format (more or less System V and Berkley).
Some time ago Linux autofs added an amd map format parser that
implemented a fair amount of the amd functionality. This was done
within the autofs infrastructure and some functionality wasn't
implemented because it either didn't make sense or required extra
kernel changes. The idea was to restrict changes to be within the
existing autofs functionality as much as possible and leave changes
with a wider scope to be considered later.
One of these changes is implementing the amd options:
1) "unmount", expire this mount according to a timeout (same as
the current autofs default).
2) "nounmount", don't expire this mount (same as setting the
autofs timeout to 0 except only for this specific mount) .
3) "utimeout=<seconds>", expire this mount using the specified
timeout (again same as setting the autofs timeout but only for
this mount)
To implement these options per-dentry expire timeouts need to be
implemented for autofs indirect mounts. This is because all map
keys (mounts) for autofs indirect mounts use an expire timeout
stored in the autofs mount super block info. structure and all
indirect mounts use the same expire timeout.
Fixes:
- Fix missing fput for FSCONFIG_SET_FD in autofs
- Use param->file for FSCONFIG_SET_FD in coda
- Delete the 'fs/netfs' proc subtreee when netfs module exits
- Make sure that struct uid_gid_map fits into a single cacheline
- Don't flush in-flight wb switches for superblocks without cgroup
writeback
- Correcting the idmapping mount example in the idmapping
documentation
- Fix a race between evice_inodes() and find_inode() and iput()
- Refine the show_inode_state() macro definition in writeback code
- Prevent dump_mapping() from accessing invalid dentry.d_name.name
- Show actual source for debugfs in /proc/mounts
- Annotate data-race of busy_poll_usecs in eventpoll
- Don't WARN for racy path_noexec check in exec code
- Handle OOM on mnt_warn_timestamp_expiry()
- Fix some spelling in the iomap design documentation
- Fix typo in procfs comment
- Fix typo in fs/namespace.c comment
Cleanups:
- Add the VFS git tree to the MAINTAINERS file
- Move FMODE_UNSIGNED_OFFSET to fop_flags freeing up another f_mode
bit in struct file bringing us to 5 free f_mode bits
- Remove the __I_DIO_WAKEUP bit from i_state flags as we can simplify
the wait mechanism
- Remove the unused path_put_init() helper
- Replace a __u32 with u32 for s_fsnotify_mask as __u32 is uapi
specific
- Replace the unsigned long i_state member with a u32 i_state member
in struct inode freeing up 4 bytes in struct inode. Instead of
using the bit based wait apis we're now using the var event apis
and using the individual bytes of the i_state member to wait on
state changes
- Explain how per-syscall AT_* flags should be allocated
- Use in_group_or_capable() helper to simplify the posix acl mode
update code
- Switch to LIST_HEAD() in fsync_buffers_list() to simplify the code
- Removed comment about d_rcu_to_refcount() as that function doesn't
exist anymore
- Add kernel documentation for lookup_fast()
- Don't re-zero evenpoll fields
- Remove outdated comment after close_fd()
- Fix imprecise wording in comment about the pipe filesystem
- Drop GFP_NOFAIL mode from alloc_page_buffers
- Missing blank line warnings and struct declaration improved in
file_table
- Annotate struct poll_list with __counted_by()
- Remove the unused read parameter in percpu-rwsem
- Remove linux/prefetch.h include from direct-io code
- Use kmemdup_array instead of kmemdup for multiple allocation in
mnt_idmapping code
- Remove unused mnt_cursor_del() declaration
Performance tweaks:
- Dodge smp_mb in break_lease and break_deleg in the common case
- Only read fops once in fops_{get,put}()
- Use RCU in ilookup()
- Elide smp_mb in iversion handling in the common case
- Drop one lock trip in evict()"
* tag 'vfs-6.12.misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (58 commits)
uidgid: make sure we fit into one cacheline
proc: Fix typo in the comment
fs/pipe: Correct imprecise wording in comment
fhandle: expose u64 mount id to name_to_handle_at(2)
uapi: explain how per-syscall AT_* flags should be allocated
fs: drop GFP_NOFAIL mode from alloc_page_buffers
writeback: Refine the show_inode_state() macro definition
fs/inode: Prevent dump_mapping() accessing invalid dentry.d_name.name
mnt_idmapping: Use kmemdup_array instead of kmemdup for multiple allocation
netfs: Delete subtree of 'fs/netfs' when netfs module exits
fs: use LIST_HEAD() to simplify code
inode: make i_state a u32
inode: port __I_LRU_ISOLATING to var event
vfs: fix race between evice_inodes() and find_inode()&iput()
inode: port __I_NEW to var event
inode: port __I_SYNC to var event
fs: reorder i_state bits
fs: add i_state helpers
MAINTAINERS: add the VFS git tree
fs: s/__u32/u32/ for s_fsnotify_mask
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"These mostly continue to rework the thermal core and the thermal zone
driver interface to make the code more straightforward and reduce
bloat
The most significant piece of this work is a change of the code
related to binding cooling devices to thermal zones which, among other
things, replaces two previously existing thermal zone operations with
one allowing driver implementations to be much simpler
There is also a new thermal core testing module allowing mock thermal
zones to be created and controlled via debugfs in order to exercise
the thermal core functionality. It is expected to be used for
implementing thermal core self tests in the future
Apart from the above, there are assorted thermal driver updates
Specifics:
- Update some thermal drivers to eliminate thermal_zone_get_trip()
calls from them and get rid of that function (Rafael Wysocki)
- Update the thermal sysfs code to store trip point attributes in
trip descriptors and get to trip points via attribute pointers
(Rafael Wysocki)
- Move the computation of the low and high boundaries for
thermal_zone_set_trips() to __thermal_zone_device_update() (Daniel
Lezcano)
- Introduce a debugfs-based facility for thermal core testing (Rafael
Wysocki)
- Replace the thermal zone .bind() and .unbind() callbacks for
binding cooling devices to thermal zones with one .should_bind()
callback used for deciding whether or not a given cooling devices
should be bound to a given trip point in a given thermal zone
(Rafael Wysocki)
- Eliminate code that has no more users after the other changes, drop
some redundant checks from the thermal core and clean it up (Rafael
Wysocki)
- Fix rounding of delay jiffies in the thermal core (Rafael Wysocki)
- Refuse to accept trip point temperature or hysteresis that would
lead to an invalid threshold value when setting them via sysfs
(Rafael Wysocki)
- Adjust states of all uninitialized instances in the .manage()
callback of the Bang-bang thermal governor (Rafael Wysocki)
- Drop a couple of redundant checks along with the code depending on
them from the thermal core (Rafael Wysocki)
- Rearrange the thermal core to avoid redundant checks and simplify
control flow in a couple of code paths (Rafael Wysocki)
- Add power domain DT bindings for new Amlogic SoCs (Georges Stark)
- Switch from CONFIG_PM_SLEEP guards to pm_sleep_ptr() in the ST
driver and add a Kconfig dependency on THERMAL_OF subsystem for the
STi driver (Raphael Gallais-Pou)
- Simplify the error code path in the probe functions in the brcmstb
driver with the helo of dev_err_probe() (Yan Zhen)
- Make imx_sc_thermal use dev_err_probe() (Alexander Stein)
- Remove trailing space after \n newline in the Renesas driver (Colin
Ian King)
- Add DT binding compatible string for the SA8255p to the tsens
thermal driver (Nikunj Kela)
- Use the devm_clk_get_enabled() helpers to simplify the init routine
in the sprd thermal driver (Huan Yang)
- Remove __maybe_unused notations for the functions by using the new
RUNTIME_PM_OPS() and SYSTEM_SLEEP_PM_OPS() macros on the IMx and
Qoriq drivers (Fabio Estevam)
- Remove unused declarations from the ti-soc-thermal driver's header
file as the functions in question were removed previously (Zhang
Zekun)"
* tag 'thermal-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (48 commits)
thermal: core: Drop thermal_zone_device_is_enabled()
thermal: core: Check passive delay in monitor_thermal_zone()
thermal: core: Drop dead code from monitor_thermal_zone()
thermal: core: Drop redundant lockdep_assert_held()
thermal: gov_bang_bang: Adjust states of all uninitialized instances
thermal: sysfs: Add sanity checks for trip temperature and hysteresis
thermal/drivers/imx_sc_thermal: Use dev_err_probe
thermal/drivers/ti-soc-thermal: Remove unused declarations
thermal/drivers/imx: Remove __maybe_unused notations
thermal/drivers/qoriq: Remove __maybe_unused notations
thermal/drivers/sprd: Use devm_clk_get_enabled() helpers
dt-bindings: thermal: tsens: document support on SA8255p
thermal/drivers/renesas: Remove trailing space after \n newline
thermal/drivers/brcmstb_thermal: Simplify with dev_err_probe()
thermal/drivers/sti: Depend on THERMAL_OF subsystem
thermal/drivers/st: Switch from CONFIG_PM_SLEEP guards to pm_sleep_ptr()
dt-bindings: thermal: amlogic,thermal: add optional power-domains
thermal: core: Drop tz field from struct thermal_instance
thermal: core: Drop redundant checks from thermal_bind_cdev_to_trip()
thermal: core: Rename cdev-to-thermal-zone bind/unbind functions
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"By the number of new lines of code, the most visible change here is
the addition of hybrid CPU capacity scaling support to the
intel_pstate driver. Next are the amd-pstate driver changes related to
the calculation of the AMD boost numerator and preferred core
detection.
As far as new hardware support is concerned, the intel_idle driver
will now handle Granite Rapids Xeon processors natively, the
intel_rapl power capping driver will recognize family 1Ah of AMD
processors and Intel ArrowLake-U chipos, and intel_pstate will handle
Granite Rapids and Sierra Forest chips in the out-of-band (OOB) mode.
Apart from the above, there is a usual collection of assorted fixes
and code cleanups in many places and there are tooling updates.
Specifics:
- Remove LATENCY_MULTIPLIER from cpufreq (Qais Yousef)
- Add support for Granite Rapids and Sierra Forest in OOB mode to the
intel_pstate cpufreq driver (Srinivas Pandruvada)
- Add basic support for CPU capacity scaling on x86 and make the
intel_pstate driver set asymmetric CPU capacity on hybrid systems
without SMT (Rafael Wysocki)
- Add missing MODULE_DESCRIPTION() macros to the powerpc cpufreq
driver (Jeff Johnson)
- Several OF related cleanups in cpufreq drivers (Rob Herring)
- Enable COMPILE_TEST for ARM drivers (Rob Herrring)
- Introduce quirks for syscon failures and use socinfo to get
revision for TI cpufreq driver (Dhruva Gole, Nishanth Menon)
- Minor cleanups in amd-pstate driver (Anastasia Belova, Dhananjay
Ugwekar)
- Minor cleanups for loongson, cpufreq-dt and powernv cpufreq drivers
(Danila Tikhonov, Huacai Chen, and Liu Jing)
- Make amd-pstate validate return of any attempt to update EPP
limits, which fixes the masking hardware problems (Mario
Limonciello)
- Move the calculation of the AMD boost numerator outside of
amd-pstate, correcting acpi-cpufreq on systems with preferred cores
(Mario Limonciello)
- Harden preferred core detection in amd-pstate to avoid potential
false positives (Mario Limonciello)
- Add extra unit test coverage for mode state machine (Mario
Limonciello)
- Fix an "Uninitialized variables" issue in amd-pstste (Qianqiang
Liu)
- Add Granite Rapids Xeon support to intel_idle (Artem Bityutskiy)
- Disable promotion to C1E on Jasper Lake and Elkhart Lake in
intel_idle (Kai-Heng Feng)
- Use scoped device node handling to fix missing of_node_put() and
simplify walking OF children in the riscv-sbi cpuidle driver
(Krzysztof Kozlowski)
- Remove dead code from cpuidle_enter_state() (Dhruva Gole)
- Change an error pointer to NULL to fix error handling in the
intel_rapl power capping driver (Dan Carpenter)
- Fix off by one in get_rpi() in the intel_rapl power capping driver
(Dan Carpenter)
- Add support for ArrowLake-U to the intel_rapl power capping driver
(Sumeet Pawnikar)
- Fix the energy-pkg event for AMD CPUs in the intel_rapl power
capping driver (Dhananjay Ugwekar)
- Add support for AMD family 1Ah processors to the intel_rapl power
capping driver (Dhananjay Ugwekar)
- Remove unused stub for saveable_highmem_page() and remove
deprecated macros from power management documentation (Andy
Shevchenko)
- Use ysfs_emit() and sysfs_emit_at() in "show" functions in the PM
sysfs interface (Xueqin Luo)
- Update the maintainers information for the
operating-points-v2-ti-cpu DT binding (Dhruva Gole)
- Drop unnecessary of_match_ptr() from ti-opp-supply (Rob Herring)
- Add missing MODULE_DESCRIPTION() macros to devfreq governors (Jeff
Johnson)
- Use devm_clk_get_enabled() in the exynos-bus devfreq driver (Anand
Moon)
- Use of_property_present() instead of of_get_property() in the
imx-bus devfreq driver (Rob Herring)
- Update directory handling and installation process in the pm-graph
Makefile and add .gitignore to ignore sleepgraph.py artifacts to
pm-graph (Amit Vadhavana, Yo-Jung Lin)
- Make cpupower display residency value in idle-info (Aboorva
Devarajan)
- Add missing powercap_set_enabled() stub function to cpupower (John
B. Wyatt IV)
- Add SWIG support to cpupower (John B. Wyatt IV)"
* tag 'pm-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (62 commits)
cpufreq/amd-pstate-ut: Fix an "Uninitialized variables" issue
cpufreq/amd-pstate-ut: Add test case for mode switches
cpufreq/amd-pstate: Export symbols for changing modes
amd-pstate: Add missing documentation for `amd_pstate_prefcore_ranking`
cpufreq: amd-pstate: Add documentation for `amd_pstate_hw_prefcore`
cpufreq: amd-pstate: Optimize amd_pstate_update_limits()
cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()
x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()
x86/amd: Move amd_get_highest_perf() out of amd-pstate
ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn
ACPI: CPPC: Drop check for non zero perf ratio
x86/amd: Rename amd_get_highest_perf() to amd_get_boost_ratio_numerator()
ACPI: CPPC: Adjust return code for inline functions in !CONFIG_ACPI_CPPC_LIB
x86/amd: Move amd_get_highest_perf() from amd.c to cppc.c
PM: hibernate: Remove unused stub for saveable_highmem_page()
pm:cpupower: Add error warning when SWIG is not installed
MAINTAINERS: Add Maintainers for SWIG Python bindings
pm:cpupower: Include test_raw_pylibcpupower.py
pm:cpupower: Add SWIG bindings files for libcpupower
pm:cpupower: Add missing powercap_set_enabled() stub function
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These update the ACPICA code in the kernel to upstream version
20240827, add support for ACPI-based enumeration of interrupt
controllers on RISC-V along with some related irqchip updates, clean
up the ACPI device object sysfs interface, add some quirks for
backlight handling and IRQ overrides, fix assorted issues and clean up
code.
Specifics:
- Check return value in acpi_db_convert_to_package() (Pei Xiao)
- Detect FACS and allow setting the waking vector on reduced-hardware
ACPI platforms (Jiaqing Zhao)
- Allow ACPICA to represent semaphores as integers (Adrien Destugues)
- Complete CXL 3.0 CXIMS structures support in ACPICA (Zhang Rui)
- Make ACPICA support SPCR version 4 and add RISC-V SBI Subtype to
DBG2 (Sia Jee Heng)
- Implement the Dword_PCC Resource Descriptor Macro in ACPICA (Jose
Marinho)
- Correct the typo in struct acpi_mpam_msc_node member (Punit
Agrawal)
- Implement ACPI_WARNING_ONCE() and ACPI_ERROR_ONCE() and use them to
prevent a Stall() violation warning from being printed every time
this takes place (Vasily Khoruzhick)
- Allow PCC Data Type in MCTP resource (Adam Young)
- Fix memory leaks on acpi_ps_get_next_namepath() and
acpi_ps_get_next_field() failures (Armin Wolf)
- Add support for supressing leading zeros in hex strings when
converting them to integers and update integer-to-hex-string
conversions in ACPICA (Armin Wolf)
- Add support for Windows 11 22H2 _OSI string (Armin Wolf)
- Avoid warning for Dump Functions in ACPICA (Adam Lackorzynski)
- Add extended linear address mode to HMAT MSCIS in ACPICA (Dave
Jiang)
- Handle empty connection_node in iasl (Aleksandrs Vinarskis)
- Allow for more flexibility in _DSM args (Saket Dumbre)
- Setup for ACPICA release 20240827 (Saket Dumbre)
- Add ACPI device enumeration support for interrupt controller
probing including taking dependencies into account (Sunil V L)
- Implement ACPI-based interrupt controller probing on RISC-V
(Sunil V L)
- Add ACPI support for AIA in riscv-intc and add ACPI support to
riscv-imsic, riscv-aplic, and sifive-plic (Sunil V L)
- Do not release locks during operation region accesses in the ACPI
EC driver (Rafael Wysocki)
- Fix up the _STR handling in the ACPI device object sysfs interface,
make it represent the device object attributes as an attribute
group and make it rely on driver core functionality for sysfs
attrubute management (Thomas Weißschuh)
- Extend error messages printed to the kernel log when
acpi_evaluate_dsm() fails to include revision and function number
(David Wang)
- Add a new AMDI0015 platform device ID to the ACPi APD driver for
AMD SoCs (Shyam Sundar S K)
- Use the driver core for the async probing management in the ACPI
battery driver (Thomas Weißschuh)
- Remove redundant initalizations of a local variable to NULL from
the ACPI battery driver (Ilpo Järvinen)
- Remove unneeded check in tps68470_pmic_opregion_probe() (Aleksandr
Mishin)
- Add support for setting the EPP register through the ACPI CPPC
sysfs interface if it is in FFH (Mario Limonciello)
- Fix MASK_VAL() usage in the ACPI CPPC library (Clément Léger)
- Reduce the log level of a per-CPU message about idle states in the
ACPI processor driver (Li RongQing)
- Fix crash in exit_round_robin() in the ACPI processor aggregator
device (PAD) driver (Seiji Nishikawa)
- Add force_vendor quirk for Panasonic Toughbook CF-18 in the ACPI
backlight driver (Hans de Goede)
- Make the DMI checks related to backlight handling on Lenovo Yoga
Tab 3 X90F less strict (Hans de Goede)
- Enforce native backlight handling on Apple MacbookPro9,2 (Esther
Shimanovich)
- Add IRQ override quirks for Asus Vivobook Go E1404GAB and MECHREV
GM7XG0M, and refine the TongFang GMxXGxx quirk (Li Chen, Tamim
Khan, Werner Sembach)
- Quirk ASUS ROG M16 to default to S3 sleep (Luke D. Jones)
- Define and use symbols for device and class name lengths in the
ACPI bus type code and make the code use strscpy() instead of
strcpy() in several places (Muhammad Qasim Abdul Majeed)"
* tag 'acpi-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (70 commits)
ACPI: resource: Add another DMI match for the TongFang GMxXGxx
ACPI: CPPC: Add support for setting EPP register in FFH
ACPI: PM: Quirk ASUS ROG M16 to default to S3 sleep
ACPI: video: Add force_vendor quirk for Panasonic Toughbook CF-18
ACPI: battery: use driver core managed async probing
ACPI: button: Use strscpy() instead of strcpy()
ACPI: resource: Skip IRQ override on Asus Vivobook Go E1404GAB
ACPI: CPPC: Fix MASK_VAL() usage
irqchip/sifive-plic: Add ACPI support
ACPICA: Setup for ACPICA release 20240827
ACPICA: Allow for more flexibility in _DSM args
ACPICA: iasl: handle empty connection_node
ACPICA: HMAT: Add extended linear address mode to MSCIS
ACPICA: Avoid warning for Dump Functions
ACPICA: Add support for Windows 11 22H2 _OSI string
ACPICA: Update integer-to-hex-string conversions
ACPICA: Add support for supressing leading zeros in hex strings
ACPICA: Allow for supressing leading zeros when using acpi_ex_convert_to_ascii()
ACPICA: Fix memory leak if acpi_ps_get_next_field() fails
ACPICA: Fix memory leak if acpi_ps_get_next_namepath() fails
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"The highlights are support for Arm's "Permission Overlay Extension"
using memory protection keys, support for running as a protected guest
on Android as well as perf support for a bunch of new interconnect
PMUs.
Summary:
ACPI:
- Enable PMCG erratum workaround for HiSilicon HIP10 and 11
platforms.
- Ensure arm64-specific IORT header is covered by MAINTAINERS.
CPU Errata:
- Enable workaround for hardware access/dirty issue on Ampere-1A
cores.
Memory management:
- Define PHYSMEM_END to fix a crash in the amdgpu driver.
- Avoid tripping over invalid kernel mappings on the kexec() path.
- Userspace support for the Permission Overlay Extension (POE) using
protection keys.
Perf and PMUs:
- Add support for the "fixed instruction counter" extension in the
CPU PMU architecture.
- Extend and fix the event encodings for Apple's M1 CPU PMU.
- Allow LSM hooks to decide on SPE permissions for physical
profiling.
- Add support for the CMN S3 and NI-700 PMUs.
Confidential Computing:
- Add support for booting an arm64 kernel as a protected guest under
Android's "Protected KVM" (pKVM) hypervisor.
Selftests:
- Fix vector length issues in the SVE/SME sigreturn tests
- Fix build warning in the ptrace tests.
Timers:
- Add support for PR_{G,S}ET_TSC so that 'rr' can deal with
non-determinism arising from the architected counter.
Miscellaneous:
- Rework our IPI-based CPU stopping code to try NMIs if regular IPIs
don't succeed.
- Minor fixes and cleanups"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (94 commits)
perf: arm-ni: Fix an NULL vs IS_ERR() bug
arm64: hibernate: Fix warning for cast from restricted gfp_t
arm64: esr: Define ESR_ELx_EC_* constants as UL
arm64: pkeys: remove redundant WARN
perf: arm_pmuv3: Use BR_RETIRED for HW branch event if enabled
MAINTAINERS: List Arm interconnect PMUs as supported
perf: Add driver for Arm NI-700 interconnect PMU
dt-bindings/perf: Add Arm NI-700 PMU
perf/arm-cmn: Improve format attr printing
perf/arm-cmn: Clean up unnecessary NUMA_NO_NODE check
arm64/mm: use lm_alias() with addresses passed to memblock_free()
mm: arm64: document why pte is not advanced in contpte_ptep_set_access_flags()
arm64: Expose the end of the linear map in PHYSMEM_END
arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec()
arm64/mm: Delete __init region from memblock.reserved
perf/arm-cmn: Support CMN S3
dt-bindings: perf: arm-cmn: Add CMN S3
perf/arm-cmn: Refactor DTC PMU register access
perf/arm-cmn: Make cycle counts less surprising
perf/arm-cmn: Improve build-time assertion
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC updates from Borislav Petkov:
- Drop a now obsolete ppc4xx_edac driver
- Fix conversion to physical memory addresses on Intel's Elkhart Lake
and Ice Lake hardware when the system address is above the
(Top-Of-Memory) TOM address
- Pay attention to the memory hole on Zynq UltraScale+ MPSoC DDR
controllers when injecting errors for testing purposes
- Add support for translating normalized error addresses reported by an
AMD memory controller into system physical addresses using an UEFI
mechanism called platform runtime mechanism (PRM).
- The usual cleanups and fixes
* tag 'edac_updates_for_v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC: Drop obsolete PPC4xx driver
EDAC/sb_edac: Fix the compile warning of large frame size
EDAC/{skx_common,i10nm}: Remove the AMAP register for determing DDR5
EDAC/{skx_common,skx,i10nm}: Move the common debug code to skx_common
EDAC/igen6: Fix conversion of system address to physical memory address
EDAC/synopsys: Fix error injection on Zynq UltraScale+
RAS/AMD/ATL: Translate normalized to system physical addresses using PRM
ACPI: PRM: Add PRM handler direct call support
|
|
Pull ARM updates from Russell King:
- clean up TTBCR magic numbers and use u32 for this register
- fix clang issue in VFP code leading to kernel oops, caused by
compiler instruction scheduling.
- switch 32-bit Arm to use GENERIC_CPU_DEVICES and use the
arch_cpu_is_hotpluggable() hook.
- pass struct device to arm_iommu_create_mapping() and move over to use
iommu_paging_domain_alloc() rather than iommu_domain_alloc()
- make amba_bustype constant
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
ARM: 9418/1: dma-mapping: Use iommu_paging_domain_alloc()
ARM: 9417/1: dma-mapping: Pass device to arm_iommu_create_mapping()
ARM: 9416/1: amba: make amba_bustype constant
ARM: 9412/1: Convert to arch_cpu_is_hotpluggable()
ARM: 9411/1: Switch over to GENERIC_CPU_DEVICES using arch_register_cpu()
ARM: 9410/1: vfp: Use asm volatile in fmrx/fmxr macros
ARM: 9409/1: mmu: Do not use magic number for TTBCR settings
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu"
"API:
- Make self-test asynchronous
Algorithms:
- Remove MPI functions added for SM3
- Add allocation error checks to remaining MPI functions (introduced
for SM3)
- Set default Jitter RNG OSR to 3
Drivers:
- Add hwrng driver for Rockchip RK3568 SoC
- Allow disabling SR-IOV VFs through sysfs in qat
- Fix device reset bugs in hisilicon
- Fix authenc key parsing by using generic helper in octeontx*
Others:
- Fix xor benchmarking on parisc"
* tag 'v6.12-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (96 commits)
crypto: n2 - Set err to EINVAL if snprintf fails for hmac
crypto: camm/qi - Use ERR_CAST() to return error-valued pointer
crypto: mips/crc32 - Clean up useless assignment operations
crypto: qcom-rng - rename *_of_data to *_match_data
crypto: qcom-rng - fix support for ACPI-based systems
dt-bindings: crypto: qcom,prng: document support for SA8255p
crypto: aegis128 - Fix indentation issue in crypto_aegis128_process_crypt()
crypto: octeontx* - Select CRYPTO_AUTHENC
crypto: testmgr - Hide ENOENT errors
crypto: qat - Remove trailing space after \n newline
crypto: hisilicon/sec - Remove trailing space after \n newline
crypto: algboss - Pass instance creation error up
crypto: api - Fix generic algorithm self-test races
crypto: hisilicon/qm - inject error before stopping queue
crypto: hisilicon/hpre - mask cluster timeout error
crypto: hisilicon/qm - reset device before enabling it
crypto: hisilicon/trng - modifying the order of header files
crypto: hisilicon - add a lock for the qp send operation
crypto: hisilicon - fix missed error branch
crypto: ccp - do not request interrupt on cmd completion when irqs disabled
...
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-next
ASoC: Updates for v6.12
This is a very large set of changes, almost all in drivers rather than
the core. Even with the addition of several quite large drivers the
overall diffstat is negative thanks to the removal of some old Intel
board support which has been obsoleted by the AVS driver, helped a bit
by some factoring out into helpers (especially around the Soundwire
machine drivers for x86).
Highlights include:
- More simplifications and cleanups throughout the subsystem from
Morimoto-san.
- Extensive cleanups and refactoring of the Soundwire drivers to make
better use of helpers.
- Removal of Intel machine support obsoleted by the AVS driver.
- Lots of DT schema conversions.
- Machine support for many AMD and Intel x86 platforms.
- Support for AMD ACP 7.1, Mediatek MT6367 and MT8365, Realtek RTL1320
SoundWire and rev C, and Texas Instruments TAS2563
|
|
Most of the drivers which used this header have been deleted, most
of these code is obsoleted, move the only defines that are actually
used into arch/powerpc/platforms/chrp/pegasos_eth.c and delete the
file completely.
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Link: https://patch.msgid.link/20240912011949.2726928-1-cuigaosheng1@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a new command status MLX5_CMD_STAT_NOT_READY to handle cases
where the firmware is not ready.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/20240911201757.1505453-14-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
New devices with new FW can support sync reset for firmware activate
using hot reset. Add capability for supporting it and add MFRL field to
query from FW which type of PCI reset method to use while handling sync
reset events.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240911201757.1505453-10-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As preparation for HW Steering support, where the function
get_root_namespace() is needed to get root FDB, make it an API function
and rename it to mlx5_get_root_namespace().
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/20240911201757.1505453-5-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts (sort of) and no adjacent changes.
This merge reverts commit b3c9e65eb227 ("net: hsr: remove seqnr_lock")
from net, as it was superseded by
commit 430d67bdcb04 ("net: hsr: Use the seqnr lock for frames received via interlink port.")
in net-next.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from netfilter.
There is a recently notified BT regression with no fix yet. I do not
think a fix will land in the next week.
Current release - regressions:
- core: tighten bad gso csum offset check in virtio_net_hdr
- netfilter: move nf flowtable bpf initialization in
nf_flow_table_module_init()
- eth: ice: stop calling pci_disable_device() as we use pcim
- eth: fou: fix null-ptr-deref in GRO.
Current release - new code bugs:
- hsr: prevent NULL pointer dereference in hsr_proxy_announce()
Previous releases - regressions:
- hsr: remove seqnr_lock
- netfilter: nft_socket: fix sk refcount leaks
- mptcp: pm: fix uaf in __timer_delete_sync
- phy: dp83822: fix NULL pointer dereference on DP83825 devices
- eth: revert "virtio_net: rx enable premapped mode by default"
- eth: octeontx2-af: Modify SMQ flush sequence to drop packets
Previous releases - always broken:
- eth: mlx5: fix bridge mode operations when there are no VFs
- eth: igb: Always call igb_xdp_ring_update_tail() under Tx lock"
* tag 'net-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits)
net: netfilter: move nf flowtable bpf initialization in nf_flow_table_module_init()
net: tighten bad gso csum offset check in virtio_net_hdr
netlink: specs: mptcp: fix port endianness
net: dpaa: Pad packets to ETH_ZLEN
mptcp: pm: Fix uaf in __timer_delete_sync
net: libwx: fix number of Rx and Tx descriptors
net: dsa: felix: ignore pending status of TAS module when it's disabled
net: hsr: prevent NULL pointer dereference in hsr_proxy_announce()
selftests: mptcp: include net_helper.sh file
selftests: mptcp: include lib.sh file
selftests: mptcp: join: restrict fullmesh endp on 1st sf
netfilter: nft_socket: make cgroupsv2 matching work with namespaces
netfilter: nft_socket: fix sk refcount leaks
MAINTAINERS: Add ethtool pse-pd to PSE NETWORK DRIVER
dt-bindings: net: tja11xx: fix the broken binding
selftests: net: csum: Fix checksums for packets with non-zero padding
net: phy: dp83822: Fix NULL pointer dereference on DP83825 devices
virtio_net: disable premapped mode by default
Revert "virtio_net: big mode skip the unmap check"
Revert "virtio_net: rx remove premapped failover code"
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
- asus-wmi: Disable OOBE that interferes with backlight control
- panasonic-laptop: Two fixes to SINF array handling
* tag 'platform-drivers-x86-v6.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: asus-wmi: Disable OOBE experience on Zenbook S 16
platform/x86: panasonic-laptop: Allocate 1 entry extra in the sinf array
platform/x86: panasonic-laptop: Fix SINF array out of bounds accesses
|
|
* for-next/poe: (31 commits)
arm64: pkeys: remove redundant WARN
kselftest/arm64: Add test case for POR_EL0 signal frame records
kselftest/arm64: parse POE_MAGIC in a signal frame
kselftest/arm64: add HWCAP test for FEAT_S1POE
selftests: mm: make protection_keys test work on arm64
selftests: mm: move fpregs printing
kselftest/arm64: move get_header()
arm64: add Permission Overlay Extension Kconfig
arm64: enable PKEY support for CPUs with S1POE
arm64: enable POE and PIE to coexist
arm64/ptrace: add support for FEAT_POE
arm64: add POE signal support
arm64: implement PKEYS support
arm64: add pte_access_permitted_no_overlay()
arm64: handle PKEY/POE faults
arm64: mask out POIndex when modifying a PTE
arm64: convert protection key into vm_flags and pgprot values
arm64: add POIndex defines
arm64: re-order MTE VM_ flags
arm64: enable the Permission Overlay Extension for EL0
...
|
|
* for-next/pkvm-guest:
arm64: smccc: Reserve block of KVM "vendor" services for pKVM hypercalls
drivers/virt: pkvm: Intercept ioremap using pKVM MMIO_GUARD hypercall
arm64: mm: Add confidential computing hook to ioremap_prot()
drivers/virt: pkvm: Hook up mem_encrypt API using pKVM hypercalls
arm64: mm: Add top-level dispatcher for internal mem_encrypt API
drivers/virt: pkvm: Add initial support for running as a protected guest
firmware/smccc: Call arch-specific hook on discovering KVM services
|
|
A patch for Qualcomm depends on some fixes.
|
|
Replace the bespoke cifs iterators of ITER_BVEC and ITER_KVEC to do hashing
with iterate_and_advance_kernel() - a variant on iterate_and_advance() that
only supports kernel-internal ITER_* types and not UBUF/IOVEC types.
The bespoke ITER_XARRAY is left because we don't really want to be calling
crypto_shash_update() under the RCU read lock for large amounts of data;
besides, ITER_XARRAY is going to be phased out.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Paulo Alcantara <pc@manguebit.com>
cc: Tom Talpey <tom@talpey.com>
cc: Enzo Matsumiya <ematsumiya@suse.de>
cc: linux-cifs@vger.kernel.org
Link: https://lore.kernel.org/r/20240814203850.2240469-24-dhowells@redhat.com/ # v2
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Because it uses DIO writes, cachefiles is unable to make a write to the
backing file if that write is not aligned to and sized according to the
backing file's DIO block alignment. This makes it tricky to handle a write
to the cache where the EOF on the network file is not correctly aligned.
To get around this, netfslib attempts to tell the driver it is calling how
much more data there is available beyond the EOF that it can use to pad the
write (netfslib preclears the part of the folio above the EOF). However,
it tries to tell the cache what the maximum length is, but doesn't
calculate this correctly; and, in any case, cachefiles actually ignores the
value and just skips the block.
Fix this by:
(1) Change the value passed to indicate the amount of extra data that can
be added to the operation (now ->submit_extendable_to). This is much
simpler to calculate as it's just the end of the folio minus the top
of the data within the folio - rather than having to account for data
spread over multiple folios.
(2) Make cachefiles add some of this data if the subrequest it is given
ends at the network file's i_size if the extra data is sufficient to
pad out to a whole block.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20240814203850.2240469-22-dhowells@redhat.com/ # v2
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Improve the efficiency of buffered reads in a number of ways:
(1) Overhaul the algorithm in general so that it's a lot more compact and
split the read submission code between buffered and unbuffered
versions. The unbuffered version can be vastly simplified.
(2) Read-result collection is handed off to a work queue rather than being
done in the I/O thread. Multiple subrequests can be processes
simultaneously.
(3) When a subrequest is collected, any folios it fully spans are
collected and "spare" data on either side is donated to either the
previous or the next subrequest in the sequence.
Notes:
(*) Readahead expansion is massively slows down fio, presumably because it
causes a load of extra allocations, both folio and xarray, up front
before RPC requests can be transmitted.
(*) RDMA with cifs does appear to work, both with SIW and RXE.
(*) PG_private_2-based reading and copy-to-cache is split out into its own
file and altered to use folio_queue. Note that the copy to the cache
now creates a new write transaction against the cache and adds the
folios to be copied into it. This allows it to use part of the
writeback I/O code.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20240814203850.2240469-20-dhowells@redhat.com/ # v2
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Use the new folio_queue structures to simplify the writeback code. The
problem with referring to the i_pages xarray directly is that we may have
gaps in the sequence of folios we're writing from that we need to skip when
we're removing the writeback mark from the folios we're writing back from.
At the moment the code tries to deal with this by carefully tracking the
gaps in each writeback stream (eg. write to server and write to cache) and
divining when there's a gap that spans folios (something that's not helped
by folios not being a consistent size).
Instead, the folio_queue buffer contains pointers only the folios we're
dealing with, has them in ascending order and indicates a gap by placing
non-consequitive folios next to each other. This makes it possible to
track where we need to clean up to by just keeping track of where we've
processed to on each stream and taking the minimum.
Note that the I/O iterator is always rounded up to the end of the folio,
even if that is beyond the EOF position, so that the cache can do DIO from
the page. The excess space is cleared, though mmapped writes clobber it.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20240814203850.2240469-18-dhowells@redhat.com/ # v2
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Make the netfs write-side routines use the new folio_queue struct to hold a
rolling buffer of folios, with the issuer adding folios at the tail and the
collector removing them from the head as they're processed instead of using
an xarray.
This will allow a subsequent patch to simplify the write collector.
The primary mark (as tested by folioq_is_marked()) is used to note if the
corresponding folio needs putting.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20240814203850.2240469-16-dhowells@redhat.com/ # v2
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Provide a copy_folio_from_iter() wrapper.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Christian Brauner <christian@brauner.io>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/20240814203850.2240469-14-dhowells@redhat.com/ # v2
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Define a data structure, struct folio_queue, to represent a sequence of
folios and a kernel-internal I/O iterator type, ITER_FOLIOQ, to allow a
list of folio_queue structures to be used to provide a buffer to
iov_iter-taking functions, such as sendmsg and recvmsg.
The folio_queue structure looks like:
struct folio_queue {
struct folio_batch vec;
u8 orders[PAGEVEC_SIZE];
struct folio_queue *next;
struct folio_queue *prev;
unsigned long marks;
unsigned long marks2;
};
It does not use a list_head so that next and/or prev can be set to NULL at
the ends of the list, allowing iov_iter-handling routines to determine that
they *are* the ends without needing to store a head pointer in the iov_iter
struct.
A folio_batch struct is used to hold the folio pointers which allows the
batch to be passed to batch handling functions. Two mark bits are
available per slot. The intention is to use at least one of them to mark
folios that need putting, but that might not be ultimately necessary.
Accessor functions are used to access the slots to do the masking and an
additional accessor function is used to indicate the size of the array.
The order of each folio is also stored in the structure to avoid the need
for iov_iter_advance() and iov_iter_revert() to have to query each folio to
find its size.
With careful barriering, this can be used as an extending buffer with new
folios inserted and new folio_queue structs added without the need for a
lock. Further, provided we always keep at least one struct in the buffer,
we can also remove consumed folios and consumed structs from the head end
as we without the need for locks.
[Questions/thoughts]
(1) To manage this, I need a head pointer, a tail pointer, a tail slot
number (assuming insertion happens at the tail end and the next
pointers point from head to tail). Should I put these into a struct
of their own, say "folio_queue_head" or "rolling_buffer"?
I will end up with two of these in netfs_io_request eventually, one
keeping track of the pagecache I'm dealing with for buffered I/O and
the other to hold a bounce buffer when we need one.
(2) Should I make the slots {folio,off,len} or bio_vec?
(3) This is intended to replace ITER_XARRAY eventually. Using an xarray
in I/O iteration requires the taking of the RCU read lock, doing
copying under the RCU read lock, walking the xarray (which may change
under us), handling retries and dealing with special values.
The advantage of ITER_XARRAY is that when we're dealing with the
pagecache directly, we don't need any allocation - but if we're doing
encrypted comms, there's a good chance we'd be using a bounce buffer
anyway.
This will require afs, erofs, cifs, orangefs and fscache to be
converted to not use this. afs still uses it for dirs and symlinks;
some of erofs usages should be easy to change, but there's one which
won't be so easy; ceph's use via fscache can be fixed by porting ceph
to netfslib; cifs is using xarray as a bounce buffer - that can be
moved to use sheaves instead; and orangefs has a similar problem to
erofs - maybe orangefs could use netfslib?
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Gao Xiang <xiang@kernel.org>
cc: Mike Marshall <hubcap@omnibond.com>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
cc: linux-afs@lists.infradead.org
cc: linux-cifs@vger.kernel.org
cc: ceph-devel@vger.kernel.org
cc: linux-erofs@lists.ozlabs.org
cc: devel@lists.orangefs.org
Link: https://lore.kernel.org/r/20240814203850.2240469-13-dhowells@redhat.com/ # v2
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
When I expanded uidgid mappings I intended for a struct uid_gid_map to
fit into a single cacheline on x86 as they tend to be pretty
performance sensitive (idmapped mounts etc). But a 4 byte hole was added
that brought it over 64 bytes. Fix that by adding the static extent
array and the extent counter into a substruct. C's type punning for
unions guarantees that we can access ->nr_extents even if the last
written to member wasn't within the same object. This is also what we
rely on in struct_group() and friends. This of course relies on
non-strict aliasing which we don't do.
99) If the member used to read the contents of a union object is not the
same as the member last used to store a value in the object, the
appropriate part of the object representation of the value is
reinterpreted as an object representation in the new type as
described in 6.2.6 (a process sometimes called "type punning").
Link: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf
Link: https://lore.kernel.org/r/20240910-work-uid_gid_map-v1-1-e6bc761363ed@kernel.org
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|