Age | Commit message (Collapse) | Author |
|
Pull filesystem folio updates from Matthew Wilcox:
"Primarily this series converts some of the address_space operations to
take a folio instead of a page.
Notably:
- a_ops->is_partially_uptodate() takes a folio instead of a page and
changes the type of the 'from' and 'count' arguments to make it
obvious they're bytes.
- a_ops->invalidatepage() becomes ->invalidate_folio() and has a
similar type change.
- a_ops->launder_page() becomes ->launder_folio()
- a_ops->set_page_dirty() becomes ->dirty_folio() and adds the
address_space as an argument.
There are a couple of other misc changes up front that weren't worth
separating into their own pull request"
* tag 'folio-5.18b' of git://git.infradead.org/users/willy/pagecache: (53 commits)
fs: Remove aops ->set_page_dirty
fb_defio: Use noop_dirty_folio()
fs: Convert __set_page_dirty_no_writeback to noop_dirty_folio
fs: Convert __set_page_dirty_buffers to block_dirty_folio
nilfs: Convert nilfs_set_page_dirty() to nilfs_dirty_folio()
mm: Convert swap_set_page_dirty() to swap_dirty_folio()
ubifs: Convert ubifs_set_page_dirty to ubifs_dirty_folio
f2fs: Convert f2fs_set_node_page_dirty to f2fs_dirty_node_folio
f2fs: Convert f2fs_set_data_page_dirty to f2fs_dirty_data_folio
f2fs: Convert f2fs_set_meta_page_dirty to f2fs_dirty_meta_folio
afs: Convert afs_dir_set_page_dirty() to afs_dir_dirty_folio()
btrfs: Convert extent_range_redirty_for_io() to use folios
fs: Convert trivial uses of __set_page_dirty_nobuffers to filemap_dirty_folio
btrfs: Convert from set_page_dirty to dirty_folio
fscache: Convert fscache_set_page_dirty() to fscache_dirty_folio()
fs: Add aops->dirty_folio
fs: Remove aops->launder_page
orangefs: Convert launder_page to launder_folio
nfs: Convert from launder_page to launder_folio
fuse: Convert from launder_page to launder_folio
...
|
|
Pull folio updates from Matthew Wilcox:
- Rewrite how munlock works to massively reduce the contention on
i_mmap_rwsem (Hugh Dickins):
https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/
- Sort out the page refcount mess for ZONE_DEVICE pages (Christoph
Hellwig):
https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/
- Convert GUP to use folios and make pincount available for order-1
pages. (Matthew Wilcox)
- Convert a few more truncation functions to use folios (Matthew
Wilcox)
- Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew
Wilcox)
- Convert rmap_walk to use folios (Matthew Wilcox)
- Convert most of shrink_page_list() to use a folio (Matthew Wilcox)
- Add support for creating large folios in readahead (Matthew Wilcox)
* tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache: (114 commits)
mm/damon: minor cleanup for damon_pa_young
selftests/vm/transhuge-stress: Support file-backed PMD folios
mm/filemap: Support VM_HUGEPAGE for file mappings
mm/readahead: Switch to page_cache_ra_order
mm/readahead: Align file mappings for non-DAX
mm/readahead: Add large folio readahead
mm: Support arbitrary THP sizes
mm: Make large folios depend on THP
mm: Fix READ_ONLY_THP warning
mm/filemap: Allow large folios to be added to the page cache
mm: Turn can_split_huge_page() into can_split_folio()
mm/vmscan: Convert pageout() to take a folio
mm/vmscan: Turn page_check_references() into folio_check_references()
mm/vmscan: Account large folios correctly
mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios
mm/vmscan: Free non-shmem folios without splitting them
mm/rmap: Constify the rmap_walk_control argument
mm/rmap: Convert rmap_walk() to take a folio
mm: Turn page_anon_vma() into folio_anon_vma()
mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read()
...
|
|
Merge updates from Andrew Morton:
- A few misc subsystems: kthread, scripts, ntfs, ocfs2, block, and vfs
- Most the MM patches which precede the patches in Willy's tree: kasan,
pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap,
sparsemem, vmalloc, pagealloc, memory-failure, mlock, hugetlb,
userfaultfd, vmscan, compaction, mempolicy, oom-kill, migration, thp,
cma, autonuma, psi, ksm, page-poison, madvise, memory-hotplug, rmap,
zswap, uaccess, ioremap, highmem, cleanups, kfence, hmm, and damon.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (227 commits)
mm/damon/sysfs: remove repeat container_of() in damon_sysfs_kdamond_release()
Docs/ABI/testing: add DAMON sysfs interface ABI document
Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface
selftests/damon: add a test for DAMON sysfs interface
mm/damon/sysfs: support DAMOS stats
mm/damon/sysfs: support DAMOS watermarks
mm/damon/sysfs: support schemes prioritization
mm/damon/sysfs: support DAMOS quotas
mm/damon/sysfs: support DAMON-based Operation Schemes
mm/damon/sysfs: support the physical address space monitoring
mm/damon/sysfs: link DAMON for virtual address spaces monitoring
mm/damon: implement a minimal stub for sysfs-based DAMON interface
mm/damon/core: add number of each enum type values
mm/damon/core: allow non-exclusive DAMON start/stop
Docs/damon: update outdated term 'regions update interval'
Docs/vm/damon/design: update DAMON-Idle Page Tracking interference handling
Docs/vm/damon: call low level monitoring primitives the operations
mm/damon: remove unnecessary CONFIG_DAMON option
mm/damon/paddr,vaddr: remove damon_{p,v}a_{target_valid,set_operations}()
mm/damon/dbgfs-test: fix is_target_id() change
...
|
|
Let's make it clearer at which places we actually add and remove memory
blocks -- streamlining the terminology -- and highlight which memory block
start out online and which start out as offline.
* rename add_memory_block -> add_boot_memory_block
* rename init_memory_block -> add_memory_block
* rename unregister_memory -> remove_memory_block
* rename register_memory -> __add_memory_block
* add add_hotplug_memory_block
* mark add_boot_memory_block with __init (suggested by Oscar)
__add_memory_block() is a pure helper for add_memory_block(), remove
the somewhat obvious comment.
Link: https://lkml.kernel.org/r/20220221154531.11382-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
test_pages_in_a_zone() is just another nasty PFN walker that can easily
stumble over ZONE_DEVICE memory ranges falling into the same memory block
as ordinary system RAM: the memmap of parts of these ranges might possibly
be uninitialized. In fact, we observed (on an older kernel) with UBSAN:
UBSAN: Undefined behaviour in ./include/linux/mm.h:1133:50
index 7 is out of range for type 'zone [5]'
CPU: 121 PID: 35603 Comm: read_all Kdump: loaded Tainted: [...]
Hardware name: Dell Inc. PowerEdge R7425/08V001, BIOS 1.12.2 11/15/2019
Call Trace:
dump_stack+0x9a/0xf0
ubsan_epilogue+0x9/0x7a
__ubsan_handle_out_of_bounds+0x13a/0x181
test_pages_in_a_zone+0x3c4/0x500
show_valid_zones+0x1fa/0x380
dev_attr_show+0x43/0xb0
sysfs_kf_seq_show+0x1c5/0x440
seq_read+0x49d/0x1190
vfs_read+0xff/0x300
ksys_read+0xb8/0x170
do_syscall_64+0xa5/0x4b0
entry_SYSCALL_64_after_hwframe+0x6a/0xdf
RIP: 0033:0x7f01f4439b52
We seem to stumble over a memmap that contains a garbage zone id. While
we could try inserting pfn_to_online_page() calls, it will just make
memory offlining slower, because we use test_pages_in_a_zone() to make
sure we're offlining pages that all belong to the same zone.
Let's just get rid of this PFN walker and determine the single zone of a
memory block -- if any -- for early memory blocks during boot. For memory
onlining, we know the single zone already. Let's avoid any additional
memmap scanning and just rely on the zone information available during
boot.
For memory hot(un)plug, we only really care about memory blocks that:
* span a single zone (and, thereby, a single node)
* are completely System RAM (IOW, no holes, no ZONE_DEVICE)
If one of these conditions is not met, we reject memory offlining.
Hotplugged memory blocks (starting out offline), always meet both
conditions.
There are three scenarios to handle:
(1) Memory hot(un)plug
A memory block with zone == NULL cannot be offlined, corresponding to
our previous test_pages_in_a_zone() check.
After successful memory onlining/offlining, we simply set the zone
accordingly.
* Memory onlining: set the zone we just used for onlining
* Memory offlining: set zone = NULL
So a hotplugged memory block starts with zone = NULL. Once memory
onlining is done, we set the proper zone.
(2) Boot memory with !CONFIG_NUMA
We know that there is just a single pgdat, so we simply scan all zones
of that pgdat for an intersection with our memory block PFN range when
adding the memory block. If more than one zone intersects (e.g., DMA and
DMA32 on x86 for the first memory block) we set zone = NULL and
consequently mimic what test_pages_in_a_zone() used to do.
(3) Boot memory with CONFIG_NUMA
At the point in time we create the memory block devices during boot, we
don't know yet which nodes *actually* span a memory block. While we could
scan all zones of all nodes for intersections, overlapping nodes complicate
the situation and scanning all nodes is possibly expensive. But that
problem has already been solved by the code that sets the node of a memory
block and creates the link in the sysfs --
do_register_memory_block_under_node().
So, we hook into the code that sets the node id for a memory block. If
we already have a different node id set for the memory block, we know
that multiple nodes *actually* have PFNs falling into our memory block:
we set zone = NULL and consequently mimic what test_pages_in_a_zone() used
to do. If there is no node id set, we do the same as (2) for the given
node.
Note that the call order in driver_init() is:
-> memory_dev_init(): create memory block devices
-> node_dev_init(): link memory block devices to the node and set the
node id
So in summary, we detect if there is a single zone responsible for this
memory block and we consequently store the zone in that case in the
memory block, updating it during memory onlining/offlining.
Link: https://lkml.kernel.org/r/20220210184359.235565-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Rafael Parra <rparrazo@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Rafael Parra <rparrazo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
register_memory_block_under_node()
Patch series "drivers/base/memory: determine and store zone for single-zone memory blocks", v2.
I remember talking to Michal in the past about removing
test_pages_in_a_zone(), which we use for:
* verifying that a memory block we intend to offline is really only managed
by a single zone. We don't support offlining of memory blocks that are
managed by multiple zones (e.g., multiple nodes, DMA and DMA32)
* exposing that zone to user space via
/sys/devices/system/memory/memory*/valid_zones
Now that I identified some more cases where test_pages_in_a_zone() might
go wrong, and we received an UBSAN report (see patch #3), let's get rid of
this PFN walker.
So instead of detecting the zone at runtime with test_pages_in_a_zone() by
scanning the memmap, let's determine and remember for each memory block if
it's managed by a single zone. The stored zone can then be used for the
above two cases, avoiding a manual lookup using test_pages_in_a_zone().
This avoids eventually stumbling over uninitialized memmaps in corner
cases, especially when ZONE_DEVICE ranges partly fall into memory block
(that are responsible for managing System RAM).
Handling memory onlining is easy, because we online to exactly one zone.
Handling boot memory is more tricky, because we want to avoid scanning all
zones of all nodes to detect possible zones that overlap with the physical
memory region of interest. Fortunately, we already have code that
determines the applicable nodes for a memory block, to create sysfs links
-- we'll hook into that.
Patch #1 is a simple cleanup I had laying around for a longer time.
Patch #2 contains the main logic to remove test_pages_in_a_zone() and
further details.
[1] https://lkml.kernel.org/r/20220128144540.153902-1-david@redhat.com
[2] https://lkml.kernel.org/r/20220203105212.30385-1-david@redhat.com
This patch (of 2):
Let's adjust the stale terminology, making it match
unregister_memory_block_under_nodes() and
do_register_memory_block_under_node(). We're dealing with memory block
devices, which span 1..X memory sections.
Link: https://lkml.kernel.org/r/20220210184359.235565-1-david@redhat.com
Link: https://lkml.kernel.org/r/20220210184359.235565-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Rafael Parra <rparrazo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
node_dev_init()
... and call node_dev_init() after memory_dev_init() from driver_init(),
so before any of the existing arch/subsys calls. All online nodes should
be known at that point: early during boot, arch code determines node and
zone ranges and sets the relevant nodes online; usually this happens in
setup_arch().
This is in line with memory_dev_init(), which initializes the memory
device subsystem and creates all memory block devices.
Similar to memory_dev_init(), panic() if anything goes wrong, we don't
want to continue with such basic initialization errors.
The important part is that node_dev_init() gets called after
memory_dev_init() and after cpu_dev_init(), but before any of the relevant
archs call register_cpu() to register the new cpu device under the node
device. The latter should be the case for the current users of
topology_init().
Link: https://lkml.kernel.org/r/20220203105212.30385-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Tested-by: Anatoly Pugachev <matorola@gmail.com> (sparc64)
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
succeeded
If register_memory() fails, we freed the memory block but already added
the memory block to the group list, not good. Let's defer adding the
block to the memory group to after registering the memory block device.
We do handle it properly during unregister_memory(), but that's not
called when the registration fails.
Link: https://lkml.kernel.org/r/20220128144540.153902-1-david@redhat.com
Fixes: 028fc57a1c36 ("drivers/base/memory: introduce "memory groups" to logically group memory blocks")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When the hwpoison page meets the filter conditions, it should not be
regarded as successful memory_failure() processing for mce handler, but
should return a distinct value, otherwise mce handler regards the error
page has been identified and isolated, which may lead to calling
set_mce_nospec() to change page attribute, etc.
Here memory_failure() return -EOPNOTSUPP to indicate that the error
event is filtered, mce handler should not take any action for this
situation and hwpoison injector should treat as correct.
Link: https://lkml.kernel.org/r/20220223082135.2769649-1-luofei@unicloud.com
Signed-off-by: luofei <luofei@unicloud.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Some places in the kernel don't really expect pageblock_order >=
MAX_ORDER, and it looks like this is only possible in corner cases:
1) CONFIG_DEFERRED_STRUCT_PAGE_INIT we'll end up freeing pageblock_order
pages via __free_pages_core(), which cannot possibly work.
2) find_zone_movable_pfns_for_nodes() will roundup the ZONE_MOVABLE
start PFN to MAX_ORDER_NR_PAGES. Consequently with a bigger
pageblock_order, we could have a single pageblock partially managed by
two zones.
3) compaction code runs into __fragmentation_index() with order
>= MAX_ORDER, when checking WARN_ON_ONCE(order >= MAX_ORDER). [1]
4) mm/page_reporting.c won't be reporting any pages with default
page_reporting_order == pageblock_order, as we'll be skipping the
reporting loop inside page_reporting_process_zone().
5) __rmqueue_fallback() will never be able to steal with
ALLOC_NOFRAGMENT.
pageblock_order >= MAX_ORDER is weird either way: it's a pure
optimization for making alloc_contig_range(), as used for allcoation of
gigantic pages, a little more reliable to succeed. However, if there is
demand for somewhat reliable allocation of gigantic pages, affected
setups should be using CMA or boottime allocations instead.
So let's make sure that pageblock_order < MAX_ORDER and simplify.
[1] https://lkml.kernel.org/r/87r189a2ks.fsf@linux.ibm.com
Link: https://lkml.kernel.org/r/20220214174132.219303-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: John Garry via iommu <iommu@lists.linux-foundation.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Patch series "mm: enforce pageblock_order < MAX_ORDER".
Having pageblock_order >= MAX_ORDER seems to be able to happen in corner
cases and some parts of the kernel are not prepared for it.
For example, Aneesh has shown [1] that such kernels can be compiled on
ppc64 with 64k base pages by setting FORCE_MAX_ZONEORDER=8, which will
run into a WARN_ON_ONCE(order >= MAX_ORDER) in comapction code right
during boot.
We can get pageblock_order >= MAX_ORDER when the default hugetlb size is
bigger than the maximum allocation granularity of the buddy, in which
case we are no longer talking about huge pages but instead gigantic
pages.
Having pageblock_order >= MAX_ORDER can only make alloc_contig_range()
of such gigantic pages more likely to succeed.
Reliable use of gigantic pages either requires boot time allcoation or
CMA, no need to overcomplicate some places in the kernel to optimize for
corner cases that are broken in other areas of the kernel.
This patch (of 2):
Let's enforce pageblock_order < MAX_ORDER and simplify.
Especially patch #1 can be regarded a cleanup before:
[PATCH v5 0/6] Use pageblock_order for cma and alloc_contig_range
alignment. [2]
[1] https://lkml.kernel.org/r/87r189a2ks.fsf@linux.ibm.com
[2] https://lkml.kernel.org/r/20220211164135.1803616-1-zi.yan@sent.com
Link: https://lkml.kernel.org/r/20220214174132.219303-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Acked-by: Rob Herring <robh@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: John Garry via iommu <iommu@lists.linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
At each login the user forces the kernel to create a new terminal and
allocate up to ~1Kb memory for the tty-related structures.
By default it's allowed to create up to 4096 ptys with 1024 reserve for
initial mount namespace only and the settings are controlled by host
admin.
Though this default is not enough for hosters with thousands of
containers per node. Host admin can be forced to increase it up to
NR_UNIX98_PTY_MAX = 1<<20.
By default container is restricted by pty mount_opt.max = 1024, but
admin inside container can change it via remount. As a result, one
container can consume almost all allowed ptys and allocate up to 1Gb of
unaccounted memory.
It is not enough per-se to trigger OOM on host, however anyway, it
allows to significantly exceed the assigned memcg limit and leads to
troubles on the over-committed node.
It makes sense to account for them to restrict the host's memory
consumption from inside the memcg-limited container.
Link: https://lkml.kernel.org/r/5d4bca06-7d4f-a905-e518-12981ebca1b3@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The inode allocation is supposed to use alloc_inode_sb(), so convert
kmem_cache_alloc() of all filesystems to alloc_inode_sb().
Link: https://lkml.kernel.org/r/20220228122126.37293-5-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Theodore Ts'o <tytso@mit.edu> [ext4]
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Alex Shi <alexs@kernel.org>
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Fam Zheng <fam.zheng@bytedance.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kari Argillander <kari.argillander@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
These functions are no longer useful as no BDIs report congestions any
more.
Removing the test on bdi_write_contested() in current_may_throttle()
could cause a small change in behaviour, but only when PF_LOCAL_THROTTLE
is set.
So replace the calls by 'false' and simplify the code - and remove the
functions.
[akpm@linux-foundation.org: fix build]
Link: https://lkml.kernel.org/r/164549983742.9187.2570198746005819592.stgit@noble.brown
Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> [nilfs]
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Paolo Valente <paolo.valente@linaro.org>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
- Cleanups for SCHED_DEADLINE
- Tracing updates/fixes
- CPU Accounting fixes
- First wave of changes to optimize the overhead of the scheduler
build, from the fast-headers tree - including placeholder *_api.h
headers for later header split-ups.
- Preempt-dynamic using static_branch() for ARM64
- Isolation housekeeping mask rework; preperatory for further changes
- NUMA-balancing: deal with CPU-less nodes
- NUMA-balancing: tune systems that have multiple LLC cache domains per
node (eg. AMD)
- Updates to RSEQ UAPI in preparation for glibc usage
- Lots of RSEQ/selftests, for same
- Add Suren as PSI co-maintainer
* tag 'sched-core-2022-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (81 commits)
sched/headers: ARM needs asm/paravirt_api_clock.h too
sched/numa: Fix boot crash on arm64 systems
headers/prep: Fix header to build standalone: <linux/psi.h>
sched/headers: Only include <linux/entry-common.h> when CONFIG_GENERIC_ENTRY=y
cgroup: Fix suspicious rcu_dereference_check() usage warning
sched/preempt: Tell about PREEMPT_DYNAMIC on kernel headers
sched/topology: Remove redundant variable and fix incorrect type in build_sched_domains
sched/deadline,rt: Remove unused parameter from pick_next_[rt|dl]_entity()
sched/deadline,rt: Remove unused functions for !CONFIG_SMP
sched/deadline: Use __node_2_[pdl|dle]() and rb_first_cached() consistently
sched/deadline: Merge dl_task_can_attach() and dl_cpu_busy()
sched/deadline: Move bandwidth mgmt and reclaim functions into sched class source file
sched/deadline: Remove unused def_dl_bandwidth
sched/tracing: Report TASK_RTLOCK_WAIT tasks as TASK_UNINTERRUPTIBLE
sched/tracing: Don't re-read p->state when emitting sched_switch event
sched/rt: Plug rt_mutex_setprio() vs push_rt_task() race
sched/cpuacct: Remove redundant RCU read lock
sched/cpuacct: Optimize away RCU read lock
sched/cpuacct: Fix charge percpu cpuusage
sched/headers: Reorganize, clean up and optimize kernel/sched/sched.h dependencies
...
|
|
This reverts commit 6f98a4bfee72c22f50aedb39fb761567969865fe.
It turns out we still can't do this. Way too many platforms that don't
have any real source of randomness at boot and no jitter entropy because
they don't even have a cycle counter.
As reported by Guenter Roeck:
"This causes a large number of qemu boot test failures for various
architectures (arm, m68k, microblaze, sparc32, xtensa are the ones I
observed).
Common denominator is that boot hangs at 'Saving random seed:'"
This isn't hugely unexpected - we tried it, it failed, so now we'll
revert it.
Link: https://lore.kernel.org/all/20220322155820.GA1745955@roeck-us.net/
Reported-and-bisected-by: Guenter Roeck <linux@roeck-us.net>
Cc: Jason Donenfeld <Jason@zx2c4.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull bounds fixes from Kees Cook:
"These are a handful of buffer and array bounds fixes that I've been
carrying in preparation for the coming memcpy improvements and the
enabling of '-Warray-bounds' globally.
There are additional similar fixes in other maintainer's trees, but
these ended up getting carried by me. :)"
* tag 'bounds-fixes-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
media: omap3isp: Use struct_group() for memcpy() region
tpm: vtpm_proxy: Check length to avoid compiler warning
alpha: Silence -Warray-bounds warnings
m68k: cmpxchg: Dereference matching size
intel_th: msu: Use memset_startat() for clearing hw header
KVM: x86: Replace memset() "optimization" with normal per-field writes
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull pstore updates from Kees Cook:
- Don't use semaphores in always-atomic-context code (Jann Horn)
- Add "ECC:" prefix to ECC messages (Vincent Whitchurch)
* tag 'pstore-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
pstore: Don't use semaphores in always-atomic-context code
pstore: Add prefix to ECC messages
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi updates from Mark Brown:
"The overwhelming bulk of this pull request is a change from Uwe
Kleine-König which changes the return type of the remove() function to
void as part of some wider work he's doing to do this for all bus
types, causing updates to most SPI device drivers. The branch with
that on has been cross merged with a couple of other trees which added
new SPI drivers this cycle, I'm not expecting any build issues
resulting from the change.
Otherwise it's been a relatively quiet release with some new device
support, a few minor features and the welcome completion of the
conversion of the subsystem to use GPIO descriptors rather than
numbers:
- Change return type of remove() to void.
- Completion of the conversion of SPI controller drivers to use GPIO
descriptors rather than numbers.
- Quite a few DT schema conversions.
- Support for multiple SPI devices on a bus in ACPI systems.
- Big overhaul of the PXA2xx SPI driver.
- Support for AMD AMDI0062, Intel Raptor Lake, Mediatek MT7986 and
MT8186, nVidia Tegra210 and Tegra234, Renesas RZ/V2L, Tesla FSD and
Sunplus SP7021"
[ And this is obviously where that spi change that snuck into the
regulator tree _should_ have been :^]
* tag 'spi-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (124 commits)
spi: fsi: Implement a timeout for polling status
spi: Fix erroneous sgs value with min_t()
spi: tegra20: Use of_device_get_match_data()
spi: mediatek: add ipm design support for MT7986
spi: Add compatible for MT7986
spi: sun4i: fix typos in comments
spi: mediatek: support tick_delay without enhance_timing
spi: Update clock-names property for arm pl022
spi: rockchip-sfc: fix platform_get_irq.cocci warning
spi: s3c64xx: Add spi port configuration for Tesla FSD SoC
spi: dt-bindings: samsung: Add fsd spi compatible
spi: topcliff-pch: Prevent usage of potentially stale DMA device
spi: tegra210-quad: combined sequence mode
spi: tegra210-quad: add acpi support
spi: npcm-fiu: Fix typo ("npxm")
spi: Fix Tegra QSPI example
spi: qup: replace spin_lock_irqsave by spin_lock in hard IRQ
spi: cadence: fix platform_get_irq.cocci warning
spi: Update NXP Flexspi maintainer details
dt-bindings: mfd: maxim,max77802: Convert to dtschema
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator updates from Mark Brown:
"Quite a quiet release for the regulator API, mainly a few new drivers
plus a lot of fixes for the Raspberry Pi panel driver.
There's also a SPI commit in here which I managed to apply to the
wrong tree and then didn't notice until there were too many commits on
top of it, sorry about that.
- Make it easier to use the virtual consumer test driver with DT
systems.
- Substantial overhaul providing various fixes and robustness
improvements for the Raspberry Pi panel driver.
- Support for Qualcomm PMX65 and SDX65, Richtek RT5190A, and Texas
Instruments TPS62864x"
* tag 'regulator-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (26 commits)
regulator: qcom-rpmh: Add support for SDX65
regulator: dt-bindings: Add PMX65 compatibles
regulator: vctrl: Use min() instead of doing it manually
regulator: rt5190a: Add support for Richtek RT5190A PMIC
regulator: Add bindings for Richtek RT5190A PMIC
regulator: Convert TPS62360 binding to json-schema
regulator: cleanup comments
regulator: virtual: add devicetree support
regulator: virtual: warn against production use
regulator: virtual: use dev_err_probe()
regulator: tps62864: Fix bindings for SW property
regulator: Add support for TPS6286x
regulator: Add bindings for TPS62864x
regulator/rpi-panel-attiny: Use two transactions for I2C read
regulator/rpi-panel-attiny: Use the regmap cache
regulator: rpi-panel: Remove get_brightness hook
regulator: rpi-panel: Add GPIO control for panel and touch resets
regulator: rpi-panel: Convert to drive lines directly
regulator: rpi-panel: Ensure the backlight is off during probe.
regulator: rpi-panel: Serialise operations.
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"A couple of small fixes, plus some new features that enable us to
handle devices that reformat register addresses depending on the bus
used to handle the control interface more gracefully"
* tag 'regmap-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: allow a defined reg_base to be added to every address
regmap: add configurable downshift for addresses
regmap: irq: cleanup comments
regmap-irq: Fix typo in comment
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon updates from Guenter Roeck:
"New drivers:
- Texas Instruments TMP464 and TMP468 driver
- Vicor PLI1209BC Digital Supervisor driver
- ASUS EC driver
Improvements to existing drivers:
- adt7x10:
- Convert to use regmap
- convert to use with_info API
- use hwmon_notify_event
- other cleanup
- aquacomputer_d5next:
- Add support for Aquacomputer Farbwerk 360
- asus_wmi_sensors:
- Add ASUS ROG STRIX B450-F GAMING II
- asus_wmi_ec_sensors:
- Support T_Sensor on Prime X570-Pro
- Deprecate driver (replaced by new driver)
- axi-fan-control:
- Use hwmon_notify_event
- dell-smm:
- Clean up CONFIG_I8K
- disable fan type support for Inspiron 3505
- various other cleanup
- hwmon core:
- Report attribute name with udev events
- Add "label" attribute to ABI,
- Add support for pwm auto channels attribute
- max6639:
- Add regulator support
- lm70:
- Add support for TI TMP125
- lm83:
- Cleanup, convert to use with_info API
- mlxreg-fan:
- Use pwm attribute for setting fan speed low limit
- nct6775:
- Add board ID's for ASUS ROG STRIX Z390/Z490/X570-* / PRIME
X570-P, PRIME B550-PLUS, ASUS Pro B550M-C/PRIME B550M-A
- Add support for TSI temperature registers
- occ:
- Add various new sysfs attributes
- pmbus core:
- Handle VIN unit off status
- Add regulator supply into macro
- Add get_error_flags support to regulator ops
- pmbus/adm1275:
- Allow setting sample averaging
- pmbus/lm25066:
- Add regulator support
- pmbus/xdpe12284:
- Add support for xdpe11280
- register as regulator
- powr1220:
- Convert to with_info API
- Add support for Lattice's POWR1014 power manager IC
- sch56xx:
- Cleanup and minor improvements
- sch5627:
- Add pwmX_auto_channels_temp support
- tc654:
- Add thermal_cooling device support"
* tag 'hwmon-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (86 commits)
hwmon: (dell-smm) Add Inspiron 3505 to fan type blacklist
hwmon: (pmbus) Add Vin unit off handling
hwmon: (scpi-hwmon): Use of_device_get_match_data()
hwmon: (axi-fan-control) Use hwmon_notify_event
hwmon: (vexpress-hwmon) Use of_device_get_match_data()
hwmon: Add driver for Texas Instruments TMP464 and TMP468
dt-bindings: hwmon: add tmp464.yaml
dt-bindings: hwmon: Add sample averaging properties for ADM1275
hwmon: (adm1275) Allow setting sample averaging
hwmon: (xdpe12284) Add regulator support
hwmon: (xdpe12284) Add support for xdpe11280
dt-bindings: trivial-devices: Add xdpe11280
hwmon: (aquacomputer_d5next) Add support for Aquacomputer Farbwerk 360
hwmon: (sch5627) Add pwmX_auto_channels_temp support
hwmon: (core) Add support for pwm auto channels attribute
hwmon: (lm70) Add ti,tmp125 support
dt-bindings: Add ti,tmp125 temperature sensor binding
hwmon: (pmbus/pli1209bc) Add regulator support
hwmon: (pmbus) Add support for pli1209bc
dt-bindings:trivial-devices: Add pli1209bc
...
|
|
Pull block driver updates from Jens Axboe:
- NVMe updates via Christoph:
- add vectored-io support for user-passthrough (Kanchan Joshi)
- add verbose error logging (Alan Adamson)
- support buffered I/O on block devices in nvmet (Chaitanya
Kulkarni)
- central discovery controller support (Martin Belanger)
- fix and extended the globally unique idenfier validation
(Christoph)
- move away from the deprecated IDA APIs (Sagi Grimberg)
- misc code cleanup (Keith Busch, Max Gurtovoy, Qinghua Jin,
Chaitanya Kulkarni)
- add lockdep annotations for in-kernel sockets (Chris Leech)
- use vmalloc for ANA log buffer (Hannes Reinecke)
- kerneldoc fixes (Chaitanya Kulkarni)
- cleanups (Guoqing Jiang, Chaitanya Kulkarni, Christoph)
- warn about shared namespaces without multipathing (Christoph)
- MD updates via Song with a set of cleanups (Christoph, Mariusz, Paul,
Erik, Dirk)
- loop cleanups and queue depth configuration (Chaitanya)
- null_blk cleanups and fixes (Chaitanya)
- Use descriptive init/exit names in virtio_blk (Randy)
- Use bvec_kmap_local() in drivers (Christoph)
- bcache fixes (Mingzhe)
- xen blk-front persistent grant speedups (Juergen)
- rnbd fix and cleanup (Gioh)
- Misc fixes (Christophe, Colin)
* tag 'for-5.18/drivers-2022-03-18' of git://git.kernel.dk/linux-block: (76 commits)
virtio_blk: eliminate anonymous module_init & module_exit
nvme: warn about shared namespaces without CONFIG_NVME_MULTIPATH
nvme: remove nvme_alloc_request and nvme_alloc_request_qid
nvme: cleanup how disk->disk_name is assigned
nvmet: move the call to nvmet_ns_changed out of nvmet_ns_revalidate
nvmet: use snprintf() with PAGE_SIZE in configfs
nvmet: don't fold lines
nvmet-rdma: fix kernel-doc warning for nvmet_rdma_device_removal
nvmet-fc: fix kernel-doc warning for nvmet_fc_unregister_targetport
nvmet-fc: fix kernel-doc warning for nvmet_fc_register_targetport
nvme-tcp: lockdep: annotate in-kernel sockets
nvme-tcp: don't fold the line
nvme-tcp: don't initialize ret variable
nvme-multipath: call bio_io_error in nvme_ns_head_submit_bio
nvme-multipath: use vmalloc for ANA log buffer
xen/blkfront: speed up purge_persistent_grants()
raid5: initialize the stripe_head embeeded bios as needed
raid5-cache: statically allocate the recovery ra bio
raid5-cache: fully initialize flush_bio when needed
raid5-ppl: fully initialize the bio in ppl_new_iounit
...
|
|
Pull block updates from Jens Axboe:
- BFQ cleanups and fixes (Yu, Zhang, Yahu, Paolo)
- blk-rq-qos completion fix (Tejun)
- blk-cgroup merge fix (Tejun)
- Add offline error return value to distinguish it from an IO error on
the device (Song)
- IO stats fixes (Zhang, Christoph)
- blkcg refcount fixes (Ming, Yu)
- Fix for indefinite dispatch loop softlockup (Shin'ichiro)
- blk-mq hardware queue management improvements (Ming)
- sbitmap dead code removal (Ming, John)
- Plugging merge improvements (me)
- Show blk-crypto capabilities in sysfs (Eric)
- Multiple delayed queue run improvement (David)
- Block throttling fixes (Ming)
- Start deprecating auto module loading based on dev_t (Christoph)
- bio allocation improvements (Christoph, Chaitanya)
- Get rid of bio_devname (Christoph)
- bio clone improvements (Christoph)
- Block plugging improvements (Christoph)
- Get rid of genhd.h header (Christoph)
- Ensure drivers use appropriate flush helpers (Christoph)
- Refcounting improvements (Christoph)
- Queue initialization and teardown improvements (Ming, Christoph)
- Misc fixes/improvements (Barry, Chaitanya, Colin, Dan, Jiapeng,
Lukas, Nian, Yang, Eric, Chengming)
* tag 'for-5.18/block-2022-03-18' of git://git.kernel.dk/linux-block: (127 commits)
block: cancel all throttled bios in del_gendisk()
block: let blkcg_gq grab request queue's refcnt
block: avoid use-after-free on throttle data
block: limit request dispatch loop duration
block/bfq-iosched: Fix spelling mistake "tenative" -> "tentative"
sr: simplify the local variable initialization in sr_block_open()
block: don't merge across cgroup boundaries if blkcg is enabled
block: fix rq-qos breakage from skipping rq_qos_done_bio()
block: flush plug based on hardware and software queue order
block: ensure plug merging checks the correct queue at least once
block: move rq_qos_exit() into disk_release()
block: do more work in elevator_exit
block: move blk_exit_queue into disk_release
block: move q_usage_counter release into blk_queue_release
block: don't remove hctx debugfs dir from blk_mq_exit_queue
block: move blkcg initialization/destroy into disk allocation/release handler
sr: implement ->free_disk to simplify refcounting
sd: implement ->free_disk to simplify refcounting
sd: delay calling free_opal_dev
sd: call sd_zbc_release_disk before releasing the scsi_device reference
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- hwrng core now credits for low-quality RNG devices.
Algorithms:
- Optimisations for neon aes on arm/arm64.
- Add accelerated crc32_be on arm64.
- Add ffdheXYZ(dh) templates.
- Disallow hmac keys < 112 bits in FIPS mode.
- Add AVX assembly implementation for sm3 on x86.
Drivers:
- Add missing local_bh_disable calls for crypto_engine callback.
- Ensure BH is disabled in crypto_engine callback path.
- Fix zero length DMA mappings in ccree.
- Add synchronization between mailbox accesses in octeontx2.
- Add Xilinx SHA3 driver.
- Add support for the TDES IP available on sama7g5 SoC in atmel"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (137 commits)
crypto: xilinx - Turn SHA into a tristate and allow COMPILE_TEST
MAINTAINERS: update HPRE/SEC2/TRNG driver maintainers list
crypto: dh - Remove the unused function dh_safe_prime_dh_alg()
hwrng: nomadik - Change clk_disable to clk_disable_unprepare
crypto: arm64 - cleanup comments
crypto: qat - fix initialization of pfvf rts_map_msg structures
crypto: qat - fix initialization of pfvf cap_msg structures
crypto: qat - remove unneeded assignment
crypto: qat - disable registration of algorithms
crypto: hisilicon/qm - fix memset during queues clearing
crypto: xilinx: prevent probing on non-xilinx hardware
crypto: marvell/octeontx - Use swap() instead of open coding it
crypto: ccree - Fix use after free in cc_cipher_exit()
crypto: ccp - ccp_dmaengine_unregister release dma channels
crypto: octeontx2 - fix missing unlock
hwrng: cavium - fix NULL but dereferenced coccicheck error
crypto: cavium/nitrox - don't cast parameter in bit operations
crypto: vmx - add missing dependencies
MAINTAINERS: Add maintainer for Xilinx ZynqMP SHA3 driver
crypto: xilinx - Add Xilinx SHA3 driver
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
"There have been a few important changes to the RNG's crypto, but the
intent for 5.18 has been to shore up the existing design as much as
possible with modern cryptographic functions and proven constructions,
rather than actually changing up anything fundamental to the RNG's
design.
So it's still the same old RNG at its core as before: it still counts
entropy bits, and collects from the various sources with the same
heuristics as before, and so forth. However, the cryptographic
algorithms that transform that entropic data into safe random numbers
have been modernized.
Just as important, if not more, is that the code has been cleaned up
and re-documented. As one of the first drivers in Linux, going back to
1.3.30, its general style and organization was showing its age and
becoming both a maintenance burden and an auditability impediment.
Hopefully this provides a more solid foundation to build on for the
future. I encourage you to open up the file in full, and maybe you'll
remark, "oh, that's what it's doing," and enjoy reading it. That, at
least, is the eventual goal, which this pull begins working toward.
Here's a summary of the various patches in this pull:
- /dev/urandom and /dev/random now do the same thing, per the patch
we discussed on the list. I think this is worth trying out. If it
does appear problematic, I've made sure to keep it standalone and
revertible without any conflicts.
- Fixes and cleanups for numerous integer type problems, locking
issues, and general code quality concerns.
- The input pool's LFSR has been replaced with a cryptographically
secure hash function, which has security and performance benefits
alike, and consequently allows us to count entropy bits linearly.
- The pre-init injection now uses a real hash function too, instead
of an LFSR or vanilla xor.
- The interrupt handler's fast_mix() function now uses one round of
SipHash, rather than the fake crypto that was there before.
- All additions of RDRAND and RDSEED now go through the input pool's
hash function, in part to mitigate ridiculous hypothetical CPU
backdoors, but more so to have a consistent interface for ingesting
entropy that's easy to analyze, making everything happen one way,
instead of a potpourri of different ways.
- The crng now works on per-cpu data, while also being in accordance
with the actual "fast key erasure RNG" design. This allows us to
fix several boot-time race complications associated with the prior
dynamically allocated model, eliminates much locking, and makes our
backtrack protection more robust.
- Batched entropy now erases doled out values so that it's backtrack
resistant.
- Working closely with Sebastian, the interrupt handler no longer
needs to take any locks at all, as we punt the
synchronized/expensive operations to a workqueue. This is
especially nice for PREEMPT_RT, where taking spinlocks in irq
context is problematic. It also makes the handler faster for the
rest of us.
- Also working with Sebastian, we now do the right thing on CPU
hotplug, so that we don't use stale entropy or fail to accumulate
new entropy when CPUs come back online.
- We handle virtual machines that fork / clone / snapshot, using the
"vmgenid" ACPI specification for retrieving a unique new RNG seed,
which we can use to also make WireGuard (and in the future, other
things) safe across VM forks.
- Around boot time, we now try to reseed more often if enough entropy
is available, before settling on the usual 5 minute schedule.
- Last, but certainly not least, the documentation in the file has
been updated considerably"
* tag 'random-5.18-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (60 commits)
random: check for signal and try earlier when generating entropy
random: reseed more often immediately after booting
random: make consistent usage of crng_ready()
random: use SipHash as interrupt entropy accumulator
wireguard: device: clear keys on VM fork
random: provide notifier for VM fork
random: replace custom notifier chain with standard one
random: do not export add_vmfork_randomness() unless needed
virt: vmgenid: notify RNG of VM fork and supply generation ID
ACPI: allow longer device IDs
random: add mechanism for VM forks to reinitialize crng
random: don't let 644 read-only sysctls be written to
random: give sysctl_random_min_urandom_seed a more sensible value
random: block in /dev/urandom
random: do crng pre-init loading in worker rather than irq
random: unify cycles_t and jiffies usage and types
random: cleanup UUID handling
random: only wake up writers after zap if threshold was passed
random: round-robin registers as ulong, not u32
random: clear fast pool, crng, and batches in cpuhp bring up
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull PnP update from Rafael Wysocki:
"Replace acpi_bus_get_device() in the PNP code with
acpi_fetch_acpi_dev() which is better"
* tag 'pnp-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PNP: Replace acpi_bus_get_device()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"As far as new functionality is concerned, there is a new thermal
driver for the Intel Hardware Feedback Interface (HFI) along with some
intel-speed-select utility changes to support it. There are also new
DT compatible strings for a couple of platforms, and thermal zones on
some platforms will be registered as HWmon sensors now.
Apart from the above, some drivers are updated (fixes mostly) and
there is a new piece of documentation for the Intel DPTF (Dynamic
Power and Thermal Framework) sysfs interface.
Specifics:
- Add a new thermal driver for the Intel Hardware Feedback Interface
(HFI) including the HFI initialization, HFI notification interrupt
handling and sending CPU capabilities change messages to user space
via the thermal netlink interface (Ricardo Neri, Srinivas
Pandruvada, Nathan Chancellor, Randy Dunlap).
- Extend the intel-speed-select utility to handle out-of-band CPU
configuration changes and add support for the CPU capabilities
change messages sent over the thermal netlink interface by the new
HFI thermal driver to it (Srinivas Pandruvada).
- Convert the DT bindings to yaml format for the Exynos platform and
fix and update the MAINTAINERS file for this driver (Krzysztof
Kozlowski).
- Register the thermal zones as HWmon sensors for the QCom's Tsens
driver and TI thermal platforms (Dmitry Baryshkov, Romain Naour).
- Add the msm8953 compatible documentation in the bindings (Luca
Weiss).
- Add the sm8150 platform support to the QCom LMh driver's DT binding
(Thara Gopinath).
- Check the command result from the IPC command to the BPMP in the
Tegra driver (Mikko Perttunen).
- Silence the error for normal configuration where the interrupt is
optionnal in the Broadcom thermal driver (Florian Fainelli).
- Remove remaining dead code from the TI thermal driver (Yue
Haibing).
- Don't use bitmap_weight() in end_power_clamp() in the powerclamp
driver (Yury Norov).
- Update the OS policy capabilities handshake in the int340x thermal
driver (Srinivas Pandruvada).
- Increase the policies bitmap size in int340x (Srinivas Pandruvada).
- Replace acpi_bus_get_device() with acpi_fetch_acpi_dev() in the
int340x thermal driver (Rafael Wysocki).
- Check for NULL after calling kmemdup() in int340x (Jiasheng Jiang).
- Add Intel Dynamic Power and Thermal Framework (DPTF) kernel
interface documentation (Srinivas Pandruvada).
- Fix bullet list warning in the thermal documentation (Randy
Dunlap)"
* tag 'thermal-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (30 commits)
thermal: int340x: Update OS policy capability handshake
thermal: int340x: Increase bitmap size
Documentation: thermal: DPTF Documentation
MAINTAINERS: thermal: samsung: update Krzysztof Kozlowski's email
thermal/drivers/ti-soc-thermal: Remove unused function ti_thermal_get_temp()
thermal/drivers/brcmstb_thermal: Interrupt is optional
thermal: tegra-bpmp: Handle errors in BPMP response
drivers/thermal/ti-soc-thermal: Add hwmon support
dt-bindings: thermal: tsens: Add msm8953 compatible
dt-bindings: thermal: Add sm8150 compatible string for LMh
thermal/drivers/qcom/lmh: Add support for sm8150
thermal/drivers/tsens: register thermal zones as hwmon sensors
MAINTAINERS: thermal: samsung: Drop obsolete properties
dt-bindings: thermal: samsung: Convert to dtschema
tools/power/x86/intel-speed-select: v1.12 release
tools/power/x86/intel-speed-select: HFI support
tools/power/x86/intel-speed-select: OOB daemon mode
thermal: intel: hfi: INTEL_HFI_THERMAL depends on NET
thermal: netlink: Fix parameter type of thermal_genl_cpu_capability_event() stub
thermal: Replace acpi_bus_get_device()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These are mostly fixes and cleanups all over the code and a new piece
of documentation for Intel uncore frequency scaling.
Functionality-wise, the intel_idle driver will support Sapphire Rapids
Xeons natively now (with some extra facilities for controlling
C-states more precisely on those systems), virtual guests will take
the ACPI S4 hardware signature into account by default, the
intel_pstate driver will take the defualt EPP value from the firmware,
cpupower utility will support the AMD P-state driver added in the
previous cycle, and there is a new tracer utility for that driver.
Specifics:
- Allow device_pm_check_callbacks() to be called from interrupt
context without issues (Dmitry Baryshkov).
- Modify devm_pm_runtime_enable() to automatically handle
pm_runtime_dont_use_autosuspend() at driver exit time (Douglas
Anderson).
- Make the schedutil cpufreq governor use to_gov_attr_set() instead
of open coding it (Kevin Hao).
- Replace acpi_bus_get_device() with acpi_fetch_acpi_dev() in the
cpufreq longhaul driver (Rafael Wysocki).
- Unify show() and store() naming in cpufreq and make it use
__ATTR_XX (Lianjie Zhang).
- Make the intel_pstate driver use the EPP value set by the firmware
by default (Srinivas Pandruvada).
- Re-order the init checks in the powernow-k8 cpufreq driver (Mario
Limonciello).
- Make the ACPI processor idle driver check for architectural support
for LPI to avoid using it on x86 by mistake (Mario Limonciello).
- Add Sapphire Rapids Xeon support to the intel_idle driver (Artem
Bityutskiy).
- Add 'preferred_cstates' module argument to the intel_idle driver to
work around C1 and C1E handling issue on Sapphire Rapids (Artem
Bityutskiy).
- Add core C6 optimization on Sapphire Rapids to the intel_idle
driver (Artem Bityutskiy).
- Optimize the haltpoll cpuidle driver a bit (Li RongQing).
- Remove leftover text from intel_idle() kerneldoc comment and fix up
white space in intel_idle (Rafael Wysocki).
- Fix load_image_and_restore() error path (Ye Bin).
- Fix typos in comments in the system wakeup hadling code (Tom Rix).
- Clean up non-kernel-doc comments in hibernation code (Jiapeng
Chong).
- Fix __setup handler error handling in system-wide suspend and
hibernation core code (Randy Dunlap).
- Add device name to suspend_report_result() (Youngjin Jang).
- Make virtual guests honour ACPI S4 hardware signature by default
(David Woodhouse).
- Block power off of a parent PM domain unless child is in deepest
state (Ulf Hansson).
- Use dev_err_probe() to simplify error handling for generic PM
domains (Ahmad Fatoum).
- Fix sleep-in-atomic bug caused by genpd_debug_remove() (Shawn Guo).
- Document Intel uncore frequency scaling (Srinivas Pandruvada).
- Add DTPM hierarchy description (Daniel Lezcano).
- Change the locking scheme in DTPM (Daniel Lezcano).
- Fix dtpm_cpu cleanup at exit time and missing virtual DTPM pointer
release (Daniel Lezcano).
- Make dtpm_node_callback[] static (kernel test robot).
- Fix spelling mistake "initialze" -> "initialize" in
dtpm_create_hierarchy() (Colin Ian King).
- Add tracer tool for the amd-pstate driver (Jinzhou Su).
- Fix PC6 displaying in turbostat on some systems (Artem Bityutskiy).
- Add AMD P-State support to the cpupower utility (Huang Rui)"
* tag 'pm-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (58 commits)
cpufreq: powernow-k8: Re-order the init checks
cpuidle: intel_idle: Drop redundant backslash at line end
cpuidle: intel_idle: Update intel_idle() kerneldoc comment
PM: hibernate: Honour ACPI hardware signature by default for virtual guests
cpufreq: intel_pstate: Use firmware default EPP
cpufreq: unify show() and store() naming and use __ATTR_XX
PM: core: keep irq flags in device_pm_check_callbacks()
cpuidle: haltpoll: Call cpuidle_poll_state_init() later
Documentation: amd-pstate: add tracer tool introduction
tools/power/x86/amd_pstate_tracer: Add tracer tool for AMD P-state
tools/power/x86/intel_pstate_tracer: make tracer as a module
cpufreq: amd-pstate: Add more tracepoint for AMD P-State module
PM: sleep: Add device name to suspend_report_result()
turbostat: fix PC6 displaying on some systems
intel_idle: add core C6 optimization for SPR
intel_idle: add 'preferred_cstates' module argument
intel_idle: add SPR support
PM: runtime: Have devm_pm_runtime_enable() handle pm_runtime_dont_use_autosuspend()
ACPI: processor idle: Check for architectural support for LPI
cpuidle: PSCI: Move the `has_lpi` check to the beginning of the function
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"From the new functionality perspective, the most significant items
here are the new driver for the 'ARM Generic Diagnostic Dump and
Reset' device, the extension of fine grain fan control in the ACPI fan
driver, and the change making it possible to use CPPC information to
obtain CPU capacity.
There are also a few new quirks, a bunch of fixes, including the
platform-level _OSC handling change to make it actually take the
platform firmware response into account, some code and documentation
cleanups, and a notable update of the ACPI device enumeration
documentation.
Specifics:
- Use uintptr_t and offsetof() in the ACPICA code to avoid compiler
warnings regarding NULL pointer arithmetic (Rafael Wysocki).
- Fix possible NULL pointer dereference in acpi_ns_walk_namespace()
when passed "acpi=off" in the command line (Rafael Wysocki).
- Fix and clean up acpi_os_read/write_port() (Rafael Wysocki).
- Introduce acpi_bus_for_each_dev() and use it for walking all ACPI
device objects in the Type C code (Rafael Wysocki).
- Fix the _OSC platform capabilities negotioation and prevent CPPC
from being used if the platform firmware indicates that it not
supported via _OSC (Rafael Wysocki).
- Use ida_alloc() instead of ida_simple_get() for ACPI enumeration of
devices (Rafael Wysocki).
- Add AGDI and CEDT to the list of known ACPI table signatures (Ilkka
Koskinen, Robert Kiraly).
- Add power management debug messages related to suspend-to-idle in
two places (Rafael Wysocki).
- Fix __acpi_node_get_property_reference() return value and clean up
that function (Andy Shevchenko, Sakari Ailus).
- Fix return value of the __setup handler in the ACPI PM timer clock
source driver (Randy Dunlap).
- Clean up double words in two comments (Tom Rix).
- Add "skip i2c clients" quirks for Lenovo Yoga Tablet 1050F/L and
Nextbook Ares 8 (Hans de Goede).
- Clean up frequency invariance handling on x86 in the ACPI CPPC
library (Huang Rui).
- Work around broken XSDT on the Advantech DAC-BJ01 board (Mark
Cilissen).
- Make wakeup events checks in the ACPI EC driver more
straightforward and clean up acpi_ec_submit_event() (Rafael
Wysocki).
- Make it possible to obtain the CPU capacity with the help of CPPC
information (Ionela Voinescu).
- Improve fine grained fan control in the ACPI fan driver and
document it (Srinivas Pandruvada).
- Add device HID and quirk for Microsoft Surface Go 3 to the ACPI
battery driver (Maximilian Luz).
- Make the ACPI driver for Intel SoCs (LPSS) let the SPI driver know
the exact type of the controller (Andy Shevchenko).
- Force native backlight mode on Clevo NL5xRU and NL5xNU (Werner
Sembach).
- Fix return value of __setup handlers in the APEI code (Randy
Dunlap).
- Add Arm Generic Diagnostic Dump and Reset device driver (Ilkka
Koskinen).
- Limit printable size of BERT table data (Darren Hart).
- Fix up HEST and GHES initialization (Shuai Xue).
- Update the ACPI device enumeration documentation and unify the ASL
style in GPIO-related examples (Andy Shevchenko)"
* tag 'acpi-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (52 commits)
clocksource: acpi_pm: fix return value of __setup handler
ACPI: bus: Avoid using CPPC if not supported by firmware
Revert "ACPI: Pass the same capabilities to the _OSC regardless of the query flag"
ACPI: video: Force backlight native for Clevo NL5xRU and NL5xNU
arm64, topology: enable use of init_cpu_capacity_cppc()
arch_topology: obtain cpu capacity using information from CPPC
x86, ACPI: rename init_freq_invariance_cppc() to arch_init_invariance_cppc()
ACPI: AGDI: Add driver for Arm Generic Diagnostic Dump and Reset device
ACPI: tables: Add AGDI to the list of known table signatures
ACPI/APEI: Limit printable size of BERT table data
ACPI: docs: gpio-properties: Unify ASL style for GPIO examples
ACPI / x86: Work around broken XSDT on Advantech DAC-BJ01 board
ACPI: APEI: fix return value of __setup handlers
x86/ACPI: CPPC: Move init_freq_invariance_cppc() into x86 CPPC
x86: Expose init_freq_invariance() to topology header
x86/ACPI: CPPC: Move AMD maximum frequency ratio setting function into x86 CPPC
x86/ACPI: CPPC: Rename cppc_msr.c to cppc.c
ACPI / x86: Add skip i2c clients quirk for Lenovo Yoga Tablet 1050F/L
ACPI / x86: Add skip i2c clients quirk for Nextbook Ares 8
ACPICA: Avoid walking the ACPI Namespace if it is not there
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc architecture updates from Helge Deller:
- add vDSO support (allows us to use non-executable stacks)
- many TLB and cache flush optimizations (by Dave Anglin)
- fix handling of probe non-access faults (by Dave Anglin)
- fix invalidate/flush vmap routines (by Dave Anglin)
- avoid using hardware single-step in kprobes
- enable ARCH_HAS_DEBUG_VM_PGTABLE
- many cleanups in unaligned handlers, e.g. rewrite of existing
assembly code
- always use the self-extracting kernel feature
- big refacturing and code reductions regarding space-register usage in
get_user() and put_user()
- add fillrect() support to stifb graphics driver
* tag 'for-5.18/parisc-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (23 commits)
parisc: Fix invalidate/flush vmap routines
parisc: Avoid flushing cache on cache-less machines
parisc: Avoid using hardware single-step in kprobes
parisc: Improve CPU socket and core bootup info text
parisc: Enable ARCH_HAS_DEBUG_VM_PGTABLE
parisc: Avoid calling SMP cache flush functions on cache-less machines
parisc: Increase parisc_cache_flush_threshold setting
parisc/unaligned: Enhance user-space visible output
parisc/unaligned: Rewrite 32-bit inline assembly of emulate_sth()
parisc/unaligned: Rewrite 32-bit inline assembly of emulate_ldd()
parisc/unaligned: Rewrite inline assembly of emulate_ldw()
parisc/unaligned: Rewrite inline assembly of emulate_ldh()
parisc/unaligned: Use EFAULT fixup handler in unaligned handlers
parisc: Reduce code size by optimizing get_current() function calls
parisc: Use constants to encode the space registers like SR_KERNEL
parisc: Use SR_USER and SR_KERNEL in get_user() and put_user()
parisc: Add defines for various space register
parisc: Always use the self-extracting kernel feature
video/fbdev/stifb: Implement the stifb_fillrect() function
parisc: Add vDSO support
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull interrupt updates from Thomas Gleixner:
"Core code:
- Provide generic_handle_irq_safe() which can be invoked from any
context (hard interrupt or threaded). This allows to remove ugly
workarounds in drivers all over the place.
- Use generic_handle_irq_safe() in the affected drivers.
- The usual cleanups and improvements.
Interrupt chip drivers:
- Support for new interrupt chips or not yet supported variants:
STM32MP14, Meson GPIO, Apple M1 PMU, Apple M1 AICv2, Qualcomm MPM
- Convert the Xilinx driver to generic interrupt domains
- Cleanup the irq_chip::name handling
- The usual cleanups and improvements all over the place"
* tag 'irq-core-2022-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
irqchip: Add Qualcomm MPM controller driver
dt-bindings: interrupt-controller: Add Qualcomm MPM support
irqchip/apple-aic: Add support for AICv2
irqchip/apple-aic: Support multiple dies
irqchip/apple-aic: Dynamically compute register offsets
irqchip/apple-aic: Switch to irq_domain_create_tree and sparse hwirqs
irqchip/apple-aic: Add Fast IPI support
dt-bindings: interrupt-controller: apple,aic2: New binding for AICv2
PCI: apple: Change MSI handling to handle 4-cell AIC fwspec form
irqchip/apple-aic: Fix cpumask allocation for FIQs
irqchip/meson-gpio: Add support for meson s4 SoCs
irqchip/meson-gpio: add select trigger type callback
irqchip/meson-gpio: support more than 8 channels gpio irq
dt-bindings: interrupt-controller: New binding for Meson-S4 SoCs
irqchip/xilinx: Switch to GENERIC_IRQ_MULTI_HANDLER
staging: greybus: gpio: Use generic_handle_irq_safe().
net: usb: lan78xx: Use generic_handle_irq_safe().
mfd: ezx-pcap: Use generic_handle_irq_safe().
misc: hi6421-spmi-pmic: Use generic_handle_irq_safe().
irqchip/sifive-plic: Disable S-mode IRQs if running in M-mode
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer and timekeeping updates from Thomas Gleixner:
"Core code:
- Make the NOHZ handling of the timekeeping/tick core more robust to
prevent a rare jiffies update stall.
- Handle softirqs in the NOHZ/idle case correctly
Drivers:
- Add support for event stream scaling of the 1GHz counter on ARM(64)
- Correct an error code check in the timer-of layer
- The usual cleanups and improvements all over the place"
* tag 'timers-core-2022-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
lib/irq_poll: Declare IRQ_POLL softirq vector as ksoftirqd-parking safe
tick/rcu: Stop allowing RCU_SOFTIRQ in idle
tick/rcu: Remove obsolete rcu_needs_cpu() parameters
tick: Detect and fix jiffies update stall
clocksource/drivers/timer-of: Check return value of of_iomap in timer_of_base_init()
clocksource/drivers/timer-microchip-pit64b: Use 5MHz for clockevent
clocksource/drivers/timer-microchip-pit64b: Use notrace
clocksource/drivers/timer-microchip-pit64b: Remove mmio selection
dt-bindings: timer: Tegra: Convert text bindings to yaml
clocksource/drivers/imx-tpm: Move tpm_read_sched_clock() under CONFIG_ARM
clocksource/drivers/arm_arch_timer: Use event stream scaling when available
clocksource/drivers/exynos_mct: Increase the size of name array
clocksource/drivers/exynos_mct: Bump up mct max irq number
clocksource/drivers/exynos_mct: Remove mct interrupt index enum
clocksource/drivers/exynos_mct: Handle DTS with higher number of interrupts
clocksource/drivers/timer-ti-dm: Fix regression from errata i940 fix
clocksource/drivers/imx-tpm: Exclude sched clock for ARM64
clocksource: Add a Kconfig option for WATCHDOG_MAX_SKEW
clocksource/drivers/imx-tpm: Update name of clkevt
clocksource/drivers/imx-tpm: Add CLOCK_EVT_FEAT_DYNIRQ
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 PASID support from Thomas Gleixner:
"Reenable ENQCMD/PASID support:
- Simplify the PASID handling to allocate the PASID once, associate
it to the mm of a process and free it on mm_exit().
The previous attempt of refcounted PASIDs and dynamic
alloc()/free() turned out to be error prone and too complex. The
PASID space is 20bits, so the case of resource exhaustion is a pure
academic concern.
- Populate the PASID MSR on demand via #GP to avoid racy updates via
IPIs.
- Reenable ENQCMD and let objtool check for the forbidden usage of
ENQCMD in the kernel.
- Update the documentation for Shared Virtual Addressing accordingly"
* tag 'x86-pasid-2022-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Documentation/x86: Update documentation for SVA (Shared Virtual Addressing)
tools/objtool: Check for use of the ENQCMD instruction in the kernel
x86/cpufeatures: Re-enable ENQCMD
x86/traps: Demand-populate PASID MSR via #GP
sched: Define and initialize a flag to identify valid PASID in the task
x86/fpu: Clear PASID when copying fpstate
iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
kernel/fork: Initialize mm's PASID
iommu/ioasid: Introduce a helper to check for valid PASIDs
mm: Change CONFIG option for mm->pasid field
iommu/sva: Rename CONFIG_IOMMU_SVA_LIB to CONFIG_IOMMU_SVA
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu feature updates from Borislav Petkov:
- Merge the AMD and Intel PPIN code into a shared one by both vendors.
Add the PPIN number to sysfs so that sockets can be identified when
replacement is needed
- Minor fixes and cleanups
* tag 'x86_cpu_for_v5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Clear SME feature flag when not in use
x86/cpufeatures: Put the AMX macros in the word 18 block
topology/sysfs: Add PPIN in sysfs under cpu topology
topology/sysfs: Add format parameter to macro defining "show" functions for proc
x86/cpu: Read/save PPIN MSR during initialization
x86/cpu: X86_FEATURE_INTEL_PPIN finally has a CPUID bit
x86/cpu: Merge Intel and AMD ppin_init() functions
x86/CPU/AMD: Use default_groups in kobj_type
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC updates from Borislav Petkov:
- Add support for newer AMD family 0x19, models 0x10-... CPUs to
amd64_edac
- The usual amount of improvements and fixes
* tag 'edac_updates_for_v5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/altera: Add SDRAM ECC check for U-Boot
EDAC/amd64: Add new register offset support and related changes
EDAC/amd64: Set memory type per DIMM
EDAC/mc: Remove unnecessary cast to char * in edac_align_ptr()
EDAC: Use default_groups in kobj_type
EDAC: Use proper list of struct attribute for attributes
|
|
Pull ARM updates from Russell King:
- amba bus cleanups
- conversion to use reserve_initrd_mem()
- remove -nostdlib from vdso link
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 9181/1: vdso: remove -nostdlib compiler flag
ARM: 9175/1: Convert to reserve_initrd_mem()
ARM: 9174/1: amba: Move EXPORT_SYMBOL() closer to definition
ARM: 9173/1: amba: kill amba_find_match()
ARM: 9172/1: amba: Cleanup amba pclk operation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
- Support for including MTE tags in ELF coredumps
- Instruction encoder updates, including fixes to 64-bit immediate
generation and support for the LSE atomic instructions
- Improvements to kselftests for MTE and fpsimd
- Symbol aliasing and linker script cleanups
- Reduce instruction cache maintenance performed for user mappings
created using contiguous PTEs
- Support for the new "asymmetric" MTE mode, where stores are checked
asynchronously but loads are checked synchronously
- Support for the latest pointer authentication algorithm ("QARMA3")
- Support for the DDR PMU present in the Marvell CN10K platform
- Support for the CPU PMU present in the Apple M1 platform
- Use the RNDR instruction for arch_get_random_{int,long}()
- Update our copy of the Arm optimised string routines for str{n}cmp()
- Fix signal frame generation for CPUs which have foolishly elected to
avoid building in support for the fpsimd instructions
- Workaround for Marvell GICv3 erratum #38545
- Clarification to our Documentation (booting reqs. and MTE prctl())
- Miscellanous cleanups and minor fixes
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (90 commits)
docs: sysfs-devices-system-cpu: document "asymm" value for mte_tcf_preferred
arm64/mte: Remove asymmetric mode from the prctl() interface
arm64: Add cavium_erratum_23154_cpus missing sentinel
perf/marvell: Fix !CONFIG_OF build for CN10K DDR PMU driver
arm64: mm: Drop 'const' from conditional arm64_dma_phys_limit definition
Documentation: vmcoreinfo: Fix htmldocs warning
kasan: fix a missing header include of static_keys.h
drivers/perf: Add Apple icestorm/firestorm CPU PMU driver
drivers/perf: arm_pmu: Handle 47 bit counters
arm64: perf: Consistently make all event numbers as 16-bits
arm64: perf: Expose some Armv9 common events under sysfs
perf/marvell: cn10k DDR perf event core ownership
perf/marvell: cn10k DDR perfmon event overflow handling
perf/marvell: CN10k DDR performance monitor support
dt-bindings: perf: marvell: cn10k ddr performance monitor
arm64: clean up tools Makefile
perf/arm-cmn: Update watchpoint format
perf/arm-cmn: Hide XP PUB events for CMN-600
arm64: drop unused includes of <linux/personality.h>
arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm updates from Jarkko Sakkinen:
"In order to split the work a bit we've aligned with David Howells more
or less that I take more hardware/firmware aligned keyring patches,
and he takes care more of the framework aligned patches.
For TPM the patches worth of highlighting are the fixes for
refcounting provided by Lino Sanfilippo and James Bottomley.
Eric B. has done a bunch obvious (but important) fixes but there's one
a bit controversial: removal of asym_tpm. It was added in 2018 when
TPM1 was already declared as insecure and world had moved on to TPM2.
I don't know how this has passed all the filters but I did not have a
chance to see the patches when they were out. I simply cannot commit
to maintaining this because it was from all angles just wrong to take
it in the first place to the mainline kernel. Nobody should use this
module really for anything.
Finally, there is a new keyring '.machine' to hold MOK keys ('Machine
Owner Keys'). In the mok side MokListTrustedRT UEFI variable can be
set, from which kernel knows that MOK keys are kernel trusted keys and
they are populated to the machine keyring. This keyring linked to the
secondary trusted keyring, which means that can be used like any
kernel trusted keys. This keyring of course can be used to hold other
MOK'ish keys in other platforms in future"
* tag 'tpmdd-next-v5.18-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: (24 commits)
tpm: use try_get_ops() in tpm-space.c
KEYS: asymmetric: properly validate hash_algo and encoding
KEYS: asymmetric: enforce that sig algo matches key algo
KEYS: remove support for asym_tpm keys
tpm: fix reference counting for struct tpm_chip
integrity: Only use machine keyring when uefi_check_trust_mok_keys is true
integrity: Trust MOK keys if MokListTrustedRT found
efi/mokvar: move up init order
KEYS: Introduce link restriction for machine keys
KEYS: store reference to machine keyring
integrity: add new keyring handler for mok keys
integrity: Introduce a Linux keyring called machine
integrity: Fix warning about missing prototypes
KEYS: trusted: Avoid calling null function trusted_key_exit
KEYS: trusted: Fix trusted key backends when building as module
tpm: xen-tpmfront: Use struct_size() helper
KEYS: x509: remove dead code that set ->unsupported_sig
KEYS: x509: remove never-set ->unsupported_key flag
KEYS: x509: remove unused fields
KEYS: x509: clearly distinguish between key and signature algorithms
...
|
|
These functions are page cache functionality and don't need to be
declared in fs.h.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
|
|
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
"Two driver fixes:
- a fix for zinitix touchscreen to properly report contacts
- a fix for aiptek tablet driver to be more resilient to devices with
incorrect descriptors"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: aiptek - properly check endpoint type
Input: zinitix - do not report shadow fingers
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two small(ish) fixes, both in drivers"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: fnic: Finish scsi_cmnd before dropping the spinlock
scsi: mpt3sas: Page fault in reply q processing
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fix from Greg KH:
"Here is a single driver fix for 5.17-final that has been submitted
many times but I somehow missed it in my patch queue:
- fix for counter sysfs code for reported problem
This has been in linux-next all week with no reported issues"
* tag 'char-misc-5.17-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
counter: Stop using dev_get_drvdata() to get the counter device
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small remaining USB fixes for 5.17-final.
They include:
- two USB gadget driver fixes for reported problems
- usbtmc driver fix for syzbot found issues
- musb patch partial revert to resolve a reported regression.
All of these have been in linux-next this week with no reported
problems"
* tag 'usb-5.17-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: gadget: Fix use-after-free bug by not setting udc->dev.driver
usb: usbtmc: Fix bug in pipe direction for control transfers
partially Revert "usb: musb: Set the DT node on the child device"
usb: gadget: rndis: prevent integer overflow in rndis_set_response()
|
|
Sadly, while firmware 1.5 fixed temperature labels on my
Inspiron 3505, it also caused fan type calls to take
ca. 4 seconds with the fan being at full speed.
Fix the resulting delays by adding the model to the
blacklist.
Tested on a Dell Inspiron 3505.
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/20220318183408.13286-1-W_Armin@gmx.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Pull block fixes from Jens Axboe:
- Revert of a nvme target feature (Hannes)
- Fix a memory leak with rq-qos (Ming)
* tag 'block-5.17-2022-03-18' of git://git.kernel.dk/linux-block:
nvmet: revert "nvmet: make discovery NQN configurable"
block: release rq qos structures for queue without disk
|
|
Pull drm fixes from Dave Airlie:
"A few minor changes to finish things off, one mgag200 regression, imx
fix and couple of panel changes.
imx:
- Don't test bus flags in atomic check
mgag200:
- Fix PLL setup on some models
panel:
- Fix bpp settings on Innolux G070Y2-L01
- Fix DRM_PANEL_EDP Kconfig dependencies"
* tag 'drm-fixes-2022-03-18' of git://anongit.freedesktop.org/drm/drm:
drm: Don't make DRM_PANEL_BRIDGE dependent on DRM_KMS_HELPERS
drm/panel: simple: Fix Innolux G070Y2-L01 BPP settings
drm/imx: parallel-display: Remove bus flags check in imx_pd_bridge_atomic_check()
drm/mgag200: Fix PLL setup for g200wb and g200ew
|
|
Merge Intel Hardware Feedback Interface (HFI) thermal driver for
5.18-rc1 and update the intel-speed-select utility to support that
driver.
* thermal-hfi:
tools/power/x86/intel-speed-select: v1.12 release
tools/power/x86/intel-speed-select: HFI support
tools/power/x86/intel-speed-select: OOB daemon mode
thermal: intel: hfi: INTEL_HFI_THERMAL depends on NET
thermal: netlink: Fix parameter type of thermal_genl_cpu_capability_event() stub
thermal: intel: hfi: Notify user space for HFI events
thermal: netlink: Add a new event to notify CPU capabilities change
thermal: intel: hfi: Enable notification interrupt
thermal: intel: hfi: Handle CPU hotplug events
thermal: intel: hfi: Minimally initialize the Hardware Feedback Interface
x86/cpu: Add definitions for the Intel Hardware Feedback Interface
x86/Documentation: Describe the Intel Hardware Feedback Interface
|
|
Merge powerclamp thermal driver changes, int340x thermal driver changes
and thermal documentation changes for 5.18-rc1:
- Don't use bitmap_weight() in end_power_clamp() in the powerclamp
driver (Yury Norov).
- Update the OS policy capabilities handshake in the int340x thermal
driver (Srinivas Pandruvada).
- Increase the policies bitmap size in int340x (Srinivas Pandruvada).
- Replace acpi_bus_get_device() with acpi_fetch_acpi_dev() in the
int340x thermal driver (Rafael Wysocki).
- Check for NULL after calling kmemdup() in int340x (Jiasheng Jiang).
- Add Intel Dynamic Power and Thermal Framework (DPTF) kernel interface
documentation (Srinivas Pandruvada).
- Fix bullet list warning in the thermal documentation (Randy Dunlap).
* thermal-powerclamp:
thermal: intel_powerclamp: don't use bitmap_weight() in end_power_clamp()
* thermal-int340x:
thermal: int340x: Update OS policy capability handshake
thermal: int340x: Increase bitmap size
thermal: Replace acpi_bus_get_device()
thermal: int340x: Check for NULL after calling kmemdup()
* thermal-docs:
Documentation: thermal: DPTF Documentation
thermal: fix Documentation bullet list warning
|