Age | Commit message (Collapse) | Author |
|
This reverts commits 4bad58ebc8bc4f20d89cff95417c9b4674769709 (and
399f8dd9a866e107639eabd3c1979cd526ca3a98, which tried to fix it).
I do not believe these are correct, and I'm about to release 5.13, so am
reverting them out of an abundance of caution.
The locking is odd, and appears broken.
On the allocation side (in __sigqueue_alloc()), the locking is somewhat
straightforward: it depends on sighand->siglock. Since one caller
doesn't hold that lock, it further then tests 'sigqueue_flags' to avoid
the case with no locks held.
On the freeing side (in sigqueue_cache_or_free()), there is no locking
at all, and the logic instead depends on 'current' being a single
thread, and not able to race with itself.
To make things more exciting, there's also the data race between freeing
a signal and allocating one, which is handled by using WRITE_ONCE() and
READ_ONCE(), and being mutually exclusive wrt the initial state (ie
freeing will only free if the old state was NULL, while allocating will
obviously only use the value if it was non-NULL, so only one or the
other will actually act on the value).
However, while the free->alloc paths do seem mutually exclusive thanks
to just the data value dependency, it's not clear what the memory
ordering constraints are on it. Could writes from the previous
allocation possibly be delayed and seen by the new allocation later,
causing logical inconsistencies?
So it's all very exciting and unusual.
And in particular, it seems that the freeing side is incorrect in
depending on "current" being single-threaded. Yes, 'current' is a
single thread, but in the presense of asynchronous events even a single
thread can have data races.
And such asynchronous events can and do happen, with interrupts causing
signals to be flushed and thus free'd (for example - sending a
SIGCONT/SIGSTOP can happen from interrupt context, and can flush
previously queued process control signals).
So regardless of all the other questions about the memory ordering and
locking for this new cached allocation, the sigqueue_cache_or_free()
assumptions seem to be fundamentally incorrect.
It may be that people will show me the errors of my ways, and tell me
why this is all safe after all. We can reinstate it if so. But my
current belief is that the WRITE_ONCE() that sets the cached entry needs
to be a smp_store_release(), and the READ_ONCE() that finds a cached
entry needs to be a smp_load_acquire() to handle memory ordering
correctly.
And the sequence in sigqueue_cache_or_free() would need to either use a
lock or at least be interrupt-safe some way (perhaps by using something
like the percpu 'cmpxchg': it doesn't need to be SMP-safe, but like the
percpu operations it needs to be interrupt-safe).
Fixes: 399f8dd9a866 ("signal: Prevent sigqueue caching after task got released")
Fixes: 4bad58ebc8bc ("signal: Allow tasks to cache one sigqueue struct")
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Merge misc fixes from Andrew Morton:
"24 patches, based on 4a09d388f2ab382f217a764e6a152b3f614246f6.
Subsystems affected by this patch series: mm (thp, vmalloc, hugetlb,
memory-failure, and pagealloc), nilfs2, kthread, MAINTAINERS, and
mailmap"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (24 commits)
mailmap: add Marek's other e-mail address and identity without diacritics
MAINTAINERS: fix Marek's identity again
mm/page_alloc: do bulk array bounds check after checking populated elements
mm/page_alloc: __alloc_pages_bulk(): do bounds check before accessing array
mm/hwpoison: do not lock page again when me_huge_page() successfully recovers
mm,hwpoison: return -EHWPOISON to denote that the page has already been poisoned
mm/memory-failure: use a mutex to avoid memory_failure() races
mm, futex: fix shared futex pgoff on shmem huge page
kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync()
kthread_worker: split code for canceling the delayed work timer
mm/vmalloc: unbreak kasan vmalloc support
KVM: s390: prepare for hugepage vmalloc
mm/vmalloc: add vmalloc_no_huge
nilfs2: fix memory leak in nilfs_sysfs_delete_device_group
mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk()
mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes
mm: page_vma_mapped_walk(): get vma_address_end() earlier
mm: page_vma_mapped_walk(): use goto instead of while (1)
mm: page_vma_mapped_walk(): add a level of indentation
mm: page_vma_mapped_walk(): crossing page table boundary
...
|
|
Pull ceph fixes from Ilya Dryomov:
"Two regression fixes from the merge window: one in the auth code
affecting old clusters and one in the filesystem for proper
propagation of MDS request errors.
Also included a locking fix for async creates, marked for stable"
* tag 'ceph-for-5.13-rc8' of https://github.com/ceph/ceph-client:
libceph: set global_id as soon as we get an auth ticket
libceph: don't pass result into ac->ops->handle_reply()
ceph: fix error handling in ceph_atomic_open and ceph_lookup
ceph: must hold snap_rwsem when filling inode for async create
|
|
If more than one futex is placed on a shmem huge page, it can happen
that waking the second wakes the first instead, and leaves the second
waiting: the key's shared.pgoff is wrong.
When 3.11 commit 13d60f4b6ab5 ("futex: Take hugepages into account when
generating futex_key"), the only shared huge pages came from hugetlbfs,
and the code added to deal with its exceptional page->index was put into
hugetlb source. Then that was missed when 4.8 added shmem huge pages.
page_to_pgoff() is what others use for this nowadays: except that, as
currently written, it gives the right answer on hugetlbfs head, but
nonsense on hugetlbfs tails. Fix that by calling hugetlbfs-specific
hugetlb_basepage_index() on PageHuge tails as well as on head.
Yes, it's unconventional to declare hugetlb_basepage_index() there in
pagemap.h, rather than in hugetlb.h; but I do not expect anything but
page_to_pgoff() ever to need it.
[akpm@linux-foundation.org: give hugetlb_basepage_index() prototype the correct scope]
Link: https://lkml.kernel.org/r/b17d946b-d09-326e-b42a-52884c36df32@google.com
Fixes: 800d8c63b2e9 ("shmem: add huge pages support")
Reported-by: Neel Natu <neelnatu@google.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Zhang Yi <wetpzy@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Patch series "mm: add vmalloc_no_huge and use it", v4.
Add vmalloc_no_huge() and export it, so modules can allocate memory with
small pages.
Use the newly added vmalloc_no_huge() in KVM on s390 to get around a
hardware limitation.
This patch (of 2):
Commit 121e6f3258fe3 ("mm/vmalloc: hugepage vmalloc mappings") added
support for hugepage vmalloc mappings, it also added the flag
VM_NO_HUGE_VMAP for __vmalloc_node_range to request the allocation to be
performed with 0-order non-huge pages.
This flag is not accessible when calling vmalloc, the only option is to
call directly __vmalloc_node_range, which is not exported.
This means that a module can't vmalloc memory with small pages.
Case in point: KVM on s390x needs to vmalloc a large area, and it needs
to be mapped with non-huge pages, because of a hardware limitation.
This patch adds the function vmalloc_no_huge, which works like vmalloc,
but it is guaranteed to always back the mapping using small pages. This
new function is exported, therefore it is usable by modules.
[akpm@linux-foundation.org: whitespace fixes, per Christoph]
Link: https://lkml.kernel.org/r/20210614132357.10202-1-imbrenda@linux.ibm.com
Link: https://lkml.kernel.org/r/20210614132357.10202-2-imbrenda@linux.ibm.com
Fixes: 121e6f3258fe3 ("mm/vmalloc: hugepage vmalloc mappings")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 61ca49a9105f ("libceph: don't set global_id until we get an
auth ticket") delayed the setting of global_id too much. It is set
only after all tickets are received, but in pre-nautilus clusters an
auth ticket and the service tickets are obtained in separate steps
(for a total of three MAuth replies). When the service tickets are
requested, global_id is used to build an authorizer; if global_id is
still 0 we never get them and fail to establish the session.
Moving the setting of global_id into protocol implementations. This
way global_id can be set exactly when an auth ticket is received, not
sooner nor later.
Fixes: 61ca49a9105f ("libceph: don't set global_id until we get an auth ticket")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
|
|
There is no result to pass in msgr2 case because authentication
failures are reported through auth_bad_method frame and in MAuth
case an error is returned immediately.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
|
|
Better handle the failure paths.
vmlinux.o: warning: objtool: debug_locks_off()+0x23: call to console_verbose() leaves .noinstr.text section
vmlinux.o: warning: objtool: debug_locks_off()+0x19: call to __kasan_check_write() leaves .noinstr.text section
debug_locks_off+0x19/0x40:
instrument_atomic_write at include/linux/instrumented.h:86
(inlined by) __debug_locks_off at include/linux/debug_locks.h:17
(inlined by) debug_locks_off at lib/debug_locks.c:41
Fixes: 6eebad1ad303 ("lockdep: __always_inline more for noinstr")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210621120120.784404944@infradead.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.13-rc7, including fixes from wireless, bpf,
bluetooth, netfilter and can.
Current release - regressions:
- mlxsw: spectrum_qdisc: Pass handle, not band number to find_class()
to fix modifying offloaded qdiscs
- lantiq: net: fix duplicated skb in rx descriptor ring
- rtnetlink: fix regression in bridge VLAN configuration, empty info
is not an error, bot-generated "fix" was not needed
- libbpf: s/rx/tx/ typo on umem->rx_ring_setup_done to fix umem
creation
Current release - new code bugs:
- ethtool: fix NULL pointer dereference during module EEPROM dump via
the new netlink API
- mlx5e: don't update netdev RQs with PTP-RQ, the special purpose
queue should not be visible to the stack
- mlx5e: select special PTP queue only for SKBTX_HW_TSTAMP skbs
- mlx5e: verify dev is present in get devlink port ndo, avoid a panic
Previous releases - regressions:
- neighbour: allow NUD_NOARP entries to be force GCed
- further fixes for fallout from reorg of WiFi locking (staging:
rtl8723bs, mac80211, cfg80211)
- skbuff: fix incorrect msg_zerocopy copy notifications
- mac80211: fix NULL ptr deref for injected rate info
- Revert "net/mlx5: Arm only EQs with EQEs" it may cause missed IRQs
Previous releases - always broken:
- bpf: more speculative execution fixes
- netfilter: nft_fib_ipv6: skip ipv6 packets from any to link-local
- udp: fix race between close() and udp_abort() resulting in a panic
- fix out of bounds when parsing TCP options before packets are
validated (in netfilter: synproxy, tc: sch_cake and mptcp)
- mptcp: improve operation under memory pressure, add missing
wake-ups
- mptcp: fix double-lock/soft lookup in subflow_error_report()
- bridge: fix races (null pointer deref and UAF) in vlan tunnel
egress
- ena: fix DMA mapping function issues in XDP
- rds: fix memory leak in rds_recvmsg
Misc:
- vrf: allow larger MTUs
- icmp: don't send out ICMP messages with a source address of 0.0.0.0
- cdc_ncm: switch to eth%d interface naming"
* tag 'net-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (139 commits)
net: ethernet: fix potential use-after-free in ec_bhf_remove
selftests/net: Add icmp.sh for testing ICMP dummy address responses
icmp: don't send out ICMP messages with a source address of 0.0.0.0
net: ll_temac: Avoid ndo_start_xmit returning NETDEV_TX_BUSY
net: ll_temac: Fix TX BD buffer overwrite
net: ll_temac: Add memory-barriers for TX BD access
net: ll_temac: Make sure to free skb when it is completely used
MAINTAINERS: add Guvenc as SMC maintainer
bnxt_en: Call bnxt_ethtool_free() in bnxt_init_one() error path
bnxt_en: Fix TQM fastpath ring backing store computation
bnxt_en: Rediscover PHY capabilities after firmware reset
cxgb4: fix wrong shift.
mac80211: handle various extensible elements correctly
mac80211: reset profile_periodicity/ema_ap
cfg80211: avoid double free of PMSR request
cfg80211: make certificate generation more robust
mac80211: minstrel_ht: fix sample time check
net: qed: Fix memcpy() overflow of qed_dcbx_params()
net: cdc_eem: fix tx fixup skb leak
net: hamradio: fix memory leak in mkiss_close
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"Remove recently added frequency invariance support from the CPPC
cpufreq driver, because it has turned out to be problematic and it
cannot be fixed properly on time for 5.13 (Viresh Kumar)"
* tag 'pm-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "cpufreq: CPPC: Add support for frequency invariance"
|
|
Running devlink reload command for port in switchdev mode cause
resources to corrupt: driver can't release allocated EQ and reclaim
memory pages, because "rdma" auxiliary device had add CQs which blocks
EQ from deletion.
Erroneous sequence happens during reload-down phase, and is following:
1. detach device - suspends auxiliary devices which support it, destroys
others. During this step "eth-rep" and "rdma-rep" are destroyed,
"eth" - suspended.
2. disable SRIOV - moves device to legacy mode; as part of disablement -
rescans drivers. This step adds "rdma" auxiliary device.
3. destroy EQ table - <failure>.
Driver shouldn't create any device during unload flows. To handle that
implement MLX5_PRIV_FLAGS_DETACH flag, set it on device detach and unset
on device attach. If flag is set do no-op on drivers rescan.
Fixes: a925b5e309c9 ("net/mlx5: Register mlx5 devices to auxiliary virtual bus")
Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
There is a race between THP unmapping and truncation, when truncate sees
pmd_none() and skips the entry, after munmap's zap_huge_pmd() cleared
it, but before its page_remove_rmap() gets to decrement
compound_mapcount: generating false "BUG: Bad page cache" reports that
the page is still mapped when deleted. This commit fixes that, but not
in the way I hoped.
The first attempt used try_to_unmap(page, TTU_SYNC|TTU_IGNORE_MLOCK)
instead of unmap_mapping_range() in truncate_cleanup_page(): it has
often been an annoyance that we usually call unmap_mapping_range() with
no pages locked, but there apply it to a single locked page.
try_to_unmap() looks more suitable for a single locked page.
However, try_to_unmap_one() contains a VM_BUG_ON_PAGE(!pvmw.pte,page):
it is used to insert THP migration entries, but not used to unmap THPs.
Copy zap_huge_pmd() and add THP handling now? Perhaps, but their TLB
needs are different, I'm too ignorant of the DAX cases, and couldn't
decide how far to go for anon+swap. Set that aside.
The second attempt took a different tack: make no change in truncate.c,
but modify zap_huge_pmd() to insert an invalidated huge pmd instead of
clearing it initially, then pmd_clear() between page_remove_rmap() and
unlocking at the end. Nice. But powerpc blows that approach out of the
water, with its serialize_against_pte_lookup(), and interesting pgtable
usage. It would need serious help to get working on powerpc (with a
minor optimization issue on s390 too). Set that aside.
Just add an "if (page_mapped(page)) synchronize_rcu();" or other such
delay, after unmapping in truncate_cleanup_page()? Perhaps, but though
that's likely to reduce or eliminate the number of incidents, it would
give less assurance of whether we had identified the problem correctly.
This successful iteration introduces "unmap_mapping_page(page)" instead
of try_to_unmap(), and goes the usual unmap_mapping_range_tree() route,
with an addition to details. Then zap_pmd_range() watches for this
case, and does spin_unlock(pmd_lock) if so - just like
page_vma_mapped_walk() now does in the PVMW_SYNC case. Not pretty, but
safe.
Note that unmap_mapping_page() is doing a VM_BUG_ON(!PageLocked) to
assert its interface; but currently that's only used to make sure that
page->mapping is stable, and zap_pmd_range() doesn't care if the page is
locked or not. Along these lines, in invalidate_inode_pages2_range()
move the initial unmap_mapping_range() out from under page lock, before
then calling unmap_mapping_page() under page lock if still mapped.
Link: https://lkml.kernel.org/r/a2a4a148-cdd8-942c-4ef8-51b77f643dbe@google.com
Fixes: fc127da085c2 ("truncate: handle file thp")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jue Wang <juew@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Stressing huge tmpfs often crashed on unmap_page()'s VM_BUG_ON_PAGE
(!unmap_success): with dump_page() showing mapcount:1, but then its raw
struct page output showing _mapcount ffffffff i.e. mapcount 0.
And even if that particular VM_BUG_ON_PAGE(!unmap_success) is removed,
it is immediately followed by a VM_BUG_ON_PAGE(compound_mapcount(head)),
and further down an IS_ENABLED(CONFIG_DEBUG_VM) total_mapcount BUG():
all indicative of some mapcount difficulty in development here perhaps.
But the !CONFIG_DEBUG_VM path handles the failures correctly and
silently.
I believe the problem is that once a racing unmap has cleared pte or
pmd, try_to_unmap_one() may skip taking the page table lock, and emerge
from try_to_unmap() before the racing task has reached decrementing
mapcount.
Instead of abandoning the unsafe VM_BUG_ON_PAGE(), and the ones that
follow, use PVMW_SYNC in try_to_unmap_one() in this case: adding
TTU_SYNC to the options, and passing that from unmap_page().
When CONFIG_DEBUG_VM, or for non-debug too? Consensus is to do the same
for both: the slight overhead added should rarely matter, except perhaps
if splitting sparsely-populated multiply-mapped shmem. Once confident
that bugs are fixed, TTU_SYNC here can be removed, and the race
tolerated.
Link: https://lkml.kernel.org/r/c1e95853-8bcd-d8fd-55fa-e7f2488e78f@google.com
Fixes: fec89c109f3a ("thp: rewrite freeze_page()/unfreeze_page() with generic rmap walkers")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jue Wang <juew@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Most callers of is_huge_zero_pmd() supply a pmd already verified
present; but a few (notably zap_huge_pmd()) do not - it might be a pmd
migration entry, in which the pfn is encoded differently from a present
pmd: which might pass the is_huge_zero_pmd() test (though not on x86,
since L1TF forced us to protect against that); or perhaps even crash in
pmd_page() applied to a swap-like entry.
Make it safe by adding pmd_present() check into is_huge_zero_pmd()
itself; and make it quicker by saving huge_zero_pfn, so that
is_huge_zero_pmd() will not need to do that pmd_page() lookup each time.
__split_huge_pmd_locked() checked pmd_trans_huge() before: that worked,
but is unnecessary now that is_huge_zero_pmd() checks present.
Link: https://lkml.kernel.org/r/21ea9ca-a1f5-8b90-5e88-95fb1c49bbfa@google.com
Fixes: e71769ae5260 ("mm: enable thp migration for shmem thp")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jue Wang <juew@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The routine restore_reserve_on_error is called to restore reservation
information when an error occurs after page allocation. The routine
alloc_huge_page modifies the mapping reserve map and potentially the
reserve count during allocation. If code calling alloc_huge_page
encounters an error after allocation and needs to free the page, the
reservation information needs to be adjusted.
Currently, restore_reserve_on_error only takes action on pages for which
the reserve count was adjusted(HPageRestoreReserve flag). There is
nothing wrong with these adjustments. However, alloc_huge_page ALWAYS
modifies the reserve map during allocation even if the reserve count is
not adjusted. This can cause issues as observed during development of
this patch [1].
One specific series of operations causing an issue is:
- Create a shared hugetlb mapping
Reservations for all pages created by default
- Fault in a page in the mapping
Reservation exists so reservation count is decremented
- Punch a hole in the file/mapping at index previously faulted
Reservation and any associated pages will be removed
- Allocate a page to fill the hole
No reservation entry, so reserve count unmodified
Reservation entry added to map by alloc_huge_page
- Error after allocation and before instantiating the page
Reservation entry remains in map
- Allocate a page to fill the hole
Reservation entry exists, so decrement reservation count
This will cause a reservation count underflow as the reservation count
was decremented twice for the same index.
A user would observe a very large number for HugePages_Rsvd in
/proc/meminfo. This would also likely cause subsequent allocations of
hugetlb pages to fail as it would 'appear' that all pages are reserved.
This sequence of operations is unlikely to happen, however they were
easily reproduced and observed using hacked up code as described in [1].
Address the issue by having the routine restore_reserve_on_error take
action on pages where HPageRestoreReserve is not set. In this case, we
need to remove any reserve map entry created by alloc_huge_page. A new
helper routine vma_del_reservation assists with this operation.
There are three callers of alloc_huge_page which do not currently call
restore_reserve_on error before freeing a page on error paths. Add
those missing calls.
[1] https://lore.kernel.org/linux-mm/20210528005029.88088-1-almasrymina@google.com/
Link: https://lkml.kernel.org/r/20210607204510.22617-1-mike.kravetz@oracle.com
Fixes: 96b96a96ddee ("mm/hugetlb: fix huge page reservation leak in private mapping error paths"
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
I found it by pure code review, that pte_same_as_swp() of unuse_vma()
didn't take uffd-wp bit into account when comparing ptes.
pte_same_as_swp() returning false negative could cause failure to
swapoff swap ptes that was wr-protected by userfaultfd.
Link: https://lkml.kernel.org/r/20210603180546.9083-1-peterx@redhat.com
Fixes: f45ec5ff16a7 ("userfaultfd: wp: support swap and page migration")
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org> [5.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When hugetlb page fault (under overcommitting situation) and
memory_failure() race, VM_BUG_ON_PAGE() is triggered by the following
race:
CPU0: CPU1:
gather_surplus_pages()
page = alloc_surplus_huge_page()
memory_failure_hugetlb()
get_hwpoison_page(page)
__get_hwpoison_page(page)
get_page_unless_zero(page)
zero = put_page_testzero(page)
VM_BUG_ON_PAGE(!zero, page)
enqueue_huge_page(h, page)
put_page(page)
__get_hwpoison_page() only checks the page refcount before taking an
additional one for memory error handling, which is not enough because
there's a time window where compound pages have non-zero refcount during
hugetlb page initialization.
So make __get_hwpoison_page() check page status a bit more for hugetlb
pages with get_hwpoison_huge_page(). Checking hugetlb-specific flags
under hugetlb_lock makes sure that the hugetlb page is not transitive.
It's notable that another new function, HWPoisonHandlable(), is helpful
to prevent a race against other transitive page states (like a generic
compound page just before PageHuge becomes true).
Link: https://lkml.kernel.org/r/20210603233632.2964832-2-nao.horiguchi@gmail.com
Fixes: ead07f6a867b ("mm/memory-failure: introduce get_hwpoison_page() for consistent refcount handling")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reported-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: <stable@vger.kernel.org> [5.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Scaled PPM conversion to PPB may (on 64bit systems) result
in a value larger than s32 can hold (freq/scaled_ppm is a long).
This means the kernel will not correctly reject unreasonably
high ->freq values (e.g. > 4294967295ppb, 281474976645 scaled PPM).
The conversion is equivalent to a division by ~66 (65.536),
so the value of ppb is always smaller than ppm, but not small
enough to assume narrowing the type from long -> s32 is okay.
Note that reasonable user space (e.g. ptp4l) will not use such
high values, anyway, 4289046510ppb ~= 4.3x, so the fix is
somewhat pedantic.
Fixes: d39a743511cd ("ptp: validate the requested frequency adjustment.")
Fixes: d94ba80ebbea ("ptp: Added a brand new class driver for ptp clocks.")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 4c38f2df71c8e33c0b64865992d693f5022eeaad.
There are few races in the frequency invariance support for CPPC driver,
namely the driver doesn't stop the kthread_work and irq_work on policy
exit during suspend/resume or CPU hotplug.
A proper fix won't be possible for the 5.13-rc, as it requires a lot of
changes. Lets revert the patch instead for now.
Fixes: 4c38f2df71c8 ("cpufreq: CPPC: Add support for frequency invariance")
Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
0day robot reported a 9.2% regression for will-it-scale mmap1 test
case[1], caused by commit 57efa1fe5957 ("mm/gup: prevent gup_fast from
racing with COW during fork").
Further debug shows the regression is due to that commit changes the
offset of hot fields 'mmap_lock' inside structure 'mm_struct', thus some
cache alignment changes.
From the perf data, the contention for 'mmap_lock' is very severe and
takes around 95% cpu cycles, and it is a rw_semaphore
struct rw_semaphore {
atomic_long_t count; /* 8 bytes */
atomic_long_t owner; /* 8 bytes */
struct optimistic_spin_queue osq; /* spinner MCS lock */
...
Before commit 57efa1fe5957 adds the 'write_protect_seq', it happens to
have a very optimal cache alignment layout, as Linus explained:
"and before the addition of the 'write_protect_seq' field, the
mmap_sem was at offset 120 in 'struct mm_struct'.
Which meant that count and owner were in two different cachelines,
and then when you have contention and spend time in
rwsem_down_write_slowpath(), this is probably *exactly* the kind
of layout you want.
Because first the rwsem_write_trylock() will do a cmpxchg on the
first cacheline (for the optimistic fast-path), and then in the
case of contention, rwsem_down_write_slowpath() will just access
the second cacheline.
Which is probably just optimal for a load that spends a lot of
time contended - new waiters touch that first cacheline, and then
they queue themselves up on the second cacheline."
After the commit, the rw_semaphore is at offset 128, which means the
'count' and 'owner' fields are now in the same cacheline, and causes
more cache bouncing.
Currently there are 3 "#ifdef CONFIG_XXX" before 'mmap_lock' which will
affect its offset:
CONFIG_MMU
CONFIG_MEMBARRIER
CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES
The layout above is on 64 bits system with 0day's default kernel config
(similar to RHEL-8.3's config), in which all these 3 options are 'y'.
And the layout can vary with different kernel configs.
Relayouting a structure is usually a double-edged sword, as sometimes it
can helps one case, but hurt other cases. For this case, one solution
is, as the newly added 'write_protect_seq' is a 4 bytes long seqcount_t
(when CONFIG_DEBUG_LOCK_ALLOC=n), placing it into an existing 4 bytes
hole in 'mm_struct' will not change other fields' alignment, while
restoring the regression.
Link: https://lore.kernel.org/lkml/20210525031636.GB7744@xsang-OptiPlex-9020/ [1]
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There is a panic in socket ioctl cmd SIOCGSKNS when NET_NS is not enabled.
The reason is that nsfs tries to access ns->ops but the proc_ns_operations
is not implemented in this case.
[7.670023] Unable to handle kernel NULL pointer dereference at virtual address 00000010
[7.670268] pgd = 32b54000
[7.670544] [00000010] *pgd=00000000
[7.671861] Internal error: Oops: 5 [#1] SMP ARM
[7.672315] Modules linked in:
[7.672918] CPU: 0 PID: 1 Comm: systemd Not tainted 5.13.0-rc3-00375-g6799d4f2da49 #16
[7.673309] Hardware name: Generic DT based system
[7.673642] PC is at nsfs_evict+0x24/0x30
[7.674486] LR is at clear_inode+0x20/0x9c
The same to tun SIOCGSKNS command.
To fix this problem, we make get_net_ns() return -EINVAL when NET_NS is
disabled. Meanwhile move it to right place net/core/net_namespace.c.
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Fixes: c62cce2caee5 ("net: add an ioctl to get a socket network namespace")
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a number of tiny USB fixes for 5.13-rc6.
There are more than I would normally like, but there's been a bunch of
people banging on the gadget and dwc3 and typec code recently for I
think an Android release, which has resulted in a number of small
fixes. It's nice to see companies send fixes upstream for this type of
work, a notable change from years ago.
Anyway, fixes in here are:
- usb-serial device id updates
- usb-serial cp210x driver fixes for broken firmware versions
- typec fixes for crazy charging devices and other reported problems
- dwc3 fixes for reported problems found
- gadget fixes for reported problems
- tiny xhci fixes
- other small fixes for reported issues.
- revert of a problem fix found by linux-next testing
All of these have passed 0-day and linux-next testing with no reported
problems (the revert for the found linux-next build problem included)"
* tag 'usb-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (44 commits)
Revert "usb: gadget: fsl: Re-enable driver for ARM SoCs"
usb: typec: mux: Fix copy-paste mistake in typec_mux_match
usb: typec: ucsi: Clear PPM capability data in ucsi_init() error path
usb: gadget: fsl: Re-enable driver for ARM SoCs
usb: typec: wcove: Use LE to CPU conversion when accessing msg->header
USB: serial: cp210x: fix CP2102N-A01 modem control
USB: serial: cp210x: fix alternate function for CP2102N QFN20
usb: misc: brcmstb-usb-pinmap: check return value after calling platform_get_resource()
usb: dwc3: ep0: fix NULL pointer exception
usb: gadget: eem: fix wrong eem header operation
usb: typec: intel_pmc_mux: Put ACPI device using acpi_dev_put()
usb: typec: intel_pmc_mux: Add missed error check for devm_ioremap_resource()
usb: typec: intel_pmc_mux: Put fwnode in error case during ->probe()
usb: typec: tcpm: Do not finish VDM AMS for retrying Responses
usb: fix various gadget panics on 10gbps cabling
usb: fix various gadgets null ptr deref on 10gbps cabling.
usb: pci-quirks: disable D3cold on xhci suspend for s2idle on AMD Renoir
usb: f_ncm: only first packet of aggregate needs to start timer
USB: f_ncm: ncm_bitrate (speed) is unsigned
MAINTAINERS: usb: add entry for isp1760
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small misc driver fixes for 5.13-rc6 that fix some
reported problems:
- Tiny phy driver fixes for reported issues
- rtsx regression for when the device suspended
- mhi driver fix for a use-after-free
All of these have been in linux-next for a few days with no reported
issues"
* tag 'char-misc-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
misc: rtsx: separate aspm mode into MODE_REG and MODE_CFG
bus: mhi: pci-generic: Fix hibernation
bus: mhi: pci_generic: Fix possible use-after-free in mhi_pci_remove()
bus: mhi: pci_generic: T99W175: update channel name from AT to DUN
phy: Sparx5 Eth SerDes: check return value after calling platform_get_resource()
phy: ralink: phy-mt7621-pci: drop 'of_match_ptr' to fix -Wunused-const-variable
phy: ti: Fix an error code in wiz_probe()
phy: phy-mtk-tphy: Fix some resource leaks in mtk_phy_init()
phy: cadence: Sierra: Fix error return code in cdns_sierra_phy_probe()
phy: usb: Fix misuse of IS_ENABLED
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Misc fixes:
- Fix performance regression caused by lack of intended batching of
RCU callbacks by over-eager NOHZ-full code.
- Fix cgroups related corruption of load_avg and load_sum metrics.
- Three fixes to fix blocked load, util_sum/runnable_sum and util_est
tracking bugs"
* tag 'sched-urgent-2021-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix util_est UTIL_AVG_UNCHANGED handling
sched/pelt: Ensure that *_sum is always synced with *_avg
tick/nohz: Only check for RCU deferred wakeup on user/guest entry when needed
sched/fair: Make sure to update tg contrib for blocked load
sched/fair: Keep load_avg and load_sum synced
|
|
Pull rdma fixes from Jason Gunthorpe:
"A mixture of small bug fixes and a small security issue:
- WARN_ON when IPoIB is automatically moved between namespaces
- Long standing bug where mlx5 would use the wrong page for the
doorbell recovery memory if fork is used
- Security fix for mlx4 that disables the timestamp feature
- Several crashers for mlx5
- Plug a recent mlx5 memory leak for the sig_mr"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/mlx5: Fix initializing CQ fragments buffer
RDMA/mlx5: Delete right entry from MR signature database
RDMA: Verify port when creating flow rule
RDMA/mlx5: Block FDB rules when not in switchdev mode
RDMA/mlx4: Do not map the core_clock page to user space unless enabled
RDMA/mlx5: Use different doorbell memory for different processes
RDMA/ipoib: Fix warning caused by destroying non-initial netns
|
|
When adding a hairpin flow, a firmware-side send queue is created for
the peer net device, which claims some host memory pages for its
internal ring buffer. If the peer net device is removed/unbound before
the hairpin flow is deleted, then the send queue is not destroyed which
leads to a stack trace on pci device remove:
[ 748.005230] mlx5_core 0000:08:00.2: wait_func:1094:(pid 12985): MANAGE_PAGES(0x108) timeout. Will cause a leak of a command resource
[ 748.005231] mlx5_core 0000:08:00.2: reclaim_pages:514:(pid 12985): failed reclaiming pages: err -110
[ 748.001835] mlx5_core 0000:08:00.2: mlx5_reclaim_root_pages:653:(pid 12985): failed reclaiming pages (-110) for func id 0x0
[ 748.002171] ------------[ cut here ]------------
[ 748.001177] FW pages counter is 4 after reclaiming all pages
[ 748.001186] WARNING: CPU: 1 PID: 12985 at drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:685 mlx5_reclaim_startup_pages+0x34b/0x460 [mlx5_core] [ +0.002771] Modules linked in: cls_flower mlx5_ib mlx5_core ptp pps_core act_mirred sch_ingress openvswitch nsh xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm ib_umad ib_ipoib iw_cm ib_cm ib_uverbs ib_core overlay fuse [last unloaded: pps_core]
[ 748.007225] CPU: 1 PID: 12985 Comm: tee Not tainted 5.12.0+ #1
[ 748.001376] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 748.002315] RIP: 0010:mlx5_reclaim_startup_pages+0x34b/0x460 [mlx5_core]
[ 748.001679] Code: 28 00 00 00 0f 85 22 01 00 00 48 81 c4 b0 00 00 00 31 c0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 c7 c7 40 cc 19 a1 e8 9f 71 0e e2 <0f> 0b e9 30 ff ff ff 48 c7 c7 a0 cc 19 a1 e8 8c 71 0e e2 0f 0b e9
[ 748.003781] RSP: 0018:ffff88815220faf8 EFLAGS: 00010286
[ 748.001149] RAX: 0000000000000000 RBX: ffff8881b4900280 RCX: 0000000000000000
[ 748.001445] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed102a441f51
[ 748.001614] RBP: 00000000000032b9 R08: 0000000000000001 R09: ffffed1054a15ee8
[ 748.001446] R10: ffff8882a50af73b R11: ffffed1054a15ee7 R12: fffffbfff07c1e30
[ 748.001447] R13: dffffc0000000000 R14: ffff8881b492cba8 R15: 0000000000000000
[ 748.001429] FS: 00007f58bd08b580(0000) GS:ffff8882a5080000(0000) knlGS:0000000000000000
[ 748.001695] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 748.001309] CR2: 000055a026351740 CR3: 00000001d3b48006 CR4: 0000000000370ea0
[ 748.001506] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 748.001483] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 748.001654] Call Trace:
[ 748.000576] ? mlx5_satisfy_startup_pages+0x290/0x290 [mlx5_core]
[ 748.001416] ? mlx5_cmd_teardown_hca+0xa2/0xd0 [mlx5_core]
[ 748.001354] ? mlx5_cmd_init_hca+0x280/0x280 [mlx5_core]
[ 748.001203] mlx5_function_teardown+0x30/0x60 [mlx5_core]
[ 748.001275] mlx5_uninit_one+0xa7/0xc0 [mlx5_core]
[ 748.001200] remove_one+0x5f/0xc0 [mlx5_core]
[ 748.001075] pci_device_remove+0x9f/0x1d0
[ 748.000833] device_release_driver_internal+0x1e0/0x490
[ 748.001207] unbind_store+0x19f/0x200
[ 748.000942] ? sysfs_file_ops+0x170/0x170
[ 748.001000] kernfs_fop_write_iter+0x2bc/0x450
[ 748.000970] new_sync_write+0x373/0x610
[ 748.001124] ? new_sync_read+0x600/0x600
[ 748.001057] ? lock_acquire+0x4d6/0x700
[ 748.000908] ? lockdep_hardirqs_on_prepare+0x400/0x400
[ 748.001126] ? fd_install+0x1c9/0x4d0
[ 748.000951] vfs_write+0x4d0/0x800
[ 748.000804] ksys_write+0xf9/0x1d0
[ 748.000868] ? __x64_sys_read+0xb0/0xb0
[ 748.000811] ? filp_open+0x50/0x50
[ 748.000919] ? syscall_enter_from_user_mode+0x1d/0x50
[ 748.001223] do_syscall_64+0x3f/0x80
[ 748.000892] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 748.001026] RIP: 0033:0x7f58bcfb22f7
[ 748.000944] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[ 748.003925] RSP: 002b:00007fffd7f2aaa8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 748.001732] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f58bcfb22f7
[ 748.001426] RDX: 000000000000000d RSI: 00007fffd7f2abc0 RDI: 0000000000000003
[ 748.001746] RBP: 00007fffd7f2abc0 R08: 0000000000000000 R09: 0000000000000001
[ 748.001631] R10: 00000000000001b6 R11: 0000000000000246 R12: 000000000000000d
[ 748.001537] R13: 00005597ac2c24a0 R14: 000000000000000d R15: 00007f58bd084700
[ 748.001564] irq event stamp: 0
[ 748.000787] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[ 748.001399] hardirqs last disabled at (0): [<ffffffff813132cf>] copy_process+0x146f/0x5eb0
[ 748.001854] softirqs last enabled at (0): [<ffffffff8131330e>] copy_process+0x14ae/0x5eb0
[ 748.013431] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 748.001492] ---[ end trace a6fabd773d1c51ae ]---
Fix by destroying the send queue of a hairpin peer net device that is
being removed/unbound, which returns the allocated ring buffer pages to
the host.
Fixes: 4d8fcf216c90 ("net/mlx5e: Avoid unbounded peer devices when unpairing TC hairpin rules")
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
git://github.com/ojeda/linux
Pull compiler attribute update from Miguel Ojeda:
"A trivial update to the compiler attributes: Add 'continue' keyword to
documentation in comment (from Wei Ming Chen)"
* tag 'compiler-attributes-for-linus-v5.13-rc6' of git://github.com/ojeda/linux:
Compiler Attributes: Add continue in comment
|
|
Pull kvm fixes from Paolo Bonzini:
"Bugfixes, including a TLB flush fix that affects processors without
nested page tables"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: fix previous commit for 32-bit builds
kvm: avoid speculation-based attacks from out-of-range memslot accesses
KVM: x86: Unload MMU on guest TLB flush if TDP disabled to force MMU sync
KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message
selftests: kvm: Add support for customized slot0 memory size
KVM: selftests: introduce P47V64 for s390x
KVM: x86: Ensure PV TLB flush tracepoint reflects KVM behavior
KVM: X86: MMU: Use the correct inherited permissions to get shadow page
KVM: LAPIC: Write 0 to TMICT should also cancel vmx-preemption timer
KVM: SVM: Fix SEV SEND_START session length & SEND_UPDATE_DATA query length after commit 238eca821cee
|
|
aspm (Active State Power Management)
rtsx_comm_set_aspm: this function is for driver to make sure
not enter power saving when processing of init and card_detcct
ASPM_MODE_CFG: 8411 5209 5227 5229 5249 5250
Change back to use original way to control aspm
ASPM_MODE_REG: 5227A 524A 5250A 5260 5261 5228
Keep the new way to control aspm
Fixes: 121e9c6b5c4c ("misc: rtsx: modify and fix init_hw function")
Reported-by: Chris Chiu <chris.chiu@canonical.com>
Tested-by: Gordon Lack <gordon.lack@dsl.pipex.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Ricky Wu <ricky_wu@realtek.com>
Link: https://lore.kernel.org/r/20210607101634.4948-1-ricky_wu@realtek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
array_index_nospec does not work for uint64_t on 32-bit builds.
However, the size of a memory slot must be less than 20 bits wide
on those system, since the memory slot must fit in the user
address space. So just store it in an unsigned long.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
KVM's mechanism for accessing guest memory translates a guest physical
address (gpa) to a host virtual address using the right-shifted gpa
(also known as gfn) and a struct kvm_memory_slot. The translation is
performed in __gfn_to_hva_memslot using the following formula:
hva = slot->userspace_addr + (gfn - slot->base_gfn) * PAGE_SIZE
It is expected that gfn falls within the boundaries of the guest's
physical memory. However, a guest can access invalid physical addresses
in such a way that the gfn is invalid.
__gfn_to_hva_memslot is called from kvm_vcpu_gfn_to_hva_prot, which first
retrieves a memslot through __gfn_to_memslot. While __gfn_to_memslot
does check that the gfn falls within the boundaries of the guest's
physical memory or not, a CPU can speculate the result of the check and
continue execution speculatively using an illegal gfn. The speculation
can result in calculating an out-of-bounds hva. If the resulting host
virtual address is used to load another guest physical address, this
is effectively a Spectre gadget consisting of two consecutive reads,
the second of which is data dependent on the first.
Right now it's not clear if there are any cases in which this is
exploitable. One interesting case was reported by the original author
of this patch, and involves visiting guest page tables on x86. Right
now these are not vulnerable because the hva read goes through get_user(),
which contains an LFENCE speculation barrier. However, there are
patches in progress for x86 uaccess.h to mask kernel addresses instead of
using LFENCE; once these land, a guest could use speculation to read
from the VMM's ring 3 address space. Other architectures such as ARM
already use the address masking method, and would be susceptible to
this same kind of data-dependent access gadgets. Therefore, this patch
proactively protects from these attacks by masking out-of-bounds gfns
in __gfn_to_hva_memslot, which blocks speculation of invalid hvas.
Sean Christopherson noted that this patch does not cover
kvm_read_guest_offset_cached. This however is limited to a few bytes
past the end of the cache, and therefore it is unlikely to be useful in
the context of building a chain of data dependent accesses.
Reported-by: Artemiy Margaritov <artemiy.margaritov@gmail.com>
Co-developed-by: Artemiy Margaritov <artemiy.margaritov@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"A collection of fixes for the regulator API that have come up since
the merge window, including a big batch of fixes from Axel Lin's usual
careful and detailed review.
The one stand out fix here is Dmitry Baryshkov's fix for an issue
where we fail to power on the parents of always on regulators during
system startup if they weren't already powered on"
* tag 'regulator-fix-v5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (21 commits)
regulator: rt4801: Fix NULL pointer dereference if priv->enable_gpios is NULL
regulator: hi6421v600: Fix .vsel_mask setting
regulator: bd718x7: Fix the BUCK7 voltage setting on BD71837
regulator: atc260x: Fix n_voltages and min_sel for pickable linear ranges
regulator: rtmv20: Fix to make regcache value first reading back from HW
regulator: mt6315: Fix function prototype for mt6315_map_mode
regulator: rtmv20: Add Richtek to Kconfig text
regulator: rtmv20: Fix .set_current_limit/.get_current_limit callbacks
regulator: hisilicon: use the correct HiSilicon copyright
regulator: bd71828: Fix .n_voltages settings
regulator: bd70528: Fix off-by-one for buck123 .n_voltages setting
regulator: max77620: Silence deferred probe error
regulator: max77620: Use device_set_of_node_from_dev()
regulator: scmi: Fix off-by-one for linear regulators .n_voltages setting
regulator: core: resolve supply for boot-on/always-on regulators
regulator: fixed: Ensure enable_counter is correct if reg_domain_disable fails
regulator: Check ramp_delay_table for regulator_set_ramp_delay_regmap
regulator: fan53880: Fix missing n_voltages setting
regulator: da9121: Return REGULATOR_MODE_INVALID for invalid mode
regulator: fan53555: fix TCS4525 voltage calulation
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Olof Johansson:
"A set of fixes that have been coming in over the last few weeks, the
usual mix of fixes:
- DT fixups for TI K3
- SATA drive detection fix for TI DRA7
- Power management fixes and a few build warning removals for OMAP
- OP-TEE fix to use standard API for UUID exporting
- DT fixes for a handful of i.MX boards
And a few other smaller items"
* tag 'arm-soc-fixes-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (29 commits)
arm64: meson: select COMMON_CLK
soc: amlogic: meson-clk-measure: remove redundant dev_err call in meson_msr_probe()
ARM: OMAP1: ams-delta: remove unused function ams_delta_camera_power
bus: ti-sysc: Fix flakey idling of uarts and stop using swsup_sidle_act
ARM: dts: imx: emcon-avari: Fix nxp,pca8574 #gpio-cells
ARM: dts: imx7d-pico: Fix the 'tuning-step' property
ARM: dts: imx7d-meerkat96: Fix the 'tuning-step' property
arm64: dts: freescale: sl28: var1: fix RGMII clock and voltage
arm64: dts: freescale: sl28: var4: fix RGMII clock and voltage
ARM: imx: pm-imx27: Include "common.h"
arm64: dts: zii-ultra: fix 12V_MAIN voltage
arm64: dts: zii-ultra: remove second GEN_3V3 regulator instance
arm64: dts: ls1028a: fix memory node
bus: ti-sysc: Fix am335x resume hang for usb otg module
ARM: OMAP2+: Fix build warning when mmc_omap is not built
ARM: OMAP1: isp1301-omap: Add missing gpiod_add_lookup_table function
ARM: OMAP1: Fix use of possibly uninitialized irq variable
optee: use export_uuid() to copy client UUID
arm64: dts: ti: k3*: Introduce reg definition for interrupt routers
arm64: dts: ti: k3-am65|j721e|am64: Map the dma / navigator subsystem via explicit ranges
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes
PM and build warning fixes for omaps
While chasing system suspend related regressions, I noticed few other
issues related to PM would be good to have fixed:
- UART idling does not always work for hardware autoidle features
- am335x resume works only the first time unless musb module is loaded
Then there are three patches for omap1 related warnings caused by the gpio
changes, and one build warning fix for legacy mmc platform code when mmc
is built as a loadable module.
These can all be merged whenever suitable naturally. I've sent the more
urgent SATA regression fix separately although it appears in this pull
request too because of the branches merged.
* tag 'omap-for-v5.13/fixes-pm' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP1: ams-delta: remove unused function ams_delta_camera_power
bus: ti-sysc: Fix flakey idling of uarts and stop using swsup_sidle_act
bus: ti-sysc: Fix am335x resume hang for usb otg module
ARM: OMAP2+: Fix build warning when mmc_omap is not built
ARM: OMAP1: isp1301-omap: Add missing gpiod_add_lookup_table function
ARM: OMAP1: Fix use of possibly uninitialized irq variable
Link: https://lore.kernel.org/r/pull-1622614772-543196@atomide.com
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Merge misc fixes from Andrew Morton:
"13 patches.
Subsystems affected by this patch series: mips, mm (kfence, debug,
pagealloc, memory-hotplug, hugetlb, kasan, and hugetlb), init, proc,
lib, ocfs2, and mailmap"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mailmap: use private address for Michel Lespinasse
ocfs2: fix data corruption by fallocate
lib: crc64: fix kernel-doc warning
mm, hugetlb: fix simple resv_huge_pages underflow on UFFDIO_COPY
mm/kasan/init.c: fix doc warning
proc: add .gitignore for proc-subset-pid selftest
hugetlb: pass head page to remove_hugetlb_page()
drivers/base/memory: fix trying offlining memory blocks with memory holes on aarch64
mm/page_alloc: fix counting of free pages after take off from buddy
mm/debug_vm_pgtable: fix alignment for pmd/pud_advanced_tests()
pid: take a reference when initializing `cad_pid`
kfence: use TASK_IDLE when awaiting allocation
Revert "MIPS: make userspace mapping young by default"
|
|
This reverts commit f685a533a7fab35c5d069dcd663f59c8e4171a75.
The MIPS cache flush logic needs to know whether the mapping was already
established to decide how to flush caches. This is done by checking the
valid bit in the PTE. The commit above breaks this logic by setting the
valid in the PTE in new mappings, which causes kernel crashes.
Link: https://lkml.kernel.org/r/20210526094335.92948-1-tsbogend@alpha.franken.de
Fixes: f685a533a7f ("MIPS: make userspace mapping young by default")
Reported-by: Zhou Yanjie <zhouyanjie@wanyeetech.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Huang Pei <huangpei@loongson.cn>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes, including fixes from bpf, wireless, netfilter and
wireguard trees.
The bpf vs lockdown+audit fix is the most notable.
Things haven't slowed down just yet, both in terms of regressions in
current release and largish fixes for older code, but we usually see a
slowdown only after -rc5.
Current release - regressions:
- virtio-net: fix page faults and crashes when XDP is enabled
- mlx5e: fix HW timestamping with CQE compression, and make sure they
are only allowed to coexist with capable devices
- stmmac:
- fix kernel panic due to NULL pointer dereference of
mdio_bus_data
- fix double clk unprepare when no PHY device is connected
Current release - new code bugs:
- mt76: a few fixes for the recent MT7921 devices and runtime power
management
Previous releases - regressions:
- ice:
- track AF_XDP ZC enabled queues in bitmap to fix copy mode Tx
- fix allowing VF to request more/less queues via virtchnl
- correct supported and advertised autoneg by using PHY
capabilities
- allow all LLDP packets from PF to Tx
- kbuild: quote OBJCOPY var to avoid a pahole call break the build
Previous releases - always broken:
- bpf, lockdown, audit: fix buggy SELinux lockdown permission checks
- mt76: address the recent FragAttack vulnerabilities not covered by
generic fixes
- ipv6: fix KASAN: slab-out-of-bounds Read in
fib6_nh_flush_exceptions
- Bluetooth:
- fix the erroneous flush_work() order, to avoid double free
- use correct lock to prevent UAF of hdev object
- nfc: fix NULL ptr dereference in llcp_sock_getname() after failed
connect
- ieee802154: multiple fixes to error checking and return values
- igb: fix XDP with PTP enabled
- intel: add correct exception tracing for XDP
- tls: fix use-after-free when TLS offload device goes down and back
up
- ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service
- netfilter: nft_ct: skip expectations for confirmed conntrack
- mptcp: fix falling back to TCP in presence of out of order packets
early in connection lifetime
- wireguard: switch from O(n) to a O(1) algorithm for maintaining
peers, fixing stalls and a large memory leak in the process
Misc:
- devlink: correct VIRTUAL port to not have phys_port attributes
- Bluetooth: fix VIRTIO_ID_BT assigned number
- net: return the correct errno code ENOBUF -> ENOMEM
- wireguard:
- peer: allocate in kmem_cache saving 25% on peer memory
- do not use -O3"
* tag 'net-5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits)
cxgb4: avoid link re-train during TC-MQPRIO configuration
sch_htb: fix refcount leak in htb_parent_to_leaf_offload
wireguard: allowedips: free empty intermediate nodes when removing single node
wireguard: allowedips: allocate nodes in kmem_cache
wireguard: allowedips: remove nodes in O(1)
wireguard: allowedips: initialize list head in selftest
wireguard: peer: allocate in kmem_cache
wireguard: use synchronize_net rather than synchronize_rcu
wireguard: do not use -O3
wireguard: selftests: make sure rp_filter is disabled on vethc
wireguard: selftests: remove old conntrack kconfig value
virtchnl: Add missing padding to virtchnl_proto_hdrs
ice: Allow all LLDP packets from PF to Tx
ice: report supported and advertised autoneg using PHY capabilities
ice: handle the VF VSI rebuild failure
ice: Fix VFR issues for AVF drivers that expect ATQLEN cleared
ice: Fix allowing VF to request more/less queues via virtchnl
virtio-net: fix for skb_over_panic inside big mode
ipv6: Fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions
fib: Return the correct errno code
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- Fix MSIs for platforms with "msi-map" device-tree property, which we
broke in v5.13-rc1 (Jean-Philippe Brucker)
- Add Krzysztof Wilczyński as PCI reviewer (Lorenzo Pieralisi)
* tag 'pci-v5.13-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/MSI: Fix MSIs for generic hosts that use device-tree's "msi-map"
MAINTAINERS: Add Krzysztof as PCI host/endpoint controllers reviewer
|
|
Pull drm fixes from Dave Airlie:
"Two big regression reverts in here, one for fbdev and one i915.
Otherwise it's mostly amdgpu display fixes, and tegra fixes.
fb:
- revert broken fb_defio patch
amdgpu:
- Display fixes
- FRU EEPROM error handling fix
- RAS fix
- PSP fix
- Releasing pinned BO fix
i915:
- Revert conversion to io_mapping_map_user() which lead to BUG_ON()
- Fix check for error valued returns in a selftest
tegra:
- SOR power domain race condition fix
- build warning fix
- runtime pm ref leak fix
- modifier fix"
* tag 'drm-fixes-2021-06-04-1' of git://anongit.freedesktop.org/drm/drm:
amd/display: convert DRM_DEBUG_ATOMIC to drm_dbg_atomic
drm/amdgpu: make sure we unpin the UVD BO
drm/amd/amdgpu:save psp ring wptr to avoid attack
drm/amd/display: Fix potential memory leak in DMUB hw_init
drm/amdgpu: Don't query CE and UE errors
drm/amd/display: Fix overlay validation by considering cursors
drm/amdgpu: refine amdgpu_fru_get_product_info
drm/amdgpu: add judgement for dc support
drm/amd/display: Fix GPU scaling regression by FS video support
drm/amd/display: Allow bandwidth validation for 0 streams.
Revert "i915: use io_mapping_map_user"
drm/i915/selftests: Fix return value check in live_breadcrumbs_smoketest()
Revert "fb_defio: Remove custom address_space_operations"
drm/tegra: Correct DRM_FORMAT_MOD_NVIDIA_SECTOR_LAYOUT
drm/tegra: sor: Fix AUX device reference leak
drm/tegra: Get ref for DP AUX channel, not its ddc adapter
drm/tegra: Fix shift overflow in tegra_shared_plane_atomic_update
drm/tegra: sor: Fully initialize SOR before registration
gpu: host1x: Split up client initalization and registration
drm/tegra: sor: Do not leak runtime PM reference
|
|
On m68k (Coldfire M547x):
CC drivers/net/ethernet/intel/i40e/i40e_main.o
In file included from drivers/net/ethernet/intel/i40e/i40e_prototype.h:9,
from drivers/net/ethernet/intel/i40e/i40e.h:41,
from drivers/net/ethernet/intel/i40e/i40e_main.c:12:
include/linux/avf/virtchnl.h:153:36: warning: division by zero [-Wdiv-by-zero]
153 | { virtchnl_static_assert_##X = (n)/((sizeof(struct X) == (n)) ? 1 : 0) }
| ^
include/linux/avf/virtchnl.h:844:1: note: in expansion of macro ‘VIRTCHNL_CHECK_STRUCT_LEN’
844 | VIRTCHNL_CHECK_STRUCT_LEN(2312, virtchnl_proto_hdrs);
| ^~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/avf/virtchnl.h:844:33: error: enumerator value for ‘virtchnl_static_assert_virtchnl_proto_hdrs’ is not an integer constant
844 | VIRTCHNL_CHECK_STRUCT_LEN(2312, virtchnl_proto_hdrs);
| ^~~~~~~~~~~~~~~~~~~
On m68k, integers are aligned on addresses that are multiples of two,
not four, bytes. Hence the size of a structure containing integers may
not be divisible by 4.
Fix this by adding explicit padding.
Fixes: 1f7ea1cd6a374842 ("ice: Enable FDIR Configure for AVF")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When receiving Alert Message, if it is not unexpected but is
unsupported for some reason, the port should return Not_Supported
Message response.
Also, according to PD3.0 Spec 6.5.2.1.4 Event Flags Field, the
OTP/OVP/OCP flags in the Event Flags field in Status Message no longer
require Get_PPS_Status Message to clear them. Thus remove it when
receiving Status Message with those flags being set.
In addition, add the missing AMS operations for Status Message.
Fixes: 64f7c494a3c0 ("typec: tcpm: Add support for sink PPS related messages")
Fixes: 0908c5aca31e ("usb: typec: tcpm: AMS and Collision Avoidance")
Signed-off-by: Kyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210531164928.2368606-1-kyletso@google.com
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
ssh://git.freedesktop.org/git/tegra/linux into drm-fixes
drm/tegra: Fixes for v5.13-rc5
The most important change here fixes a race condition that causes either
HDA or (more frequently) display to malfunction because they race for
enabling the SOR power domain at probe time.
Other than that, there's a couple of build warnings for issues
introduced in v5.13 as well as some minor fixes, such as reference leak
plugs.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603144624.788861-1-thierry.reding@gmail.com
|
|
Currently when mlx4 maps the hca_core_clock page to the user space there
are read-modifiable registers, one of which is semaphore, on this page as
well as the clock counter. If user reads the wrong offset, it can modify
the semaphore and hang the device.
Do not map the hca_core_clock page to the user space unless the device has
been put in a backwards compatibility mode to support this feature.
After this patch, mlx4 core_clock won't be mapped to user space on the
majority of existing devices and the uverbs device time feature in
ibv_query_rt_values_ex() will be disabled.
Fixes: 52033cfb5aab ("IB/mlx4: Add mmap call to map the hardware clock")
Link: https://lore.kernel.org/r/9632304e0d6790af84b3b706d8c18732bc0d5e27.1622726305.git.leonro@nvidia.com
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The util_est internal UTIL_AVG_UNCHANGED flag which is used to prevent
unnecessary util_est updates uses the LSB of util_est.enqueued. It is
exposed via _task_util_est() (and task_util_est()).
Commit 92a801e5d5b7 ("sched/fair: Mask UTIL_AVG_UNCHANGED usages")
mentions that the LSB is lost for util_est resolution but
find_energy_efficient_cpu() checks if task_util_est() returns 0 to
return prev_cpu early.
_task_util_est() returns the max value of util_est.ewma and
util_est.enqueued or'ed w/ UTIL_AVG_UNCHANGED.
So task_util_est() returning the max of task_util() and
_task_util_est() will never return 0 under the default
SCHED_FEAT(UTIL_EST, true).
To fix this use the MSB of util_est.enqueued instead and keep the flag
util_est internal, i.e. don't export it via _task_util_est().
The maximal possible util_avg value for a task is 1024 so the MSB of
'unsigned int util_est.enqueued' isn't used to store a util value.
As a caveat the code behind the util_est_se trace point has to filter
UTIL_AVG_UNCHANGED to see the real util_est.enqueued value which should
be easy to do.
This also fixes an issue report by Xuewen Yan that util_est_update()
only used UTIL_AVG_UNCHANGED for the subtrahend of the equation:
last_enqueued_diff = ue.enqueued - (task_util() | UTIL_AVG_UNCHANGED)
Fixes: b89997aa88f0b sched/pelt: Fix task util_est update filtering
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Xuewen Yan <xuewen.yan@unisoc.com>
Reviewed-by: Vincent Donnefort <vincent.donnefort@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20210602145808.1562603-1-dietmar.eggemann@arm.com
|
|
Current timer PD_T_SINK_WAIT_CAP is set to 240ms which will violate the
SinkWaitCapTimer (tTypeCSinkWaitCap 310 - 620 ms) defined in the PD
Spec if the port is faster enough when running the state machine. Set it
to the lower bound 310ms to ensure the timeout is in Spec.
Fixes: f0690a25a140 ("staging: typec: USB Type-C Port Manager (tcpm)")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Kyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210528081613.730661-1-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- memory leak fix in usbhid from Anirudh Rayabharam
- additions for a few new recognized generic key IDs from Dmitry
Torokhov
- Asus T101HA and Dell K15A quirks from Hans de Goede
- memory leak fix in amd_sfh from Basavaraj Natikar
- Win8 compatibility and Stylus fixes in multitouch driver from
Ahelenia Ziemiańska
- NULL pointer dereference fix in hid-magicmouse from Johan Hovold
- assorted other small fixes and device ID additions
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (33 commits)
HID: asus: Cleanup Asus T101HA keyboard-dock handling
HID: magicmouse: fix NULL-deref on disconnect
HID: intel-ish-hid: ipc: Add Alder Lake device IDs
HID: i2c-hid: fix format string mismatch
HID: amd_sfh: Fix memory leak in amd_sfh_work
HID: amd_sfh: Use devm_kzalloc() instead of kzalloc()
HID: ft260: improve error handling of ft260_hid_feature_report_get()
HID: magicmouse: fix crash when disconnecting Magic Trackpad 2
HID: gt683r: add missing MODULE_DEVICE_TABLE
HID: pidff: fix error return code in hid_pidff_init()
HID: logitech-hidpp: initialize level variable
HID: multitouch: Disable event reporting on suspend on the Asus T101HA touchpad
HID: core: Remove extraneous empty line before EXPORT_SYMBOL_GPL(hid_check_keys_pressed)
HID: hid-sensor-custom: Process failure of sensor_hub_set_feature()
HID: i2c-hid: Skip ELAN power-on command after reset
HID: usbhid: fix info leak in hid_submit_ctrl
HID: Add BUS_VIRTUAL to hid_connect logging
HID: multitouch: set Stylus suffix for Stylus-application devices, too
HID: multitouch: require Finger field to mark Win8 reports as MT
HID: remove the unnecessary redefinition of a macro
...
|
|
Flow table that contains flow pointing to multiple flow tables or multiple
TIRs must have a level lower than 64. In our case it applies to muli-
destination flow table.
Fix the level of the created table to comply with HW Spec definitions, and
still make sure that its level lower than SW-owned tables, so that it
would be possible to point from the multi-destination FW table to SW
tables.
Fixes: 34583beea4b7 ("net/mlx5: DR, Create multi-destination table for SW-steering use")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Commit ccf953d8f3d6 makes framebuffers which use deferred I/O stop
displaying updates after the first one. This is because the pages
handled by fb_defio no longer have a page_mapping(). That prevents
page_mkclean() from marking the PTEs as clean, and so writes are only
noticed the first time.
Reported-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/YLZEhv0cpZp8uVE3@casper.infradead.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify fixes from Jan Kara:
"A fix for permission checking with fanotify unpriviledged groups.
Also there's a small update in MAINTAINERS file for fanotify"
* tag 'fsnotify_for_v5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fanotify: fix permission model of unprivileged group
MAINTAINERS: Add Matthew Bobrowski as a reviewer
|
|
Checking for and processing RCU-nocb deferred wakeup upon user/guest
entry is only relevant when nohz_full runs on the local CPU, otherwise
the periodic tick should take care of it.
Make sure we don't needlessly pollute these fast-paths as a -3%
performance regression on a will-it-scale.per_process_ops has been
reported so far.
Fixes: 47b8ff194c1f (entry: Explicitly flush pending rcuog wakeup before last rescheduling point)
Fixes: 4ae7dc97f726 (entry/kvm: Explicitly flush pending rcuog wakeup before last rescheduling point)
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210527113441.465489-1-frederic@kernel.org
|