<feed xmlns='http://www.w3.org/2005/Atom'>
<title>pm24.git/mm, branch v5.17</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.</subtitle>
<id>https://git.kobert.dev/pm24.git/atom/mm?h=v5.17</id>
<link rel='self' href='https://git.kobert.dev/pm24.git/atom/mm?h=v5.17'/>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/'/>
<updated>2022-03-17T18:02:13Z</updated>
<entry>
<title>mm: swap: get rid of livelock in swapin readahead</title>
<updated>2022-03-17T18:02:13Z</updated>
<author>
<name>Guo Ziliang</name>
<email>guo.ziliang@zte.com.cn</email>
</author>
<published>2022-03-16T23:15:03Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=029c4628b2eb2ca969e9bf979b05dc18d8d5575e'/>
<id>urn:sha1:029c4628b2eb2ca969e9bf979b05dc18d8d5575e</id>
<content type='text'>
In our testing, a livelock task was found.  Through sysrq printing, same
stack was found every time, as follows:

  __swap_duplicate+0x58/0x1a0
  swapcache_prepare+0x24/0x30
  __read_swap_cache_async+0xac/0x220
  read_swap_cache_async+0x58/0xa0
  swapin_readahead+0x24c/0x628
  do_swap_page+0x374/0x8a0
  __handle_mm_fault+0x598/0xd60
  handle_mm_fault+0x114/0x200
  do_page_fault+0x148/0x4d0
  do_translation_fault+0xb0/0xd4
  do_mem_abort+0x50/0xb0

The reason for the livelock is that swapcache_prepare() always returns
EEXIST, indicating that SWAP_HAS_CACHE has not been cleared, so that it
cannot jump out of the loop.  We suspect that the task that clears the
SWAP_HAS_CACHE flag never gets a chance to run.  We try to lower the
priority of the task stuck in a livelock so that the task that clears
the SWAP_HAS_CACHE flag will run.  The results show that the system
returns to normal after the priority is lowered.

In our testing, multiple real-time tasks are bound to the same core, and
the task in the livelock is the highest priority task of the core, so
the livelocked task cannot be preempted.

Although cond_resched() is used by __read_swap_cache_async, it is an
empty function in the preemptive system and cannot achieve the purpose
of releasing the CPU.  A high-priority task cannot release the CPU
unless preempted by a higher-priority task.  But when this task is
already the highest priority task on this core, other tasks will not be
able to be scheduled.  So we think we should replace cond_resched() with
schedule_timeout_uninterruptible(1), schedule_timeout_interruptible will
call set_current_state first to set the task state, so the task will be
removed from the running queue, so as to achieve the purpose of giving
up the CPU and prevent it from running in kernel mode for too long.

(akpm: ugly hack becomes uglier.  But it fixes the issue in a
backportable-to-stable fashion while we hopefully work on something
better)

Link: https://lkml.kernel.org/r/20220221111749.1928222-1-cgel.zte@gmail.com
Signed-off-by: Guo Ziliang &lt;guo.ziliang@zte.com.cn&gt;
Reported-by: Zeal Robot &lt;zealci@zte.com.cn&gt;
Reviewed-by: Ran Xiaokai &lt;ran.xiaokai@zte.com.cn&gt;
Reviewed-by: Jiang Xuexin &lt;jiang.xuexin@zte.com.cn&gt;
Reviewed-by: Yang Yang &lt;yang.yang29@zte.com.cn&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Roger Quadros &lt;rogerq@kernel.org&gt;
Cc: Ziliang Guo &lt;guo.ziliang@zte.com.cn&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: gup: make fault_in_safe_writeable() use fixup_user_fault()</title>
<updated>2022-03-10T18:48:53Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-03-08T19:55:48Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=fe673d3f5bf1fc50cdc4b754831db91a2ec10126'/>
<id>urn:sha1:fe673d3f5bf1fc50cdc4b754831db91a2ec10126</id>
<content type='text'>
Instead of using GUP, make fault_in_safe_writeable() actually force a
'handle_mm_fault()' using the same fixup_user_fault() machinery that
futexes already use.

Using the GUP machinery meant that fault_in_safe_writeable() did not do
everything that a real fault would do, ranging from not auto-expanding
the stack segment, to not updating accessed or dirty flags in the page
tables (GUP sets those flags on the pages themselves).

The latter causes problems on architectures (like s390) that do accessed
bit handling in software, which meant that fault_in_safe_writeable()
didn't actually do all the fault handling it needed to, and trying to
access the user address afterwards would still cause faults.

Reported-and-tested-by: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
Fixes: cdd591fc86e3 ("iov_iter: Introduce fault_in_iov_iter_writeable")
Link: https://lore.kernel.org/all/CAHc6FU5nP+nziNGG0JAF1FUx-GV7kKFvM7aZuU_XD2_1v4vnvg@mail.gmail.com/
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memfd: fix F_SEAL_WRITE after shmem huge page allocated</title>
<updated>2022-03-05T19:08:32Z</updated>
<author>
<name>Hugh Dickins</name>
<email>hughd@google.com</email>
</author>
<published>2022-03-05T04:29:01Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=f2b277c4d1c63a85127e8aa2588e9cc3bd21cb99'/>
<id>urn:sha1:f2b277c4d1c63a85127e8aa2588e9cc3bd21cb99</id>
<content type='text'>
Wangyong reports: after enabling tmpfs filesystem to support transparent
hugepage with the following command:

  echo always &gt; /sys/kernel/mm/transparent_hugepage/shmem_enabled

the docker program tries to add F_SEAL_WRITE through the following
command, but it fails unexpectedly with errno EBUSY:

  fcntl(5, F_ADD_SEALS, F_SEAL_WRITE) = -1.

That is because memfd_tag_pins() and memfd_wait_for_pins() were never
updated for shmem huge pages: checking page_mapcount() against
page_count() is hopeless on THP subpages - they need to check
total_mapcount() against page_count() on THP heads only.

Make memfd_tag_pins() (compared &gt; 1) as strict as memfd_wait_for_pins()
(compared != 1): either can be justified, but given the non-atomic
total_mapcount() calculation, it is better now to be strict.  Bear in
mind that total_mapcount() itself scans all of the THP subpages, when
choosing to take an XA_CHECK_SCHED latency break.

Also fix the unlikely xa_is_value() case in memfd_wait_for_pins(): if a
page has been swapped out since memfd_tag_pins(), then its refcount must
have fallen, and so it can safely be untagged.

Link: https://lkml.kernel.org/r/a4f79248-df75-2c8c-3df-ba3317ccb5da@google.com
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Reported-by: Zeal Robot &lt;zealci@zte.com.cn&gt;
Reported-by: wangyong &lt;wang.yong12@zte.com.cn&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: CGEL ZTE &lt;cgel.zte@gmail.com&gt;
Cc: Kirill A. Shutemov &lt;kirill@shutemov.name&gt;
Cc: Song Liu &lt;songliubraving@fb.com&gt;
Cc: Yang Yang &lt;yang.yang29@zte.com.cn&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix use-after-free when anon vma name is used after vma is freed</title>
<updated>2022-03-05T19:08:32Z</updated>
<author>
<name>Suren Baghdasaryan</name>
<email>surenb@google.com</email>
</author>
<published>2022-03-05T04:28:58Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=942341dcc5748d9c1fc7009a359fc1916bfe0ef0'/>
<id>urn:sha1:942341dcc5748d9c1fc7009a359fc1916bfe0ef0</id>
<content type='text'>
When adjacent vmas are being merged it can result in the vma that was
originally passed to madvise_update_vma being destroyed.  In the current
implementation, the name parameter passed to madvise_update_vma points
directly to vma-&gt;anon_name and it is used after the call to vma_merge.
In the cases when vma_merge merges the original vma and destroys it,
this might result in UAF.  For that the original vma would have to hold
the anon_vma_name with the last reference.  The following vma would need
to contain a different anon_vma_name object with the same string.  Such
scenario is shown below:

madvise_vma_behavior(vma)
  madvise_update_vma(vma, ..., anon_name == vma-&gt;anon_name)
    vma_merge(vma)
      __vma_adjust(vma) &lt;-- merges vma with adjacent one
        vm_area_free(vma) &lt;-- frees the original vma
    replace_vma_anon_name(anon_name) &lt;-- UAF of vma-&gt;anon_name

Fix this by raising the name refcount and stabilizing it.

Link: https://lkml.kernel.org/r/20220224231834.1481408-3-surenb@google.com
Link: https://lkml.kernel.org/r/20220223153613.835563-3-surenb@google.com
Fixes: 9a10064f5625 ("mm: add a field to store names for private anonymous memory")
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Reported-by: syzbot+aa7b3d4b35f9dc46a366@syzkaller.appspotmail.com
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: Chris Hyser &lt;chris.hyser@oracle.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Colin Cross &lt;ccross@google.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@gmail.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Peter Collingbourne &lt;pcc@google.com&gt;
Cc: Sasha Levin &lt;sashal@kernel.org&gt;
Cc: Sumit Semwal &lt;sumit.semwal@linaro.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Xiaofeng Cao &lt;caoxiaofeng@yulong.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: prevent vm_area_struct::anon_name refcount saturation</title>
<updated>2022-03-05T19:08:32Z</updated>
<author>
<name>Suren Baghdasaryan</name>
<email>surenb@google.com</email>
</author>
<published>2022-03-05T04:28:55Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=96403e11283def1d1c465c8279514c9a504d8630'/>
<id>urn:sha1:96403e11283def1d1c465c8279514c9a504d8630</id>
<content type='text'>
A deep process chain with many vmas could grow really high.  With
default sysctl_max_map_count (64k) and default pid_max (32k) the max
number of vmas in the system is 2147450880 and the refcounter has
headroom of 1073774592 before it reaches REFCOUNT_SATURATED
(3221225472).

Therefore it's unlikely that an anonymous name refcounter will overflow
with these defaults.  Currently the max for pid_max is PID_MAX_LIMIT
(4194304) and for sysctl_max_map_count it's INT_MAX (2147483647).  In
this configuration anon_vma_name refcount overflow becomes theoretically
possible (that still require heavy sharing of that anon_vma_name between
processes).

kref refcounting interface used in anon_vma_name structure will detect a
counter overflow when it reaches REFCOUNT_SATURATED value but will only
generate a warning and freeze the ref counter.  This would lead to the
refcounted object never being freed.  A determined attacker could leak
memory like that but it would be rather expensive and inefficient way to
do so.

To ensure anon_vma_name refcount does not overflow, stop anon_vma_name
sharing when the refcount reaches REFCOUNT_MAX (2147483647), which still
leaves INT_MAX/2 (1073741823) values before the counter reaches
REFCOUNT_SATURATED.  This should provide enough headroom for raising the
refcounts temporarily.

Link: https://lkml.kernel.org/r/20220223153613.835563-2-surenb@google.com
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Suggested-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: Chris Hyser &lt;chris.hyser@oracle.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Colin Cross &lt;ccross@google.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@gmail.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Peter Collingbourne &lt;pcc@google.com&gt;
Cc: Sasha Levin &lt;sashal@kernel.org&gt;
Cc: Sumit Semwal &lt;sumit.semwal@linaro.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Xiaofeng Cao &lt;caoxiaofeng@yulong.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: refactor vm_area_struct::anon_vma_name usage code</title>
<updated>2022-03-05T19:08:32Z</updated>
<author>
<name>Suren Baghdasaryan</name>
<email>surenb@google.com</email>
</author>
<published>2022-03-05T04:28:51Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=5c26f6ac9416b63d093e29c30e79b3297e425472'/>
<id>urn:sha1:5c26f6ac9416b63d093e29c30e79b3297e425472</id>
<content type='text'>
Avoid mixing strings and their anon_vma_name referenced pointers by
using struct anon_vma_name whenever possible.  This simplifies the code
and allows easier sharing of anon_vma_name structures when they
represent the same name.

[surenb@google.com: fix comment]

Link: https://lkml.kernel.org/r/20220223153613.835563-1-surenb@google.com
Link: https://lkml.kernel.org/r/20220224231834.1481408-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Suggested-by: Matthew Wilcox &lt;willy@infradead.org&gt;
Suggested-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Colin Cross &lt;ccross@google.com&gt;
Cc: Sumit Semwal &lt;sumit.semwal@linaro.org&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: Sasha Levin &lt;sashal@kernel.org&gt;
Cc: Chris Hyser &lt;chris.hyser@oracle.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Peter Collingbourne &lt;pcc@google.com&gt;
Cc: Xiaofeng Cao &lt;caoxiaofeng@yulong.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: Consider __GFP_NOWARN flag for oversized kvmalloc() calls</title>
<updated>2022-03-04T18:00:37Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2022-03-04T14:26:32Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=0708a0afe291bdfe1386d74d5ec1f0c27e8b9168'/>
<id>urn:sha1:0708a0afe291bdfe1386d74d5ec1f0c27e8b9168</id>
<content type='text'>
syzkaller was recently triggering an oversized kvmalloc() warning via
xdp_umem_create().

The triggered warning was added back in 7661809d493b ("mm: don't allow
oversized kvmalloc() calls"). The rationale for the warning for huge
kvmalloc sizes was as a reaction to a security bug where the size was
more than UINT_MAX but not everything was prepared to handle unsigned
long sizes.

Anyway, the AF_XDP related call trace from this syzkaller report was:

  kvmalloc include/linux/mm.h:806 [inline]
  kvmalloc_array include/linux/mm.h:824 [inline]
  kvcalloc include/linux/mm.h:829 [inline]
  xdp_umem_pin_pages net/xdp/xdp_umem.c:102 [inline]
  xdp_umem_reg net/xdp/xdp_umem.c:219 [inline]
  xdp_umem_create+0x6a5/0xf00 net/xdp/xdp_umem.c:252
  xsk_setsockopt+0x604/0x790 net/xdp/xsk.c:1068
  __sys_setsockopt+0x1fd/0x4e0 net/socket.c:2176
  __do_sys_setsockopt net/socket.c:2187 [inline]
  __se_sys_setsockopt net/socket.c:2184 [inline]
  __x64_sys_setsockopt+0xb5/0x150 net/socket.c:2184
  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
  do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Björn mentioned that requests for &gt;2GB allocation can still be valid:

  The structure that is being allocated is the page-pinning accounting.
  AF_XDP has an internal limit of U32_MAX pages, which is *a lot*, but
  still fewer than what memcg allows (PAGE_COUNTER_MAX is a LONG_MAX/
  PAGE_SIZE on 64 bit systems). [...]

  I could just change from U32_MAX to INT_MAX, but as I stated earlier
  that has a hacky feeling to it. [...] From my perspective, the code
  isn't broken, with the memcg limits in consideration. [...]

Linus says:

  [...] Pretty much every time this has come up, the kernel warning has
  shown that yes, the code was broken and there really wasn't a reason
  for doing allocations that big.

  Of course, some people would be perfectly fine with the allocation
  failing, they just don't want the warning. I didn't want __GFP_NOWARN
  to shut it up originally because I wanted people to see all those
  cases, but these days I think we can just say "yeah, people can shut
  it up explicitly by saying 'go ahead and fail this allocation, don't
  warn about it'".

  So enough time has passed that by now I'd certainly be ok with [it].

Thus allow call-sites to silence such userspace triggered splats if the
allocation requests have __GFP_NOWARN. For xdp_umem_pin_pages()'s call
to kvcalloc() this is already the case, so nothing else needed there.

Fixes: 7661809d493b ("mm: don't allow oversized kvmalloc() calls")
Reported-by: syzbot+11421fbbff99b989670e@syzkaller.appspotmail.com
Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Tested-by: syzbot+11421fbbff99b989670e@syzkaller.appspotmail.com
Cc: Björn Töpel &lt;bjorn@kernel.org&gt;
Cc: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Cc: Willy Tarreau &lt;w@1wt.eu&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Cc: Jakub Kicinski &lt;kuba@kernel.org&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Link: https://lore.kernel.org/bpf/CAJ+HfNhyfsT5cS_U9EC213ducHs9k9zNxX9+abqC0kTrPbQ0gg@mail.gmail.com
Link: https://lore.kernel.org/bpf/20211201202905.b9892171e3f5b9a60f9da251@linux-foundation.org
Reviewed-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Ackd-by: Michal Hocko &lt;mhocko@suse.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'fixes-2022-02-26' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock</title>
<updated>2022-02-26T20:00:44Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-02-26T20:00:44Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=e41898d2ba51ef2e8e81fb905c1eaa958aec830a'/>
<id>urn:sha1:e41898d2ba51ef2e8e81fb905c1eaa958aec830a</id>
<content type='text'>
Pull memblock fix from Mike Rapoport:
 "Use kfree() to release kmalloced memblock regions

  memblock.{reserved,memory}.regions may be allocated using kmalloc()
  in memblock_double_array(). Use kfree() to release these kmalloced
  regions"

* tag 'fixes-2022-02-26' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  memblock: use kfree() to release kmalloced memblock regions
</content>
</entry>
<entry>
<title>mm: fix use-after-free bug when mm-&gt;mmap is reused after being freed</title>
<updated>2022-02-26T17:51:17Z</updated>
<author>
<name>Suren Baghdasaryan</name>
<email>surenb@google.com</email>
</author>
<published>2022-02-26T03:11:05Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=f798a1d4f94de9510e060d37b9b47721065a957c'/>
<id>urn:sha1:f798a1d4f94de9510e060d37b9b47721065a957c</id>
<content type='text'>
oom reaping (__oom_reap_task_mm) relies on a 2 way synchronization with
exit_mmap.  First it relies on the mmap_lock to exclude from unlock
path[1], page tables tear down (free_pgtables) and vma destruction.
This alone is not sufficient because mm-&gt;mmap is never reset.

For historical reasons[2] the lock is taken there is also MMF_OOM_SKIP
set for oom victims before.

The oom reaper only ever looks at oom victims so the whole scheme works
properly but process_mrelease can opearate on any task (with fatal
signals pending) which doesn't really imply oom victims.  That means
that the MMF_OOM_SKIP part of the synchronization doesn't work and it
can see a task after the whole address space has been demolished and
traverse an already released mm-&gt;mmap list.  This leads to use after
free as properly caught up by KASAN report.

Fix the issue by reseting mm-&gt;mmap so that MMF_OOM_SKIP synchronization
is not needed anymore.  The MMF_OOM_SKIP is not removed from exit_mmap
yet but it acts mostly as an optimization now.

[1] 27ae357fa82b ("mm, oom: fix concurrent munlock and oom reaper unmap, v3")
[2] 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")

[mhocko@suse.com: changelog rewrite]

Link: https://lore.kernel.org/all/00000000000072ef2c05d7f81950@google.com/
Link: https://lkml.kernel.org/r/20220215201922.1908156-1-surenb@google.com
Fixes: 64591e8605d6 ("mm: protect free_pgtables with mmap_lock write lock in exit_mmap")
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Reported-by: syzbot+2ccf63a4bd07cf39cab0@syzkaller.appspotmail.com
Suggested-by: Michal Hocko &lt;mhocko@suse.com&gt;
Reviewed-by: Rik van Riel &lt;riel@surriel.com&gt;
Reviewed-by: Yang Shi &lt;shy828301@gmail.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Kirill A. Shutemov &lt;kirill@shutemov.name&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Cc: Florian Weimer &lt;fweimer@redhat.com&gt;
Cc: Jan Engelhardt &lt;jengelh@inai.de&gt;
Cc: Tim Murray &lt;timmurray@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>hugetlbfs: fix a truncation issue in hugepages parameter</title>
<updated>2022-02-26T17:51:17Z</updated>
<author>
<name>Liu Yuntao</name>
<email>liuyuntao10@huawei.com</email>
</author>
<published>2022-02-26T03:11:02Z</published>
<link rel='alternate' type='text/html' href='https://git.kobert.dev/pm24.git/commit/?id=e79ce9832316e09529b212a21278d68240ccbf1f'/>
<id>urn:sha1:e79ce9832316e09529b212a21278d68240ccbf1f</id>
<content type='text'>
When we specify a large number for node in hugepages parameter, it may
be parsed to another number due to truncation in this statement:

	node = tmp;

For example, add following parameter in command line:

	hugepagesz=1G hugepages=4294967297:5

and kernel will allocate 5 hugepages for node 1 instead of ignoring it.

I move the validation check earlier to fix this issue, and slightly
simplifies the condition here.

Link: https://lkml.kernel.org/r/20220209134018.8242-1-liuyuntao10@huawei.com
Fixes: b5389086ad7be0 ("hugetlbfs: extend the definition of hugepages parameter to support node allocation")
Signed-off-by: Liu Yuntao &lt;liuyuntao10@huawei.com&gt;
Reviewed-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
