summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-09-11mm: remove BUG_ON() in __isolate_free_page()Kefeng Wang
Drop unneed comment and blank, adjust the variable, and the most important is to delete BUG_ON(). The page passed is always buddy page into __isolate_free_page() from compaction, page_isolation and page_reporting, and the caller also check the return, BUG_ON() is a too drastic measure, remove it. Link: https://lkml.kernel.org/r/20220901015043.189276-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11mm/kmemleak: make create_object return voidLiu Shixin
No caller cares about the return value of create_object(), so make it return void. Link: https://lkml.kernel.org/r/20220901023007.3471887-1-liushixin2@huawei.com Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11selftest: vm: remove deleted local_config.* from .gitignoreTarun Sahu
Commit d2d6cba5d6623 ("selftest: vm: remove orphaned references to local_config.{h,mk}") took care of removing orphaned references. This commit removes local_config from .gitignore. Parent patch commit 69007f156ba ("Kselftests: remove support of libhugetlbfs from kselftests") Link: https://lkml.kernel.org/r/20220901092315.33619-1-tsahu@linux.ibm.com Signed-off-by: Tarun Sahu <tsahu@linux.ibm.com> Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11hugetlb: make hugetlb depends on SYSFS or SYSCTLMiaohe Lin
If CONFIG_SYSFS and CONFIG_SYSCTL are both undefined, hugetlb doesn't work now as there's no way to set max huge pages. Make sure at least one of the above configs is defined to make hugetlb works as expected. Link: https://lkml.kernel.org/r/20220901120030.63318-11-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11hugetlb: remove meaningless BUG_ON(huge_pte_none())Miaohe Lin
When code reaches here, invalid page would have been accessed if huge pte is none. So this BUG_ON(huge_pte_none()) is meaningless. Remove it. Link: https://lkml.kernel.org/r/20220901120030.63318-10-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11hugetlb: add comment for subtle SetHPageVmemmapOptimized()Miaohe Lin
The SetHPageVmemmapOptimized() called here seems unnecessary as it's assumed to be set when calling this function. But it's indeed cleared by above set_page_private(page, 0). Add a comment to avoid possible future confusion. Link: https://lkml.kernel.org/r/20220901120030.63318-9-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11hugetlb: kill hugetlbfs_pagecache_page()Miaohe Lin
Fold hugetlbfs_pagecache_page() into its sole caller to remove some duplicated code. No functional change intended. Link: https://lkml.kernel.org/r/20220901120030.63318-8-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11hugetlb: pass NULL to kobj_to_hstate() if nid is unusedMiaohe Lin
We can pass NULL to kobj_to_hstate() directly when nid is unused to simplify the code. No functional change intended. Link: https://lkml.kernel.org/r/20220901120030.63318-7-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11hugetlb: use helper {huge_pte|pmd}_lock()Miaohe Lin
Use helper huge_pte_lock and pmd_lock to simplify the code. No functional change intended. Link: https://lkml.kernel.org/r/20220901120030.63318-6-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11hugetlb: use sizeof() to get the array sizeMiaohe Lin
It's better to use sizeof() to get the array size instead of manual calculation. Minor readability improvement. Link: https://lkml.kernel.org/r/20220901120030.63318-5-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11hugetlb: use LIST_HEAD() to define a list headMiaohe Lin
Use LIST_HEAD() directly to define a list head to simplify the code. No functional change intended. Link: https://lkml.kernel.org/r/20220901120030.63318-4-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11hugetlb: Use helper macro SZ_1KMiaohe Lin
Use helper macro SZ_1K to do the size conversion to make code more consistent in this file. Minor readability improvement. Link: https://lkml.kernel.org/r/20220901120030.63318-3-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11hugetlb: make hugetlb_cma_check() staticMiaohe Lin
Patch series "A few cleanup patches for hugetlb", v2. This series contains a few cleanup patches to use helper functions to simplify the codes, remove unneeded nid parameter and so on. More details can be found in the respective changelogs. This patch (of 10): Make hugetlb_cma_check() static as it's only used inside mm/hugetlb.c. Link: https://lkml.kernel.org/r/20220901120030.63318-1-linmiaohe@huawei.com Link: https://lkml.kernel.org/r/20220901120030.63318-2-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11fs/buffer: remove bh_submit_read() helperZhang Yi
bh_submit_read() has no user anymore, just remove it. Link: https://lkml.kernel.org/r/20220901133505.2510834-15-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11ext2: replace bh_submit_read() helper with bh_read()Zhang Yi
bh_submit_read() and the uptodate check logic in bh_uptodate_or_lock() has been integrated in bh_read() helper, so switch to use it directly. Link: https://lkml.kernel.org/r/20220901133505.2510834-14-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11fs/buffer: remove ll_rw_block() helperZhang Yi
Now that all ll_rw_block() users has been replaced to new safe helpers, we just remove it here. Link: https://lkml.kernel.org/r/20220901133505.2510834-13-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11ufs: replace ll_rw_block()Zhang Yi
ll_rw_block() is not safe for the sync read path because it cannot guarantee that submitting read IO if the buffer has been locked. We could get false positive EIO after wait_on_buffer() if the buffer has been locked by others. So stop using ll_rw_block() in ufs. Link: https://lkml.kernel.org/r/20220901133505.2510834-12-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11udf: replace ll_rw_block()Zhang Yi
ll_rw_block() is not safe for the sync read path because it cannot guarantee that submitting read IO if the buffer has been locked. We could get false positive EIO after wait_on_buffer() if the buffer has been locked by others. So stop using ll_rw_block(). We also switch to new bh_readahead_batch() helper for the buffer array readahead path. Link: https://lkml.kernel.org/r/20220901133505.2510834-11-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11reiserfs: replace ll_rw_block()Zhang Yi
ll_rw_block() is not safe for the sync read/write path because it cannot guarantee that submitting read/write IO if the buffer has been locked. We could get false positive EIO after wait_on_buffer() in read path if the buffer has been locked by others. So stop using ll_rw_block() in reiserfs. We also switch to new bh_readahead_batch() helper for the buffer array readahead path. Link: https://lkml.kernel.org/r/20220901133505.2510834-10-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11ocfs2: replace ll_rw_block()Zhang Yi
ll_rw_block() is not safe for the sync read path because it cannot guarantee that submitting read IO if the buffer has been locked. We could get false positive EIO after wait_on_buffer() if the buffer has been locked by others. So stop using ll_rw_block() in ocfs2. Link: https://lkml.kernel.org/r/20220901133505.2510834-9-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11ntfs3: replace ll_rw_block()Zhang Yi
ll_rw_block() is not safe for the sync read path because it cannot guarantee that submitting read IO if the buffer has been locked. We could get false positive EIO after wait_on_buffer() if the buffer has been locked by others. So stop using ll_rw_block() in ntfs_get_block_vbo(). Link: https://lkml.kernel.org/r/20220901133505.2510834-8-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11jbd2: replace ll_rw_block()Zhang Yi
ll_rw_block() is not safe for the sync read path because it cannot guarantee that submitting read IO if the buffer has been locked. We could get false positive EIO after wait_on_buffer() if the buffer has been locked by others. So stop using ll_rw_block() in journal_get_superblock(). We also switch to new bh_readahead_batch() for the buffer array readahead path. Link: https://lkml.kernel.org/r/20220901133505.2510834-7-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11isofs: replace ll_rw_block()Zhang Yi
ll_rw_block() is not safe for the sync read path because it cannot guarantee that submitting read IO if the buffer has been locked. We could get false positive EIO return from zisofs_uncompress_block() if he buffer has been locked by others. So stop using ll_rw_block(), switch to sync helper instead. Link: https://lkml.kernel.org/r/20220901133505.2510834-6-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11gfs2: replace ll_rw_block()Zhang Yi
ll_rw_block() is not safe for the sync read path because it cannot guarantee that always submitting read IO if the buffer has been locked, so stop using it. We also switch to new bh_readahead() helper for the readahead path. Link: https://lkml.kernel.org/r/20220901133505.2510834-5-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11fs/buffer: replace ll_rw_block()Zhang Yi
ll_rw_block() is not safe for the sync IO path because it skip buffers which has been locked by others, it could lead to false positive EIO when submitting read IO. So stop using ll_rw_block(), switch to use new helpers which could guarantee buffer locked and submit IO if needed. Link: https://lkml.kernel.org/r/20220901133505.2510834-4-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11fs/buffer: add some new buffer read helpersZhang Yi
Current ll_rw_block() helper is fragile because it assumes that locked buffer means it's under IO which is submitted by some other who holds the lock, it skip buffer if it failed to get the lock, so it's only safe on the readahead path. Unfortunately, now that most filesystems still use this helper mistakenly on the sync metadata read path. There is no guarantee that the one who holds the buffer lock always submit IO (e.g. buffer_migrate_folio_norefs() after commit 88dbcbb3a484 ("blkdev: avoid migration stalls for blkdev pages"), it could lead to false positive -EIO when submitting reading IO. This patch add some friendly buffer read helpers to prepare replacing ll_rw_block() and similar calls. We can only call bh_readahead_[] helpers for the readahead paths. Link: https://lkml.kernel.org/r/20220901133505.2510834-3-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11fs/buffer: remove __breadahead_gfp()Zhang Yi
Patch series "fs/buffer: remove ll_rw_block()", v2. ll_rw_block() will skip locked buffer before submitting IO, it assumes that locked buffer means it is under IO. This assumption is not always true because we cannot guarantee every buffer lock path would submit IO. After commit 88dbcbb3a484 ("blkdev: avoid migration stalls for blkdev pages"), buffer_migrate_folio_norefs() becomes one exceptional case, and there may be others. So ll_rw_block() is not safe on the sync read path, we could get false positive EIO return value when filesystem reading metadata. It seems that it could be only used on the readahead path. Unfortunately, many filesystem misuse the ll_rw_block() on the sync read path. This patch set just remove ll_rw_block() and add new friendly helpers, which could prevent false positive EIO on the read metadata path. Thanks for the suggestion from Jan, the original discussion is at [1]. patch 1: remove unused helpers in fs/buffer.c patch 2: add new bh_read_[*] helpers patch 3-11: remove all ll_rw_block() calls in filesystems patch 12-14: do some leftover cleanups. [1]. https://lore.kernel.org/linux-mm/20220825080146.2021641-1-chengzhihao1@huawei.com/ This patch (of 14): No one use __breadahead_gfp() and sb_breadahead_unmovable() any more, remove them. Link: https://lkml.kernel.org/r/20220901133505.2510834-1-yi.zhang@huawei.com Link: https://lkml.kernel.org/r/20220901133505.2510834-2-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Evgeniy Dushistov <dushistov@mail.ru> Cc: Heming Zhao <ocfs2-devel@oss.oracle.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Yu Kuai <yukuai3@huawei.com> Cc: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11mm/vmalloc: extend find_vmap_lowest_match_check with extra argumentsSong Liu
find_vmap_lowest_match() is now able to handle different roots. With DEBUG_AUGMENT_LOWEST_MATCH_CHECK enabled as: : --- a/mm/vmalloc.c : +++ b/mm/vmalloc.c : @@ -713,7 +713,7 @@ EXPORT_SYMBOL(vmalloc_to_pfn); : /*** Global kva allocator ***/ : : -#define DEBUG_AUGMENT_LOWEST_MATCH_CHECK 0 : +#define DEBUG_AUGMENT_LOWEST_MATCH_CHECK 1 compilation failed as: mm/vmalloc.c: In function 'find_vmap_lowest_match_check': mm/vmalloc.c:1328:32: warning: passing argument 1 of 'find_vmap_lowest_match' makes pointer from integer without a cast [-Wint-conversion] 1328 | va_1 = find_vmap_lowest_match(size, align, vstart, false); | ^~~~ | | | long unsigned int mm/vmalloc.c:1236:40: note: expected 'struct rb_root *' but argument is of type 'long unsigned int' 1236 | find_vmap_lowest_match(struct rb_root *root, unsigned long size, | ~~~~~~~~~~~~~~~~^~~~ mm/vmalloc.c:1328:9: error: too few arguments to function 'find_vmap_lowest_match' 1328 | va_1 = find_vmap_lowest_match(size, align, vstart, false); | ^~~~~~~~~~~~~~~~~~~~~~ mm/vmalloc.c:1236:1: note: declared here 1236 | find_vmap_lowest_match(struct rb_root *root, unsigned long size, | ^~~~~~~~~~~~~~~~~~~~~~ Extend find_vmap_lowest_match_check() and find_vmap_lowest_linear_match() with extra arguments to fix this. Link: https://lkml.kernel.org/r/20220906060548.1127396-1-song@kernel.org Link: https://lkml.kernel.org/r/20220831052734.3423079-1-song@kernel.org Fixes: f9863be49312 ("mm/vmalloc: extend __alloc_vmap_area() with extra arguments") Signed-off-by: Song Liu <song@kernel.org> Reviewed-by: Baoquan He <bhe@redhat.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11mm/migrate_device.c: fix a misleading and outdated commentAlistair Popple
Commit ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED") changed the way trylock_page() in migrate_vma_collect_pmd() works without updating the comment. Reword the comment to be less misleading and a better reflection of what happens. Link: https://lkml.kernel.org/r/20220830020138.497063-1-apopple@nvidia.com Fixes: ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED") Signed-off-by: Alistair Popple <apopple@nvidia.com> Reported-by: Peter Xu <peterx@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11mm/page_alloc.c: delete a redundant parameter of rmqueue_pcplistzezuo
The gfp_flags parameter is not used in rmqueue_pcplist, so directly delete this parameter. Link: https://lkml.kernel.org/r/20220831013404.3360714-1-zuoze1@huawei.com Signed-off-by: zezuo <zuoze1@huawei.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11mm/damon: get the hotness from damon_hot_score() in damon_pageout_score()Kaixu Xia
We can get the hotness value from damon_hot_score() directly in damon_pageout_score() function and improve the code readability. Link: https://lkml.kernel.org/r/1661766366-20998-1-git-send-email-kaixuxia@tencent.com Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11mm/thp: remove redundant CONFIG_TRANSPARENT_HUGEPAGELiu Shixin
Simplify code by removing redundant CONFIG_TRANSPARENT_HUGEPAGE judgment. No functional change. Link: https://lkml.kernel.org/r/20220829095125.3284567-1-liushixin2@huawei.com Signed-off-by: Liu Shixin <liushixin2@huawei.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11mm/thp: simplify has_transparent_hugepage by using IS_BUILTINLiu Shixin
Simplify code of has_transparent_hugepage define by using IS_BUILTIN. No functional change. Link: https://lkml.kernel.org/r/20220829095709.3287462-1-liushixin2@huawei.com Signed-off-by: Liu Shixin <liushixin2@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11mm/damon/vaddr: remove comparison between mm and last_mm when checking ↵Kaixu Xia
region accesses The damon regions that belong to the same damon target have the same 'struct mm_struct *mm', so it's unnecessary to compare the mm and last_mm objects among the damon regions in one damon target when checking accesses. But the check is necessary when the target changed in '__damon_va_check_accesses()', so we can simplify the whole operation by using the bool 'same_target' to indicate whether the target changed. Link: https://lkml.kernel.org/r/1661590971-20893-3-git-send-email-kaixuxia@tencent.com Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11mm/damon: simplify the parameter passing for 'check_accesses'Kaixu Xia
Patch series "mm/damon: Simplify the damon regions access check", v2. This patchset simplifies the operations when checking the damon regions accesses. This patch (of 2): The parameter 'struct damon_ctx *ctx' isn't used in the functions __damon_{p,v}a_check_access(), so we can remove it and simplify the parameter passing. Link: https://lkml.kernel.org/r/1661590971-20893-1-git-send-email-kaixuxia@tencent.com Link: https://lkml.kernel.org/r/1661590971-20893-2-git-send-email-kaixuxia@tencent.com Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11mm: fix null-ptr-deref in kswapd_is_running()Kefeng Wang
kswapd_run/stop() will set pgdat->kswapd to NULL, which could race with kswapd_is_running() in kcompactd(), kswapd_run/stop() kcompactd() kswapd_is_running() pgdat->kswapd // error or nomal ptr verify pgdat->kswapd // load non-NULL pgdat->kswapd pgdat->kswapd = NULL task_is_running(pgdat->kswapd) // Null pointer derefence KASAN reports the null-ptr-deref shown below, vmscan: Failed to start kswapd on node 0 ... BUG: KASAN: null-ptr-deref in kcompactd+0x440/0x504 Read of size 8 at addr 0000000000000024 by task kcompactd0/37 CPU: 0 PID: 37 Comm: kcompactd0 Kdump: loaded Tainted: G OE 5.10.60 #1 Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 Call trace: dump_backtrace+0x0/0x394 show_stack+0x34/0x4c dump_stack+0x158/0x1e4 __kasan_report+0x138/0x140 kasan_report+0x44/0xdc __asan_load8+0x94/0xd0 kcompactd+0x440/0x504 kthread+0x1a4/0x1f0 ret_from_fork+0x10/0x18 At present kswapd/kcompactd_run() and kswapd/kcompactd_stop() are protected by mem_hotplug_begin/done(), but without kcompactd(). There is no need to involve memory hotplug lock in kcompactd(), so let's add a new mutex to protect pgdat->kswapd accesses. Also, because the kcompactd task will check the state of kswapd task, it's better to call kcompactd_stop() before kswapd_stop() to reduce lock conflicts. [akpm@linux-foundation.org: add comments] Link: https://lkml.kernel.org/r/20220827111959.186838-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: David Hildenbrand <david@redhat.com> Cc: Muchun Song <muchun.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11mm: kill is_memblock_offlined()Kefeng Wang
Directly check state of struct memory_block, no need a single function. Link: https://lkml.kernel.org/r/20220827112043.187028-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11filemap: remove find_get_pages_contig()Vishal Moola (Oracle)
All callers of find_get_pages_contig() have been removed, so it is no longer needed. Link: https://lkml.kernel.org/r/20220824004023.77310-8-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <clm@fb.com> Cc: David Sterba <dsterba@suse.com> Cc: David Sterba <dsterb@suse.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11ramfs: convert ramfs_nommu_get_unmapped_area() to use ↵Vishal Moola (Oracle)
filemap_get_folios_contig() Convert to use folios throughout. This is in preparation for the removal for find_get_pages_contig(). Now also supports large folios. The initial version of this function set the page_address to be returned after finishing all the checks. Since folio_batches have a maximum of 15 folios, the function had to be modified to support getting and checking up to lpages, 15 pages at a time while still returning the initial page address. Now the function sets ret as soon as the first batch arrives, and updates it only if a check fails. The physical adjacency check utilizes the page frame numbers. The page frame number of each folio must be nr_pages away from the first folio. Link: https://lkml.kernel.org/r/20220824004023.77310-7-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <clm@fb.com> Cc: David Sterba <dsterba@suse.com> Cc: David Sterba <dsterb@suse.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11nilfs2: convert nilfs_find_uncommited_extent() to use ↵Vishal Moola (Oracle)
filemap_get_folios_contig() Convert function to use folios throughout. This is in preparation for the removal of find_get_pages_contig(). Now also supports large folios. Also clean up an unnecessary if statement - pvec.pages[0]->index > index will always evaluate to false, and filemap_get_folios_contig() returns 0 if there is no folio found at index. Link: https://lkml.kernel.org/r/20220824004023.77310-6-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <clm@fb.com> Cc: David Sterba <dsterba@suse.com> Cc: David Sterba <dsterb@suse.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11btrfs: convert process_page_range() to use filemap_get_folios_contig()Vishal Moola (Oracle)
Converted function to use folios throughout. This is in preparation for the removal of find_get_pages_contig(). Now also supports large folios. Since we may receive more than nr_pages pages, nr_pages may underflow. Since nr_pages > 0 is equivalent to index <= end_index, we replaced it with this check instead. Also minor comment renaming for consistency in subpage. Link: https://lkml.kernel.org/r/20220824004023.77310-5-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: David Sterba <dsterb@suse.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <clm@fb.com> Cc: David Sterba <dsterba@suse.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11btrfs: convert end_compressed_writeback() to use filemap_get_folios()Vishal Moola (Oracle)
Converted function to use folios throughout. This is in preparation for the removal of find_get_pages_contig(). Now also supports large folios. Since we may receive more than nr_pages pages, nr_pages may underflow. Since nr_pages > 0 is equivalent to index <= end_index, we replaced it with this check instead. Also this function does not care about the pages being contiguous so we can just use filemap_get_folios() to be more efficient. Link: https://lkml.kernel.org/r/20220824004023.77310-4-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: David Sterba <dsterba@suse.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <clm@fb.com> Cc: David Sterba <dsterb@suse.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11btrfs: convert __process_pages_contig() to use filemap_get_folios_contig()Vishal Moola (Oracle)
Convert to use folios throughout. This is in preparation for the removal of find_get_pages_contig(). Now also supports large folios. Since we may receive more than nr_pages pages, nr_pages may underflow. Since nr_pages > 0 is equivalent to index <= end_index, we replaced it with this check instead. Link: https://lkml.kernel.org/r/20220824004023.77310-3-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: David Sterba <dsterba@suse.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <clm@fb.com> Cc: David Sterba <dsterb@suse.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11filemap: add filemap_get_folios_contig()Vishal Moola (Oracle)
Patch series "Convert to filemap_get_folios_contig()", v3. This patch series replaces find_get_pages_contig() with filemap_get_folios_contig(). This patch (of 7): This function is meant to replace find_get_pages_contig(). Unlike find_get_pages_contig(), filemap_get_folios_contig() no longer takes in a target number of pages to find - It returns up to 15 contiguous folios. To be more consistent with filemap_get_folios(), filemap_get_folios_contig() now also updates the start index passed in, and takes an end index. Link: https://lkml.kernel.org/r/20220824004023.77310-1-vishal.moola@gmail.com Link: https://lkml.kernel.org/r/20220824004023.77310-2-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: David Sterba <dsterba@suse.com> Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: David Sterba <dsterb@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11zram: don't retry compress incompressible pageAlexey Romanov
It doesn't make sense for us to retry to compress an uncompressible page (comp_len == PAGE_SIZE) in zsmalloc slowpath, because we will be storing it uncompressed anyway. We can avoid wasting time on another compression attempt. It is enough to take lock (zcomp_stream_get) and execute the code below. Link: https://lkml.kernel.org/r/20220824113117.78849-1-avromanov@sberdevices.ru Signed-off-by: Alexey Romanov <avromanov@sberdevices.ru> Signed-off-by: Dmitry Rokosov <ddrokosov@sberdevices.ru> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Alexey Romanov <avromanov@sberdevices.ru> Cc: Dmitry Rokosov <DDRokosov@sberdevices.ru> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11mm: backing-dev: Remove the unneeded result variableye xingchen
Return the value cgwb_bdi_init() directly instead of storing it in another redundant variable. Link: https://lkml.kernel.org/r/20220826071906.252419-1-ye.xingchen@zte.com.cn Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Reported-by: Zeal Robot <zealci@zte.com.cn> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11page_ext: introduce boot parameter 'early_page_ext'Li Zhe
In commit 2f1ee0913ce5 ("Revert "mm: use early_pfn_to_nid in page_ext_init""), we call page_ext_init() after page_alloc_init_late() to avoid some panic problem. It seems that we cannot track early page allocations in current kernel even if page structure has been initialized early. This patch introduces a new boot parameter 'early_page_ext' to resolve this problem. If we pass it to the kernel, page_ext_init() will be moved up and the feature 'deferred initialization of struct pages' will be disabled to initialize the page allocator early and prevent the panic problem above. It can help us to catch early page allocations. This is useful especially when we find that the free memory value is not the same right after different kernel booting. [akpm@linux-foundation.org: fix section issue by removing __meminitdata] Link: https://lkml.kernel.org/r/20220825102714.669-1-lizhe.67@bytedance.com Signed-off-by: Li Zhe <lizhe.67@bytedance.com> Suggested-by: Michal Hocko <mhocko@suse.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11s390/hugetlb: switch to generic version of follow_huge_pud()Gerald Schaefer
When pud-sized hugepages were introduced for s390, the generic version of follow_huge_pud() was using pte_page() instead of pud_page(). This would be wrong for s390, see also commit 97534127012f ("mm/hugetlb: use pmd_page() in follow_huge_pmd()"). Therefore, and probably because not all archs were supporting pud_page() at that time, a private version of follow_huge_pud() was added for s390, correctly using pud_page(). Since commit 3a194f3f8ad01 ("mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry"), the generic version of follow_huge_pud() is now also using pud_page(), and in general behaves similar to follow_huge_pmd(). Therefore we can now switch to the generic version and get rid of the s390-specific follow_huge_pud(). Link: https://lkml.kernel.org/r/20220818135717.609eef8a@thinkpad Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Haiyue Wang <haiyue.wang@intel.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: David Hildenbrand <david@redhat.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11memcg: increase MEMCG_CHARGE_BATCH to 64Shakeel Butt
For several years, MEMCG_CHARGE_BATCH was kept at 32 but with bigger machines and the network intensive workloads requiring througput in Gbps, 32 is too small and makes the memcg charging path a bottleneck. For now, increase it to 64 for easy acceptance to 6.0. We will need to revisit this in future for ever increasing demand of higher performance. Please note that the memcg charge path drain the per-cpu memcg charge stock, so there should not be any oom behavior change. Though it does have impact on rstat flushing and high limit reclaim backoff. To evaluate the impact of this optimization, on a 72 CPUs machine, we ran the following workload in a three level of cgroup hierarchy. $ netserver -6 # 36 instances of netperf with following params $ netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K Results (average throughput of netperf): Without (6.0-rc1) 10482.7 Mbps With patch 17064.7 Mbps (62.7% improvement) With the patch, the throughput improved by 62.7%. Link: https://lkml.kernel.org/r/20220825000506.239406-4-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Reported-by: kernel test robot <oliver.sang@intel.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Feng Tang <feng.tang@intel.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "Michal Koutný" <mkoutny@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11mm: page_counter: rearrange struct page_counter fieldsShakeel Butt
With memcg v2 enabled, memcg->memory.usage is a very hot member for the workloads doing memcg charging on multiple CPUs concurrently. Particularly the network intensive workloads. In addition, there is a false cache sharing between memory.usage and memory.high on the charge path. This patch moves the usage into a separate cacheline and move all the read most fields into separate cacheline. To evaluate the impact of this optimization, on a 72 CPUs machine, we ran the following workload in a three level of cgroup hierarchy. $ netserver -6 # 36 instances of netperf with following params $ netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K Results (average throughput of netperf): Without (6.0-rc1) 10482.7 Mbps With patch 12413.7 Mbps (18.4% improvement) With the patch, the throughput improved by 18.4%. One side-effect of this patch is the increase in the size of struct mem_cgroup. For example with this patch on 64 bit build, the size of struct mem_cgroup increased from 4032 bytes to 4416 bytes. However for the performance improvement, this additional size is worth it. In addition there are opportunities to reduce the size of struct mem_cgroup like deprecation of kmem and tcpmem page counters and better packing. Link: https://lkml.kernel.org/r/20220825000506.239406-3-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Reported-by: kernel test robot <oliver.sang@intel.com> Reviewed-by: Feng Tang <feng.tang@intel.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "Michal Koutný" <mkoutny@suse.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>