summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorRam Tummala <rtummala@nvidia.com>2024-07-09 18:45:39 -0700
committerAndrew Morton <akpm@linux-foundation.org>2024-07-26 14:33:09 -0700
commit4cd7ba16a0afb36550eed7690e73d3e7a743fa96 (patch)
tree93afe3062918d2e0c9bbb61641a90ef7ad5b5c83
parent34e526f6182e12b71f6076d43760dc6e0ae175a3 (diff)
mm: fix old/young bit handling in the faulting path
Commit 3bd786f76de2 ("mm: convert do_set_pte() to set_pte_range()") replaced do_set_pte() with set_pte_range() and that introduced a regression in the following faulting path of non-anonymous vmas which caused the PTE for the faulting address to be marked as old instead of young. handle_pte_fault() do_pte_missing() do_fault() do_read_fault() || do_cow_fault() || do_shared_fault() finish_fault() set_pte_range() The polarity of prefault calculation is incorrect. This leads to prefault being incorrectly set for the faulting address. The following check will incorrectly mark the PTE old rather than young. On some architectures this will cause a double fault to mark it young when the access is retried. if (prefault && arch_wants_old_prefaulted_pte()) entry = pte_mkold(entry); On a subsequent fault on the same address, the faulting path will see a non NULL vmf->pte and instead of reaching the do_pte_missing() path, PTE will then be correctly marked young in handle_pte_fault() itself. Due to this bug, performance degradation in the fault handling path will be observed due to unnecessary double faulting. Link: https://lkml.kernel.org/r/20240710014539.746200-1-rtummala@nvidia.com Fixes: 3bd786f76de2 ("mm: convert do_set_pte() to set_pte_range()") Signed-off-by: Ram Tummala <rtummala@nvidia.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-rw-r--r--mm/memory.c2
1 files changed, 1 insertions, 1 deletions
diff --git a/mm/memory.c b/mm/memory.c
index 1ff7b6f51ec1..34f8402d2046 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4780,7 +4780,7 @@ void set_pte_range(struct vm_fault *vmf, struct folio *folio,
{
struct vm_area_struct *vma = vmf->vma;
bool write = vmf->flags & FAULT_FLAG_WRITE;
- bool prefault = in_range(vmf->address, addr, nr * PAGE_SIZE);
+ bool prefault = !in_range(vmf->address, addr, nr * PAGE_SIZE);
pte_t entry;
flush_icache_pages(vma, page, nr);