pm24.git/arch/parisc/kernel/cache.c, branch v5.6

parisc: use pgtable-nopXd instead of 4level-fixup

2019-12-05T03:44:15Z

parisc has two or three levels of page tables and can use appropriate pgtable-nopXd and folding of the upper layers. Replace usage of include/asm-generic/4level-fixup.h and explicit definitions of __PAGETABLE_PxD_FOLDED in parisc with include/asm-generic/pgtable-nopmd.h for two-level configurations and with include/asm-generic/pgtable-nopud.h for three-lelve configurations and adjust page table manipulation macros and functions accordingly. Link: http://lkml.kernel.org/r/1572938135-31886-9-git-send-email-rppt@kernel.org Signed-off-by: Mike Rapoport Acked-by: Helge Deller Cc: Anatoly Pugachev Cc: Anton Ivanov Cc: Arnd Bergmann Cc: "David S. Miller" Cc: Geert Uytterhoeven Cc: Greentime Hu Cc: Greg Ungerer Cc: "James E.J. Bottomley" Cc: Jeff Dike Cc: "Kirill A. Shutemov" Cc: Mark Salter Cc: Matt Turner Cc: Michal Simek Cc: Peter Rosin Cc: Richard Weinberger Cc: Rolf Eike Beer Cc: Russell King Cc: Russell King Cc: Sam Creasey Cc: Vincent Chen Cc: Vineet Gupta Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

parisc: Avoid spurious inequivalent alias kernel error messages

2019-11-04T07:34:27Z

This patch changes flush_dcache_page() to only print inequivalent alias error messages on systems that require coherency. Inequivalent aliases can occur on systems that don't require coherency and this can cause spurious messages. Signed-off-by: John David Anglin Signed-off-by: Helge Deller

parisc: Use __ro_after_init in cache.c

2019-05-10T19:00:44Z

Signed-off-by: Helge Deller

parisc: Use per-pagetable spinlock

2019-05-03T21:47:41Z

PA-RISC uses a global spinlock to protect pagetable updates in the TLB fault handlers. When multiple cores are taking TLB faults simultaneously, the cache line containing the spinlock becomes a bottleneck. This patch embeds the spinlock in the top level page directory, so that every process has its own lock. It improves performance by 30% when doing parallel compilations. At least on the N class systems, only one PxTLB inter processor broadcast can be active at any one time on the Merced bus. If a Merced bus is found, this patch serializes the TLB flushes with the pa_tlb_flush_lock spinlock. v1: Initial patch by Mikulas v2: Added Merced detection by Helge v3: Revised TLB serialization by Dave & Helge Signed-off-by: Mikulas Patocka Signed-off-by: John David Anglin Signed-off-by: Helge Deller

parisc: Optimze cache flush algorithms

2018-10-20T19:10:26Z

The attached patch implements three optimizations: 1) Loops in flush_user_dcache_range_asm, flush_kernel_dcache_range_asm, purge_kernel_dcache_range_asm, flush_user_icache_range_asm, and flush_kernel_icache_range_asm are unrolled to reduce branch overhead. 2) The static branch prediction for cmpb instructions in pacache.S have been reviewed and the operand order adjusted where necessary. 3) For flush routines in cache.c, we purge rather flush when we have no context. The pdc instruction at level 0 is not required to write back dirty lines to memory. This provides a performance improvement over the fdc instruction if the feature is implemented. Version 2 adds alternative patching. The patch provides an average improvement of about 2%. Signed-off-by: John David Anglin Signed-off-by: Helge Deller

parisc: Add alternative coding infrastructure

2018-10-17T15:22:26Z

This patch adds the necessary code to patch a running kernel at runtime to improve performance. The current implementation offers a few optimizations variants: - When running a SMP kernel on a single UP processor, unwanted assembler statements like locking functions are overwritten with NOPs. When multiple instructions shall be skipped, one branch instruction is used instead of multiple nop instructions. - In the UP case, some pdtlb and pitlb instructions are patched to become pdtlb,l and pitlb,l which only flushes the CPU-local tlb entries instead of broadcasting the flush to other CPUs in the system and thus may improve performance. - fic and fdc instructions are skipped if no I- or D-caches are installed. This should speed up qemu emulation and cacheless systems. - If no cache coherence is needed for IO operations, the relevant fdc and sync instructions in the sba and ccio drivers are replaced by nops. - On systems which share I- and D-TLBs and thus don't have a seperate instruction TLB, the pitlb instruction is replaced by a nop. Live-patching is done early in the boot process, just after having run the system inventory. No drivers are running and thus no external interrupts should arrive. So the hope is that no TLB exceptions will occur during the patching. If this turns out to be wrong we will probably need to do the patching in real-mode. Signed-off-by: Helge Deller

parisc: Reorder TLB flush timing calculation

2018-10-17T06:18:00Z

On boot (mostly reboot), my c8000 sometimes crashes after it prints the TLB flush threshold. The lockup is hard. The front LED flashes red and the box must be unplugged to reset the error. I noticed that when the crash occurs the TLB flush threshold is about one quarter what it is on a successful boot. If I disabled the calculation, the crash didn't occur. There also seemed to be a timing dependency affecting the crash. I finally realized that the flush_tlb_all() timing test runs just after the secondary CPUs are started. There seems to be a problem with running flush_tlb_all() too soon after the CPUs are started. The timing for the range test always seemed okay. So, I reversed the order of the two timing tests and I haven't had a crash at this point so far. I added a couple of information messages which I have left to help with diagnosis if the problem should appear on another machine. This version reduces the minimum TLB flush threshold to 16 KiB. Signed-off-by: John David Anglin Signed-off-by: Helge Deller

parisc: Move cache flush functions into .text.hot section

2018-04-11T09:40:35Z

and move the disable_sr_hashing() C and assembly functions into the .init section. Signed-off-by: Helge Deller

mm: fix races between swapoff and flush dcache

2018-04-06T04:36:26Z

Thanks to commit 4b3ef9daa4fc ("mm/swap: split swap cache into 64MB trunks"), after swapoff the address_space associated with the swap device will be freed. So page_mapping() users which may touch the address_space need some kind of mechanism to prevent the address_space from being freed during accessing. The dcache flushing functions (flush_dcache_page(), etc) in architecture specific code may access the address_space of swap device for anonymous pages in swap cache via page_mapping() function. But in some cases there are no mechanisms to prevent the swap device from being swapoff, for example, CPU1 CPU2 __get_user_pages() swapoff() flush_dcache_page() mapping = page_mapping() ... exit_swap_address_space() ... kvfree(spaces) mapping_mapped(mapping) The address space may be accessed after being freed. But from cachetlb.txt and Russell King, flush_dcache_page() only care about file cache pages, for anonymous pages, flush_anon_page() should be used. The implementation of flush_dcache_page() in all architectures follows this too. They will check whether page_mapping() is NULL and whether mapping_mapped() is true to determine whether to flush the dcache immediately. And they will use interval tree (mapping->i_mmap) to find all user space mappings. While mapping_mapped() and mapping->i_mmap isn't used by anonymous pages in swap cache at all. So, to fix the race between swapoff and flush dcache, __page_mapping() is add to return the address_space for file cache pages and NULL otherwise. All page_mapping() invoking in flush dcache functions are replaced with page_mapping_file(). [akpm@linux-foundation.org: simplify page_mapping_file(), per Mike] Link: http://lkml.kernel.org/r/20180305083634.15174-1-ying.huang@intel.com Signed-off-by: "Huang, Ying" Reviewed-by: Andrew Morton Cc: Minchan Kim Cc: Michal Hocko Cc: Johannes Weiner Cc: Mel Gorman Cc: Dave Hansen Cc: Chen Liqin Cc: Russell King Cc: Yoshinori Sato Cc: "James E.J. Bottomley" Cc: Guan Xuetao Cc: "David S. Miller" Cc: Chris Zankel Cc: Vineet Gupta Cc: Ley Foon Tan Cc: Ralf Baechle Cc: Andi Kleen Cc: Mike Rapoport Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

parisc: Handle case where flush_cache_range is called with no context

2018-03-17T10:49:39Z

Just when I had decided that flush_cache_range() was always called with a valid context, Helge reported two cases where the "BUG_ON(!vma->vm_mm->context);" was hit on the phantom buildd: kernel BUG at /mnt/sdb6/linux/linux-4.15.4/arch/parisc/kernel/cache.c:587! CPU: 1 PID: 3254 Comm: kworker/1:2 Tainted: G D 4.15.0-1-parisc64-smp #1 Debian 4.15.4-1+b1 Workqueue: events free_ioctx IAOQ[0]: flush_cache_range+0x164/0x168 IAOQ[1]: flush_cache_page+0x0/0x1c8 RP(r2): unmap_page_range+0xae8/0xb88 Backtrace: [<00000000404a6980>] unmap_page_range+0xae8/0xb88 [<00000000404a6ae0>] unmap_single_vma+0xc0/0x188 [<00000000404a6cdc>] zap_page_range_single+0x134/0x1f8 [<00000000404a702c>] unmap_mapping_range+0x1cc/0x208 [<0000000040461518>] truncate_pagecache+0x98/0x108 [<0000000040461624>] truncate_setsize+0x9c/0xb8 [<00000000405d7f30>] put_aio_ring_file+0x80/0x100 [<00000000405d803c>] aio_free_ring+0x8c/0x290 [<00000000405d82c0>] free_ioctx+0x80/0x180 [<0000000040284e6c>] process_one_work+0x21c/0x668 [<00000000402854c4>] worker_thread+0x20c/0x778 [<0000000040291d44>] kthread+0x2d4/0x2e0 [<0000000040204020>] end_fault_vector+0x20/0xc0 This indicates that we need to handle the no context case in flush_cache_range() as we do in flush_cache_mm(). In thinking about this, I realized that we don't need to flush the TLB when there is no context. So, I added context checks to the large flush cases in flush_cache_mm() and flush_cache_range(). The large flush case occurs frequently in flush_cache_mm() and the change should improve fork performance. The v2 version of this change removes the BUG_ON from flush_cache_page() by skipping the TLB flush when there is no context. I also added code to flush the TLB in flush_cache_mm() and flush_cache_range() when we have a context that's not current. Now all three routines handle TLB flushes in a similar manner. Signed-off-by: John David Anglin Cc: stable@vger.kernel.org # 4.9+ Signed-off-by: Helge Deller