summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-08-09md/raid6: implement recovery using ARM NEON intrinsicsArd Biesheuvel
Provide a NEON accelerated implementation of the recovery algorithm, which supersedes the default byte-by-byte one. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-09md/raid6: use faster multiplication for ARM NEON delta syndromeArd Biesheuvel
The P/Q left side optimization in the delta syndrome simply involves repeatedly multiplying a value by polynomial 'x' in GF(2^8). Given that 'x * x * x * x' equals 'x^4' even in the polynomial world, we can accelerate this substantially by performing up to 4 such operations at once, using the NEON instructions for polynomial multiplication. Results on a Cortex-A57 running in 64-bit mode: Before: ------- raid6: neonx1 xor() 1680 MB/s raid6: neonx2 xor() 2286 MB/s raid6: neonx4 xor() 3162 MB/s raid6: neonx8 xor() 3389 MB/s After: ------ raid6: neonx1 xor() 2281 MB/s raid6: neonx2 xor() 3362 MB/s raid6: neonx4 xor() 3787 MB/s raid6: neonx8 xor() 4239 MB/s While we're at it, simplify MASK() by using a signed shift rather than a vector compare involving a temp register. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-09Merge tag 'arm64-iort-for-v4.14' of ↵Catalin Marinas
git://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/linux into for-next/core Two patches for the 4.14 release merge window: - IORT PCI aliases detection improvement to cater for systems with complex PCI topologies that current code mishandles (R.Murphy) - IORT SMMUv3 proximity domain (ie NUMA) handling (respective ACPICA changes will be merged separately) (G. Kulkarni) * tag 'arm64-iort-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/linux: ACPI/IORT: numa: Add numa node mapping for smmuv3 devices ACPI/IORT: Handle PCI aliases properly for IOMMUs
2017-08-09Merge branch 'arm64/exception-stack' of ↵Catalin Marinas
git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux into for-next/core * 'arm64/exception-stack' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux: arm64: unwind: remove sp from struct stackframe arm64: unwind: reference pt_regs via embedded stack frame arm64: unwind: disregard frame.sp when validating frame pointer arm64: unwind: avoid percpu indirection for irq stack arm64: move non-entry code out of .entry.text arm64: consistently use bl for C exception entry arm64: Add ASM_BUG()
2017-08-09arm64: unwind: remove sp from struct stackframeArd Biesheuvel
The unwind code sets the sp member of struct stackframe to 'frame pointer + 0x10' unconditionally, without regard for whether doing so produces a legal value. So let's simply remove it now that we have stopped using it anyway. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
2017-08-09arm64: unwind: reference pt_regs via embedded stack frameArd Biesheuvel
As it turns out, the unwind code is slightly broken, and probably has been for a while. The problem is in the dumping of the exception stack, which is intended to dump the contents of the pt_regs struct at each level in the call stack where an exception was taken and routed to a routine marked as __exception (which means its stack frame is right below the pt_regs struct on the stack). 'Right below the pt_regs struct' is ill defined, though: the unwind code assigns 'frame pointer + 0x10' to the .sp member of the stackframe struct at each level, and dump_backtrace() happily dereferences that as the pt_regs pointer when encountering an __exception routine. However, the actual size of the stack frame created by this routine (which could be one of many __exception routines we have in the kernel) is not known, and so frame.sp is pretty useless to figure out where struct pt_regs really is. So it seems the only way to ensure that we can find our struct pt_regs when walking the stack frames is to put it at a known fixed offset of the stack frame pointer that is passed to such __exception routines. The simplest way to do that is to put it inside pt_regs itself, which is the main change implemented by this patch. As a bonus, doing this allows us to get rid of a fair amount of cruft related to walking from one stack to the other, which is especially nice since we intend to introduce yet another stack for overflow handling once we add support for vmapped stacks. It also fixes an inconsistency where we only add a stack frame pointing to ELR_EL1 if we are executing from the IRQ stack but not when we are executing from the task stack. To consistly identify exceptions regs even in the presence of exceptions taken from entry code, we must check whether the next frame was created by entry text, rather than whether the current frame was crated by exception text. To avoid backtracing using PCs that fall in the idmap, or are controlled by userspace, we must explcitly zero the FP and LR in startup paths, and must ensure that the frame embedded in pt_regs is zeroed upon entry from EL0. To avoid these NULL entries showin in the backtrace, unwind_frame() is updated to avoid them. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [Mark: compare current frame against .entry.text, avoid bogus PCs] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
2017-08-09arm64/vdso: Support mremap() for vDSODmitry Safonov
vDSO VMA address is saved in mm_context for the purpose of using restorer from vDSO page to return to userspace after signal handling. In Checkpoint Restore in Userspace (CRIU) project we place vDSO VMA on restore back to the place where it was on the dump. With the exception for x86 (where there is API to map vDSO with arch_prctl()), we move vDSO inherited from CRIU task to restoree position by mremap(). CRIU does support arm64 architecture, but kernel doesn't update context.vdso pointer after mremap(). Which results in translation fault after signal handling on restored application: https://github.com/xemul/criu/issues/288 Make vDSO code track the VMA address by supplying .mremap() fops the same way it's done for x86 and arm32 by: commit b059a453b1cf ("x86/vdso: Add mremap hook to vm_special_mapping") commit 280e87e98c09 ("ARM: 8683/1: ARM32: Support mremap() for sigpage/vDSO"). Cc: Russell King <rmk+kernel@armlinux.org.uk> Cc: linux-arm-kernel@lists.infradead.org Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Cc: Christopher Covington <cov@codeaurora.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-09arm64: uaccess: Implement *_flushcache variantsRobin Murphy
Implement the set of copy functions with guarantees of a clean cache upon completion necessary to support the pmem driver. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-09arm64: Implement pmem API supportRobin Murphy
Add a clean-to-point-of-persistence cache maintenance helper, and wire up the basic architectural support for the pmem driver based on it. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> [catalin.marinas@arm.com: move arch_*_pmem() functions to arch/arm64/mm/flush.c] [catalin.marinas@arm.com: change dmb(sy) to dmb(osh)] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-09arm64: Handle trapped DC CVAPRobin Murphy
Cache clean to PoP is subject to the same access controls as to PoC, so if we are trapping userspace cache maintenance with SCTLR_EL1.UCI, we need to be prepared to handle it. To avoid getting into complicated fights with binutils about ARMv8.2 options, we'll just cheat and use the raw SYS instruction rather than the 'proper' DC alias. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-09arm64: Expose DC CVAP to userspaceRobin Murphy
The ARMv8.2-DCPoP feature introduces persistent memory support to the architecture, by defining a point of persistence in the memory hierarchy, and a corresponding cache maintenance operation, DC CVAP. Expose the support via HWCAP and MRS emulation. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-09arm64: Convert __inval_cache_range() to area-basedRobin Murphy
__inval_cache_range() is already the odd one out among our data cache maintenance routines as the only remaining range-based one; as we're going to want an invalidation routine to call from C code for the pmem API, let's tweak the prototype and name to bring it in line with the clean operations, and to make its relationship with __dma_inv_area() neatly mirror that of __clean_dcache_area_poc() and __dma_clean_area(). The loop clearing the early page tables gets mildly massaged in the process for the sake of consistency. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-09arm64: mm: Fix set_memory_valid() declarationRobin Murphy
Clearly, set_memory_valid() has never been seen in the same room as its declaration... Whilst the type mismatch is such that kexec probably wasn't broken in practice, fix it to match the definition as it should. Fixes: 9b0aa14e3155 ("arm64: mm: add set_memory_valid()") Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-08ACPI/IORT: numa: Add numa node mapping for smmuv3 devicesGanapatrao Kulkarni
ARM IORT specification(rev. C) has added provision to define proximity domain in SMMUv3 IORT table. Adding required code to parse Proximity domain and set numa_node of smmv3 platform devices. Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> [lorenzo.pieralisi@arm.com: update pr_info()/commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2017-08-08arm64: unwind: disregard frame.sp when validating frame pointerArd Biesheuvel
Currently, when unwinding the call stack, we validate the frame pointer of each frame against frame.sp, whose value is not clearly defined, and which makes it more difficult to link stack frames together across different stacks. It is far better to simply check whether the frame pointer itself points into a valid stack. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
2017-08-08arm64: unwind: avoid percpu indirection for irq stackMark Rutland
Our IRQ_STACK_PTR() and on_irq_stack() helpers both take a cpu argument, used to generate a percpu address. In all cases, they are passed {raw_,}smp_processor_id(), so this parameter is redundant. Since {raw_,}smp_processor_id() use a percpu variable internally, this approach means we generate a percpu offset to find the current cpu, then use this to index an array of percpu offsets, which we then use to find the current CPU's IRQ stack pointer. Thus, most of the work is redundant. Instead, we can consistently use raw_cpu_ptr() to generate the CPU's irq_stack pointer by simply adding the percpu offset to the irq_stack address, which is simpler in both respects. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
2017-08-08arm64: move non-entry code out of .entry.textMark Rutland
Currently, cpu_switch_to and ret_from_fork both live in .entry.text, though neither form the critical path for an exception entry. In subsequent patches, we will require that code in .entry.text is part of the critical path for exception entry, for which we can assume certain properties (e.g. the presence of exception regs on the stack). Neither cpu_switch_to nor ret_from_fork will meet these requirements, so we must move them out of .entry.text. To ensure that neither are kprobed after being moved out of .entry.text, we must explicitly blacklist them, requiring a new NOKPROBE() asm helper. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
2017-08-08arm64: consistently use bl for C exception entryMark Rutland
In most cases, our exception entry assembly branches to C handlers with a BL instruction, but in cases where we do not expect to return, we use B instead. While this is correct today, it means that backtraces for fatal exceptions miss the entry assembly (as the LR is stale at the point we call C code), while non-fatal exceptions have the entry assembly in the LR. In subsequent patches, we will need the LR to be set in these cases in order to backtrace reliably. This patch updates these sites to use a BL, ensuring consistency, and preparing for backtrace rework. An ASM_BUG() is added after each of these new BLs, which both catches unexpected returns, and ensures that the LR value doesn't point to another function label. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
2017-08-08arm64: Add ASM_BUG()Mark Rutland
Currently. we can only use BUG() from C code, though there are situations where we would like an equivalent mechanism in assembly code. This patch refactors our BUG() definition such that it can be used in either C or assembly, in the form of a new ASM_BUG(). The refactoring requires the removal of escape sequences, such as '\n' and '\t', but these aren't strictly necessary as we can use ';' to terminate assembler statements. The low-level assembly is factored out into <asm/asm-bug.h>, with <asm/bug.h> retained as the C wrapper. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Martin <dave.martin@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
2017-08-08ACPI/IORT: Handle PCI aliases properly for IOMMUsRobin Murphy
When a PCI device has DMA quirks, we need to ensure that an upstream IOMMU knows about all possible aliases, since the presence of a DMA quirk does not preclude the device still also emitting transactions (e.g. MSIs) on its 'real' RID. Similarly, the rules for bridge aliasing are relatively complex, and some bridges may only take ownership of transactions under particular transient circumstances, leading again to multiple RIDs potentially being seen at the IOMMU for the given device. Take all this into account in iort_iommu_configure() by mapping every RID produced by the alias walk, not just whichever one comes out last. Since adding any more internal PTR_ERR() juggling would have confused me no end, a bit of refactoring happens in the process - we know where to find the ops if everything succeeded, so we're free to just pass regular error codes around up until then. CC: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> CC: Hanjun Guo <hanjun.guo@linaro.org> CC: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> [lorenzo.pieralisi@arm.com: tagged __get_pci_rid __maybe_unused] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2017-08-07arm64: Decode information from ESR upon mem faultsJulien Thierry
When receiving unhandled faults from the CPU, description is very sparse. Adding information about faults decoded from ESR. Added defines to esr.h corresponding ESR fields. Values are based on ARM Archtecture Reference Manual (DDI 0487B.a), section D7.2.28 ESR_ELx, Exception Syndrome Register (ELx) (pages D7-2275 to D7-2280). New output is of the form: [ 77.818059] Mem abort info: [ 77.820826] Exception class = DABT (current EL), IL = 32 bits [ 77.826706] SET = 0, FnV = 0 [ 77.829742] EA = 0, S1PTW = 0 [ 77.832849] Data abort info: [ 77.835713] ISV = 0, ISS = 0x00000070 [ 77.839522] CM = 0, WnR = 1 Signed-off-by: Julien Thierry <julien.thierry@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> [catalin.marinas@arm.com: fix "%lu" in a pr_alert() call] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-07arm64: Abstract syscallno manipulationDave Martin
The -1 "no syscall" value is written in various ways, shared with the user ABI in some places, and generally obscure. This patch attempts to make things a little more consistent and readable by replacing all these uses with a single #define. A couple of symbolic helpers are provided to clarify the intent further. Because the in-syscall check in do_signal() is changed from >= 0 to != NO_SYSCALL by this patch, different behaviour may be observable if syscallno is set to values less than -1 by a tracer. However, this is not different from the behaviour that is already observable if a tracer sets syscallno to a value >= __NR_(compat_)syscalls. It appears that this can cause spurious syscall restarting, but that is not a new behaviour either, and does not appear harmful. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-07arm64: syscallno is secretly an int, make it officialDave Martin
The upper 32 bits of the syscallno field in thread_struct are handled inconsistently, being sometimes zero extended and sometimes sign-extended. In fact, only the lower 32 bits seem to have any real significance for the behaviour of the code: it's been OK to handle the upper bits inconsistently because they don't matter. Currently, the only place I can find where those bits are significant is in calling trace_sys_enter(), which may be unintentional: for example, if a compat tracer attempts to cancel a syscall by passing -1 to (COMPAT_)PTRACE_SET_SYSCALL at the syscall-enter-stop, it will be traced as syscall 4294967295 rather than -1 as might be expected (and as occurs for a native tracer doing the same thing). Elsewhere, reads of syscallno cast it to an int or truncate it. There's also a conspicuous amount of code and casting to bodge around the fact that although semantically an int, syscallno is stored as a u64. Let's not pretend any more. In order to preserve the stp x instruction that stores the syscall number in entry.S, this patch special-cases the layout of struct pt_regs for big endian so that the newly 32-bit syscallno field maps onto the low bits of the stored value. This is not beautiful, but benchmarking of the getpid syscall on Juno suggests indicates a minor slowdown if the stp is split into an stp x and stp w. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-06Linux 4.13-rc4v4.13-rc4Linus Torvalds
2017-08-06Merge tag 'platform-drivers-x86-v4.13-4' of ↵Linus Torvalds
git://git.infradead.org/linux-platform-drivers-x86 Pull x86 platform driver fix from Darren Hart: "Fix loop preventing some platforms from waking up via the power button in s2idle: - intel-vbtn: match power button on press rather than release" * tag 'platform-drivers-x86-v4.13-4' of git://git.infradead.org/linux-platform-drivers-x86: platform/x86: intel-vbtn: match power button on press rather than release
2017-08-06Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "A large number of ext4 bug fixes and cleanups for v4.13" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix copy paste error in ext4_swap_extents() ext4: fix overflow caused by missing cast in ext4_resize_fs() ext4, project: expand inode extra size if possible ext4: cleanup ext4_expand_extra_isize_ea() ext4: restructure ext4_expand_extra_isize ext4: fix forgetten xattr lock protection in ext4_expand_extra_isize ext4: make xattr inode reads faster ext4: inplace xattr block update fails to deduplicate blocks ext4: remove unused mode parameter ext4: fix warning about stack corruption ext4: fix dir_nlink behaviour ext4: silence array overflow warning ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesize ext4: release discard bio after sending discard commands ext4: convert swap_inode_data() over to use swap() on most of the fields ext4: error should be cleared if ea_inode isn't added to the cache ext4: Don't clear SGID when inheriting ACLs ext4: preserve i_mode if __ext4_set_acl() fails ext4: remove unused metadata accounting variables ext4: correct comment references to ext4_ext_direct_IO()
2017-08-06Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds
Pull MIPS fixes from Ralf Baechle: "This fixes two build issues for ralink platforms, both due to missing #includes which used to be included indirectly via other headers" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: ralink: mt7620: Add missing header MIPS: ralink: Fix build error due to missing header
2017-08-06Fix compat_sys_sigpending breakageDmitry V. Levin
The latest change of compat_sys_sigpending in commit 8f13621abced ("sigpending(): move compat to native") has broken it in two ways. First, it tries to write 4 bytes more than userspace expects: sizeof(old_sigset_t) == sizeof(long) == 8 instead of sizeof(compat_old_sigset_t) == sizeof(u32) == 4. Second, on big endian architectures these bytes are being written in the wrong order. This bug was found by strace test suite. Reported-by: Anatoly Pugachev <matorola@gmail.com> Inspired-by: Eugene Syromyatnikov <evgsyr@gmail.com> Fixes: 8f13621abced ("sigpending(): move compat to native") Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-06ext4: fix copy paste error in ext4_swap_extents()Maninder Singh
This bug was found by a static code checker tool for copy paste problems. Signed-off-by: Maninder Singh <maninder1.s@samsung.com> Signed-off-by: Vaneet Narang <v.narang@samsung.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-06ext4: fix overflow caused by missing cast in ext4_resize_fs()Jerry Lee
On a 32-bit platform, the value of n_blcoks_count may be wrong during the file system is resized to size larger than 2^32 blocks. This may caused the superblock being corrupted with zero blocks count. Fixes: 1c6bd7173d66 Signed-off-by: Jerry Lee <jerrylee@qnap.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org # 3.7+
2017-08-06ext4, project: expand inode extra size if possibleMiao Xie
When upgrading from old format, try to set project id to old file first time, it will return EOVERFLOW, but if that file is dirtied(touch etc), changing project id will be allowed, this might be confusing for users, we could try to expand @i_extra_isize here too. Reported-by: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Miao Xie <miaoxie@huawei.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-06ext4: cleanup ext4_expand_extra_isize_ea()Miao Xie
Clean up some goto statement, make ext4_expand_extra_isize_ea() clearer. Signed-off-by: Miao Xie <miaoxie@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Wang Shilong <wshilong@ddn.com>
2017-08-06ext4: restructure ext4_expand_extra_isizeMiao Xie
Current ext4_expand_extra_isize just tries to expand extra isize, if someone is holding xattr lock or some check fails, it will give up. So rename its name to ext4_try_to_expand_extra_isize. Besides that, we clean up unnecessary check and move some relative checks into it. Signed-off-by: Miao Xie <miaoxie@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Wang Shilong <wshilong@ddn.com>
2017-08-06ext4: fix forgetten xattr lock protection in ext4_expand_extra_isizeMiao Xie
We should avoid the contention between the i_extra_isize update and the inline data insertion, so move the xattr trylock in front of i_extra_isize update. Signed-off-by: Miao Xie <miaoxie@huawei.com> Reviewed-by: Wang Shilong <wshilong@ddn.com>
2017-08-06ext4: make xattr inode reads fasterTahsin Erdogan
ext4_xattr_inode_read() currently reads each block sequentially while waiting for io operation to complete before moving on to the next block. This prevents request merging in block layer. Add a ext4_bread_batch() function that starts reads for all blocks then optionally waits for them to complete. A similar logic is used in ext4_find_entry(), so update that code to use the new function. Signed-off-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-05ext4: inplace xattr block update fails to deduplicate blocksTahsin Erdogan
When an xattr block has a single reference, block is updated inplace and it is reinserted to the cache. Later, a cache lookup is performed to see whether an existing block has the same contents. This cache lookup will most of the time return the just inserted entry so deduplication is not achieved. Running the following test script will produce two xattr blocks which can be observed in "File ACL: " line of debugfs output: mke2fs -b 1024 -I 128 -F -O extent /dev/sdb 1G mount /dev/sdb /mnt/sdb touch /mnt/sdb/{x,y} setfattr -n user.1 -v aaa /mnt/sdb/x setfattr -n user.2 -v bbb /mnt/sdb/x setfattr -n user.1 -v aaa /mnt/sdb/y setfattr -n user.2 -v bbb /mnt/sdb/y debugfs -R 'stat x' /dev/sdb | cat debugfs -R 'stat y' /dev/sdb | cat This patch defers the reinsertion to the cache so that we can locate other blocks with the same contents. Signed-off-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca>
2017-08-05ext4: remove unused mode parameterTahsin Erdogan
ext4_alloc_file_blocks() does not use its mode parameter. Remove it. Signed-off-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-05ext4: fix warning about stack corruptionArnd Bergmann
After commit 62d1034f53e3 ("fortify: use WARN instead of BUG for now"), we get a warning about possible stack overflow from a memcpy that was not strictly bounded to the size of the local variable: inlined from 'ext4_mb_seq_groups_show' at fs/ext4/mballoc.c:2322:2: include/linux/string.h:309:9: error: '__builtin_memcpy': writing between 161 and 1116 bytes into a region of size 160 overflows the destination [-Werror=stringop-overflow=] We actually had a bug here that would have been found by the warning, but it was already fixed last year in commit 30a9d7afe70e ("ext4: fix stack memory corruption with 64k block size"). This replaces the fixed-length structure on the stack with a variable-length structure, using the correct upper bound that tells the compiler that everything is really fine here. I also change the loop count to check for the same upper bound for consistency, but the existing code is already correct here. Note that while clang won't allow certain kinds of variable-length arrays in structures, this particular instance is fine, as the array is at the end of the structure, and the size is strictly bounded. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-05ext4: fix dir_nlink behaviourAndreas Dilger
The dir_nlink feature has been enabled by default for new ext4 filesystems since e2fsprogs-1.41 in 2008, and was automatically enabled by the kernel for older ext4 filesystems since the dir_nlink feature was added with ext4 in kernel 2.6.28+ when the subdirectory count exceeded EXT4_LINK_MAX-1. Automatically adding the file system features such as dir_nlink is generally frowned upon, since it could cause the file system to not be mountable on older kernel, thus preventing the administrator from rolling back to an older kernel if necessary. In this case, the administrator might also want to disable the feature because glibc's fts_read() function does not correctly optimize directory traversal for directories that use st_nlinks field of 1 to indicate that the number of links in the directory are not tracked by the file system, and could fail to traverse the full directory hierarchy. Fortunately, in the past ten years very few users have complained about incomplete file system traversal by glibc's fts_read(). This commit also changes ext4_inc_count() to allow i_nlinks to reach the full EXT4_LINK_MAX links on the parent directory (including "." and "..") before changing i_links_count to be 1. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196405 Signed-off-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-05ext4: silence array overflow warningDan Carpenter
I get a static checker warning: fs/ext4/ext4.h:3091 ext4_set_de_type() error: buffer overflow 'ext4_type_by_mode' 15 <= 15 It seems unlikely that we would hit this read overflow in real life, but it's also simple enough to make the array 16 bytes instead of 15. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-05ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesizeJan Kara
ext4_find_unwritten_pgoff() does not properly handle a situation when starting index is in the middle of a page and blocksize < pagesize. The following command shows the bug on filesystem with 1k blocksize: xfs_io -f -c "falloc 0 4k" \ -c "pwrite 1k 1k" \ -c "pwrite 3k 1k" \ -c "seek -a -r 0" foo In this example, neither lseek(fd, 1024, SEEK_HOLE) nor lseek(fd, 2048, SEEK_DATA) will return the correct result. Fix the problem by neglecting buffers in a page before starting offset. Reported-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz> CC: stable@vger.kernel.org # 3.8+
2017-08-05platform/x86: intel-vbtn: match power button on press rather than releaseMario Limonciello
This fixes a problem where the system gets stuck in a loop unable to wakeup via power button in s2idle. The problem happens because: - press power button: - system emits 0xc0 (power press), event ignored - system emits 0xc1 (power release), event processed, emited as KEY_POWER - set wakeup_mode to true - system goes to s2idle - press power button - system emits 0xc0 (power press), wakeup_mode is true, system wakes - system emits 0xc1 (power release), event processed, emited as KEY_POWER - system goes to s2idle again To avoid this situation, process the presses (which matches what intel-hid does too). Verified on an Dell XPS 9365 Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2017-08-05Merge tag 'media/v4.13-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "This series is larger than I would like to submit for -rc4. My original intent were to sent it to either -rc2 or -rc3. Unfortunately, due to my vacations, I got a lot of pending stuff after my return, and had to do some biz trips, with prevented me to send this earlier. Several fixes: - some fixes at atomisp staging driver - several gcc 7 warning fixes - cleanup media SVG files, in order to fix PDF build on some distros - fix random Kconfig build of venus driver - some fixes for the venus driver - some changes from semaphone to mutex in ngene's driver - some locking fixes at dib0700 driver - several fixes on ngene's driver and frontends to make it properly support some new boards added on Kernel 4.13 - some fixes to CEC drivers - omap_vout: vrfb: convert to dmaengine - docs-rst: document EBUSY for VIDIOC_S_FMT Please notice that the big diffstat changes here are at the SVG files. Visually, the images look the same, but the file size is now a lot smaller than before, and they don't use some XML tags that would cause them to be badly parsed by some ImageMagick versions, or to require a lot of memory by TeTex, with would break PDF output on some distributions" * tag 'media/v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (68 commits) media: atomisp2: array underflow in imx_enum_frame_size() media: atomisp2: array underflow in ap1302_enum_frame_size() media: atomisp2: Array underflow in atomisp_enum_input() media: platform: davinci: drop VPFE_CMD_S_CCDC_RAW_PARAMS media: platform: davinci: return -EINVAL for VPFE_CMD_S_CCDC_RAW_PARAMS ioctl media: venus: don't abuse dma_alloc for non-DMA allocations media: venus: hfi: fix error handling in hfi_sys_init_done() media: venus: fix compile-test build on non-qcom ARM platform media: venus: mark PM functions as __maybe_unused media: cec-notifier: small improvements media: pulse8-cec: persistent_config should be off by default media: cec: cec_transmit_attempt_done: ignore CEC_TX_STATUS_MAX_RETRIES media: staging: atomisp: array underflow in ioctl media: lirc: LIRC_GET_REC_RESOLUTION should return microseconds media: svg: avoid too long lines media: svg files: simplify files media: selection.svg: simplify the SVG file media: vimc: set id_table for platform drivers media: staging: atomisp: disable warnings with cc-disable-warning media: davinci: variable 'common' set but not used ...
2017-08-05ext4: release discard bio after sending discard commandsDaeho Jeong
We've changed the discard command handling into parallel manner. But, in this change, I forgot decreasing the usage count of the bio which was used to send discard request. I'm sorry about that. Fixes: a015434480dc ("ext4: send parallel discards on commit completions") Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2017-08-05Merge tag 'gpio-v4.13-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: - LP87565: set the proper output level for direction_output. - stm32: fix the kernel build by selecting the hierarchical irqdomain symbol properly - this happens to be done in the pin control framework but whatever, it had dependencies to GPIO so we need to apply it here. - Select the hierarchical IRQ domain also for Xgene. - Fix wakeups to work on MXC. - Fix up the device tree binding on Exar that went astray, also add the right bindings. - Fix the unwanted events for edges from the library. - Fix the unbalanced chanined IRQ on the Tegra. * tag 'gpio-v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: tegra: fix unbalanced chained_irq_enter/exit gpiolib: skip unwanted events, don't convert them to opposite edge gpio: exar: Use correct property prefix and document bindings gpio: gpio-mxc: Fix: higher 16 GPIOs usable as wake source gpio: xgene-sb: select IRQ_DOMAIN_HIERARCHY pinctrl: stm32: select IRQ_DOMAIN_HIERARCHY instead of depends on gpio: lp87565: Set proper output level and direction for direction_output MAINTAINERS: Add entry for Whiskey Cove PMIC GPIO driver
2017-08-04Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A handful of critical fixes for changes introduce this merge window. - The TI sci_clk_get() API was pretty broken and nobody noticed. - There were some CPUfreq crashes on C.H.I.P devices because we failed to propagate rates up the clk tree. - Also, the Intel Atom PMC clk driver needs to mark a clk critical if the firmware has it enabled already so that audio doesn't get killed on Baytrail. - Gemini devices have a dead serial console because the reset control usage in the serial driver assume one method of reset that gemini doesn't support (this will be fixed in the next version in the reset framework so this is the small fix for -rc series). - Finally we have two rate calculation fixes, one for Exynos and one for Meson SoCs, that fix rate inconsistencies" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: keystone: sci-clk: Fix sci_clk_get clk: meson: mpll: fix mpll0 fractional part ignored clk: samsung: exynos5420: The EPLL rate table corrections clk: sunxi-ng: sun5i: Add clk_set_rate_parent to the CPU clock clk: x86: Do not gate clocks enabled by the firmware clk: gemini: Fix reset regression
2017-08-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Radim Krčmář: "ARM: - Yet another race with VM destruction plugged - A set of small vgic fixes x86: - Preserve pending INIT - RCU fixes in paravirtual async pf, VM teardown, and VMXOFF emulation - nVMX interrupt injection and dirty tracking fixes - initialize to make UBSAN happy" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: arm/arm64: vgic: Use READ_ONCE fo cmpxchg KVM: nVMX: Fix interrupt window request with "Acknowledge interrupt on exit" KVM: nVMX: mark vmcs12 pages dirty on L2 exit kvm: nVMX: don't flush VMCS12 during VMXOFF or VCPU teardown KVM: nVMX: do not pin the VMCS12 KVM: avoid using rcu_dereference_protected KVM: X86: init irq->level in kvm_pv_kick_cpu_op KVM: X86: Fix loss of pending INIT due to race KVM: async_pf: make rcu irq exit if not triggered from idle task KVM: nVMX: fixes to nested virt interrupt injection KVM: nVMX: do not fill vm_exit_intr_error_code in prepare_vmcs12 KVM: arm/arm64: Handle hva aging while destroying the vm KVM: arm/arm64: PMU: Fix overflow interrupt injection KVM: arm/arm64: Fix bug in advertising KVM_CAP_MSI_DEVID capability
2017-08-04Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Thomas Gleixner: "The recent irq core changes unearthed API abuse in the HPET code, which manifested itself in a suspend/resume regression. The fix replaces the cruft with the proper function calls and cures the regression" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/hpet: Cure interface abuse in the resume path
2017-08-04Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A single fix for a multiplication overflow in the timer code on 32bit systems" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers: Fix overflow in get_next_timer_interrupt
2017-08-04Merge tag 'armsoc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "This comes a bit later than I planned, and as a consequence is a larger than it should be. Most of the changes are devicetree fixes, across lots of platforms: Renesas, Samsung Exynos, Marvell EBU, TI OMAP, Rockchips, Amlogic Meson, Sigma Desings Tango, Allwinner SUNxi and TI Davinci. Also across many platforms, I applied an older series of simple randconfig build fixes. This includes making the CONFIG_MTD_XIP option compile again, which had been broken for many years and probably has not been missed, but it felt wrong to just remove it completely. The only other changes are: - We enable HWSPINLOCK in defconfig to get some Qualcomm boards to work out of the box. - A few regression fixes for Texas Instruments OMAP2+. - A boot regression fix for the Renesas regulator quirk. - A suspend/resume fix for Uniphier SoCs, fixing the resume of the system bus" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (43 commits) ARM: dts: tango4: Request RGMII RX and TX clock delays bus: uniphier-system-bus: set up registers when resuming ARM64: dts: marvell: armada-37xx: Fix the number of GPIO on south bridge ARM: shmobile: rcar-gen2: Fix deadlock in regulator quirk arm64: defconfig: enable missing HWSPINLOCK ARM: pxa: select both FB and FB_W100 for eseries ARM: ixp4xx: fix ioport_unmap definition ARM: ep93xx: use ARM_PATCH_PHYS_VIRT correctly ARM: mmp: mark usb_dma_mask as __maybe_unused ARM: omap2: mark unused functions as __maybe_unused ARM: omap1: avoid unused variable warning ARM: sirf: mark sirfsoc_init_late as __maybe_unused ARM: ixp4xx: use normal prototype for {read,write}s{b,w,l} ARM: omap1/ams-delta: warn about failed regulator enable ARM: rpc: rename RAM_SIZE macro ARM: w90x900: normalize clk API ARM: ep93xx: normalize clk API ARM: dts: sun8i: a83t: Switch to CCU device tree binding macros arm64: allwinner: sun50i-a64: Correct emac register size ARM: dts: sunxi: h3/h5: Correct emac register size ...