summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2014-12-18KVM: move APIC types to arch/x86/Paolo Bonzini
They are not used anymore by IA64, move them away. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-15Merge tag 'kvm-arm-for-3.19-take2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD Second round of changes for KVM for arm/arm64 for v3.19; fixes reboot problems, clarifies VCPU init, and fixes a regression concerning the VGIC init flow. Conflicts: arch/ia64/kvm/kvm-ia64.c [deleted in HEAD and modified in kvmarm]
2014-12-15arm/arm64: KVM: Require in-kernel vgic for the arch timersChristoffer Dall
It is curently possible to run a VM with architected timers support without creating an in-kernel VGIC, which will result in interrupts from the virtual timer going nowhere. To address this issue, move the architected timers initialization to the time when we run a VCPU for the first time, and then only initialize (and enable) the architected timers if we have a properly created and initialized in-kernel VGIC. When injecting interrupts from the virtual timer to the vgic, the current setup should ensure that this never calls an on-demand init of the VGIC, which is the only call path that could return an error from kvm_vgic_inject_irq(), so capture the return value and raise a warning if there's an error there. We also change the kvm_timer_init() function from returning an int to be a void function, since the function always succeeds. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-12-13arm/arm64: KVM: Add (new) vgic_initialized macroChristoffer Dall
Some code paths will need to check to see if the internal state of the vgic has been initialized (such as when creating new VCPUs), so introduce such a macro that checks the nr_cpus field which is set when the vgic has been initialized. Also set nr_cpus = 0 in kvm_vgic_destroy, because the error path in vgic_init() will call this function, and code should never errornously assume the vgic to be properly initialized after an error. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-12-13arm/arm64: KVM: Rename vgic_initialized to vgic_readyChristoffer Dall
The vgic_initialized() macro currently returns the state of the vgic->ready flag, which indicates if the vgic is ready to be used when running a VM, not specifically if its internal state has been initialized. Rename the macro accordingly in preparation for a more nuanced initialization flow. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-12-13arm/arm64: KVM: vgic: move reset initialization into vgic_init_maps()Peter Maydell
VGIC initialization currently happens in three phases: (1) kvm_vgic_create() (triggered by userspace GIC creation) (2) vgic_init_maps() (triggered by userspace GIC register read/write requests, or from kvm_vgic_init() if not already run) (3) kvm_vgic_init() (triggered by first VM run) We were doing initialization of some state to correspond with the state of a freshly-reset GIC in kvm_vgic_init(); this is too late, since it will overwrite changes made by userspace using the register access APIs before the VM is run. Move this initialization earlier, into the vgic_init_maps() phase. This fixes a bug where QEMU could successfully restore a saved VM state snapshot into a VM that had already been run, but could not restore it "from cold" using the -loadvm command line option (the symptoms being that the restored VM would run but interrupts were ignored). Finally rename vgic_init_maps to vgic_init and renamed kvm_vgic_init to kvm_vgic_map_resources. [ This patch is originally written by Peter Maydell, but I have modified it somewhat heavily, renaming various bits and moving code around. If something is broken, I am to be blamed. - Christoffer ] Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-12-04kvm: optimize GFN to memslot lookup with large slots amountIgor Mammedov
Current linear search doesn't scale well when large amount of memslots is used and looked up slot is not in the beginning memslots array. Taking in account that memslots don't overlap, it's possible to switch sorting order of memslots array from 'npages' to 'base_gfn' and use binary search for memslot lookup by GFN. As result of switching to binary search lookup times are reduced with large amount of memslots. Following is a table of search_memslot() cycles during WS2008R2 guest boot. boot, boot + ~10 min mostly same of using it, slot lookup randomized lookup max average average cycles cycles cycles 13 slots : 1450 28 30 13 slots : 1400 30 40 binary search 117 slots : 13000 30 460 117 slots : 2000 35 180 binary search Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-04kvm: search_memslots: add simple LRU memslot cachingIgor Mammedov
In typical guest boot workload only 2-3 memslots are used extensively, and at that it's mostly the same memslot lookup operation. Adding LRU cache improves average lookup time from 46 to 28 cycles (~40%) for this workload. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-25kvm: add a memslot flag for incoherent memory regionsArd Biesheuvel
Memory regions may be incoherent with the caches, typically when the guest has mapped a host system RAM backed memory region as uncached. Add a flag KVM_MEMSLOT_INCOHERENT so that we can tag these memslots and handle them appropriately when mapping them. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2014-11-25kvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn()Ard Biesheuvel
This reverts commit 85c8555ff0 ("KVM: check for !is_zero_pfn() in kvm_is_mmio_pfn()") and renames the function to kvm_is_reserved_pfn. The problem being addressed by the patch above was that some ARM code based the memory mapping attributes of a pfn on the return value of kvm_is_mmio_pfn(), whose name indeed suggests that such pfns should be mapped as device memory. However, kvm_is_mmio_pfn() doesn't do quite what it says on the tin, and the existing non-ARM users were already using it in a way which suggests that its name should probably have been 'kvm_is_reserved_pfn' from the beginning, e.g., whether or not to call get_page/put_page on it etc. This means that returning false for the zero page is a mistake and the patch above should be reverted. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2014-11-24KVM: x86: move device assignment out of kvm_host.hPaolo Bonzini
Create a new header, and hide the device assignment functions there. Move struct kvm_assigned_dev_kernel to assigned-dev.c by modifying arch/x86/kvm/iommu.c to take a PCI device struct. Based on a patch by Radim Krcmar <rkrcmark@redhat.com>. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-23kvm: x86: move assigned-dev.c and iommu.c to arch/x86/Radim Krčmář
Now that ia64 is gone, we can hide deprecated device assignment in x86. Notable changes: - kvm_vm_ioctl_assigned_device() was moved to x86/kvm_arch_vm_ioctl() The easy parts were removed from generic kvm code, remaining - kvm_iommu_(un)map_pages() would require new code to be moved - struct kvm_assigned_dev_kernel depends on struct kvm_irq_ack_notifier Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-21kvm: remove IA64 ioctlsRadim Krčmář
KVM ia64 is no longer present so new applications shouldn't use them. The main problem is that they most likely didn't work even before, because of a conflict in the #defines: #define KVM_SET_GUEST_DEBUG _IOW(KVMIO, 0x9b, struct kvm_guest_debug) #define KVM_IA64_VCPU_SET_STACK _IOW(KVMIO, 0x9b, void *) The argument to KVM_SET_GUEST_DEBUG is: struct kvm_guest_debug { __u32 control; __u32 pad; struct kvm_guest_debug_arch arch; }; struct kvm_guest_debug_arch { }; meaning that sizeof(struct kvm_guest_debug) == sizeof(void *) == 8 and KVM_SET_GUEST_DEBUG == KVM_IA64_VCPU_SET_STACK. KVM_SET_GUEST_DEBUG is handled in virt/kvm/kvm_main.c before even calling kvm_arch_vcpu_ioctl (which would have handled KVM_IA64_VCPU_SET_STACK), so KVM_IA64_VCPU_SET_STACK would just return -EINVAL. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-21kvm: x86: move ioapic.c and irq_comm.c back to arch/x86/Paolo Bonzini
ia64 does not need them anymore. Ack notifiers become x86-specific too. Suggested-by: Gleb Natapov <gleb@kernel.org> Reviewed-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-03kvm: drop unsupported capabilities, fix documentationMichael S. Tsirkin
No kernel ever reported KVM_CAP_DEVICE_MSIX, KVM_CAP_DEVICE_MSI, KVM_CAP_DEVICE_ASSIGNMENT, KVM_CAP_DEVICE_DEASSIGNMENT. This makes the documentation wrong, and no application ever written to use these capabilities has a chance to work correctly. The only way to detect support is to try, and test errno for ENOTTY. That's unfortunate, but we can't fix the past. Document the actual semantics, and drop the definitions from the exported header to make it easier for application developers to note and fix the bug. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-02Merge tag 'for-linus-20141102' of git://git.infradead.org/linux-mtdLinus Torvalds
Pull MTD fixes from Brian Norris: "Three main MTD fixes for 3.18: - A regression from 3.16 which was noticed in 3.17. With the restructuring of the m25p80.c driver and the SPI NOR library framework, we omitted proper listing of the SPI device IDs. This means m25p80.c wouldn't auto-load (modprobe) properly when built as a module. For now, we duplicate the device IDs into both modules. - The OMAP / ELM modules were depending on an implicit link ordering. Use deferred probing so that the new link order (in 3.18-rc) can still allow for successful probing. - Fix suspend/resume support for LH28F640BF NOR flash" * tag 'for-linus-20141102' of git://git.infradead.org/linux-mtd: mtd: cfi_cmdset_0001.c: fix resume for LH28F640BF chips mtd: omap: fix mtd devices not showing up mtd: m25p80,spi-nor: Fix module aliases for m25p80 mtd: spi-nor: make spi_nor_scan() take a chip type name, not spi_device_id mtd: m25p80: get rid of spi_get_device_id
2014-11-02Merge tag 'scsi-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a set of six patches consisting of: - two MAINTAINER updates - two scsi-mq fixs for the old parallel interface (not every request is tagged and we need to set the right flags to populate the SPI tag message) - a fix for a memory leak in scatterlist traversal caused by a preallocation update in 3.17 - an ipv6 fix for cxgbi" [ The scatterlist fix also came in separately through the block layer tree ] * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: MAINTAINERS: ufs - remove self MAINTAINERS: change hpsa and cciss maintainer libcxgbi : support ipv6 address host_param scsi: set REQ_QUEUE for the blk-mq case Revert "block: all blk-mq requests are tagged" lib/scatterlist: fix memory leak with scsi-mq
2014-11-02Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm fixes from Dave Airlie: "Nothing too astounding or major: radeon, i915, vmwgfx, armada and exynos. Biggest ones: - vmwgfx has one big locking regression fix - i915 has come displayport fixes - radeon has some stability and a memory alloc failure - armada and exynos have some vblank fixes" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (24 commits) drm/exynos: correct connector->dpms field before resuming drm/exynos: enable vblank after DPMS on drm/exynos: init kms poll at the end of initialization drm/exynos: propagate plane initialization errors drm/exynos: vidi: fix build warning drm/exynos: remove explicit encoder/connector de-initialization drm/exynos: init vblank with real number of crtcs drm/vmwgfx: Filter out modes those cannot be supported by the current VRAM size. drm/vmwgfx: Fix hash key computation drm/vmwgfx: fix lock breakage drm/i915/dp: only use training pattern 3 on platforms that support it drm/radeon: remove some buggy dead code drm/i915: Ignore VBT backlight check on Macbook 2, 1 drm/radeon: remove invalid pci id drm/radeon: dpm fixes for asrock systems radeon: clean up coding style differences in radeon_get_bios() drm/radeon: Use drm_malloc_ab instead of kmalloc_array drm/radeon/dpm: disable ulv support on SI drm/i915: Fix GMBUSFREQ on vlv/chv drm/i915: Ignore long hpds on eDP ports ...
2014-11-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS fixes from Al Viro: "A bunch of assorted fixes, most of them followups to overlayfs merge" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ovl: initialize ->is_cursor Return short read or 0 at end of a raw device, not EIO isofs: don't bother with ->d_op for normal case isofs_cmp(): we'll never see a dentry for . or .. overlayfs: fix lockdep misannotation ovl: fix check for cursor overlayfs: barriers for opening upper-layer directory rcu: Provide counterpart to rcu_dereference() for non-RCU situations staging: android: logger: Fix log corruption regression
2014-10-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "A bit has accumulated, but it's been a week or so since my last batch of post-merge-window fixes, so... 1) Missing module license in netfilter reject module, from Pablo. Lots of people ran into this. 2) Off by one in mac80211 baserate calculation, from Karl Beldan. 3) Fix incorrect return value from ax88179_178a driver's set_mac_addr op, which broke use of it with bonding. From Ian Morgan. 4) Checking of skb_gso_segment()'s return value was not all encompassing, it can return an SKB pointer, a pointer error, or NULL. Fix from Florian Westphal. This is crummy, and longer term will be fixed to just return error pointers or a real SKB. 6) Encapsulation offloads not being handled by skb_gso_transport_seglen(). From Florian Westphal. 7) Fix deadlock in TIPC stack, from Ying Xue. 8) Fix performance regression from using rhashtable for netlink sockets. The problem was the synchronize_net() invoked for every socket destroy. From Thomas Graf. 9) Fix bug in eBPF verifier, and remove the strong dependency of BPF on NET. From Alexei Starovoitov. 10) In qdisc_create(), use the correct interface to allocate ->cpu_bstats, otherwise the u64_stats_sync member isn't initialized properly. From Sabrina Dubroca. 11) Off by one in ip_set_nfnl_get_byindex(), from Dan Carpenter. 12) nf_tables_newchain() was erroneously expecting error pointers from netdev_alloc_pcpu_stats(). It only returna a valid pointer or NULL. From Sabrina Dubroca. 13) Fix use-after-free in _decode_session6(), from Li RongQing. 14) When we set the TX flow hash on a socket, we mistakenly do so before we've nailed down the final source port. Move the setting deeper to fix this. From Sathya Perla. 15) NAPI budget accounting in amd-xgbe driver was counting descriptors instead of full packets, fix from Thomas Lendacky. 16) Fix total_data_buflen calculation in hyperv driver, from Haiyang Zhang. 17) Fix bcma driver build with OF_ADDRESS disabled, from Hauke Mehrtens. 18) Fix mis-use of per-cpu memory in TCP md5 code. The problem is that something that ends up being vmalloc memory can't be passed to the crypto hash routines via scatter-gather lists. From Eric Dumazet. 19) Fix regression in promiscuous mode enabling in cdc-ether, from Olivier Blin. 20) Bucket eviction and frag entry killing can race with eachother, causing an unlink of the object from the wrong list. Fix from Nikolay Aleksandrov. 21) Missing initialization of spinlock in cxgb4 driver, from Anish Bhatt. 22) Do not cache ipv4 routing failures, otherwise if the sysctl for forwarding is subsequently enabled this won't be seen. From Nicolas Cavallari" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (131 commits) drivers: net: cpsw: Support ALLMULTI and fix IFF_PROMISC in switch mode drivers: net: cpsw: Fix broken loop condition in switch mode net: ethtool: Return -EOPNOTSUPP if user space tries to read EEPROM with lengh 0 stmmac: pci: set default of the filter bins net: smc91x: Fix gpios for device tree based booting mpls: Allow mpls_gso to be built as module mpls: Fix mpls_gso handler. r8152: stop submitting intr for -EPROTO netfilter: nft_reject_bridge: restrict reject to prerouting and input netfilter: nft_reject_bridge: don't use IP stack to reject traffic netfilter: nf_reject_ipv6: split nf_send_reset6() in smaller functions netfilter: nf_reject_ipv4: split nf_send_reset() in smaller functions netfilter: nf_tables_bridge: update hook_mask to allow {pre,post}routing drivers/net: macvtap and tun depend on INET drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packets drivers/net: Disable UFO through virtio net: skb_fclone_busy() needs to detect orphaned skb gre: Use inner mac length when computing tunnel length mlx4: Avoid leaking steering rules on flow creation error flow net/mlx4_en: Don't attempt to TX offload the outer UDP checksum for VXLAN ...
2014-10-31Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Various scheduler fixes all over the place: three SCHED_DL fixes, three sched/numa fixes, two generic race fixes and a comment fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/dl: Fix preemption checks sched: Update comments for CLONE_NEWNS sched: stop the unbound recursion in preempt_schedule_context() sched/fair: Fix division by zero sysctl_numa_balancing_scan_size sched/fair: Care divide error in update_task_scan_period() sched/numa: Fix unsafe get_task_struct() in task_numa_assign() sched/deadline: Fix races between rt_mutex_setprio() and dl_task_timer() sched/deadline: Don't replenish from a !SCHED_DEADLINE entity sched: Fix race between task_group and sched_task_group
2014-10-31Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Mostly tooling fixes, plus on the kernel side: - a revert for a newly introduced PMU driver which isn't complete yet and where we ran out of time with fixes (to be tried again in v3.19) - this makes up for a large chunk of the diffstat. - compilation warning fixes - a printk message fix - event_idx usage fixes/cleanups" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf probe: Trivial typo fix for --demangle perf tools: Fix report -F dso_from for data without branch info perf tools: Fix report -F dso_to for data without branch info perf tools: Fix report -F symbol_from for data without branch info perf tools: Fix report -F symbol_to for data without branch info perf tools: Fix report -F mispredict for data without branch info perf tools: Fix report -F in_tx for data without branch info perf tools: Fix report -F abort for data without branch info perf tools: Make CPUINFO_PROC an array to support different kernel versions perf callchain: Use global caching provided by libunwind perf/x86/intel: Revert incomplete and undocumented Broadwell client support perf/x86: Fix compile warnings for intel_uncore perf: Fix typos in sample code in the perf_event.h header perf: Fix and clean up initialization of pmu::event_idx perf: Fix bogus kernel printk perf diff: Add missing hists__init() call at tool start
2014-10-31Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core fixes from Ingo Molnar: "The tree contains two RCU fixes and a compiler quirk comment fix" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rcu: Make rcu_barrier() understand about missing rcuo kthreads compiler/gcc4+: Remove inaccurate comment about 'asm goto' miscompiles rcu: More on deadlock between CPU hotplug and expedited grace periods
2014-10-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== netfilter/ipvs fixes for net The following patchset contains fixes for netfilter/ipvs. This round of fixes is larger than usual at this stage, specifically because of the nf_tables bridge reject fixes that I would like to see in 3.18. The patches are: 1) Fix a null-pointer dereference that may occur when logging errors. This problem was introduced by 4a4739d56b0 ("ipvs: Pull out crosses_local_route_boundary logic") in v3.17-rc5. 2) Update hook mask in nft_reject_bridge so we can also filter out packets from there. This fixes 36d2af5 ("netfilter: nf_tables: allow to filter from prerouting and postrouting"), which needs this chunk to work. 3) Two patches to refactor common code to forge the IPv4 and IPv6 reject packets from the bridge. These are required by the nf_tables reject bridge fix. 4) Fix nft_reject_bridge by avoiding the use of the IP stack to reject packets from the bridge. The idea is to forge the reject packets and inject them to the original port via br_deliver() which is now exported for that purpose. 5) Restrict nft_reject_bridge to bridge prerouting and input hooks. the original skbuff may cloned after prerouting when the bridge stack needs to flood it to several bridge ports, it is too late to reject the traffic. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-31netfilter: nf_reject_ipv6: split nf_send_reset6() in smaller functionsPablo Neira Ayuso
That can be reused by the reject bridge expression to build the reject packet. The new functions are: * nf_reject_ip6_tcphdr_get(): to sanitize and to obtain the TCP header. * nf_reject_ip6hdr_put(): to build the IPv6 header. * nf_reject_ip6_tcphdr_put(): to build the TCP header. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-31netfilter: nf_reject_ipv4: split nf_send_reset() in smaller functionsPablo Neira Ayuso
That can be reused by the reject bridge expression to build the reject packet. The new functions are: * nf_reject_ip_tcphdr_get(): to sanitize and to obtain the TCP header. * nf_reject_iphdr_put(): to build the IPv4 header. * nf_reject_ip_tcphdr_put(): to build the TCP header. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-31Return short read or 0 at end of a raw device, not EIODavid Jeffery
Author: David Jeffery <djeffery@redhat.com> Changes to the basic direct I/O code have broken the raw driver when reading to the end of a raw device. Instead of returning a short read for a read that extends partially beyond the device's end or 0 when at the end of the device, these reads now return EIO. The raw driver needs the same end of device handling as was added for normal block devices. Using blkdev_read_iter, which has the needed size checks, prevents the EIO conditions at the end of the device. Signed-off-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-30drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packetsBen Hutchings
UFO is now disabled on all drivers that work with virtio net headers, but userland may try to send UFO/IPv6 packets anyway. Instead of sending with ID=0, we should select identifiers on their behalf (as we used to). Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Fixes: 916e4cf46d02 ("ipv6: reuse ip6_frag_id from ip6_ufo_append_data") Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-30net: skb_fclone_busy() needs to detect orphaned skbEric Dumazet
Some drivers are unable to perform TX completions in a bound time. They instead call skb_orphan() Problem is skb_fclone_busy() has to detect this case, otherwise we block TCP retransmits and can freeze unlucky tcp sessions on mostly idle hosts. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: 1f3279ae0c13 ("tcp: avoid retransmits of TCP packets hanging in host queues") Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-30Merge branch 'urgent-for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/urgent Pull two RCU fixes from Paul E. McKenney: " - Complete the work of commit dd56af42bd82 (rcu: Eliminate deadlock between CPU hotplug and expedited grace periods), which was intended to allow synchronize_sched_expedited() to be safely used when holding locks acquired by CPU-hotplug notifiers. This commit makes the put_online_cpus() avoid the deadlock instead of just handling the get_online_cpus(). - Complete the work of commit 35ce7f29a44a (rcu: Create rcuo kthreads only for onlined CPUs), which was intended to allow RCU to avoid allocating unneeded kthreads on systems where the firmware says that there are more CPUs than are really present. This commit makes rcu_barrier() aware of the mismatch, so that it doesn't hang waiting for non-existent CPUs. " Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-29Merge branch 'akpm' (incoming from Andrew Morton)Linus Torvalds
Merge misc fixes from Andrew Morton: "21 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (21 commits) mm/balloon_compaction: fix deflation when compaction is disabled sh: fix sh770x SCIF memory regions zram: avoid NULL pointer access in concurrent situation mm/slab_common: don't check for duplicate cache names ocfs2: fix d_splice_alias() return code checking mm: rmap: split out page_remove_file_rmap() mm: memcontrol: fix missed end-writeback page accounting mm: page-writeback: inline account_page_dirtied() into single caller lib/bitmap.c: fix undefined shift in __bitmap_shift_{left|right}() drivers/rtc/rtc-bq32k.c: fix register value memory-hotplug: clear pgdat which is allocated by bootmem in try_offline_node() drivers/rtc/rtc-s3c.c: fix initialization failure without rtc source clock kernel/kmod: fix use-after-free of the sub_info structure drivers/rtc/rtc-pm8xxx.c: rework to support pm8941 rtc mm, thp: fix collapsing of hugepages on madvise drivers: of: add return value to of_reserved_mem_device_init() mm: free compound page with correct order gcov: add ARM64 to GCOV_PROFILE_ALL fsnotify: next_i is freed during fsnotify_unmount_inodes. mm/compaction.c: avoid premature range skip in isolate_migratepages_range ...
2014-10-29mm: memcontrol: fix missed end-writeback page accountingJohannes Weiner
Commit 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API") changed page migration to uncharge the old page right away. The page is locked, unmapped, truncated, and off the LRU, but it could race with writeback ending, which then doesn't unaccount the page properly: test_clear_page_writeback() migration wait_on_page_writeback() TestClearPageWriteback() mem_cgroup_migrate() clear PCG_USED mem_cgroup_update_page_stat() if (PageCgroupUsed(pc)) decrease memcg pages under writeback release pc->mem_cgroup->move_lock The per-page statistics interface is heavily optimized to avoid a function call and a lookup_page_cgroup() in the file unmap fast path, which means it doesn't verify whether a page is still charged before clearing PageWriteback() and it has to do it in the stat update later. Rework it so that it looks up the page's memcg once at the beginning of the transaction and then uses it throughout. The charge will be verified before clearing PageWriteback() and migration can't uncharge the page as long as that is still set. The RCU lock will protect the memcg past uncharge. As far as losing the optimization goes, the following test results are from a microbenchmark that maps, faults, and unmaps a 4GB sparse file three times in a nested fashion, so that there are two negative passes that don't account but still go through the new transaction overhead. There is no actual difference: old: 33.195102545 seconds time elapsed ( +- 0.01% ) new: 33.199231369 seconds time elapsed ( +- 0.03% ) The time spent in page_remove_rmap()'s callees still adds up to the same, but the time spent in the function itself seems reduced: # Children Self Command Shared Object Symbol old: 0.12% 0.11% filemapstress [kernel.kallsyms] [k] page_remove_rmap new: 0.12% 0.08% filemapstress [kernel.kallsyms] [k] page_remove_rmap Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: <stable@vger.kernel.org> [3.17.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-29mm: page-writeback: inline account_page_dirtied() into single callerJohannes Weiner
A follow-up patch would have changed the call signature. To save the trouble, just fold it instead. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: <stable@vger.kernel.org> [3.17.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-29mm, thp: fix collapsing of hugepages on madviseDavid Rientjes
If an anonymous mapping is not allowed to fault thp memory and then madvise(MADV_HUGEPAGE) is used after fault, khugepaged will never collapse this memory into thp memory. This occurs because the madvise(2) handler for thp, hugepage_madvise(), clears VM_NOHUGEPAGE on the stack and it isn't stored in vma->vm_flags until the final action of madvise_behavior(). This causes the khugepaged_enter_vma_merge() to be a no-op in hugepage_madvise() when the vma had previously had VM_NOHUGEPAGE set. Fix this by passing the correct vma flags to the khugepaged mm slot handler. There's no chance khugepaged can run on this vma until after madvise_behavior() returns since we hold mm->mmap_sem. It would be possible to clear VM_NOHUGEPAGE directly from vma->vm_flags in hugepage_advise(), but I didn't want to introduce special case behavior into madvise_behavior(). I think it's best to just let it always set vma->vm_flags itself. Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Suleiman Souhlal <suleiman@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-29drivers: of: add return value to of_reserved_mem_device_init()Marek Szyprowski
Driver calling of_reserved_mem_device_init() might be interested if the initialization has been successful or not, so add support for returning error code. This fixes a build warining caused by commit 7bfa5ab6fa1b ("drivers: dma-coherent: add initialization from device tree"), which has been merged without this change and without fixing function return value. Fixes: 7bfa5ab6fa1b1 ("drivers: dma-coherent: add initialization from device tree") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Laura Abbott <lauraa@codeaurora.org> Cc: Josh Cartwright <joshc@codeaurora.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kyungmin Park <kyungmin.park@samsung.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-29Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block layer fixes from Jens Axboe: "A small collection of fixes for the current kernel. This contains: - Two error handling fixes from Jan Kara. One for null_blk on failure to add a device, and the other for the block/scsi_ioctl SCSI_IOCTL_SEND_COMMAND fixing up the error jump point. - A commit added in the merge window for the bio integrity bits unfortunately disabled merging for all requests if CONFIG_BLK_DEV_INTEGRITY wasn't set. Reverse the logic, so that integrity checking wont disallow merges when not enabled. - A fix from Ming Lei for merging and generating too many segments. This caused a BUG in virtio_blk. - Two error handling printk() fixups from Robert Elliott, improving the information given when we rate limit. - Error handling fixup on elevator_init() failure from Sudip Mukherjee. - A fix from Tony Battersby, fixing up a memory leak in the scatterlist handling with scsi-mq" * 'for-linus' of git://git.kernel.dk/linux-block: block: Fix merge logic when CONFIG_BLK_DEV_INTEGRITY is not defined lib/scatterlist: fix memory leak with scsi-mq block: fix wrong error return in elevator_init() scsi: Fix error handling in SCSI_IOCTL_SEND_COMMAND null_blk: Cleanup error recovery in null_add_dev() blk-merge: recaculate segment if it isn't less than max segments fs: clarify rate limit suppressed buffer I/O errors fs: merge I/O error prints into one line
2014-10-29Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID fixes from Jiri Kosina: - workarounds for a couple of misbehaving Elan Touchscreens, by Adel Gadllah - fix for TransducerSerialNumber field implementation, by Jason Gerecke - a couple of new HID usages (added by HUT), by Olivier Gay * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: input: Fix TransducerSerialNumber implementation HID: add keyboard input assist hid usages HID: usbhid: enable always-poll quirk for Elan Touchscreen 016f HID: usbhid: enable always-poll quirk for Elan Touchscreen 009b
2014-10-28block: Fix merge logic when CONFIG_BLK_DEV_INTEGRITY is not definedMartin K. Petersen
Commit 4eaf99beadce switched to returning bool and as a result reversed the logic of the integrity merge checks. However, the empty stubs used when the block integrity code is compiled out were still returning 0. Make these stubs return "true". Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reported-by: Michael L. Semon <mlsemon35@gmail.com> Tested-by: Michael L. Semon <mlsemon35@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-10-28overlayfs: fix lockdep misannotationMiklos Szeredi
In an overlay directory that shadows an empty lower directory, say /mnt/a/empty102, do: touch /mnt/a/empty102/x unlink /mnt/a/empty102/x rmdir /mnt/a/empty102 It's actually harmless, but needs another level of nesting between I_MUTEX_CHILD and I_MUTEX_NORMAL. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-28rcu: Provide counterpart to rcu_dereference() for non-RCU situationsPaul E. McKenney
Although rcu_dereference() and friends can be used in situations where object lifetimes are being managed by something other than RCU, the resulting sparse and lockdep-RCU noise can be annoying. This commit therefore supplies a lockless_dereference(), which provides the protection for dereferences without the RCU-related debugging noise. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-28usbnet: add a callback for set_rx_modeOlivier Blin
To delegate promiscuous mode and multicast filtering to the subdriver. Signed-off-by: Olivier Blin <olivier.blin@softathome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28skbuff.h: fix kernel-doc warning for headers_endRandy Dunlap
Fix kernel-doc warning in <linux/skbuff.h> by making both headers_start and headers_end private fields. Warning(..//include/linux/skbuff.h:654): No description found for parameter 'headers_end[0]' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28rcu: Make rcu_barrier() understand about missing rcuo kthreadsPaul E. McKenney
Commit 35ce7f29a44a (rcu: Create rcuo kthreads only for onlined CPUs) avoids creating rcuo kthreads for CPUs that never come online. This fixes a bug in many instances of firmware: Instead of lying about their age, these systems instead lie about the number of CPUs that they have. Before commit 35ce7f29a44a, this could result in huge numbers of useless rcuo kthreads being created. It appears that experience indicates that I should have told the people suffering from this problem to fix their broken firmware, but I instead produced what turned out to be a partial fix. The missing piece supplied by this commit makes sure that rcu_barrier() knows not to post callbacks for no-CBs CPUs that have not yet come online, because otherwise rcu_barrier() will hang on systems having firmware that lies about the number of CPUs. It is tempting to simply have rcu_barrier() refuse to post a callback on any no-CBs CPU that does not have an rcuo kthread. This unfortunately does not work because rcu_barrier() is required to wait for all pending callbacks. It is therefore required to wait even for those callbacks that cannot possibly be invoked. Even if doing so hangs the system. Given that posting a callback to a no-CBs CPU that does not yet have an rcuo kthread can hang rcu_barrier(), It is tempting to report an error in this case. Unfortunately, this will result in false positives at boot time, when it is perfectly legal to post callbacks to the boot CPU before the scheduler has started, in other words, before it is legal to invoke rcu_barrier(). So this commit instead has rcu_barrier() avoid posting callbacks to CPUs having neither rcuo kthread nor pending callbacks, and has it complain bitterly if it finds CPUs having no rcuo kthread but some pending callbacks. And when rcu_barrier() does find CPUs having no rcuo kthread but pending callbacks, as noted earlier, it has no choice but to hang indefinitely. Reported-by: Yanko Kaneti <yaneti@declera.com> Reported-by: Jay Vosburgh <jay.vosburgh@canonical.com> Reported-by: Meelis Roos <mroos@linux.ee> Reported-by: Eric B Munson <emunson@akamai.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Eric B Munson <emunson@akamai.com> Tested-by: Jay Vosburgh <jay.vosburgh@canonical.com> Tested-by: Yanko Kaneti <yaneti@declera.com> Tested-by: Kevin Fenzi <kevin@scrye.com> Tested-by: Meelis Roos <mroos@linux.ee>
2014-10-28drm/radeon: remove invalid pci idAlex Deucher
0x4c6e is a secondary device id so should not be used by the driver. Noticed-by: Mark Kettenis <mark.kettenis@xs4all.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2014-10-28compiler/gcc4+: Remove inaccurate comment about 'asm goto' miscompilesSteven Noonan
The bug referenced by the comment in this commit was not completely fixed in GCC 4.8.2, as I mentioned in a thread back in February: https://lkml.org/lkml/2014/2/12/797 The conclusion at that time was to make the quirk unconditional until the bug could be found and fixed in GCC. Unfortunately, when I submitted the patch (commit a9f18034) I left a comment in that claimed the bug was fixed in GCC 4.8.2+. This comment is inaccurate, and should be removed. Signed-off-by: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1414274982-14040-1-git-send-email-steven@uplinklabs.net Cc: Ingo Molnar <mingo@kernel.org>
2014-10-28perf: Fix typos in sample code in the perf_event.h headerAndy Lutomirski
struct perf_event_mmap_page has members called "index" and "cap_user_rdpmc". Spell them correctly in the examples. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-api@vger.kernel.org Link: http://lkml.kernel.org/r/320ba26391a8123cc16e5f02d24d34bd404332fd.1412313343.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28sched: Update comments for CLONE_NEWNSChen Hanxiao
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-api@vger.kernel.org Link: http://lkml.kernel.org/r/1412674147-8941-1-git-send-email-chenhanxiao@cn.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28scsi: set REQ_QUEUE for the blk-mq caseChristoph Hellwig
To generate the right SPI tag messages we need to properly set QUEUE_FLAG_QUEUED in the request_queue and mirror it to the request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Jens Axboe <axboe@kernel.dk> Reported-by: Meelis Roos <mroos@linux.ee> Tested-by: Meelis Roos <mroos@linux.ee> Cc: stable@vger.kernel.org
2014-10-28Revert "block: all blk-mq requests are tagged"Christoph Hellwig
This reverts commit fb3ccb5da71273e7f0d50b50bc879e50cedd60e7. SCSI-2/SPI actually needs the tagged/untagged flag in the request to work properly. Revert this patch and add a follow on to set it in the right place. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Jens Axboe <axboe@kernel.dk> Reported-by: Meelis Roos <mroos@linux.ee> Tested-by: Meelis Roos <mroos@linux.ee> Cc: stable@vger.kernel.org
2014-10-27Merge tag 'media/v3.18-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "A series of driver fixes: - a few compilation fixes with randconfigs - one potential compilation breakage on userspace due to the usage of a gcc extension - several warnings fixed - some other random driver fixes" * tag 'media/v3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (22 commits) [media] s5p-jpeg: Avoid -Wuninitialized warning in s5p_jpeg_parse_hdr [media] s5p-fimc: Only build suspend/resume for PM [media] s5p-jpeg: Only build suspend/resume for PM [media] Remove references to non-existent PLAT_S5P symbol [media] videobuf-dma-contig: set vm_pgoff to be zero to pass the sanity check in vm_iomap_memory() [media] tw68: remove bogus I2C_ALGOBIT dependency [media] usbvision-video: two use after frees [media] tw68: remove deprecated IRQF_DISABLED [media] xc5000: use after free in release() [media] em28xx-input: NULL dereference on error [media] wl128x: fix fmdbg compiler warning Revert "[media] v4l2-dv-timings: fix a sparse warning" [media] hackrf: harmless off by one in debug code [media] cx23885: initialize config structs for T9580 [media] v4l: uvcvideo: Fix buffer completion size check [media] vivid: fix buffer overrun [media] saa7146: Create a device name before it's used [media] em28xx: fix uninitialized variable warning [media] vivid: fix Kconfig FB dependency [media] anysee: make sure loading modules is const ...