summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-02-20x86/xen: Distribute switch variables for initializationKees Cook
Variables declared in a switch statement before any case statements cannot be automatically initialized with compiler instrumentation (as they are not part of any execution flow). With GCC's proposed automatic stack variable initialization feature, this triggers a warning (and they don't get initialized). Clang's automatic stack variable initialization (via CONFIG_INIT_STACK_ALL=y) doesn't throw a warning, but it also doesn't initialize such variables[1]. Note that these warnings (or silent skipping) happen before the dead-store elimination optimization phase, so even when the automatic initializations are later elided in favor of direct initializations, the warnings remain. To avoid these problems, move such variables into the "case" where they're used or lift them up into the main function body. arch/x86/xen/enlighten_pv.c: In function ‘xen_write_msr_safe’: arch/x86/xen/enlighten_pv.c:904:12: warning: statement will never be executed [-Wswitch-unreachable] 904 | unsigned which; | ^~~~~ [1] https://bugs.llvm.org/show_bug.cgi?id=44916 Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20200220062318.69299-1-keescook@chromium.org Reviewed-by: Juergen Gross <jgross@suse.com> [boris: made @which an 'unsigned int'] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2020-02-20mm: Avoid creating virtual address aliases in brk()/mmap()/mremap()Catalin Marinas
Currently the arm64 kernel ignores the top address byte passed to brk(), mmap() and mremap(). When the user is not aware of the 56-bit address limit or relies on the kernel to return an error, untagging such pointers has the potential to create address aliases in user-space. Passing a tagged address to munmap(), madvise() is permitted since the tagged pointer is expected to be inside an existing mapping. The current behaviour breaks the existing glibc malloc() implementation which relies on brk() with an address beyond 56-bit to be rejected by the kernel. Remove untagging in the above functions by partially reverting commit ce18d171cb73 ("mm: untag user pointers in mmap/munmap/mremap/brk"). In addition, update the arm64 tagged-address-abi.rst document accordingly. Link: https://bugzilla.redhat.com/1797052 Fixes: ce18d171cb73 ("mm: untag user pointers in mmap/munmap/mremap/brk") Cc: <stable@vger.kernel.org> # 5.4.x- Cc: Florian Weimer <fweimer@redhat.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reported-by: Victor Stinner <vstinner@redhat.com> Acked-by: Will Deacon <will@kernel.org> Acked-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-02-20docs: arm64: fix trivial spelling enought to enough in memory.rstScott Branden
Fix trivial spelling error enought to enough in memory.rst. Cc: trivial@kernel.org Signed-off-by: Scott Branden <scott.branden@broadcom.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-02-19ext4: add cond_resched() to __ext4_find_entry()Shijie Luo
We tested a soft lockup problem in linux 4.19 which could also be found in linux 5.x. When dir inode takes up a large number of blocks, and if the directory is growing when we are searching, it's possible the restart branch could be called many times, and the do while loop could hold cpu a long time. Here is the call trace in linux 4.19. [ 473.756186] Call trace: [ 473.756196] dump_backtrace+0x0/0x198 [ 473.756199] show_stack+0x24/0x30 [ 473.756205] dump_stack+0xa4/0xcc [ 473.756210] watchdog_timer_fn+0x300/0x3e8 [ 473.756215] __hrtimer_run_queues+0x114/0x358 [ 473.756217] hrtimer_interrupt+0x104/0x2d8 [ 473.756222] arch_timer_handler_virt+0x38/0x58 [ 473.756226] handle_percpu_devid_irq+0x90/0x248 [ 473.756231] generic_handle_irq+0x34/0x50 [ 473.756234] __handle_domain_irq+0x68/0xc0 [ 473.756236] gic_handle_irq+0x6c/0x150 [ 473.756238] el1_irq+0xb8/0x140 [ 473.756286] ext4_es_lookup_extent+0xdc/0x258 [ext4] [ 473.756310] ext4_map_blocks+0x64/0x5c0 [ext4] [ 473.756333] ext4_getblk+0x6c/0x1d0 [ext4] [ 473.756356] ext4_bread_batch+0x7c/0x1f8 [ext4] [ 473.756379] ext4_find_entry+0x124/0x3f8 [ext4] [ 473.756402] ext4_lookup+0x8c/0x258 [ext4] [ 473.756407] __lookup_hash+0x8c/0xe8 [ 473.756411] filename_create+0xa0/0x170 [ 473.756413] do_mkdirat+0x6c/0x140 [ 473.756415] __arm64_sys_mkdirat+0x28/0x38 [ 473.756419] el0_svc_common+0x78/0x130 [ 473.756421] el0_svc_handler+0x38/0x78 [ 473.756423] el0_svc+0x8/0xc [ 485.755156] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [tmp:5149] Add cond_resched() to avoid soft lockup and to provide a better system responding. Link: https://lore.kernel.org/r/20200215080206.13293-1-luoshijie1@huawei.com Signed-off-by: Shijie Luo <luoshijie1@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Cc: stable@kernel.org
2020-02-19ext4: fix a data race in EXT4_I(inode)->i_disksizeQian Cai
EXT4_I(inode)->i_disksize could be accessed concurrently as noticed by KCSAN, BUG: KCSAN: data-race in ext4_write_end [ext4] / ext4_writepages [ext4] write to 0xffff91c6713b00f8 of 8 bytes by task 49268 on cpu 127: ext4_write_end+0x4e3/0x750 [ext4] ext4_update_i_disksize at fs/ext4/ext4.h:3032 (inlined by) ext4_update_inode_size at fs/ext4/ext4.h:3046 (inlined by) ext4_write_end at fs/ext4/inode.c:1287 generic_perform_write+0x208/0x2a0 ext4_buffered_write_iter+0x11f/0x210 [ext4] ext4_file_write_iter+0xce/0x9e0 [ext4] new_sync_write+0x29c/0x3b0 __vfs_write+0x92/0xa0 vfs_write+0x103/0x260 ksys_write+0x9d/0x130 __x64_sys_write+0x4c/0x60 do_syscall_64+0x91/0xb47 entry_SYSCALL_64_after_hwframe+0x49/0xbe read to 0xffff91c6713b00f8 of 8 bytes by task 24872 on cpu 37: ext4_writepages+0x10ac/0x1d00 [ext4] mpage_map_and_submit_extent at fs/ext4/inode.c:2468 (inlined by) ext4_writepages at fs/ext4/inode.c:2772 do_writepages+0x5e/0x130 __writeback_single_inode+0xeb/0xb20 writeback_sb_inodes+0x429/0x900 __writeback_inodes_wb+0xc4/0x150 wb_writeback+0x4bd/0x870 wb_workfn+0x6b4/0x960 process_one_work+0x54c/0xbe0 worker_thread+0x80/0x650 kthread+0x1e0/0x200 ret_from_fork+0x27/0x50 Reported by Kernel Concurrency Sanitizer on: CPU: 37 PID: 24872 Comm: kworker/u261:2 Tainted: G W O L 5.5.0-next-20200204+ #5 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 Workqueue: writeback wb_workfn (flush-7:0) Since only the read is operating as lockless (outside of the "i_data_sem"), load tearing could introduce a logic bug. Fix it by adding READ_ONCE() for the read and WRITE_ONCE() for the write. Signed-off-by: Qian Cai <cai@lca.pw> Link: https://lore.kernel.org/r/1581085751-31793-1-git-send-email-cai@lca.pw Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2020-02-20Merge branch 'linux-5.6' of git://github.com/skeggsb/linux into drm-fixesDave Airlie
Nothing major here, another TU1xx modesetting fix, and hooking up ACR/GR support on TU11x now that NVIDIA have made the firmware available. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Ben Skeggs <skeggsb@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/ <CACAvsv64yBq4KHJ8D-5HQ5eeotApJSMiD+V2ut4f3BonUggf0Q@mail.gmail.com
2020-02-19hwmon: (acpi_power_meter) Fix lockdep splatGuenter Roeck
Damien Le Moal reports a lockdep splat with the acpi_power_meter, observed with Linux v5.5 and later. ====================================================== WARNING: possible circular locking dependency detected 5.6.0-rc2+ #629 Not tainted ------------------------------------------------------ python/1397 is trying to acquire lock: ffff888619080070 (&resource->lock){+.+.}, at: show_power+0x3c/0xa0 [acpi_power_meter] but task is already holding lock: ffff88881643f188 (kn->count#119){++++}, at: kernfs_seq_start+0x6a/0x160 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (kn->count#119){++++}: __kernfs_remove+0x626/0x7e0 kernfs_remove_by_name_ns+0x41/0x80 remove_attrs+0xcb/0x3c0 [acpi_power_meter] acpi_power_meter_notify+0x1f7/0x310 [acpi_power_meter] acpi_ev_notify_dispatch+0x198/0x1f3 acpi_os_execute_deferred+0x4d/0x70 process_one_work+0x7c8/0x1340 worker_thread+0x94/0xc70 kthread+0x2ed/0x3f0 ret_from_fork+0x24/0x30 -> #0 (&resource->lock){+.+.}: __lock_acquire+0x20be/0x49b0 lock_acquire+0x127/0x340 __mutex_lock+0x15b/0x1350 show_power+0x3c/0xa0 [acpi_power_meter] dev_attr_show+0x3f/0x80 sysfs_kf_seq_show+0x216/0x410 seq_read+0x407/0xf90 vfs_read+0x152/0x2c0 ksys_read+0xf3/0x1d0 do_syscall_64+0x95/0x1010 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(kn->count#119); lock(&resource->lock); lock(kn->count#119); lock(&resource->lock); *** DEADLOCK *** 4 locks held by python/1397: #0: ffff8890242d64e0 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x9b/0xb0 #1: ffff889040be74e0 (&p->lock){+.+.}, at: seq_read+0x6b/0xf90 #2: ffff8890448eb880 (&of->mutex){+.+.}, at: kernfs_seq_start+0x47/0x160 #3: ffff88881643f188 (kn->count#119){++++}, at: kernfs_seq_start+0x6a/0x160 stack backtrace: CPU: 10 PID: 1397 Comm: python Not tainted 5.6.0-rc2+ #629 Hardware name: Supermicro Super Server/X11DPL-i, BIOS 3.1 05/21/2019 Call Trace: dump_stack+0x97/0xe0 check_noncircular+0x32e/0x3e0 ? print_circular_bug.isra.0+0x1e0/0x1e0 ? unwind_next_frame+0xb9a/0x1890 ? entry_SYSCALL_64_after_hwframe+0x49/0xbe ? graph_lock+0x79/0x170 ? __lockdep_reset_lock+0x3c0/0x3c0 ? mark_lock+0xbc/0x1150 __lock_acquire+0x20be/0x49b0 ? mark_held_locks+0xe0/0xe0 ? stack_trace_save+0x91/0xc0 lock_acquire+0x127/0x340 ? show_power+0x3c/0xa0 [acpi_power_meter] ? device_remove_bin_file+0x10/0x10 ? device_remove_bin_file+0x10/0x10 __mutex_lock+0x15b/0x1350 ? show_power+0x3c/0xa0 [acpi_power_meter] ? show_power+0x3c/0xa0 [acpi_power_meter] ? mutex_lock_io_nested+0x11f0/0x11f0 ? lock_downgrade+0x6a0/0x6a0 ? kernfs_seq_start+0x47/0x160 ? lock_acquire+0x127/0x340 ? kernfs_seq_start+0x6a/0x160 ? device_remove_bin_file+0x10/0x10 ? show_power+0x3c/0xa0 [acpi_power_meter] show_power+0x3c/0xa0 [acpi_power_meter] dev_attr_show+0x3f/0x80 ? memset+0x20/0x40 sysfs_kf_seq_show+0x216/0x410 seq_read+0x407/0xf90 ? security_file_permission+0x16f/0x2c0 vfs_read+0x152/0x2c0 Problem is that reading an attribute takes the kernfs lock in the kernfs code, then resource->lock in the driver. During an ACPI notification, the opposite happens: The resource lock is taken first, followed by the kernfs lock when sysfs attributes are removed and re-created. Presumably this is now seen due to some locking related changes in kernfs after v5.4, but it was likely always a problem. Fix the problem by not blindly acquiring the lock in the notification function. It is only needed to protect the various update functions. However, those update functions are called anyway when sysfs attributes are read. This means that we can just stop calling those functions from the notifier, and the resource lock in the notifier function is no longer needed. That leaves two situations: First, METER_NOTIFY_CONFIG removes and re-allocates capability strings. While it did so under the resource lock, _displaying_ those strings was not protected, creating a race condition. To solve this problem, selectively protect both removal/creation and reporting of capability attributes with the resource lock. Second, removing and re-creating the attribute files is no longer protected by the resource lock. That doesn't matter since access to each individual attribute is protected by the kernfs lock. Userspace may get messed up if attributes disappear and reappear under its nose, but that is not different than today, and there is nothing we can do about it without major driver restructuring. Last but not least, when removing the driver, remove attribute functions first, then release capability strings. This avoids yet another race condition. Reported-by: Damien Le Moal <Damien.LeMoal@wdc.com> Cc: Damien Le Moal <Damien.LeMoal@wdc.com> Cc: stable@vger.kernel.org # v5.5+ Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2020-02-19Merge tag 'linux-kselftest-5.6-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fixes from Shuah Khan: "Fixes to build failures and other test bugs" * tag 'linux-kselftest-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: openat2: fix build error on newer glibc selftests: use LDLIBS for libraries instead of LDFLAGS selftests: fix too long argument selftests: allow detection of build failures Kernel selftests: tpm2: check for tpm support selftests/ftrace: Have pid filter test use instance flag selftests: fix spelling mistaked "chaigned" -> "chained"
2020-02-19dt-bindings: media: csi: Fix clocks descriptionMaxime Ripard
Commit 1de243b07666 ("media: dt-bindings: media: sun4i-csi: Add compatible for CSI1 on A10/A20") introduced support for the CSI1 controller on A10 and A20 that unlike CSI0 doesn't have an ISP and therefore only have two clocks, the bus and module clocks. The clocks and clock-names properties have thus been modified to allow either two or tree clocks. However, the current list has the ISP clock at the second position, which means the bindings expects a list of either bus and isp, or bus, isp and mod. The initial intent of the patch was obviously to have bus and mod in the former case. Let's fix the binding so that it validates properly. Fixes: 1de243b07666 ("media: dt-bindings: media: sun4i-csi: Add compatible for CSI1 on A10/A20") Signed-off-by: Maxime Ripard <maxime@cerno.tech> Signed-off-by: Rob Herring <robh@kernel.org>
2020-02-19dt-bindings: media: csi: Add interconnects propertiesMaxime Ripard
The Allwinner CSI controller is sitting beside the MBUS that is represented as an interconnect. Make sure that the interconnect properties are valid in the binding. Fixes: 7866d6903ce8 ("media: dt-bindings: media: sun4i-csi: Add compatible for CSI0 on R40") Signed-off-by: Maxime Ripard <maxime@cerno.tech> Signed-off-by: Rob Herring <robh@kernel.org>
2020-02-19dt-bindings: net: mdio: remove compatible string from exampleGrygorii Strashko
Remove vendor specific compatible string from example, otherwise DT YAML schemas validation may trigger warnings specific to TI ti,davinci_mdio and not to the generic MDIO example. For example, the "bus_freq" is required for davinci_mdio, but not required for generic mdio example. As result following warning will be produced: mdio.example.dt.yaml: mdio@5c030000: 'bus_freq' is a required property Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Rob Herring <robh@kernel.org>
2020-02-19dt-bindings: memory-controller: Update example for Tegra124 EMCThierry Reding
The example in the Tegra124 EMC device tree binding looks like an old version that doesn't contain all the required fields. Update it with a version from the current DTS files to fix the make dt_binding_check target. Reported-by: Rob Herring <robh+dt@kernel.org> Signed-off-by: Thierry Reding <treding@nvidia.com> [robh: also fix missing '#reset-cells'] Signed-off-by: Rob Herring <robh@kernel.org>
2020-02-20Merge tag 'drm-msm-fixes-2020-02-16' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-fixes + fix UBWC on GPU and display side for sc7180 + fix DSI suspend/resume issue encountered on sc7180 + fix some breakage on so called "linux-android" devices (fallout from sc7180/a618 support, not seen earlier due to bootloader/firmware differences) + couple other misc fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/ <CAF6AEGshz5K3tJd=NsBSHq6HGT-ZRa67qt+iN=U2ZFO2oD8kuw@mail.gmail.com
2020-02-20Merge tag 'amd-drm-fixes-5.6-2020-02-19' of ↵Dave Airlie
git://people.freedesktop.org/~agd5f/linux into drm-fixes amd-drm-fixes-5.6-2020-02-19: amdgpu: - HDCP fixes - xclk fix for raven - GFXOFF fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200219173954.3847-1-alexander.deucher@amd.com
2020-02-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf 2020-02-19 The following pull-request contains BPF updates for your *net* tree. We've added 10 non-merge commits during the last 10 day(s) which contain a total of 10 files changed, 93 insertions(+), 31 deletions(-). The main changes are: 1) batched bpf hashtab fixes from Brian and Yonghong. 2) various selftests and libbpf fixes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19Merge branch '100GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2020-02-19 This series contains fixes to the ice driver. Brett fixes an issue where if a user sets an odd [tx|rx]-usecs value through ethtool, the request is denied because the hardware is set to have an ITR with 2us granularity. Also fix an issue where the VF has not been completely removed/reset after being unbound from the host driver, so resolve this by waiting for the VF remove/reset process to happen before checking if the VF is disabled. Michal fixes an issue, where when the user changes flow control via ethtool, the OS is told the link is going down when that may not be the case. Before the fix, the only way to get out of this state was to take the interface down and up again. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19udp: rehash on disconnectWillem de Bruijn
As of the below commit, udp sockets bound to a specific address can coexist with one bound to the any addr for the same port. The commit also phased out the use of socket hashing based only on port (hslot), in favor of always hashing on {addr, port} (hslot2). The change broke the following behavior with disconnect (AF_UNSPEC): server binds to 0.0.0.0:1337 server connects to 127.0.0.1:80 server disconnects client connects to 127.0.0.1:1337 client sends "hello" server reads "hello" // times out, packet did not find sk On connect the server acquires a specific source addr suitable for routing to its destination. On disconnect it reverts to the any addr. The connect call triggers a rehash to a different hslot2. On disconnect, add the same to return to the original hslot2. Skip this step if the socket is going to be unhashed completely. Fixes: 4cdeeee9252a ("net: udp: prefer listeners bound to an address") Reported-by: Pavel Roskin <plroskin@gmail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19net/tls: Fix to avoid gettig invalid tls recordRohit Maheshwari
Current code doesn't check if tcp sequence number is starting from (/after) 1st record's start sequnce number. It only checks if seq number is before 1st record's end sequnce number. This problem will always be a possibility in re-transmit case. If a record which belongs to a requested seq number is already deleted, tls_get_record will start looking into list and as per the check it will look if seq number is before the end seq of 1st record, which will always be true and will return 1st record always, it should in fact return NULL. As part of the fix, start looking each record only if the sequence number lies in the list else return NULL. There is one more check added, driver look for the start marker record to handle tcp packets which are before the tls offload start sequence number, hence return 1st record if the record is tls start marker and seq number is before the 1st record's starting sequence number. Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure") Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19dt-bindings: mmc: omap-hsmmc: Fix SDIO interruptTomas Paukrt
SDIO interrupt must be specified correctly as IRQ_TYPE_LEVEL_LOW instead of GPIO_ACTIVE_LOW. Signed-off-by: Tomas Paukrt <tomaspaukrt@email.cz> Signed-off-by: Rob Herring <robh@kernel.org>
2020-02-19bpf: Fix a potential deadlock with bpf_map_do_batchYonghong Song
Commit 057996380a42 ("bpf: Add batch ops to all htab bpf map") added lookup_and_delete batch operation for hash table. The current implementation has bpf_lru_push_free() inside the bucket lock, which may cause a deadlock. syzbot reports: -> #2 (&htab->buckets[i].lock#2){....}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 htab_lru_map_delete_node+0xce/0x2f0 kernel/bpf/hashtab.c:593 __bpf_lru_list_shrink_inactive kernel/bpf/bpf_lru_list.c:220 [inline] __bpf_lru_list_shrink+0xf9/0x470 kernel/bpf/bpf_lru_list.c:266 bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:340 [inline] bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline] bpf_lru_pop_free+0x87c/0x1670 kernel/bpf/bpf_lru_list.c:499 prealloc_lru_pop+0x2c/0xa0 kernel/bpf/hashtab.c:132 __htab_lru_percpu_map_update_elem+0x67e/0xa90 kernel/bpf/hashtab.c:1069 bpf_percpu_hash_update+0x16e/0x210 kernel/bpf/hashtab.c:1585 bpf_map_update_value.isra.0+0x2d7/0x8e0 kernel/bpf/syscall.c:181 generic_map_update_batch+0x41f/0x610 kernel/bpf/syscall.c:1319 bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 __do_sys_bpf+0x9b7/0x41e0 kernel/bpf/syscall.c:3460 __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (&loc_l->lock){....}: check_prev_add kernel/locking/lockdep.c:2475 [inline] check_prevs_add kernel/locking/lockdep.c:2580 [inline] validate_chain kernel/locking/lockdep.c:2970 [inline] __lock_acquire+0x2596/0x4a00 kernel/locking/lockdep.c:3954 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:159 bpf_common_lru_push_free kernel/bpf/bpf_lru_list.c:516 [inline] bpf_lru_push_free+0x250/0x5b0 kernel/bpf/bpf_lru_list.c:555 __htab_map_lookup_and_delete_batch+0x8d4/0x1540 kernel/bpf/hashtab.c:1374 htab_lru_map_lookup_and_delete_batch+0x34/0x40 kernel/bpf/hashtab.c:1491 bpf_map_do_batch+0x3f5/0x510 kernel/bpf/syscall.c:3348 __do_sys_bpf+0x1f7d/0x41e0 kernel/bpf/syscall.c:3456 __se_sys_bpf kernel/bpf/syscall.c:3355 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:3355 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe Possible unsafe locking scenario: CPU0 CPU2 ---- ---- lock(&htab->buckets[i].lock#2); lock(&l->lock); lock(&htab->buckets[i].lock#2); lock(&loc_l->lock); *** DEADLOCK *** To fix the issue, for htab_lru_map_lookup_and_delete_batch() in CPU0, let us do bpf_lru_push_free() out of the htab bucket lock. This can avoid the above deadlock scenario. Fixes: 057996380a42 ("bpf: Add batch ops to all htab bpf map") Reported-by: syzbot+a38ff3d9356388f2fb83@syzkaller.appspotmail.com Reported-by: syzbot+122b5421d14e68f29cd1@syzkaller.appspotmail.com Suggested-by: Hillf Danton <hdanton@sina.com> Suggested-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Brian Vazquez <brianvv@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200219234757.3544014-1-yhs@fb.com
2020-02-19bpf: Do not grab the bucket spinlock by default on htab batch opsBrian Vazquez
Grabbing the spinlock for every bucket even if it's empty, was causing significant perfomance cost when traversing htab maps that have only a few entries. This patch addresses the issue by checking first the bucket_cnt, if the bucket has some entries then we go and grab the spinlock and proceed with the batching. Tested with a htab of size 50K and different value of populated entries. Before: Benchmark Time(ns) CPU(ns) --------------------------------------------- BM_DumpHashMap/1 2759655 2752033 BM_DumpHashMap/10 2933722 2930825 BM_DumpHashMap/200 3171680 3170265 BM_DumpHashMap/500 3639607 3635511 BM_DumpHashMap/1000 4369008 4364981 BM_DumpHashMap/5k 11171919 11134028 BM_DumpHashMap/20k 69150080 69033496 BM_DumpHashMap/39k 190501036 190226162 After: Benchmark Time(ns) CPU(ns) --------------------------------------------- BM_DumpHashMap/1 202707 200109 BM_DumpHashMap/10 213441 210569 BM_DumpHashMap/200 478641 472350 BM_DumpHashMap/500 980061 967102 BM_DumpHashMap/1000 1863835 1839575 BM_DumpHashMap/5k 8961836 8902540 BM_DumpHashMap/20k 69761497 69322756 BM_DumpHashMap/39k 187437830 186551111 Fixes: 057996380a42 ("bpf: Add batch ops to all htab bpf map") Signed-off-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200218172552.215077-1-brianvv@google.com
2020-02-19ice: Wait for VF to be reset/ready before configurationBrett Creeley
The configuration/command below is failing when the VF in the xml file is already bound to the host iavf driver. pci_0000_af_0_0.xml: <interface type='hostdev' managed='yes'> <source> <address type='pci' domain='0x0000' bus='0xaf' slot='0x0' function='0x0'/> </source> <mac address='00:de:ad:00:11:01'/> </interface> > virsh attach-device domain_name pci_0000_af_0_0.xml error: Failed to attach device from pci_0000_af_0_0.xml error: Cannot set interface MAC/vlanid to 00:de:ad:00:11:01/0 for ifname ens1f1 vf 0: Device or resource busy This is failing because the VF has not been completely removed/reset after being unbound (via the virsh command above) from the host iavf driver and ice_set_vf_mac() checks if the VF is disabled before waiting for the reset to finish. Fix this by waiting for the VF remove/reset process to happen before checking if the VF is disabled. Also, since many functions for VF administration on the PF were more or less calling the same 3 functions (ice_wait_on_vf_reset(), ice_is_vf_disabled(), and ice_check_vf_init()) move these into the helper function ice_check_vf_ready_for_cfg(). Then call this function in any flow that attempts to configure/query a VF from the PF. Lastly, increase the maximum wait time in ice_wait_on_vf_reset() to 800ms, and modify/add the #define(s) that determine the wait time. This was done for robustness because in rare/stress cases VF removal can take a max of ~800ms and previously the wait was a max of ~300ms. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-19ice: Don't tell the OS that link is going downMichal Swiatkowski
Remove code that tell the OS that link is going down when user change flow control via ethtool. When link is up it isn't certain that link goes down after 0x0605 aq command. If link doesn't go down, OS thinks that link is down, but physical link is up. To reset this state user have to take interface down and up. If link goes down after 0x0605 command, FW send information about that and after that driver tells the OS that the link goes down. So this code in ethtool is unnecessary. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-19ice: Don't reject odd values of usecs set by userBrett Creeley
Currently if a user sets an odd [tx|rx]-usecs value through ethtool, the request is denied because the hardware is set to have an ITR granularity of 2us. This caused poor customer experience. Fix this by aligning to a register allowed value, which results in rounding down. Also, print a once per ring container type message to be clear about our intentions. Also, change the ITR_TO_REG define to be the bitwise and of the ITR setting and the ICE_ITR_MASK. This makes the purpose of ITR_TO_REG more obvious. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-19bridge: br_stp: Use built-in RCU list checkingMadhuparna Bhowmik
list_for_each_entry_rcu() has built-in RCU and lock checking. Pass cond argument to list_for_each_entry_rcu() to silence false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled by default. Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19nfc: pn544: Fix occasional HW initialization failureDmitry Osipenko
The PN544 driver checks the "enable" polarity during of driver's probe and it's doing that by turning ON and OFF NFC with different polarities until enabling succeeds. It takes some time for the hardware to power-down, and thus, to deassert the IRQ that is raised by turning ON the hardware. Since the delay after last power-down of the polarity-checking process is missed in the code, the interrupt may trigger immediately after installing the IRQ handler (right after the checking is done), which results in IRQ handler trying to touch the disabled HW and ends with marking NFC as 'DEAD' during of the driver's probe: pn544_hci_i2c 1-002a: NFC: nfc_en polarity : active high pn544_hci_i2c 1-002a: NFC: invalid len byte shdlc: llc_shdlc_recv_frame: NULL Frame -> link is dead This patch fixes the occasional NFC initialization failure on Nexus 7 device. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19x86/cpu/amd: Enable the fixed Instructions Retired counter IRPERFKim Phillips
Commit aaf248848db50 ("perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter") added support for access to the free-running counter via 'perf -e msr/irperf/', but when exercised, it always returns a 0 count: BEFORE: $ perf stat -e instructions,msr/irperf/ true Performance counter stats for 'true': 624,833 instructions 0 msr/irperf/ Simply set its enable bit - HWCR bit 30 - to make it start counting. Enablement is restricted to all machines advertising IRPERF capability, except those susceptible to an erratum that makes the IRPERF return bad values. That erratum occurs in Family 17h models 00-1fh [1], but not in F17h models 20h and above [2]. AFTER (on a family 17h model 31h machine): $ perf stat -e instructions,msr/irperf/ true Performance counter stats for 'true': 621,690 instructions 622,490 msr/irperf/ [1] Revision Guide for AMD Family 17h Models 00h-0Fh Processors [2] Revision Guide for AMD Family 17h Models 30h-3Fh Processors The revision guides are available from the bugzilla Link below. [ bp: Massage commit message. ] Fixes: aaf248848db50 ("perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter") Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Link: http://lkml.kernel.org/r/20200214201805.13830-1-kim.phillips@amd.com
2020-02-19net: hsr: Pass lockdep expression to RCU listsAmol Grover
node_db is traversed using list_for_each_entry_rcu outside an RCU read-side critical section but under the protection of hsr->list_lock. Hence, add corresponding lockdep expression to silence false-positive warnings, and harden RCU lists. Signed-off-by: Amol Grover <frextrite@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19Merge tag 'mlx5-fixes-2020-02-18' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2020-02-18 This series introduces some fixes to mlx5 driver. Please pull and let me know if there is any problem. For -stable v5.3 ('net/mlx5: Fix sleep while atomic in mlx5_eswitch_get_vepa') For -stable v5.4 ('net/mlx5: DR, Fix matching on vport gvmi') ('net/mlx5e: Fix crash in recovery flow without devlink reporter') For -stable v5.5 ('net/mlx5e: Reset RQ doorbell counter before moving RQ state from RST to RDY') ('net/mlx5e: Don't clear the whole vf config when switching modes') ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-19Merge tag 'iommu-fixes-v5.6-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Compile warning fix for the Intel IOMMU driver - Fix kdump boot with Intel IOMMU enabled and in passthrough mode - Disable AMD IOMMU on a Laptop/Embedded platform because the delay it introduces in DMA transactions causes screen flickering there with 4k monitors - Make domain_free function in QCOM IOMMU driver robust and not leak memory/dereference NULL pointers - Fix ARM-SMMU module parameter prefix names * tag 'iommu-fixes-v5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/arm-smmu: Restore naming of driver parameter prefix iommu/qcom: Fix bogus detach logic iommu/amd: Disable IOMMU on Stoney Ridge systems iommu/vt-d: Simplify check in identity_mapping() iommu/vt-d: Remove deferred_attach_domain() iommu/vt-d: Do deferred attachment in iommu_need_mapping() iommu/vt-d: Move deferred device attachment into helper function iommu/vt-d: Add attach_deferred() helper iommu/vt-d: Fix compile warning from intel-svm.h
2020-02-19Merge tag 'sound-5.6-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "The only largish change in this pull request is about the revert of the recent max98090 and its relevant patches due to regressions. Other than that, all small fixes for ALSA core (covering KCSAN fuzzer warnings in ALSA sequencer and rawmidi), Intel SOF HD-audio fixes, AMD ACP fixes, usual HD-audio quirks, and various ASoC fixes" * tag 'sound-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda: Use scnprintf() for printing texts for sysfs/procfs ALSA: hda/realtek - Apply quirk for yet another MSI laptop ASoC: sun8i-codec: Fix setting DAI data format ALSA: hda/realtek - Apply quirk for MSI GP63, too ASoC: amd: ACP needs to be powered off in BIOS. ASoC: hdmi-codec: set plugged_cb to NULL when component removing ASoC: dapm: remove snd_soc_dapm_put_enum_double_locked ASoC: max98090: revert invalid fix for handling SHDN ALSA: rawmidi: Avoid bit fields for state flags ALSA: seq: Fix concurrent access to queue current tick/time ALSA: seq: Avoid concurrent access to queue flags ASoC: codec2codec: avoid invalid/double-free of pcm runtime ASoC: amd: Buffer Size instead of MAX Buffer ASoC: SOF: Intel: hda: move i915 init earlier ASoC: SOF: Intel: hda: fix ordering bug in resume flow ALSA: hda: do not override bus codec_mask in link_get() ASoC: atmel: fix atmel_ssc_set_audio link failure ASoC: fsl_sai: Fix exiting path on probing failure
2020-02-20nvme: Fix uninitialized-variable warningKeith Busch
gcc may detect a false positive on nvme using an unintialized variable if setting features fails. Since this is not a fast path, explicitly initialize this variable to suppress the warning. Reported-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-02-19s390/qdio: fill SBALEs with absolute addressesJulian Wiedmann
sbale->addr holds an absolute address (or for some FCP usage, an opaque request ID), and should only be used with proper virt/phys translation. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-02-19s390/qdio: fill SL with absolute addressesJulian Wiedmann
As the comment says, sl->sbal holds an absolute address. qeth currently solves this through wild casting, while zfcp doesn't care. Handle this properly in the code that actually builds the SL. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Steffen Maier <maier@linux.ibm.com> [for qdio] Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-02-19x86/boot/compressed: Don't declare __force_order in kaslr_64.cH.J. Lu
GCC 10 changed the default to -fno-common, which leads to LD arch/x86/boot/compressed/vmlinux ld: arch/x86/boot/compressed/pgtable_64.o:(.bss+0x0): multiple definition of `__force_order'; \ arch/x86/boot/compressed/kaslr_64.o:(.bss+0x0): first defined here make[2]: *** [arch/x86/boot/compressed/Makefile:119: arch/x86/boot/compressed/vmlinux] Error 1 Since __force_order is already provided in pgtable_64.c, there is no need to declare __force_order in kaslr_64.c. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200124181811.4780-1-hjl.tools@gmail.com
2020-02-19drm/amdgpu/display: clean up hdcp workqueue handlingAlex Deucher
Use the existence of the workqueue itself to determine when to enable HDCP features rather than sprinkling asic checks all over the code. Also add a check for the existence of the hdcp workqueue in the irq handling on the off chance we get and HPD RX interrupt with the CP bit set. This avoids a crash if the driver doesn't support HDCP for a particular asic. Fixes: 96a3b32e67236f ("drm/amd/display: only enable HDCP for DCN+") Bug: https://bugzilla.kernel.org/show_bug.cgi?id=206519 Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-19drm/amdgpu: add is_raven_kicker judgement for raven1changzhu
The rlc version of raven_kicer_rlc is different from the legacy rlc version of raven_rlc. So it needs to add a judgement function for raven_kicer_rlc and avoid disable GFXOFF when loading raven_kicer_rlc. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-20nvme-pci: Use single IRQ vector for old Apple modelsAndy Shevchenko
People reported that old Apple machines are not working properly if the non-first IRQ vector is in use. Set quirk for that models to limit IRQ to use first vector only. Based on original patch by GitHub user npx001. Link: https://github.com/Dunedan/mbp-2016-linux/issues/9 Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Leif Liddy <leif.liddy@gmail.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-02-20nvme/pci: Add sleep quirk for Samsung and Toshiba drivesShyjumon N
The Samsung SSD SM981/PM981 and Toshiba SSD KBG40ZNT256G on the Lenovo C640 platform experience runtime resume issues when the SSDs are kept in sleep/suspend mode for long time. This patch applies the 'Simple Suspend' quirk to these configurations. With this patch, the issue had not been observed in a 1+ day test. Reviewed-by: Jon Derrick <jonathan.derrick@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Shyjumon N <shyjumon.n@intel.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-02-19arm64: memory: Add missing brackets to untagged_addr() macroWill Deacon
Add brackets around the evaluation of the 'addr' parameter to the untagged_addr() macro so that the cast to 'u64' applies to the result of the expression. Cc: <stable@vger.kernel.org> Fixes: 597399d0cb91 ("arm64: tags: Preserve tags for addresses translated via TTBR1") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-02-19s390: remove obsolete ieee_emulation_warningsStephen Kitt
s390 math emulation was removed with commit 5a79859ae0f3 ("s390: remove 31 bit support"), rendering ieee_emulation_warnings useless. The code still built because it was protected by CONFIG_MATHEMU, which was no longer selectable. This patch removes the sysctl_ieee_emulation_warnings declaration and the sysctl entry declaration. Link: https://lkml.kernel.org/r/20200214172628.3598516-1-steve@sk2.org Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Stephen Kitt <steve@sk2.org> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-02-19iommu/arm-smmu: Restore naming of driver parameter prefixWill Deacon
Extending the Arm SMMU driver to allow for modular builds changed KBUILD_MODNAME to be "arm_smmu_mod" so that a single module could be built from the multiple existing object files without the need to rename any source files. This inadvertently changed the name of the driver parameters, which may lead to runtime issues if bootloaders are relying on the old names for correctness (e.g. "arm-smmu.disable_bypass=0"). Although MODULE_PARAM_PREFIX can be overridden to restore the old naming for builtin parameters, only the new name is matched by modprobe and so loading the driver as a module would cause parameters specified on the kernel command line to be ignored. Instead, rename "arm_smmu_mod" to "arm_smmu". Whilst it's a bit of a bodge, this allows us to create a single module without renaming any files and makes use of the fact that underscores and hyphens can be used interchangeably in parameter names. Cc: Robin Murphy <robin.murphy@arm.com> Cc: Russell King <linux@armlinux.org.uk> Reported-by: Li Yang <leoyang.li@nxp.com> Fixes: cd221bd24ff5 ("iommu/arm-smmu: Allow building as a module") Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-02-19iommu/qcom: Fix bogus detach logicRobin Murphy
Currently, the implementation of qcom_iommu_domain_free() is guaranteed to do one of two things: WARN() and leak everything, or dereference NULL and crash. That alone is terrible, but in fact the whole idea of trying to track the liveness of a domain via the qcom_domain->iommu pointer as a sanity check is full of fundamentally flawed assumptions. Make things robust and actually functional by not trying to be quite so clever. Reported-by: Brian Masney <masneyb@onstation.org> Tested-by: Brian Masney <masneyb@onstation.org> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Fixes: 0ae349a0f33f ("iommu/qcom: Add qcom_iommu") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Stephan Gerhold <stephan@gerhold.net> Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-02-19iommu/amd: Disable IOMMU on Stoney Ridge systemsKai-Heng Feng
Serious screen flickering when Stoney Ridge outputs to a 4K monitor. Use identity-mapping and PCI ATS doesn't help this issue. According to Alex Deucher, IOMMU isn't enabled on Windows, so let's do the same here to avoid screen flickering on 4K monitor. Cc: Alex Deucher <alexander.deucher@amd.com> Bug: https://gitlab.freedesktop.org/drm/amd/issues/961 Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-02-18net/mlx5: DR, Handle reformat capability over sw-steering tablesErez Shitrit
On flow table creation, send the relevant flags according to what the FW currently supports. When FW doesn't support reformat option over SW-steering managed table, the driver shouldn't pass this. Fixes: 988fd6b32d07 ("net/mlx5: DR, Pass table flags at creation to lower layer") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5: Fix lowest FDB pool sizePaul Blakey
The pool sizes represent the pool sizes in the fw. when we request a pool size from fw, it will return the next possible group. We track how many pools the fw has left and start requesting groups from the big to the small. When we start request 4k group, which doesn't exists in fw, fw wants to allocate the next possible size, 64k, but will fail since its exhausted. The correct smallest pool size in fw is 128 and not 4k. Fixes: 39ac237ce009 ("net/mlx5: E-Switch, Refactor chains and priorities") Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5e: Don't clear the whole vf config when switching modesDmytro Linkin
There is no need to reset all vf config (except link state) between legacy and switchdev modes changes. Also, set link state to AUTO, when legacy enabled. Fixes: 3b83b6c2e024 ("net/mlx5e: Clear VF config when switching modes") Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5: DR, Fix matching on vport gvmiHamdan Igbaria
Set vport gvmi in the tag, only when source gvmi is set in the bit mask. Fixes: 26d688e3 ("net/mlx5: DR, Add Steering entry (STE) utilities") Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5e: Fix crash in recovery flow without devlink reporterAya Levin
When health reporters are not supported, recovery function is invoked directly, not via devlink health reporters. In this direct flow, the recover function input parameter was passed incorrectly and is causing a kernel oops. This patch is fixing the input parameter. Following call trace is observed on rx error health reporting. Internal error: Oops: 96000007 [#1] PREEMPT SMP Process kworker/u16:4 (pid: 4584, stack limit = 0x00000000c9e45703) Call trace: mlx5e_rx_reporter_err_rq_cqe_recover+0x30/0x164 [mlx5_core] mlx5e_health_report+0x60/0x6c [mlx5_core] mlx5e_reporter_rq_cqe_err+0x6c/0x90 [mlx5_core] mlx5e_rq_err_cqe_work+0x20/0x2c [mlx5_core] process_one_work+0x168/0x3d0 worker_thread+0x58/0x3d0 kthread+0x108/0x134 Fixes: c50de4af1d63 ("net/mlx5e: Generalize tx reporter's functionality") Signed-off-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5e: Reset RQ doorbell counter before moving RQ state from RST to RDYAya Levin
Initialize RQ doorbell counters to zero prior to moving an RQ from RST to RDY state. Per HW spec, when RQ is back to RDY state, the descriptor ID on the completion is reset. The doorbell record must comply. Fixes: 8276ea1353a4 ("net/mlx5e: Report and recover from CQE with error on RQ") Signed-off-by: Aya Levin <ayal@mellanox.com> Reported-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>