Age | Commit message (Collapse) | Author |
|
Check through our work list for additional items. This normally
will only have one item, but occasionally may have another
job waiting. There really is no need reschedule ourself here.
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The event notification queue is set up a little differently in the
NIC and so the notifyq q and cq descriptor structures need to be
contiguous, which got missed in an earlier patch that separated
out the q and cq descriptor allocations. That patch was aimed at
making the big tx and rx descriptor queue allocations easier to
manage - the notifyq is much smaller and doesn't need to be split.
This patch simply adds an if/else and slightly different code for
the notifyq descriptor allocation.
Fixes: ea5a8b09dc3a ("ionic: reduce contiguous memory allocation requirement")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
drivers/s390/net/ctcm_fsms.h: fsm_action_nop - only declaration left
after commit 04885948b101 ("ctc: removal of the old ctc driver")
drivers/s390/net/ctcm_mpc.h: ctcmpc_open - only declaration left after
commit 293d984f0e36 ("ctcm: infrastructure for replaced ctc driver")
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
- Add/delete some blanks, white spaces and braces.
- Fix misindentations.
- Adjust a deprecated header include, and htons() conversion.
- Remove extra 'return' statements.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace our custom version of netdev_name().
Once we started to allocate the netdev at probe time with
commit d3d1b205e89f ("s390/qeth: allocate netdevice early"), this
stopped working as intended anyway.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The discipline struct is a fixed group of function pointers.
So declare the L2 and L3 disciplines as constant.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For OSA devices that are _not_ configured in prio-queue mode, give users
the option of selecting the number of active TX queues.
This requires setting up the HW queues with a reasonable default QoS
value in the QIB's PQUE parm area.
As with the other device types, we bring up the device with a minimal
number of TX queues for compatibility reasons.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use a proper struct, and only program the QIB extensions for devices
where they are supported.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When re-initializing a device, we can hit a situation where
qeth_osa_set_output_queues() detects that it supports more or less
HW TX queues than before. Right now we adjust dev->real_num_tx_queues
from right there, but
1. it's getting more & more complicated to cover all cases, and
2. we can't re-enable the actually expected number of TX queues later
because we lost the needed information.
So keep track of the wanted TX queues (on initial setup, and whenever
its changed via .set_channels), and later use that information when
re-enabling the netdevice.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
From: Saeed Mahameed <saeedm@nvidia.com>
====================
This series introduces some fixes to mlx5 driver.
v1->v2:
- Patch #1 Don't return while mutex is held. (Dave)
v2->v3:
- Drop patch #1, will consider a better approach (Jakub)
- use cpu_relax() instead of cond_resched() (Jakub)
- while(i--) to reveres a loop (Jakub)
- Drop old mellanox email sign-off and change the committer email
(Jakub)
Please pull and let me know if there is any problem.
For -stable v4.15
('net/mlx5e: Fix VLAN cleanup flow')
('net/mlx5e: Fix VLAN create flow')
For -stable v4.16
('net/mlx5: Fix request_irqs error flow')
For -stable v5.4
('net/mlx5e: Add resiliency in Striding RQ mode for packets larger than MTU')
('net/mlx5: Avoid possible free of command entry while timeout comp handler')
For -stable v5.7
('net/mlx5e: Fix return status when setting unsupported FEC mode')
For -stable v5.8
('net/mlx5e: Fix race condition on nhe->n pointer in neigh update')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for v5.10
Third set of patches for v5.10. Lots of iwlwifi patches this time, but
also few patches ath11k and of course smaller changes to other
drivers.
Major changes:
rtw88
* properly recover from firmware crashes on 8822c
* dump firmware crash log
iwlwifi
* protected Target Wake Time (TWT) implementation
* support disabling 5.8GHz channels via ACPI
* support VHT extended NSS capability
* enable Target Wake Time (TWT) by default
ath11k
* improvements to QCA6390 PCI support to make it more usable
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Via the OCELOT_MASK_MODE_REDIRECT flag put in the IS2 action vector, it
is possible to replace previous forwarding decisions with the port mask
installed in this rule.
I have studied Table 54 "MASK_MODE and PORT_MASK Combinations" from the
VSC7514 documentation and it appears to behave sanely when this rule is
installed in either lookup 0 or 1. Namely, a redirect in lookup 1 will
overwrite the forwarding decision taken by any entry in lookup 0.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The issue which led to the introduction of this check was that MAC_ETYPE
rules, such as filters on dst_mac and src_mac, would only match non-IP
frames. There is a knob in VCAP_S2_CFG which forces all IP frames to be
treated as non-IP, which is what we're currently doing if the user
requested a dst_mac filter, in order to maintain sanity.
But that knob is actually per IS2 lookup. And the good thing with
exposing the lookups to the user via tc chains is that we're now able to
offload MAC_ETYPE keys to one lookup, and IP keys to the other lookup.
So let's do that.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We were installing TCAM rules with the LOOKUP field as unmasked, meaning
that all entries were matching on all lookups. Now that lookups are
exposed as individual chains, let's make the LOOKUP explicit when
offloading TCAM entries.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
VCAP ES0 is an egress VCAP operating on all outgoing frames.
This patch added ES0 driver to support vlan push action of tc filter.
Usage:
tc filter add dev swp1 egress protocol 802.1Q flower indev swp0 skip_sw \
vlan_id 1 vlan_prio 1 action vlan push id 2 priority 2
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
VCAP IS1 is a VCAP module which can filter on the most common L2/L3/L4
Ethernet keys, and modify the results of the basic QoS classification
and VLAN classification based on those flow keys.
There are 3 VCAP IS1 lookups, mapped over chains 10000, 11000 and 12000.
Currently the driver is hardcoded to use IS1_ACTION_TYPE_NORMAL half
keys.
Note that the VLAN_MANGLE has been omitted for now. In hardware, the
VCAP_IS1_ACT_VID_REPLACE_ENA field replaces the classified VLAN
(metadata associated with the frame) and not the VLAN from the header
itself. There are currently some issues which need to be addressed when
operating in standalone, or in bridge with vlan_filtering=0 modes,
because in those cases the switch ports have VLAN awareness disabled,
and changing the classified VLAN to anything other than the pvid causes
the packets to be dropped. Another issue is that on egress, we expect
port tagging to push the classified VLAN, but port tagging is disabled
in the modes mentioned above, so although the classified VLAN is
replaced, it is not visible in the packet transmitted by the switch.
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For Ocelot switches, there are 2 ingress pipelines for flow offload
rules: VCAP IS1 (Ingress Classification) and IS2 (Security Enforcement).
IS1 and IS2 support different sets of actions. The pipeline order for a
packet on ingress is:
Basic classification -> VCAP IS1 -> VCAP IS2
Furthermore, IS1 is looked up 3 times, and IS2 is looked up twice (each
TCAM entry can be configured to match only on the first lookup, or only
on the second, or on both etc).
Because the TCAMs are completely independent in hardware, and because of
the fixed pipeline, we actually have very limited options when it comes
to offloading complex rules to them while still maintaining the same
semantics with the software data path.
This patch maps flow offload rules to ingress TCAMs according to a
predefined chain index number. There is going to be a script in
selftests that clarifies the usage model.
There is also an egress TCAM (VCAP ES0, the Egress Rewriter), which is
modeled on top of the default chain 0 of the egress qdisc, because it
doesn't have multiple lookups.
Suggested-by: Allan W. Nielsen <allan.nielsen@microchip.com>
Co-developed-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since the mscc_ocelot_switch_lib is common between a pure switchdev and
a DSA driver, the procedure of retrieving a net_device for a certain
port index differs, as those are registered by their individual
front-ends.
Up to now that has been dealt with by always passing the port index to
the switch library, but now, we're going to need to work with net_device
pointers from the tc-flower offload, for things like indev, or mirred.
It is not desirable to refactor that, so let's make sure that the flower
offload core has the ability to translate between a net_device and a
port index properly.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
At this stage, the tc-flower offload of mscc_ocelot can only delegate
rules to the VCAP IS2 security enforcement block. These rules have, in
hardware, separate bits for policing and for overriding the destination
port mask and/or copying to the CPU. So it makes sense that we attempt
to expose some more of that low-level complexity instead of simply
choosing between a single type of action.
Something similar happens with the VCAP IS1 block, where the same action
can contain enable bits for VLAN classification and for QoS
classification at the same time.
So model the action structure after the hardware description, and let
the high-level ocelot_flower.c construct an action vector from multiple
tc actions.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
Another set of changes, this time with:
* lots more S1G band support
* 6 GHz scanning, finally
* kernel-doc fixes
* non-split wiphy dump fixes in nl80211
* various other small cleanups/features
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In iscsci driver, iscsi_tcp_segment_map() uses the following code to
check whether the page should or not be handled by sendpage:
if (!recv && page_count(sg_page(sg)) >= 1 && !PageSlab(sg_page(sg)))
The "page_count(sg_page(sg)) >= 1 && !PageSlab(sg_page(sg)" part is to
make sure the page can be sent to network layer's zero copy path. This
part is exactly what sendpage_ok() does.
This patch uses use sendpage_ok() in iscsi_tcp_segment_map() to replace
the original open coded checks.
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: Cong Wang <amwang@redhat.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Chris Leech <cleech@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In _drbd_send_page() a page is checked by following code before sending
it by kernel_sendpage(),
(page_count(page) < 1) || PageSlab(page)
If the check is true, this page won't be send by kernel_sendpage() and
handled by sock_no_sendpage().
This kind of check is exactly what macro sendpage_ok() does, which is
introduced into include/linux/net.h to solve a similar send page issue
in nvme-tcp code.
This patch uses macro sendpage_ok() to replace the open coded checks to
page type and refcount in _drbd_send_page(), as a code cleanup.
Signed-off-by: Coly Li <colyli@suse.de>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently nvme_tcp_try_send_data() doesn't use kernel_sendpage() to
send slab pages. But for pages allocated by __get_free_pages() without
__GFP_COMP, which also have refcount as 0, they are still sent by
kernel_sendpage() to remote end, this is problematic.
The new introduced helper sendpage_ok() checks both PageSlab tag and
page_count counter, and returns true if the checking page is OK to be
sent by kernel_sendpage().
This patch fixes the page checking issue of nvme_tcp_try_send_data()
with sendpage_ok(). If sendpage_ok() returns true, send this page by
kernel_sendpage(), otherwise use sock_no_sendpage to handle this page.
Signed-off-by: Coly Li <colyli@suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jan Kara <jack@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Vlastimil Babka <vbabka@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use more generic eth_platform_get_mac_address() which can get a MAC
address from other than DT platform specific sources too. Check if the
obtained address is valid.
Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
v2:
If reading the MAC address from eeprom fail don't throw an error, use randomly
generated MAC instead. Either way the adapter will soldier on and the return
type of set_ethernet_addr() can be reverted to void.
v1:
Fix a bug in set_ethernet_addr() which does not take into account possible
errors (or partial reads) returned by its helpers. This can potentially lead to
writing random data into device's MAC address registers.
Signed-off-by: Petko Manolov <petko.manolov@konsulko.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Some pin control fixes here. All of them are driver fixes, the Intel
Cherryview being the most interesting one.
- Fix a mux problem for I2C in the MVEBU driver.
- Fix a really hairy inversion problem in the Intel Cherryview
driver.
- Fix the register for the sdc2_clk in the Qualcomm SM8250 driver.
- Check the virtual GPIO boot failur in the Mediatek driver"
* tag 'pinctrl-v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: mediatek: check mtk_is_virt_gpio input parameter
pinctrl: qcom: sm8250: correct sdc2_clk
pinctrl: cherryview: Preserve CHV_PADCTRL1_INVRXTX_TXDATA flag on GPIOs
pinctrl: mvebu: Fix i2c sda definition for 98DX3236
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- Fix rockchip regression in rockchip_pcie_valid_device() (Lorenzo
Pieralisi)
- Add Pali Rohár as aardvark PCI maintainer (Pali Rohár)
* tag 'pci-v5.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
MAINTAINERS: Add Pali Rohár as aardvark PCI maintainer
PCI: rockchip: Fix bus checks in rockchip_pcie_valid_device()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two patches in driver frameworks. The iscsi one corrects a bug induced
by a BPF change to network locking and the other is a regression we
introduced"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: iscsi: iscsi_tcp: Avoid holding spinlock while calling getpeername()
scsi: target: Fix lun lookup for TARGET_SCF_LOOKUP_LUN_FROM_TAG case
|
|
Indicate to the DSA receive path that we need to untage the bridge PVID,
this allows us to remove the dsa_untag_bridge_pvid() calls from
net/dsa/tag_brcm.c.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Current neigh update event handler implementation takes reference to
neighbour structure, assigns it to nhe->n, tries to schedule workqueue task
and releases the reference if task was already enqueued. This results
potentially overwriting existing nhe->n pointer with another neighbour
instance, which causes double release of the instance (once in neigh update
handler that failed to enqueue to workqueue and another one in neigh update
workqueue task that processes updated nhe->n pointer instead of original
one):
[ 3376.512806] ------------[ cut here ]------------
[ 3376.513534] refcount_t: underflow; use-after-free.
[ 3376.521213] Modules linked in: act_skbedit act_mirred act_tunnel_key vxlan ip6_udp_tunnel udp_tunnel nfnetlink act_gact cls_flower sch_ingress openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 mlx5_ib mlx5_core mlxfw pci_hyperv_intf ptp pps_core nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd
grace fscache ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp rpcrdma rdma_ucm ib_umad ib_ipoib ib_iser rdma_cm ib_cm iw_cm rfkill ib_uverbs ib_core sunrpc kvm_intel kvm iTCO_wdt iTCO_vendor_support virtio_net irqbypass net_failover crc32_pclmul lpc_ich i2c_i801 failover pcspkr i2c_smbus mfd_core ghash_clmulni_intel sch_fq_codel drm i2c
_core ip_tables crc32c_intel serio_raw [last unloaded: mlxfw]
[ 3376.529468] CPU: 8 PID: 22756 Comm: kworker/u20:5 Not tainted 5.9.0-rc5+ #6
[ 3376.530399] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[ 3376.531975] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core]
[ 3376.532820] RIP: 0010:refcount_warn_saturate+0xd8/0xe0
[ 3376.533589] Code: ff 48 c7 c7 e0 b8 27 82 c6 05 0b b6 09 01 01 e8 94 93 c1 ff 0f 0b c3 48 c7 c7 88 b8 27 82 c6 05 f7 b5 09 01 01 e8 7e 93 c1 ff <0f> 0b c3 0f 1f 44 00 00 8b 07 3d 00 00 00 c0 74 12 83 f8 01 74 13
[ 3376.536017] RSP: 0018:ffffc90002a97e30 EFLAGS: 00010286
[ 3376.536793] RAX: 0000000000000000 RBX: ffff8882de30d648 RCX: 0000000000000000
[ 3376.537718] RDX: ffff8882f5c28f20 RSI: ffff8882f5c18e40 RDI: ffff8882f5c18e40
[ 3376.538654] RBP: ffff8882cdf56c00 R08: 000000000000c580 R09: 0000000000001a4d
[ 3376.539582] R10: 0000000000000731 R11: ffffc90002a97ccd R12: 0000000000000000
[ 3376.540519] R13: ffff8882de30d600 R14: ffff8882de30d640 R15: ffff88821e000900
[ 3376.541444] FS: 0000000000000000(0000) GS:ffff8882f5c00000(0000) knlGS:0000000000000000
[ 3376.542732] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3376.543545] CR2: 0000556e5504b248 CR3: 00000002c6f10005 CR4: 0000000000770ee0
[ 3376.544483] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 3376.545419] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 3376.546344] PKRU: 55555554
[ 3376.546911] Call Trace:
[ 3376.547479] mlx5e_rep_neigh_update.cold+0x33/0xe2 [mlx5_core]
[ 3376.548299] process_one_work+0x1d8/0x390
[ 3376.548977] worker_thread+0x4d/0x3e0
[ 3376.549631] ? rescuer_thread+0x3e0/0x3e0
[ 3376.550295] kthread+0x118/0x130
[ 3376.550914] ? kthread_create_worker_on_cpu+0x70/0x70
[ 3376.551675] ret_from_fork+0x1f/0x30
[ 3376.552312] ---[ end trace d84e8f46d2a77eec ]---
Fix the bug by moving work_struct to dedicated dynamically-allocated
structure. This enabled every event handler to work on its own private
neighbour pointer and removes the need for handling the case when task is
already enqueued.
Fixes: 232c001398ae ("net/mlx5e: Add support to neighbour update flow")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
When interface is attached while in promiscuous mode and with VLAN
filtering turned off, both configurations are not respected and VLAN
filtering is performed.
There are 2 flows which add the any-vid rules during interface attach:
VLAN creation table and set rx mode. Each is relaying on the other to
add any-vid rules, eventually non of them does.
Fix this by adding any-vid rules on VLAN creation regardless of
promiscuous mode.
Fixes: 9df30601c843 ("net/mlx5e: Restore vlan filter after seamless reset")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Prior to this patch unloading an interface in promiscuous mode with RX
VLAN filtering feature turned off - resulted in a warning. This is due
to a wrong condition in the VLAN rules cleanup flow, which left the
any-vid rules in the VLAN steering table. These rules prevented
destroying the flow group and the flow table.
The any-vid rules are removed in 2 flows, but none of them remove it in
case both promiscuous is set and VLAN filtering is off. Fix the issue by
changing the condition of the VLAN table cleanup flow to clean also in
case of promiscuous mode.
mlx5_core 0000:00:08.0: mlx5_destroy_flow_group:2123:(pid 28729): Flow group 20 wasn't destroyed, refcount > 1
mlx5_core 0000:00:08.0: mlx5_destroy_flow_group:2123:(pid 28729): Flow group 19 wasn't destroyed, refcount > 1
mlx5_core 0000:00:08.0: mlx5_destroy_flow_table:2112:(pid 28729): Flow table 262149 wasn't destroyed, refcount > 1
...
...
------------[ cut here ]------------
FW pages counter is 11560 after reclaiming all pages
WARNING: CPU: 1 PID: 28729 at
drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:660
mlx5_reclaim_startup_pages+0x178/0x230 [mlx5_core]
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
mlx5_function_teardown+0x2f/0x90 [mlx5_core]
mlx5_unload_one+0x71/0x110 [mlx5_core]
remove_one+0x44/0x80 [mlx5_core]
pci_device_remove+0x3e/0xc0
device_release_driver_internal+0xfb/0x1c0
device_release_driver+0x12/0x20
pci_stop_bus_device+0x68/0x90
pci_stop_and_remove_bus_device+0x12/0x20
hv_eject_device_work+0x6f/0x170 [pci_hyperv]
? __schedule+0x349/0x790
process_one_work+0x206/0x400
worker_thread+0x34/0x3f0
? process_one_work+0x400/0x400
kthread+0x126/0x140
? kthread_park+0x90/0x90
ret_from_fork+0x22/0x30
---[ end trace 6283bde8d26170dc ]---
Fixes: 9df30601c843 ("net/mlx5e: Restore vlan filter after seamless reset")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Verify the configured FEC mode is supported by at least a single link
mode before applying the command. Otherwise fail the command and return
"Operation not supported".
Prior to this patch, the command was successful, yet it falsely set all
link modes to FEC auto mode - like configuring FEC mode to auto. Auto
mode is the default configuration if a link mode doesn't support the
configured FEC mode.
Fixes: b5ede32d3329 ("net/mlx5e: Add support for FEC modes based on 50G per lane links")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Declare GRE offload support with respect to the inner protocol. Add a
list of supported inner protocols on which the driver can offload
checksum and GSO. For other protocols, inform the stack to do the needed
operations. There is no noticeable impact on GRE performance.
Fixes: 2729984149e6 ("net/mlx5e: Support TSO and TX checksum offloads for GRE tunnels")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The cited commit introduced the following coverity issue at function
mlx5_tc_ct_rule_to_tuple_nat:
- Memory - corruptions (OVERRUN)
Overrunning array "tuple->ip.src_v6.in6_u.u6_addr32" of 4 4-byte
elements at element index 7 (byte offset 31) using index
"ip6_offset" (which evaluates to 7).
In case of IPv6 destination address rewrite, ip6_offset values are
between 4 to 7, which will cause memory overrun of array
"tuple->ip.src_v6.in6_u.u6_addr32" to array
"tuple->ip.dst_v6.in6_u.u6_addr32".
Fixed by writing the value directly to array
"tuple->ip.dst_v6.in6_u.u6_addr32" in case ip6_offset values are
between 4 to 7.
Fixes: bc562be9674b ("net/mlx5e: CT: Save ct entries tuples in hashtables")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Prior to this fix, in Striding RQ mode the driver was vulnerable when
receiving packets in the range (stride size - headroom, stride size].
Where stride size is calculated by mtu+headroom+tailroom aligned to the
closest power of 2.
Usually, this filtering is performed by the HW, except for a few cases:
- Between 2 VFs over the same PF with different MTUs
- On bluefield, when the host physical function sets a larger MTU than
the ARM has configured on its representor and uplink representor.
When the HW filtering is not present, packets that are larger than MTU
might be harmful for the RQ's integrity, in the following impacts:
1) Overflow from one WQE to the next, causing a memory corruption that
in most cases is unharmful: as the write happens to the headroom of next
packet, which will be overwritten by build_skb(). In very rare cases,
high stress/load, this is harmful. When the next WQE is not yet reposted
and points to existing SKB head.
2) Each oversize packet overflows to the headroom of the next WQE. On
the last WQE of the WQ, where addresses wrap-around, the address of the
remainder headroom does not belong to the next WQE, but it is out of the
memory region range. This results in a HW CQE error that moves the RQ
into an error state.
Solution:
Add a page buffer at the end of each WQE to absorb the leak. Actually
the maximal overflow size is headroom but since all memory units must be
of the same size, we use page size to comply with UMR WQEs. The increase
in memory consumption is of a single page per RQ. Initialize the mkey
with all MTTs pointing to a default page. When the channels are
activated, UMR WQEs will redirect the RX WQEs to the actual memory from
the RQ's pool, while the overflow MTTs remain mapped to the default page.
Fixes: 73281b78a37a ("net/mlx5e: Derive Striding RQ size from MTU")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Increase granularity of the error path to avoid unneeded free/release.
Fix the cleanup to be symmetric to the order of creation.
Fixes: 0ddf543226ac ("xdp/mlx5: setup xdp_rxq_info")
Fixes: 422d4c401edd ("net/mlx5e: RX, Split WQ objects for different RQ types")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Fix error flow handling in request_irqs which try to free irq
that we failed to request.
It fixes the below trace.
WARNING: CPU: 1 PID: 7587 at kernel/irq/manage.c:1684 free_irq+0x4d/0x60
CPU: 1 PID: 7587 Comm: bash Tainted: G W OE 4.15.15-1.el7MELLANOXsmp-x86_64 #1
Hardware name: Advantech SKY-6200/SKY-6200, BIOS F2.00 08/06/2020
RIP: 0010:free_irq+0x4d/0x60
RSP: 0018:ffffc9000ef47af0 EFLAGS: 00010282
RAX: ffff88001476ae00 RBX: 0000000000000655 RCX: 0000000000000000
RDX: ffff88001476ae00 RSI: ffffc9000ef47ab8 RDI: ffff8800398bb478
RBP: ffff88001476a838 R08: ffff88001476ae00 R09: 000000000000156d
R10: 0000000000000000 R11: 0000000000000004 R12: ffff88001476a838
R13: 0000000000000006 R14: ffff88001476a888 R15: 00000000ffffffe4
FS: 00007efeadd32740(0000) GS:ffff88047fc40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc9cc010008 CR3: 00000001a2380004 CR4: 00000000007606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
mlx5_irq_table_create+0x38d/0x400 [mlx5_core]
? atomic_notifier_chain_register+0x50/0x60
mlx5_load_one+0x7ee/0x1130 [mlx5_core]
init_one+0x4c9/0x650 [mlx5_core]
pci_device_probe+0xb8/0x120
driver_probe_device+0x2a1/0x470
? driver_allows_async_probing+0x30/0x30
bus_for_each_drv+0x54/0x80
__device_attach+0xa3/0x100
pci_bus_add_device+0x4a/0x90
pci_iov_add_virtfn+0x2dc/0x2f0
pci_enable_sriov+0x32e/0x420
mlx5_core_sriov_configure+0x61/0x1b0 [mlx5_core]
? kstrtoll+0x22/0x70
num_vf_store+0x4b/0x70 [mlx5_core]
kernfs_fop_write+0x102/0x180
__vfs_write+0x26/0x140
? rcu_all_qs+0x5/0x80
? _cond_resched+0x15/0x30
? __sb_start_write+0x41/0x80
vfs_write+0xad/0x1a0
SyS_write+0x42/0x90
do_syscall_64+0x60/0x110
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Fixes: 24163189da48 ("net/mlx5: Separate IRQ request/free from EQ life cycle")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
In case of pci is offline reclaim_pages_cmd() will still try to call
the FW to release FW pages, cmd_exec() in this case will return a silent
success without actually calling the FW.
This is wrong and will cause page leaks, what we should do is to detect
pci offline or command interface un-available before tying to access the
FW and manually release the FW pages in the driver.
In this patch we share the code to check for FW command interface
availability and we call it in sensitive places e.g. reclaim_pages_cmd().
Alternative fix:
1. Remove MLX5_CMD_OP_MANAGE_PAGES form mlx5_internal_err_ret_value,
command success simulation list.
2. Always Release FW pages even if cmd_exec fails in reclaim_pages_cmd().
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
It is possible that new command entry index allocation will temporarily
fail. The new command holds the semaphore, so it means that a free entry
should be ready soon. Add one second retry mechanism before returning an
error.
Patch "net/mlx5: Avoid possible free of command entry while timeout comp
handler" increase the possibility to bump into this temporarily failure
as it delays the entry index release for non-callback commands.
Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <eranbe@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Once driver detects a command interface command timeout, it warns the
user and returns timeout error to the caller. In such case, the entry of
the command is not evacuated (because only real event interrupt is allowed
to clear command interface entry). If the HW event interrupt
of this entry will never arrive, this entry will be left unused forever.
Command interface entries are limited and eventually we can end up without
the ability to post a new command.
In addition, if driver will not consume the EQE of the lost interrupt and
rearm the EQ, no new interrupts will arrive for other commands.
Add a resiliency mechanism for manually polling the command EQ in case of
a command timeout. In case resiliency mechanism will find non-handled EQE,
it will consume it, and the command interface will be fully functional
again. Once the resiliency flow finished, wait another 5 seconds for the
command interface to complete for this command entry.
Define mlx5_cmd_eq_recover() to manage the cmd EQ polling resiliency flow.
Add an async EQ spinlock to avoid races between resiliency flows and real
interrupts that might run simultaneously.
Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Upon command completion timeout, driver simulates a forced command
completion. In a rare case where real interrupt for that command arrives
simultaneously, it might release the command entry while the forced
handler might still access it.
Fix that by adding an entry refcount, to track current amount of allowed
handlers. Command entry to be released only when this refcount is
decremented to zero.
Command refcount is always initialized to one. For callback commands,
command completion handler is the symmetric flow to decrement it. For
non-callback commands, it is wait_func().
Before ringing the doorbell, increment the refcount for the real completion
handler. Once the real completion handler is called, it will decrement it.
For callback commands, once the delayed work is scheduled, increment the
refcount. Upon callback command completion handler, we will try to cancel
the timeout callback. In case of success, we need to decrement the callback
refcount as it will never run.
In addition, gather the entry index free and the entry free into a one
flow for all command types release.
Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
As part of driver unload, it destroys the commands EQ (via FW command).
As the commands EQ is destroyed, FW will not generate EQEs for any command
that driver sends afterwards. Driver should poll for later commands status.
Driver commands mode metadata is updated before the commands EQ is
actually destroyed. This can lead for double completion handle by the
driver (polling and interrupt), if a command is executed and completed by
FW after the mode was changed, but before the EQ was destroyed.
Fix that by using the mlx5_cmd_allowed_opcode mechanism to guarantee
that only DESTROY_EQ command can be executed during this time period.
Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
"Two fixes for this week:
- The addition of a symbol export for clint_time_val, which has been
inlined into some timex functions and can be used by drivers.
- A fix to avoid calling get_cycles() before the timers have been
probed.
These both only effect !MMU systems"
* tag 'riscv-for-linus-5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: Check clint_time_val before use
clocksource: clint: Export clint_time_val for modules
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix one more issue related to the recent RCU-lockdep changes, a
typo in documentation and add a missing return statement to
intel_pstate.
Specifics:
- Fix up RCU usage for cpuidle on the ARM imx6q platform (Ulf
Hansson)
- Fix typo in the PM documentation (Yoann Congal)
- Add return statement that is missing after recent changes in the
intel_pstate driver (Zhang Rui)"
* tag 'pm-5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ARM: imx6q: Fixup RCU usage for cpuidle
Documentation: PM: Fix a reStructuredText syntax error
cpufreq: intel_pstate: Fix missing return statement
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull IIO fixes from Greg KH:
"Here are two small IIO driver fixes for 5.9-rc8 that resolve some
reported issues:
- driver name fixed in one driver
- device name typo fixed
Both have been in linux-next for a while with no reported problems"
* tag 'staging-5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: adc: qcom-spmi-adc5: fix driver name
iio: adc: ad7124: Fix typo in device name
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"Some late GPIO fixes for the v5.9 series:
- Fix compiler warnings on the OMAP when PM is disabled
- Clear the interrupt when setting edge sensitivity on the Spreadtrum
driver.
- Fix up spurious interrupts on the TC35894.
- Support threaded interrupts on the Siox controller.
- Fix resource leaks on the mockup driver.
- Fix line event handling in syscall compatible mode for the
character device.
- Fix an unitialized variable in the PCA953A driver.
- Fix access to all GPIO IRQs on the Aspeed AST2600.
- Fix line direction on the AMD FCH driver.
- Use the bitmap API instead of compiler intrinsics for bit
manipulation in the PCA953x driver"
* tag 'gpio-v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: pca953x: Correctly initialize registers 6 and 7 for PCA957x
gpio: pca953x: Use bitmap API over implicit GCC extension
gpio: amd-fch: correct logic of GPIO_LINE_DIRECTION
gpio: aspeed: fix ast2600 bank properties
gpio/aspeed-sgpio: don't enable all interrupts by default
gpio/aspeed-sgpio: enable access to all 80 input & output sgpios
gpio: pca953x: Fix uninitialized pending variable
gpiolib: Fix line event handling in syscall compatible mode
gpio: mockup: fix resource leak in error path
gpio: siox: explicitly support only threaded irqs
gpio: tc35894: fix up tc35894 interrupt configuration
gpio: sprd: Clear interrupt when setting the type as edge
gpio: omap: Fix warnings if PM is disabled
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
- Fix deadlock when removing MEMSTICK host
- Workaround broken CMDQ on Intel GLK based IRBIS models
* tag 'mmc-v5.9-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci: Workaround broken command queuing on Intel GLK based IRBIS models
memstick: Skip allocating card when removing host
|
|
2 recent commits:
cfae58ed681c ("platform/x86: intel-vbtn: Only blacklist SW_TABLET_MODE
on the 9 / "Laptop" chasis-type")
1fac39fd0316 ("platform/x86: intel-vbtn: Also handle tablet-mode switch on
"Detachable" and "Portable" chassis-types")
Enabled reporting of SW_TABLET_MODE on more devices since the vbtn ACPI
interface is used by the firmware on some of those devices to report this.
Testing has shown that unconditionally enabling SW_TABLET_MODE reporting
on all devices with a chassis type of 8 ("Portable") or 10 ("Notebook")
which support the VGBS method is a very bad idea.
Many of these devices are normal laptops (non 2-in-1) models with a VGBS
which always returns 0, which we translate to SW_TABLET_MODE=1. This in
turn causes userspace (libinput) to suppress events from the builtin
keyboard and touchpad, making the laptop essentially unusable.
Since the problem of wrongly reporting SW_TABLET_MODE=1 in combination
with libinput, leads to a non-usable system. Where as OTOH many people will
not even notice when SW_TABLET_MODE is not being reported, this commit
changes intel_vbtn_has_switches() to use a DMI based allow-list.
The new DMI based allow-list matches on the 31 ("Convertible") and
32 ("Detachable") chassis-types, as these clearly are 2-in-1s and
so far if they support the intel-vbtn ACPI interface they all have
properly working SW_TABLET_MODE reporting.
Besides these 2 generic matches, it also contains model specific matches
for 2-in-1 models which use a different chassis-type and which are known
to have properly working SW_TABLET_MODE reporting.
This has been tested on the following 2-in-1 devices:
Dell Venue 11 Pro 7130 vPro
HP Pavilion X2 10-p002nd
HP Stream x360 Convertible PC 11
Medion E1239T
Fixes: cfae58ed681c ("platform/x86: intel-vbtn: Only blacklist SW_TABLET_MODE on the 9 / "Laptop" chasis-type")
BugLink: https://forum.manjaro.org/t/keyboard-and-touchpad-only-work-on-kernel-5-6/22668
BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1175599
Cc: Barnabás Pőcze <pobrn@protonmail.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
the HP Pavilion 11 x360"
After discussion, see the Link tag, it appears that this is not good enough.
So, revert it now and apply a better fix.
This reverts commit d823346876a970522ff9e4d2b323c9b734dcc4de.
Link: https://lore.kernel.org/platform-driver-x86/s5hft71klxl.wl-tiwai@suse.de/
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|