Age | Commit message (Collapse) | Author |
|
After calling ethtool -g it was not possible to adjust the TX ring
size again:
# ethtool -g eth1
Ring parameters for eth1:
Pre-set maximums:
RX: 4
RX Mini: n/a
RX Jumbo: n/a
TX: 10
Current hardware settings:
RX: 4
RX Mini: n/a
RX Jumbo: n/a
TX: 10
# ethtool -G eth1 tx 8
netlink error: Invalid argument
The reason for this is that the readonly setting rx_pending get
initialized and after that the range check in qcaspi_set_ringparam()
fails regardless of the provided parameter. So fix this by accepting
the exposed RX defaults. Instead of adding another magic number
better use a new define here.
Fixes: 291ab06ecf67 ("net: qualcomm: new Ethernet over SPI driver for QCA7000")
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://lore.kernel.org/r/20231206141222.52029-3-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The qca_spi driver stop and restart the SPI kernel thread
(via ndo_stop & ndo_open) in case of TX ring changes. This is
a big issue because it allows userspace to prevent restart of
the SPI kernel thread (via signals). A subsequent change of
TX ring wrongly assume a valid spi_thread pointer which result
in a crash.
So prevent this by stopping the network traffic handling and
temporary park the SPI thread.
Fixes: 291ab06ecf67 ("net: qualcomm: new Ethernet over SPI driver for QCA7000")
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://lore.kernel.org/r/20231206141222.52029-2-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Initialise various workqueue tasks and queue interrupt poll task
before the first invocation of any control net APIs. Since
octep_ctrl_net_get_info was called before the control net receive
work task was initialised or even the interrupt poll task was
queued, the function call wasn't returning actual firmware
info queried from Octeon.
Fixes: 8d6198a14e2b ("octeon_ep: support to fetch firmware info")
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Reviewed-by: Michal Schmidt <mschmidt@redhat.com>
Link: https://lore.kernel.org/r/20231206135228.2591659-1-srasheed@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In __team_options_register, team_options are allocated and appended to
the team's option_list.
If one option instance allocation fails, the "inst_rollback" cleanup
path frees the previously allocated options but doesn't remove them from
the team's option_list.
This leaves dangling pointers that can be dereferenced later by other
parts of the team driver that iterate over options.
This patch fixes the cleanup path to remove the dangling pointers from
the list.
As far as I can tell, this uaf doesn't have much security implications
since it would be fairly hard to exploit (an attacker would need to make
the allocation of that specific small object fail) but it's still nice
to fix.
Cc: stable@vger.kernel.org
Fixes: 80f7c6683fe0 ("team: add support for per-port options")
Signed-off-by: Florent Revest <revest@chromium.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20231206123719.1963153-1-revest@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes 2023-12-04
This series provides bug fixes to mlx5 driver.
V1->V2:
- Drop commit #9 ("net/mlx5e: Forbid devlink reload if IPSec rules are
offloaded"), we are working on a better fix
Please pull and let me know if there is any problem.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Provide a list of valid protocols for which the driver will provide
it's deferred xmit handler.
When using DSA_TAG_PROTO_KSZ8795 protocol, it does not provide a
"connect" method, therefor ksz_connect() is not allocating ksz_tagger_data.
This avoids the following null pointer dereference:
ksz_connect_tag_protocol from dsa_register_switch+0x9ac/0xee0
dsa_register_switch from ksz_switch_register+0x65c/0x828
ksz_switch_register from ksz_spi_probe+0x11c/0x168
ksz_spi_probe from spi_probe+0x84/0xa8
spi_probe from really_probe+0xc8/0x2d8
Fixes: ab32f56a4100 ("net: dsa: microchip: ptp: add packet transmission timestamping")
Signed-off-by: Sean Nyekjaer <sean@geanix.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20231206071655.1626479-1-sean@geanix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The neighbour event callback call the function nfp_tun_write_neigh,
this function will take a mutex lock and it is in soft irq context,
change the work queue to process the neighbour event.
Move the nfp_tun_write_neigh function out of range rcu_read_lock/unlock()
in function nfp_tunnel_request_route_v4 and nfp_tunnel_request_route_v6.
Fixes: abc210952af7 ("nfp: flower: tunnel neigh support bond offload")
CC: stable@vger.kernel.org # 6.2+
Signed-off-by: Hui Zhou <hui.zhou@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-12-05 (ice, i40e, iavf)
This series contains updates to ice, i40e and iavf drivers.
Michal fixes incorrect usage of VF MSIX value and index calculation for
ice.
Marcin restores disabling of Rx VLAN filtering which was inadvertently
removed for ice.
Ivan Vecera corrects improper messaging of MFS port for i40e.
Jake fixes incorrect checking of coalesce values on iavf.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
iavf: validate tx_coalesce_usecs even if rx_coalesce_usecs is zero
i40e: Fix unexpected MFS warning message
ice: Restore fix disabling RX VLAN filtering
ice: change vfs.num_msix_per to vf->num_msix
====================
Link: https://lore.kernel.org/r/20231205211918.2123019-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In 4a56212774ac, USXGMII support was added for 6393X, but this was
lost in the PCS conversion (the blamed commit), most likely because
these efforts where more or less done in parallel.
Restore this feature by porting Michal's patch to fit the new
implementation.
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Michal Smulski <michal.smulski@ooma.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Fixes: e5b732a275f5 ("net: dsa: mv88e6xxx: convert 88e639x to phylink_pcs")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Link: https://lore.kernel.org/r/20231205221359.3926018-1-tobias@waldekranz.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
On new silicons the TX channels for transmit level has increased.
This patch fixes the respective register offset range to
configure the newly added channels.
Fixes: b279bbb3314e ("octeontx2-af: NIX Tx scheduler queue config support")
Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
If mcs resources are attached to PF/VF. These resources need
to be freed on FLR. This patch add missing mcs flr call on PF FLR.
Fixes: bd69476e86fc ("octeontx2-af: cn10k: mcs: Install a default TCAM for normal traffic")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This patch adds the miss mcs stats register
for mcs supported platforms.
Fixes: 9312150af8da ("octeontx2-af: cn10k: mcs: Support for stats collection")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
On latest silicon versions SA cam entries increased to 256.
This patch fixes the datatype of sa_entries in mcs_hw_info
struct to u16 to hold 256 entries.
Fixes: 080bbd19c9dd ("octeontx2-af: cn10k: mcs: Add mailboxes for port related operations")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When MCS external bypass is disabled, MCS returns additional
2 credits(32B) for every packet Tx'ed on LMAC. To account for
these extra credits, NIX_AF_TX_LINKX_NORM_CREDIT.CC_MCS_CNT
needs to be configured as otherwise NIX Tx credits would overflow
and will never be returned to idle state credit count
causing issues with credit control and MTU change.
This patch fixes the same by configuring CC_MCS_CNT at probe
time for MCS enabled SoC's
Fixes: bd69476e86fc ("octeontx2-af: cn10k: mcs: Install a default TCAM for normal traffic")
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
If a xge port just connect with an optical module and no fiber,
it may have a fake link up because there may be interference on
the hardware. This patch adds an anti-shake to avoid the problem.
And the time of anti-shake is base on tests.
Fixes: b917078c1c10 ("net: hns: Add ACPI support to check SFP present")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Upon changing the tx feature, the hns driver will modify the
maybe_stop_tx() and fill_desc() functions, if the modify happens
during packet sending, will cause the hardware and software
pointers do not match, and the port can not work anymore.
This patch deletes the maybe_stop_tx() and fill_desc() functions
modification when setting tx feature, and use the skb_is_gro()
to determine which functions to use in the tx path.
Fixes: 38f616da1c28 ("net:hns: Add support of ethtool TSO set option for Hip06 in HNS")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
If is_ptp_ring == true in the loop of __aq_ring_xdp_clean function,
then a timestamp is stored from a packet in a field of skb object,
which is not allocated at the moment of the call (skb == NULL).
Generalize aq_ptp_extract_ts and other affected functions so they don't
work with struct sk_buff*, but with struct skb_shared_hwtstamps*.
Found by Linux Verification Center (linuxtesting.org) with SVACE
Fixes: 26efaef759a1 ("net: atlantic: Implement xdp data plane")
Signed-off-by: Daniil Maximov <daniil31415it@gmail.com>
Reviewed-by: Igor Russkikh <irusskikh@marvell.com>
Link: https://lore.kernel.org/r/20231204085810.1681386-1-daniil31415it@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The ASUS USB-C2500 is an RTL8156 based 2.5G Ethernet controller.
Add the vendor and product ID values to the driver. This makes Ethernet
work with the adapter.
Signed-off-by: Kelly Kane <kelly@hawknetworks.com>
Link: https://lore.kernel.org/r/20231203011712.6314-1-kelly@hawknetworks.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently ionic_dim_work() is incorrect when in
split interrupt mode. This is because the interrupt
rate is only being changed for the Rx side even for
dim running on Tx. Fix this by using the qcq from
the container_of macro. Also, introduce some local
variables for a bit of cleanup.
Fixes: a6ff85e0a2d9 ("ionic: remove intr coalesce update from napi")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20231204192234.21017-3-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Our friendly kernel test robot has reminded us that with a new
check we have a warning about a potential string truncation.
In this case it really doesn't hurt anything, but it is worth
addressing especially since there really is no reason to reserve
so many bytes for our queue names. It seems that cutting the
queue name buffer length in half stops the complaint.
Fixes: c06107cabea3 ("ionic: more ionic name tweaks")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202311300201.lO8v7mKU-lkp@intel.com/
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20231204192234.21017-2-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When flow_indr_dev_register() fails, bnxt_init_tc will free
bp->tc_info through kfree(). However, the caller function
bnxt_init_one() will ignore this failure and call
bnxt_shutdown_tc() on failure of bnxt_dl_register(), where
a use-after-free happens. Fix this issue by setting
bp->tc_info to NULL after kfree().
Fixes: 627c89d00fb9 ("bnxt_en: flow_offload: offload tunnel decap rules via indirect callbacks")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Link: https://lore.kernel.org/r/20231204024004.8245-1-dinghao.liu@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Based on the previous allocated packet, page_offset can be not null
in veth_convert_skb_to_xdp_buff routine.
Take into account page fragment offset during the skb paged area copy
in veth_convert_skb_to_xdp_buff().
Fixes: 2d0de67da51a ("net: veth: use newly added page pool API for veth with xdp")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Link: https://lore.kernel.org/r/eddfe549e7e626870071930964ac3c38a1dc8068.1701702000.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In __iavf_set_coalesce, the driver checks both ec->rx_coalesce_usecs and
ec->tx_coalesce_usecs for validity. It does this via a chain if if/else-if
blocks. If every single branch of the series of if statements exited, this
would be fine. However, the rx_coalesce_usecs is checked against zero to
print an informative message if use_adaptive_rx_coalesce is enabled. If
this check is true, it short circuits the entire chain of statements,
preventing validation of the tx_coalesce_usecs field.
Indeed, since commit e792779e6b63 ("iavf: Prevent changing static ITR
values if adaptive moderation is on") the iavf driver actually rejects any
change to the tx_coalesce_usecs or rx_coalesce_usecs when
use_adaptive_tx_coalesce or use_adaptive_rx_coalesce is enabled, making
this checking a bit redundant.
Fix this error by removing the unnecessary and redundant checks for
use_adaptive_rx_coalesce and use_adaptive_tx_coalesce. Since zero is a
valid value, and since the tx_coalesce_usecs and rx_coalesce_usecs fields
are already unsigned, remove the minimum value check. This allows assigning
an ITR value ranging from 0-8160 as described by the printed message.
Fixes: 65e87c0398f5 ("i40evf: support queue-specific settings for interrupt moderation")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Commit 3a2c6ced90e1 ("i40e: Add a check to see if MFS is set") added
a warning message that reports unexpected size of port's MFS (max
frame size) value. This message use for the port number local
variable 'i' that is wrong.
In i40e_probe() this 'i' variable is used only to iterate VSIs
to find FDIR VSI:
<code>
...
/* if FDIR VSI was set up, start it now */
for (i = 0; i < pf->num_alloc_vsi; i++) {
if (pf->vsi[i] && pf->vsi[i]->type == I40E_VSI_FDIR) {
i40e_vsi_open(pf->vsi[i]);
break;
}
}
...
</code>
So the warning message use for the port number index of FDIR VSI
if this exists or pf->num_alloc_vsi if not.
Fix the message by using 'pf->hw.port' for the port number.
Fixes: 3a2c6ced90e1 ("i40e: Add a check to see if MFS is set")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Fix setting dis_rx_filtering depending on whether port vlan is being
turned on or off. This was originally fixed in commit c793f8ea15e3 ("ice:
Fix disabling Rx VLAN filtering with port VLAN enabled"), but while
refactoring ice_vf_vsi_init_vlan_ops(), the fix has been lost. Restore the
fix along with the original comment from that change.
Also delete duplicate lines in ice_port_vlan_on().
Fixes: 2946204b3fa8 ("ice: implement bridge port vlan")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
vfs::num_msix_per should be only used as default value for
vf->num_msix. For other use cases vf->num_msix should be used, as VF can
have different MSI-X amount values.
Fix incorrect register index calculation. vfs::num_msix_per and
pf->sriov_base_vector shouldn't be used after implementation of changing
MSI-X amount on VFs. Instead vf->first_vector_idx should be used, as it
is storing value for first irq index.
Fixes: fe1c5ca2fe76 ("ice: implement num_msix field per VF")
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The rvu_dl will be freed in rvu_npa_health_reporters_destroy(rvu_dl)
after the create_workqueue fails, and after that free, the rvu_dl will
be translate back through rvu_npa_health_reporters_create,
rvu_health_reporters_create, and rvu_register_dl. Finally it goes to the
err_dl_health label, being freed again in
rvu_health_reporters_destroy(rvu) by rvu_npa_health_reporters_destroy.
In the second calls of rvu_npa_health_reporters_destroy, however,
it uses rvu_dl->rvu_npa_health_reporter, which is already freed at
the end of rvu_npa_health_reporters_destroy in the first call.
So this patch prevents the first destroy by instantly returning -ENONMEN
when create_workqueue fails. In addition, since the failure of
create_workqueue is the only entrence of label err, it has been
integrated into the error-handling path of create_workqueue.
Fixes: f1168d1e207c ("octeontx2-af: Add devlink health reporters for NPA")
Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Geethasowjanya Akula <gakula@marvell.com>
Link: https://lore.kernel.org/r/20231202095902.3264863-1-alexious@zju.edu.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The mlx5_esw_offloads_devlink_port() function returns error pointers, not
NULL.
Fixes: 7bef147a6ab6 ("net/mlx5: Don't skip vport check")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Previously, when comparing the net namespaces, the case where the netdev
doesn't exist wasn't taken into account, and therefore can cause a crash.
In such a case, the comparing function should return false, as there is no
netdev->net to compare the devlink->net to.
Furthermore, this will result in an attempt to enter switchdev mode
without a netdev to fail, and which is the desired result as there is no
meaning in switchdev mode without a net device.
Fixes: 662404b24a4c ("net/mlx5e: Block entering switchdev mode with ns inconsistency")
Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Gavi Teitz <gavi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Current sync reset flow is not supported when PCIe bridge connected
directly to mlx5 device has HotPlug interrupt enabled and can be
triggered on link state change event. Return nack on reset request in
such case.
Fixes: 92501fa6e421 ("net/mlx5: Ack on sync_reset_request only if PF can do reset_now")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
If post action is not supported, eg. ignore_flow_level is not
supported, don't offload post action rule. Otherwise, will hit
panic [1].
Fix it by checking if post action table is valid or not.
[1]
[445537.863880] BUG: unable to handle page fault for address: ffffffffffffffb1
[445537.864617] #PF: supervisor read access in kernel mode
[445537.865244] #PF: error_code(0x0000) - not-present page
[445537.865860] PGD 70683a067 P4D 70683a067 PUD 70683c067 PMD 0
[445537.866497] Oops: 0000 [#1] PREEMPT SMP NOPTI
[445537.867077] CPU: 19 PID: 248742 Comm: tc Kdump: loaded Tainted: G O 6.5.0+ #1
[445537.867888] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[445537.868834] RIP: 0010:mlx5e_tc_post_act_add+0x51/0x130 [mlx5_core]
[445537.869635] Code: c0 0d 00 00 e8 20 96 c6 d3 48 85 c0 0f 84 e5 00 00 00 c7 83 b0 01 00 00 00 00 00 00 49 89 c5 31 c0 31 d2 66 89 83 b4 01 00 00 <49> 8b 44 24 10 83 23 df 83 8b d8 01 00 00 04 48 89 83 c0 01 00 00
[445537.871318] RSP: 0018:ffffb98741cef428 EFLAGS: 00010246
[445537.871962] RAX: 0000000000000000 RBX: ffff8df341167000 RCX: 0000000000000001
[445537.872704] RDX: 0000000000000000 RSI: ffffffff954844e1 RDI: ffffffff9546e9cb
[445537.873430] RBP: ffffb98741cef448 R08: 0000000000000020 R09: 0000000000000246
[445537.874160] R10: 0000000000000000 R11: ffffffff943f73ff R12: ffffffffffffffa1
[445537.874893] R13: ffff8df36d336c20 R14: ffffffffffffffa1 R15: ffff8df341167000
[445537.875628] FS: 00007fcd6564f800(0000) GS:ffff8dfa9ea00000(0000) knlGS:0000000000000000
[445537.876425] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[445537.877090] CR2: ffffffffffffffb1 CR3: 00000003b5884001 CR4: 0000000000770ee0
[445537.877832] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[445537.878564] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[445537.879300] PKRU: 55555554
[445537.879797] Call Trace:
[445537.880263] <TASK>
[445537.880713] ? show_regs+0x6e/0x80
[445537.881232] ? __die+0x29/0x70
[445537.881731] ? page_fault_oops+0x85/0x160
[445537.882276] ? search_exception_tables+0x65/0x70
[445537.882852] ? kernelmode_fixup_or_oops+0xa2/0x120
[445537.883432] ? __bad_area_nosemaphore+0x18b/0x250
[445537.884019] ? bad_area_nosemaphore+0x16/0x20
[445537.884566] ? do_kern_addr_fault+0x8b/0xa0
[445537.885105] ? exc_page_fault+0xf5/0x1c0
[445537.885623] ? asm_exc_page_fault+0x2b/0x30
[445537.886149] ? __kmem_cache_alloc_node+0x1df/0x2a0
[445537.886717] ? mlx5e_tc_post_act_add+0x51/0x130 [mlx5_core]
[445537.887431] ? mlx5e_tc_post_act_add+0x30/0x130 [mlx5_core]
[445537.888172] alloc_flow_post_acts+0xfb/0x1c0 [mlx5_core]
[445537.888849] parse_tc_actions+0x582/0x5c0 [mlx5_core]
[445537.889505] parse_tc_fdb_actions+0xd7/0x1f0 [mlx5_core]
[445537.890175] __mlx5e_add_fdb_flow+0x1ab/0x2b0 [mlx5_core]
[445537.890843] mlx5e_add_fdb_flow+0x56/0x120 [mlx5_core]
[445537.891491] ? debug_smp_processor_id+0x1b/0x30
[445537.892037] mlx5e_tc_add_flow+0x79/0x90 [mlx5_core]
[445537.892676] mlx5e_configure_flower+0x305/0x450 [mlx5_core]
[445537.893341] mlx5e_rep_setup_tc_cls_flower+0x3d/0x80 [mlx5_core]
[445537.894037] mlx5e_rep_setup_tc_cb+0x5c/0xa0 [mlx5_core]
[445537.894693] tc_setup_cb_add+0xdc/0x220
[445537.895177] fl_hw_replace_filter+0x15f/0x220 [cls_flower]
[445537.895767] fl_change+0xe87/0x1190 [cls_flower]
[445537.896302] tc_new_tfilter+0x484/0xa50
Fixes: f0da4daa3413 ("net/mlx5e: Refactor ct to use post action infrastructure")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Automatic Verification <verifier@nvidia.com>
Reviewed-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shachar Kagan <skagan@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
|
|
Due to the cited patch, devlink health commands take devlink lock and
this may result in deadlock for mlx5e_tx_reporter as it takes local
state_lock before calling devlink health report and on the other hand
devlink health commands such as diagnose for same reporter take local
state_lock after taking devlink lock (see kernel log below).
To fix it, remove local state_lock from mlx5e_tx_timeout_work() before
calling devlink_health_report() and take care to cancel the work before
any call to close channels, which may free the SQs that should be
handled by the work. Before cancel_work_sync(), use current_work() to
check we are not calling it from within the work, as
mlx5e_tx_timeout_work() itself may close the channels and reopen as part
of recovery flow.
While removing state_lock from mlx5e_tx_timeout_work() keep rtnl_lock to
ensure no change in netdev->real_num_tx_queues, but use rtnl_trylock()
and a flag to avoid deadlock by calling cancel_work_sync() before
closing the channels while holding rtnl_lock too.
Kernel log:
======================================================
WARNING: possible circular locking dependency detected
6.0.0-rc3_for_upstream_debug_2022_08_30_13_10 #1 Not tainted
------------------------------------------------------
kworker/u16:2/65 is trying to acquire lock:
ffff888122f6c2f8 (&devlink->lock_key#2){+.+.}-{3:3}, at: devlink_health_report+0x2f1/0x7e0
but task is already holding lock:
ffff888121d20be0 (&priv->state_lock){+.+.}-{3:3}, at: mlx5e_tx_timeout_work+0x70/0x280 [mlx5_core]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&priv->state_lock){+.+.}-{3:3}:
__mutex_lock+0x12c/0x14b0
mlx5e_rx_reporter_diagnose+0x71/0x700 [mlx5_core]
devlink_nl_cmd_health_reporter_diagnose_doit+0x212/0xa50
genl_family_rcv_msg_doit+0x1e9/0x2f0
genl_rcv_msg+0x2e9/0x530
netlink_rcv_skb+0x11d/0x340
genl_rcv+0x24/0x40
netlink_unicast+0x438/0x710
netlink_sendmsg+0x788/0xc40
sock_sendmsg+0xb0/0xe0
__sys_sendto+0x1c1/0x290
__x64_sys_sendto+0xdd/0x1b0
do_syscall_64+0x3d/0x90
entry_SYSCALL_64_after_hwframe+0x46/0xb0
-> #0 (&devlink->lock_key#2){+.+.}-{3:3}:
__lock_acquire+0x2c8a/0x6200
lock_acquire+0x1c1/0x550
__mutex_lock+0x12c/0x14b0
devlink_health_report+0x2f1/0x7e0
mlx5e_health_report+0xc9/0xd7 [mlx5_core]
mlx5e_reporter_tx_timeout+0x2ab/0x3d0 [mlx5_core]
mlx5e_tx_timeout_work+0x1c1/0x280 [mlx5_core]
process_one_work+0x7c2/0x1340
worker_thread+0x59d/0xec0
kthread+0x28f/0x330
ret_from_fork+0x1f/0x30
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&priv->state_lock);
lock(&devlink->lock_key#2);
lock(&priv->state_lock);
lock(&devlink->lock_key#2);
*** DEADLOCK ***
4 locks held by kworker/u16:2/65:
#0: ffff88811a55b138 ((wq_completion)mlx5e#2){+.+.}-{0:0}, at: process_one_work+0x6e2/0x1340
#1: ffff888101de7db8 ((work_completion)(&priv->tx_timeout_work)){+.+.}-{0:0}, at: process_one_work+0x70f/0x1340
#2: ffffffff84ce8328 (rtnl_mutex){+.+.}-{3:3}, at: mlx5e_tx_timeout_work+0x53/0x280 [mlx5_core]
#3: ffff888121d20be0 (&priv->state_lock){+.+.}-{3:3}, at: mlx5e_tx_timeout_work+0x70/0x280 [mlx5_core]
stack backtrace:
CPU: 1 PID: 65 Comm: kworker/u16:2 Not tainted 6.0.0-rc3_for_upstream_debug_2022_08_30_13_10 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core]
Call Trace:
<TASK>
dump_stack_lvl+0x57/0x7d
check_noncircular+0x278/0x300
? print_circular_bug+0x460/0x460
? find_held_lock+0x2d/0x110
? __stack_depot_save+0x24c/0x520
? alloc_chain_hlocks+0x228/0x700
__lock_acquire+0x2c8a/0x6200
? register_lock_class+0x1860/0x1860
? kasan_save_stack+0x1e/0x40
? kasan_set_free_info+0x20/0x30
? ____kasan_slab_free+0x11d/0x1b0
? kfree+0x1ba/0x520
? devlink_health_do_dump.part.0+0x171/0x3a0
? devlink_health_report+0x3d5/0x7e0
lock_acquire+0x1c1/0x550
? devlink_health_report+0x2f1/0x7e0
? lockdep_hardirqs_on_prepare+0x400/0x400
? find_held_lock+0x2d/0x110
__mutex_lock+0x12c/0x14b0
? devlink_health_report+0x2f1/0x7e0
? devlink_health_report+0x2f1/0x7e0
? mutex_lock_io_nested+0x1320/0x1320
? trace_hardirqs_on+0x2d/0x100
? bit_wait_io_timeout+0x170/0x170
? devlink_health_do_dump.part.0+0x171/0x3a0
? kfree+0x1ba/0x520
? devlink_health_do_dump.part.0+0x171/0x3a0
devlink_health_report+0x2f1/0x7e0
mlx5e_health_report+0xc9/0xd7 [mlx5_core]
mlx5e_reporter_tx_timeout+0x2ab/0x3d0 [mlx5_core]
? lockdep_hardirqs_on_prepare+0x400/0x400
? mlx5e_reporter_tx_err_cqe+0x1b0/0x1b0 [mlx5_core]
? mlx5e_tx_reporter_timeout_dump+0x70/0x70 [mlx5_core]
? mlx5e_tx_reporter_dump_sq+0x320/0x320 [mlx5_core]
? mlx5e_tx_timeout_work+0x70/0x280 [mlx5_core]
? mutex_lock_io_nested+0x1320/0x1320
? process_one_work+0x70f/0x1340
? lockdep_hardirqs_on_prepare+0x400/0x400
? lock_downgrade+0x6e0/0x6e0
mlx5e_tx_timeout_work+0x1c1/0x280 [mlx5_core]
process_one_work+0x7c2/0x1340
? lockdep_hardirqs_on_prepare+0x400/0x400
? pwq_dec_nr_in_flight+0x230/0x230
? rwlock_bug.part.0+0x90/0x90
worker_thread+0x59d/0xec0
? process_one_work+0x1340/0x1340
kthread+0x28f/0x330
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x1f/0x30
</TASK>
Fixes: c90005b5f75c ("devlink: Hold the instance lock in health callbacks")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
IPsec FDB offload can only work with FW steering as of now,
disable the cap upon non FW steering.
And since the IPSec cap is dynamic now based on steering mode.
Cleanup the resources if they exist instead of checking the
IPsec cap again.
Fixes: edd8b295f9e2 ("Merge branch 'mlx5-ipsec-packet-offload-support-in-eswitch-mode'")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
After IPSec TX tables are destroyed, the flow rules in TC rhashtable,
which have the destination to IPSec, are restored to the original
one, the uplink.
However, when the device is in switchdev mode and unload driver with
IPSec rules configured, TC rhashtable cleanup is done before IPSec
cleanup, which means tc_ht->tbl is already freed when walking TC
rhashtable, in order to restore the destination. So add the checking
before walking to avoid unexpected behavior.
Fixes: d1569537a837 ("net/mlx5e: Modify and restore TC rules for IPSec TX rules")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
Currently eswitch mode_lock is so heavy, for example, it's locked
during the whole process of the mode change, which may need to hold
other locks. As the mode_lock is also used by IPSec to block mode and
encap change now, it is easy to cause lock dependency.
Since some of protections are also done by devlink lock, the eswitch
mode_lock is not needed at those places, and thus the possibility of
lockdep issue is reduced.
Fixes: c8e350e62fc5 ("net/mlx5e: Make TC and IPsec offloads mutually exclusive on a netdev")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
IPsec NAT-T packets are UDP encapsulated packets over ESP normal ones.
In case they arrive to RX, the SPI and ESP are located in inner header,
while the check was performed on outer header instead.
That wrong check caused to the situation where received rekeying request
was missed and caused to rekey timeout, which "compensated" this failure
by completing rekeying.
Fixes: d65954934937 ("net/mlx5e: Support IPsec NAT-T functionality")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
After IPsec decryption it isn't enough to only check the IPsec syndrome
but need to also check the ASO syndrome in order to verify that the
operation was actually successful.
Verify that both syndromes are actually zero and in case not drop the
packet and increment the appropriate flow counter for the drop reason.
Fixes: 6b5c45e16e43 ("net/mlx5e: Configure IPsec packet offload flow steering")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
After previous commit, which unified various IPsec creation modes,
there is no need to have struct mlx5e_ipsec_rx exposed in global
IPsec header. Move it to ipsec_fs.c to be placed together with
already existing struct mlx5e_ipsec_tx.
Fixes: 1762f132d542 ("net/mlx5e: Support IPsec packet offload for RX in switchdev mode")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
Change normal IPsec flow to use the same creation/destruction functions
for status flow table as that of ESW, which first of all refines the
code to have less code duplication.
And more importantly, the ESW status table handles IPsec syndrome
checks at steering by HW, which is more efficient than the previous
behaviour we had where it was copied to WQE meta data and checked
by the driver.
Fixes: 1762f132d542 ("net/mlx5e: Support IPsec packet offload for RX in switchdev mode")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
According to RFC4303, section "3.3.3. Sequence Number Generation",
the first packet sent using a given SA will contain a sequence
number of 1.
However if user didn't set seq/oseq, the HW used zero as first sequence
packet number. Such misconfiguration causes to drop of first packet
if replay window protection was enabled in SA.
To fix it, set sequence number to be at least 1.
Fixes: 7db21ef4566e ("net/mlx5e: Set IPsec replay sequence numbers")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
Users can configure IPsec replay window size, but mlx5 driver didn't
honor their choice and set always 32bits. Fix assignment logic to
configure right size from the beginning.
Fixes: 7db21ef4566e ("net/mlx5e: Set IPsec replay sequence numbers")
Reviewed-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
The status bits of register MAC_FPE_CTRL_STS are clear on read. Using
32-bit read for MAC_FPE_CTRL_STS in dwmac5_fpe_configure() and
dwmac5_fpe_send_mpacket() clear the status bits. Then the stmmac interrupt
handler missing FPE event status and leads to FPE handshaking failure and
retries.
To avoid clear status bits of MAC_FPE_CTRL_STS in dwmac5_fpe_configure()
and dwmac5_fpe_send_mpacket(), add fpe_csr to stmmac_fpe_cfg structure to
cache the control bits of MAC_FPE_CTRL_STS and to avoid reading
MAC_FPE_CTRL_STS in those methods.
Fixes: 5a5586112b92 ("net: stmmac: support FPE link partner hand-shaking procedure")
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jianheng Zhang <Jianheng.Zhang@synopsys.com>
Link: https://lore.kernel.org/r/CY5PR12MB637225A7CF529D5BE0FBE59CBF81A@CY5PR12MB6372.namprd12.prod.outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
coalescing
The current adaptive interrupt coalescing code updates only rx
packet stats for dim algorithm. This patch also updates tx packet
stats which will be useful when there is only tx traffic.
Also moved configuring hardware adaptive interrupt setting to
driver dim callback.
Fixes: 6e144b47f560 ("octeontx2-pf: Add support for adaptive interrupt coalescing")
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Link: https://lore.kernel.org/r/20231201053330.3903694-1-sumang@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Probe of Sohard Arcnet cards fails,
if 2 or more cards are installed in a system.
See kernel log:
[ 2.759203] arcnet: arcnet loaded
[ 2.763648] arcnet:com20020: COM20020 chipset support (by David Woodhouse et al.)
[ 2.770585] arcnet:com20020_pci: COM20020 PCI support
[ 2.772295] com20020 0000:02:00.0: enabling device (0000 -> 0003)
[ 2.772354] (unnamed net_device) (uninitialized): PLX-PCI Controls
...
[ 3.071301] com20020 0000:02:00.0 arc0-0 (uninitialized): PCI COM20020: station FFh found at F080h, IRQ 101.
[ 3.071305] com20020 0000:02:00.0 arc0-0 (uninitialized): Using CKP 64 - data rate 2.5 Mb/s
[ 3.071534] com20020 0000:07:00.0: enabling device (0000 -> 0003)
[ 3.071581] (unnamed net_device) (uninitialized): PLX-PCI Controls
...
[ 3.369501] com20020 0000:07:00.0: Led pci:green:tx:0-0 renamed to pci:green:tx:0-0_1 due to name collision
[ 3.369535] com20020 0000:07:00.0: Led pci:red:recon:0-0 renamed to pci:red:recon:0-0_1 due to name collision
[ 3.370586] com20020 0000:07:00.0 arc0-0 (uninitialized): PCI COM20020: station E1h found at C000h, IRQ 35.
[ 3.370589] com20020 0000:07:00.0 arc0-0 (uninitialized): Using CKP 64 - data rate 2.5 Mb/s
[ 3.370608] com20020: probe of 0000:07:00.0 failed with error -5
commit 5ef216c1f848 ("arcnet: com20020-pci: add rotary index support")
changes the device name of all COM20020 based PCI cards,
even if only some cards support this:
snprintf(dev->name, sizeof(dev->name), "arc%d-%d", dev->dev_id, i);
The error happens because all Sohard Arcnet cards would be called arc0-0,
since the Sohard Arcnet cards don't have a PLX rotary coder.
I.e. EAE Arcnet cards have a PLX rotary coder,
which sets the first decimal, ensuring unique devices names.
This patch adds two new card feature flags to indicate
which cards support LEDs and the PLX rotary coder.
For EAE based cards the names still depend on the PLX rotary coder
(untested, since missing EAE hardware).
For Sohard based cards, this patch will result in devices
being called arc0, arc1, ... (tested).
Signed-off-by: Thomas Reichinger <thomas.reichinger@sohard.de>
Fixes: 5ef216c1f848 ("arcnet: com20020-pci: add rotary index support")
Link: https://lore.kernel.org/r/20231130113503.6812-1-thomas.reichinger@sohard.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Delay loops in r8152 should break out if RTL8152_INACCESSIBLE is set
so that they don't delay too long if the device becomes
inaccessible. Add the break to the loop in r8153_aldps_en().
Fixes: 4214cc550bf9 ("r8152: check if disabling ALDPS is finished")
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Delay loops in r8152 should break out if RTL8152_INACCESSIBLE is set
so that they don't delay too long if the device becomes
inaccessible. Add the break to the loop in r8153_pre_firmware_1().
Fixes: 9370f2d05a2a ("r8152: support request_firmware for RTL8153")
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Delay loops in r8152 should break out if RTL8152_INACCESSIBLE is set
so that they don't delay too long if the device becomes
inaccessible. Add the break to the loop in
r8156b_wait_loading_flash().
Fixes: 195aae321c82 ("r8152: support new chips")
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Previous commits added checks for RTL8152_INACCESSIBLE in the loops in
the driver. There are still a few more that keep tripping the driver
up in error cases and make things take longer than they should. Add
those in.
All the loops that are part of this commit existed in some form or
another since the r8152 driver was first introduced, though
RTL8152_INACCESSIBLE was known as RTL8152_UNPLUG before commit
715f67f33af4 ("r8152: Rename RTL8152_UNPLUG to RTL8152_INACCESSIBLE")
Fixes: ac718b69301c ("net/usb: new driver for RTL8152")
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As of commit d9962b0d4202 ("r8152: Block future register access if
register access fails") there is a race condition that can happen
between the USB device reset thread and napi_enable() (not) getting
called during rtl8152_open(). Specifically:
* While rtl8152_open() is running we get a register access error
that's _not_ -ENODEV and queue up a USB reset.
* rtl8152_open() exits before calling napi_enable() due to any reason
(including usb_submit_urb() returning an error).
In that case:
* Since the USB reset is perform in a separate thread asynchronously,
it can run at anytime USB device lock is not held - even before
rtl8152_open() has exited with an error and caused __dev_open() to
clear the __LINK_STATE_START bit.
* The rtl8152_pre_reset() will notice that the netif_running() returns
true (since __LINK_STATE_START wasn't cleared) so it won't exit
early.
* rtl8152_pre_reset() will then hang in napi_disable() because
napi_enable() was never called.
We can fix the race by making sure that the r8152 reset routines don't
run at the same time as we're opening the device. Specifically we need
the reset routines in their entirety rely on the return value of
netif_running(). The only way to reliably depend on that is for them
to hold the rntl_lock() mutex for the duration of reset.
Grabbing the rntl_lock() mutex for the duration of reset seems like a
long time, but reset is not expected to be common and the rtnl_lock()
mutex is already held for long durations since the core grabs it
around the open/close calls.
Fixes: d9962b0d4202 ("r8152: Block future register access if register access fails")
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When FIFO reaches near full state, device will issue pause frame.
If pause slot is enabled(set to 1), in this time, device will issue
pause frame only once. But if pause slot is disabled(set to 0), device
will keep sending pause frames until FIFO reaches near empty state.
When pause slot is disabled, if there is no one to handle receive
packets, device FIFO will reach near full state and keep sending
pause frames. That will impact entire local area network.
This issue can be reproduced in Chromebox (not Chromebook) in
developer mode running a test image (and v5.10 kernel):
1) ping -f $CHROMEBOX (from workstation on same local network)
2) run "powerd_dbus_suspend" from command line on the $CHROMEBOX
3) ping $ROUTER (wait until ping fails from workstation)
Takes about ~20-30 seconds after step 2 for the local network to
stop working.
Fix this issue by enabling pause slot to only send pause frame once
when FIFO reaches near full state.
Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125")
Reported-by: Grant Grundler <grundler@chromium.org>
Tested-by: Grant Grundler <grundler@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: ChunHao Lin <hau@realtek.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/20231129155350.5843-1-hau@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|