Age | Commit message (Collapse) | Author |
|
Drivers call netdev_tx_completed_queue() right before
netif_txq_maybe_wake(). If BQL is enabled netdev_tx_completed_queue()
should issue a memory barrier, so we can depend on that separating
the stop check from the consumer index update, instead of adding
another barrier in netif_txq_maybe_wake().
This matters more than the barriers on the xmit path, because
the wake condition is almost always true. So we issue the
consumer side barrier often.
Wrap netdev_tx_completed_queue() in a local helper to issue
the barrier even if BQL is disabled. Keep the same semantics
as netdev_tx_completed_queue() (barrier only if bytes != 0)
to make it clear that the barrier is conditional.
Plus since macro gets pkt/byte counts as arguments now -
we can skip waking if there were no packets completed.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Convert ixgbe to use the new macros, I think a lot of people
copy the ixgbe code. The only functional change is that the
unlikely() in ixgbe_clean_tx_irq() turns into a likely()
inside the new macro and no longer includes
total_packets && netif_carrier_ok(tx_ring->netdev)
which is probably for the best, anyway.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The VLAN filters info is currently being held in a list and 2 bitmaps
(active_cvlans and active_svlans). We are experiencing some racing where
data is not in sync in the list and bitmaps. For example, the VLAN is
initially added to the list but only when the PF replies, it is added to
the bitmap. If a user adds many V2 VLANS before the PF responds:
while [ $((i++)) ]
ip l add l eth0 name eth0.$i type vlan id $i
we might end up with more VLAN list entries than the designated limit.
Also, The "ip link show" will show more links added than the PF limit.
On the other and, the bitmaps are only used to check the number of VLAN
filters and to re-enable the filters when the interface goes from DOWN to
UP.
This patch gets rid of the bitmaps and uses the list only. To do that,
the states of the VLAN filter are modified:
1 - IAVF_VLAN_REMOVE: the entry needs to be totally removed after informing
the PF. This is the "ip link del eth0.$i" path.
2 - IAVF_VLAN_DISABLE: (new) the netdev went down. The filter needs to be
removed from the PF and then marked INACTIVE.
3 - IAVF_VLAN_INACTIVE: (new) no PF filter exists, but the user did not
delete the VLAN.
Fixes: 48ccc43ecf10 ("iavf: Add support VIRTCHNL_VF_OFFLOAD_VLAN_V2 during netdev config")
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The VLAN filter states are currently being saved as individual bits.
This is error prone as multiple bits might be mistakenly set.
Fix by replacing the bits with a single state enum. Also, add an
"ACTIVE" state for filters that are accepted by the PF.
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Conflicts:
drivers/net/ethernet/google/gve/gve.h
3ce934558097 ("gve: Secure enough bytes in the first TX desc for all TCP pkts")
75eaae158b1b ("gve: Add XDP DROP and TX support for GQI-QPL format")
https://lore.kernel.org/all/20230406104927.45d176f5@canb.auug.org.au/
https://lore.kernel.org/all/c5872985-1a95-0bc8-9dcc-b6f23b439e9d@tessares.net/
Adjacent changes:
net/can/isotp.c
051737439eae ("can: isotp: fix race between isotp_sendsmg() and isotp_release()")
96d1c81e6a04 ("can: isotp: add module parameter for maximum pdu size")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Reset the FDIR counters when FDIR inits. Without this patch,
when VF initializes or resets, all the FDIR counters are not
cleaned, which may cause unexpected behaviors for future FDIR
rule create (e.g., rule conflict).
Fixes: 1f7ea1cd6a37 ("ice: Enable FDIR Configure for AVF")
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Signed-off-by: Lingyu Liu <lingyu.liu@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When adding a FDIR filter, if ice_vc_fdir_set_irq_ctx returns failure,
the inserted fdir entry will not be removed and if ice_vc_fdir_write_fltr
returns failure, the fdir context info for irq handler will not be cleared
which may lead to inconsistent or memory leak issue. This patch refines
failure cases to resolve this issue.
Fixes: 1f7ea1cd6a37 ("ice: Enable FDIR Configure for AVF")
Signed-off-by: Simei Su <simei.su@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Currently in the i40e driver there is no implementation of different
MAC address handling depending on whether it is a legacy or primary.
Introduce new checks for VF to be able to specify its primary MAC
address based on the VIRTCHNL_ETHER_ADDR_PRIMARY type.
Primary MAC address are treated differently compared to legacy
ones in a scenario where:
1. If a unicast MAC is being added and it's specified as
VIRTCHNL_ETHER_ADDR_PRIMARY, then replace the current
default_lan_addr.addr.
2. If a unicast MAC is being deleted and it's type
is specified as VIRTCHNL_ETHER_ADDR_PRIMARY, then zero the
hw_lan_addr.addr.
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-03-30 (documentation, ice)
This series contains updates to driver documentation and the ice driver.
Tony removes links and addresses related to the out-of-tree driver from the
Intel ethernet driver documentation.
Jake removes a comment that is no longer valid to the ice driver.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: remove comment about not supporting driver reinit
Documentation/eth/intel: Remove references to SourceForge
Documentation/eth/intel: Update address for driver support
====================
Link: https://lore.kernel.org/r/20230330165935.2503604-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Conflicts:
drivers/net/ethernet/mediatek/mtk_ppe.c
3fbe4d8c0e53 ("net: ethernet: mtk_eth_soc: ppe: add support for flow accounting")
924531326e2d ("net: ethernet: mtk_eth_soc: add missing ppe cache flush when deleting a flow")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since commit 31c8db2c4fa7 ("ice: implement devlink reinit action"), the ice
driver does support driver re-initialization via devlink reload. Remove the
stale comment indicating that the driver lacks this support from the
ice_devlink_ops structure.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Fix invalid registers dump from ethtool -d ethX after adapter self test
by ethtool -t ethY. It causes invalid data display.
The problem was caused by overwriting i40e_reg_list[].elements
which is common for ethtool self test and dump.
Fixes: 22dd9ae8afcc ("i40e: Rework register diagnostic")
Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230328172659.3906413-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The code implicitly assumes that the list iterator finds a correct
handle. If 'vsi_handle' is not found the 'old_agg_vsi_info' was
pointing to an bogus memory location. For safety a separate list
iterator variable should be used to make the != NULL check on
'old_agg_vsi_info' correct under any circumstances.
Additionally Linus proposed to avoid any use of the list iterator
variable after the loop, in the attempt to move the list iterator
variable declaration into the macro to avoid any potential misuse after
the loop. Using it in a pointer comparison after the loop is undefined
behavior and should be omitted if possible [1].
Fixes: 37c592062b16 ("ice: remove the VSI info from previous agg")
Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1]
Signed-off-by: Jakob Koschel <jkl820.git@gmail.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Add profile conflict check while adding some FDIR rules to avoid
unexpected flow behavior, rules may have conflict including:
IPv4 <---> {IPv4_UDP, IPv4_TCP, IPv4_SCTP}
IPv6 <---> {IPv6_UDP, IPv6_TCP, IPv6_SCTP}
For example, when we create an FDIR rule for IPv4, this rule will work
on packets including IPv4, IPv4_UDP, IPv4_TCP and IPv4_SCTP. But if we
then create an FDIR rule for IPv4_UDP and then destroy it, the first
FDIR rule for IPv4 cannot work on pkt IPv4_UDP then.
To prevent this unexpected behavior, we add restriction in software
when creating FDIR rules by adding necessary profile conflict check.
Fixes: 1f7ea1cd6a37 ("ice: Enable FDIR Configure for AVF")
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The current implementation causes ice_vsi_update() to update all VSI
fields based on the cached VSI context. This also assumes that the
ICE_AQ_VSI_PROP_Q_OPT_VALID bit is set. This can cause problems if the
VSI context is not correctly synced by the driver. Fix this by only
updating the fields that correspond to ICE_AQ_VSI_PROP_Q_OPT_VALID.
Also, make sure to save the updated result in the cached VSI context
on success.
Fixes: 348048e724a0 ("ice: Implement iidc operations")
Co-developed-by: Robert Malz <robertx.malz@intel.com>
Signed-off-by: Robert Malz <robertx.malz@intel.com>
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com>
Tested-by: Jakub Andrysiak <jakub.andrysiak@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
make modules W=1 returns:
.../ice/ice_txrx_lib.c:448: warning: Function parameter or member 'first_idx' not described in 'ice_finalize_xdp_rx'
.../ice/ice_txrx.c:948: warning: Function parameter or member 'ntc' not described in 'ice_get_rx_buf'
.../ice/ice_txrx.c:1038: warning: Excess function parameter 'rx_buf' description in 'ice_construct_skb'
Fix these warnings by adding and deleting the deviant arguments.
Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side")
Fixes: d7956d81f150 ("ice: Pull out next_to_clean bump out of ice_put_rx_buf()")
CC: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Conflicts:
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
6e9d51b1a5cb ("net/mlx5e: Initialize link speed to zero")
1bffcea42926 ("net/mlx5e: Add devlink hairpin queues parameters")
https://lore.kernel.org/all/20230324120623.4ebbc66f@canb.auug.org.au/
https://lore.kernel.org/all/20230321211135.47711-1-saeed@kernel.org/
Adjacent changes:
drivers/net/phy/phy.c
323fe43cf9ae ("net: phy: Improved PHY error reporting in state machine")
4203d84032e2 ("net: phy: Ensure state transitions are processed from phy_stop()")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-03-21 (ice)
This series contains updates to ice driver only.
Piotr sets first_desc field for proper handling of Flow Director
packets.
Michal moves error checking for VF earlier in function to properly return
error before other checks/reporting; he also corrects VSI filter removal to
be done during VSI removal and not rebuild.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: remove filters only if VSI is deleted
ice: check if VF exists before mode check
ice: fix rx buffers handling for flow director packets
====================
Link: https://lore.kernel.org/r/20230321183641.2849726-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
DMA coalescing is not applicable for i225 parts. This patch comes to tidy
up the driver code.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
There was a problem with resuming ping after conducting a PCI reset.
This commit adds two functions, igbvf_io_prepare and igbvf_io_done,
which, after being added to the pci_error_handlers struct,
will prepare the drivers for a PCI reset and then bring the interface up
and reset it after. This will prevent the driver from ending up in
incorrect state. Test_and_set_bit is highly reliable in this context,
so we are not including a timeout in this commit
This introduces 900ms - 1100ms of overhead to this operation but it's in
non-time-critical flow. And also allows the driver to continue
functioning after the reset.
Functionality documented in ethernet-controller-i350-datasheet 4.2.1.3
https://www.intel.com/content/www/us/en/products/details/ethernet/gigabit-controllers/i350-controllers/docs.html
Signed-off-by: Dawid Wesierski <dawidx.wesierski@intel.com>
Signed-off-by: Kamil Maziarz <kamil.maziarz@intel.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Driver's .adjfine interface functions use adjust_by_scaled_ppm and
diff_by_scaled_ppm introduced in commit 1060707e3809
("ptp: introduce helpers to adjust by scaled parts per million")
to calculate the required adjustment in a concise manner,
but not igb_ptp_adjfine_82580.
Fix it by introducing IGB_82580_BASE_PERIOD and changing function logic
to use diff_by_scaled_ppm.
Signed-off-by: Andrii Staikov <andrii.staikov@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Initialize to zero structures to build a valid
Tx Packet used for the filter programming.
Fixes: a9219b332f52 ("i40e: VLAN field for flow director")
Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When a system with E810 with existing VFs gets rebooted the following
hang may be observed.
Pid 1 is hung in iavf_remove(), part of a network driver:
PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow"
#0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb
#1 [ffffaad04005fae8] schedule at ffffffff8b323e2d
#2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc
#3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930
#4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf]
#5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513
#6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa
#7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc
#8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e
#9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429
#10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4
#11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice]
#12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice]
#13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice]
#14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1
#15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386
#16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870
#17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6
#18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159
#19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc
#20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d
#21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169
#22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b
RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7
RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead
RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90
R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005
R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000
ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b
During reboot all drivers PM shutdown callbacks are invoked.
In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE.
In ice_shutdown() the call chain above is executed, which at some point
calls iavf_remove(). However iavf_remove() expects the VF to be in one
of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If
that's not the case it sleeps forever.
So if iavf_shutdown() gets invoked before iavf_remove() the system will
hang indefinitely because the adapter is already in state __IAVF_REMOVE.
Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE,
as we already went through iavf_shutdown().
Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in remove")
Fixes: a8417330f8a5 ("iavf: Fix race condition between iavf_shutdown and iavf_remove")
Reported-by: Marius Cornea <mcornea@redhat.com>
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Filters shouldn't be removed in VSI rebuild path. Removing them on PF
VSI results in no rule for PF MAC after changing for example queues
amount.
Remove all filters only in the VSI remove flow. As unload should also
cause the filter to be removed introduce, a new function ice_stop_eth().
It will unroll ice_start_eth(), so remove filters and close VSI.
Fixes: 6624e780a577 ("ice: split ice_vsi_setup into smaller functions")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Setting trust on VF should return EINVAL when there is no VF. Move
checking for switchdev mode after checking if VF exists.
Fixes: c54d209c78b8 ("ice: Wait for VF to be reset/ready before configuration")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Signed-off-by: Kalyan Kodamagula <kalyan.kodamagula@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Adding flow director filters stopped working correctly after
commit 2fba7dc5157b ("ice: Add support for XDP multi-buffer
on Rx side"). As a result, only first flow director filter
can be added, adding next filter leads to NULL pointer
dereference attached below.
Rx buffer handling and reallocation logic has been optimized,
however flow director specific traffic was not accounted for.
As a result driver handled those packets incorrectly since new
logic was based on ice_rx_ring::first_desc which was not set
in this case.
Fix this by setting struct ice_rx_ring::first_desc to next_to_clean
for flow director received packets.
[ 438.544867] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 438.551840] #PF: supervisor read access in kernel mode
[ 438.556978] #PF: error_code(0x0000) - not-present page
[ 438.562115] PGD 7c953b2067 P4D 0
[ 438.565436] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 438.569794] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Not tainted 6.2.0-net-bug #1
[ 438.577531] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022
[ 438.588470] RIP: 0010:ice_clean_rx_irq+0x2b9/0xf20 [ice]
[ 438.593860] Code: 45 89 f7 e9 ac 00 00 00 8b 4d 78 41 31 4e 10 41 09 d5 4d 85 f6 0f 84 82 00 00 00 49 8b 4e 08 41 8b 76
1c 65 8b 3d 47 36 4a 3f <48> 8b 11 48 c1 ea 36 39 d7 0f 85 a6 00 00 00 f6 41 08 02 0f 85 9c
[ 438.612605] RSP: 0018:ff8c732640003ec8 EFLAGS: 00010082
[ 438.617831] RAX: 0000000000000800 RBX: 00000000000007ff RCX: 0000000000000000
[ 438.624957] RDX: 0000000000000800 RSI: 0000000000000000 RDI: 0000000000000000
[ 438.632089] RBP: ff4ed275a2158200 R08: 00000000ffffffff R09: 0000000000000020
[ 438.639222] R10: 0000000000000000 R11: 0000000000000020 R12: 0000000000001000
[ 438.646356] R13: 0000000000000000 R14: ff4ed275d0daffe0 R15: 0000000000000000
[ 438.653485] FS: 0000000000000000(0000) GS:ff4ed2738fa00000(0000) knlGS:0000000000000000
[ 438.661563] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 438.667310] CR2: 0000000000000000 CR3: 0000007c9f0d6006 CR4: 0000000000771ef0
[ 438.674444] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 438.681573] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 438.688697] PKRU: 55555554
[ 438.691404] Call Trace:
[ 438.693857] <IRQ>
[ 438.695877] ? profile_tick+0x17/0x80
[ 438.699542] ice_msix_clean_ctrl_vsi+0x24/0x50 [ice]
[ 438.702571] ice 0000:b1:00.0: VF 1: ctrl_vsi irq timeout
[ 438.704542] __handle_irq_event_percpu+0x43/0x1a0
[ 438.704549] handle_irq_event+0x34/0x70
[ 438.704554] handle_edge_irq+0x9f/0x240
[ 438.709901] iavf 0000:b1:01.1: Failed to add Flow Director filter with status: 6
[ 438.714571] __common_interrupt+0x63/0x100
[ 438.714580] common_interrupt+0xb4/0xd0
[ 438.718424] iavf 0000:b1:01.1: Rule ID: 127 dst_ip: 0.0.0.0 src_ip 0.0.0.0 UDP: dst_port 4 src_port 0
[ 438.722255] </IRQ>
[ 438.722257] <TASK>
[ 438.722257] asm_common_interrupt+0x22/0x40
[ 438.722262] RIP: 0010:cpuidle_enter_state+0xc8/0x430
[ 438.722267] Code: 6e e9 25 ff e8 f9 ef ff ff 8b 53 04 49 89 c5 0f 1f 44 00 00 31 ff e8 d7 f1 24 ff 45
84 ff 0f 85 57 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 85 01 00 00 49 63 d6 48 8d 04 52 48 8d 04 82 49 8d
[ 438.722269] RSP: 0018:ffffffff86003e50 EFLAGS: 00000246
[ 438.784108] RAX: ff4ed2738fa00000 RBX: ffbe72a64fc01020 RCX: 0000000000000000
[ 438.791234] RDX: 0000000000000000 RSI: ffffffff858d84de RDI: ffffffff85893641
[ 438.798365] RBP: 0000000000000002 R08: 0000000000000002 R09: 000000003158af9d
[ 438.805490] R10: 0000000000000008 R11: 0000000000000354 R12: ffffffff862365a0
[ 438.812622] R13: 000000661b472a87 R14: 0000000000000002 R15: 0000000000000000
[ 438.819757] cpuidle_enter+0x29/0x40
[ 438.823333] do_idle+0x1b6/0x230
[ 438.826566] cpu_startup_entry+0x19/0x20
[ 438.830492] rest_init+0xcb/0xd0
[ 438.833717] arch_call_rest_init+0xa/0x30
[ 438.837731] start_kernel+0x776/0xb70
[ 438.841396] secondary_startup_64_no_verify+0xe5/0xeb
[ 438.846449] </TASK>
Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side")
Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
There are likely no users of this driver as the hardware has been
discontinued since 2010. Remove the driver and all references to it
in documentation.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-03-16 (igb, igbvf, igc)
This series contains updates to igb, igbvf, and igc drivers.
Lin Ma removes rtnl_lock() when disabling SRIOV on remove which was
causing deadlock on igb.
Akihiko Odaki delays enabling of SRIOV on igb to prevent early messages
that could get ignored and clears MAC address when PF returns nack on
reset; indicating no MAC address was assigned for igbvf.
Gaosheng Cui frees IRQs in error path for igbvf.
Akashi Takahiro fixes logic on checking TAPRIO gate support for igc.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-03-16 (iavf)
This series contains updates to iavf driver only.
Alex fixes incorrect check against Rx hash feature and corrects payload
value for IPv6 UDP packet.
Ahmed removes bookkeeping of VLAN 0 filter as it always exists and can
cause a false max filter error message.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
iavf: do not track VLAN 0 filters
iavf: fix non-tunneled IPv6 UDP packet type and hashing
iavf: fix inverted Rx hash condition leading to disabled hash
====================
Link: https://lore.kernel.org/r/20230316155316.1554931-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
net/wireless/nl80211.c
b27f07c50a73 ("wifi: nl80211: fix puncturing bitmap policy")
cbbaf2bb829b ("wifi: nl80211: add a command to enable/disable HW timestamping")
https://lore.kernel.org/all/20230314105421.3608efae@canb.auug.org.au
tools/testing/selftests/net/Makefile
62199e3f1658 ("selftests: net: Add VXLAN MDB test")
13715acf8ab5 ("selftest: Add test for bind() conflicts.")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ice_qp_dis() intends to stop a given queue pair that is a target of xsk
pool attach/detach. One of the steps is to disable interrupts on these
queues. It currently is broken in a way that txq irq is turned off
*after* HW flush which in turn takes no effect.
ice_qp_dis():
-> ice_qvec_dis_irq()
--> disable rxq irq
--> flush hw
-> ice_vsi_stop_tx_ring()
-->disable txq irq
Below splat can be triggered by following steps:
- start xdpsock WITHOUT loading xdp prog
- run xdp_rxq_info with XDP_TX action on this interface
- start traffic
- terminate xdpsock
[ 256.312485] BUG: kernel NULL pointer dereference, address: 0000000000000018
[ 256.319560] #PF: supervisor read access in kernel mode
[ 256.324775] #PF: error_code(0x0000) - not-present page
[ 256.329994] PGD 0 P4D 0
[ 256.332574] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 256.337006] CPU: 3 PID: 32 Comm: ksoftirqd/3 Tainted: G OE 6.2.0-rc5+ #51
[ 256.345218] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
[ 256.355807] RIP: 0010:ice_clean_rx_irq_zc+0x9c/0x7d0 [ice]
[ 256.361423] Code: b7 8f 8a 00 00 00 66 39 ca 0f 84 f1 04 00 00 49 8b 47 40 4c 8b 24 d0 41 0f b7 45 04 66 25 ff 3f 66 89 04 24 0f 84 85 02 00 00 <49> 8b 44 24 18 0f b7 14 24 48 05 00 01 00 00 49 89 04 24 49 89 44
[ 256.380463] RSP: 0018:ffffc900088bfd20 EFLAGS: 00010206
[ 256.385765] RAX: 000000000000003c RBX: 0000000000000035 RCX: 000000000000067f
[ 256.393012] RDX: 0000000000000775 RSI: 0000000000000000 RDI: ffff8881deb3ac80
[ 256.400256] RBP: 000000000000003c R08: ffff889847982710 R09: 0000000000010000
[ 256.407500] R10: ffffffff82c060c0 R11: 0000000000000004 R12: 0000000000000000
[ 256.414746] R13: ffff88811165eea0 R14: ffffc9000d255000 R15: ffff888119b37600
[ 256.421990] FS: 0000000000000000(0000) GS:ffff8897e0cc0000(0000) knlGS:0000000000000000
[ 256.430207] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 256.436036] CR2: 0000000000000018 CR3: 0000000005c0a006 CR4: 00000000007706e0
[ 256.443283] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 256.450527] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 256.457770] PKRU: 55555554
[ 256.460529] Call Trace:
[ 256.463015] <TASK>
[ 256.465157] ? ice_xmit_zc+0x6e/0x150 [ice]
[ 256.469437] ice_napi_poll+0x46d/0x680 [ice]
[ 256.473815] ? _raw_spin_unlock_irqrestore+0x1b/0x40
[ 256.478863] __napi_poll+0x29/0x160
[ 256.482409] net_rx_action+0x136/0x260
[ 256.486222] __do_softirq+0xe8/0x2e5
[ 256.489853] ? smpboot_thread_fn+0x2c/0x270
[ 256.494108] run_ksoftirqd+0x2a/0x50
[ 256.497747] smpboot_thread_fn+0x1c1/0x270
[ 256.501907] ? __pfx_smpboot_thread_fn+0x10/0x10
[ 256.506594] kthread+0xea/0x120
[ 256.509785] ? __pfx_kthread+0x10/0x10
[ 256.513597] ret_from_fork+0x29/0x50
[ 256.517238] </TASK>
In fact, irqs were not disabled and napi managed to be scheduled and run
while xsk_pool pointer was still valid, but SW ring of xdp_buff pointers
was already freed.
To fix this, call ice_qvec_dis_irq() after ice_vsi_stop_tx_ring(). Also
while at it, remove redundant ice_clean_rx_ring() call - this is handled
in ice_qp_clean_rings().
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The check introduced in the commit a5fd39464a40 ("igc: Lift TAPRIO schedule
restriction") can detect a false positive error in some corner case.
For instance,
tc qdisc replace ... taprio num_tc 4
...
sched-entry S 0x01 100000 # slot#1
sched-entry S 0x03 100000 # slot#2
sched-entry S 0x04 100000 # slot#3
sched-entry S 0x08 200000 # slot#4
flags 0x02 # hardware offload
Here the queue#0 (the first queue) is on at the slot#1 and #2,
and off at the slot#3 and #4. Under the current logic, when the slot#4
is examined, validate_schedule() returns *false* since the enablement
count for the queue#0 is two and it is already off at the previous slot
(i.e. #3). But this definition is truely correct.
Let's fix the logic to enforce a strict validation for consecutively-opened
slots.
Fixes: a5fd39464a40 ("igc: Lift TAPRIO schedule restriction")
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
vf reset nack actually represents the reset operation itself is
performed but no address is assigned. Therefore, e1000_reset_hw_vf
should fill the "perm_addr" with the zero address and return success on
such an occasion. This prevents its callers in netdev.c from saying PF
still resetting, and instead allows them to correctly report that no
address is assigned.
Fixes: 6ddbc4cf1f4d ("igb: Indicate failure on vf reset for empty mac address")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
In igbvf_request_msix(), irqs have not been freed on the err path,
we need to free it. Fix it.
Fixes: d4e0fe01a38a ("igbvf: add new driver to support 82576 virtual functions")
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Enabling SR-IOV causes the virtual functions to make requests to the
PF via the mailbox. Notably, E1000_VF_RESET request will happen during
the initialization of the VF. However, unless the reinit is done, the
VMMB interrupt, which delivers mailbox interrupt from VF to PF will be
kept masked and such requests will be silently ignored.
Enable SR-IOV at the very end of the procedure to configure the device
for SR-IOV so that the PF is configured properly for SR-IOV when a VF is
activated.
Fixes: fa44f2f185f7 ("igb: Enable SR-IOV configuration via PCI sysfs interface")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The commit 6faee3d4ee8b ("igb: Add lock to avoid data race") adds
rtnl_lock to eliminate a false data race shown below
(FREE from device detaching) | (USE from netdev core)
igb_remove | igb_ndo_get_vf_config
igb_disable_sriov | vf >= adapter->vfs_allocated_count?
kfree(adapter->vf_data) |
adapter->vfs_allocated_count = 0 |
| memcpy(... adapter->vf_data[vf]
The above race will never happen and the extra rtnl_lock causes deadlock
below
[ 141.420169] <TASK>
[ 141.420672] __schedule+0x2dd/0x840
[ 141.421427] schedule+0x50/0xc0
[ 141.422041] schedule_preempt_disabled+0x11/0x20
[ 141.422678] __mutex_lock.isra.13+0x431/0x6b0
[ 141.423324] unregister_netdev+0xe/0x20
[ 141.423578] igbvf_remove+0x45/0xe0 [igbvf]
[ 141.423791] pci_device_remove+0x36/0xb0
[ 141.423990] device_release_driver_internal+0xc1/0x160
[ 141.424270] pci_stop_bus_device+0x6d/0x90
[ 141.424507] pci_stop_and_remove_bus_device+0xe/0x20
[ 141.424789] pci_iov_remove_virtfn+0xba/0x120
[ 141.425452] sriov_disable+0x2f/0xf0
[ 141.425679] igb_disable_sriov+0x4e/0x100 [igb]
[ 141.426353] igb_remove+0xa0/0x130 [igb]
[ 141.426599] pci_device_remove+0x36/0xb0
[ 141.426796] device_release_driver_internal+0xc1/0x160
[ 141.427060] driver_detach+0x44/0x90
[ 141.427253] bus_remove_driver+0x55/0xe0
[ 141.427477] pci_unregister_driver+0x2a/0xa0
[ 141.428296] __x64_sys_delete_module+0x141/0x2b0
[ 141.429126] ? mntput_no_expire+0x4a/0x240
[ 141.429363] ? syscall_trace_enter.isra.19+0x126/0x1a0
[ 141.429653] do_syscall_64+0x5b/0x80
[ 141.429847] ? exit_to_user_mode_prepare+0x14d/0x1c0
[ 141.430109] ? syscall_exit_to_user_mode+0x12/0x30
[ 141.430849] ? do_syscall_64+0x67/0x80
[ 141.431083] ? syscall_exit_to_user_mode_prepare+0x183/0x1b0
[ 141.431770] ? syscall_exit_to_user_mode+0x12/0x30
[ 141.432482] ? do_syscall_64+0x67/0x80
[ 141.432714] ? exc_page_fault+0x64/0x140
[ 141.432911] entry_SYSCALL_64_after_hwframe+0x72/0xdc
Since the igb_disable_sriov() will call pci_disable_sriov() before
releasing any resources, the netdev core will synchronize the cleanup to
avoid any races. This patch removes the useless rtnl_(un)lock to guarantee
correctness.
CC: stable@vger.kernel.org
Fixes: 6faee3d4ee8b ("igb: Add lock to avoid data race")
Reported-by: Corinna Vinschen <vinschen@redhat.com>
Link: https://lore.kernel.org/intel-wired-lan/ZAcJvkEPqWeJHO2r@calimero.vinschen.de/
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Tested-by: Corinna Vinschen <vinschen@redhat.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When an interface with the maximum number of VLAN filters is brought up,
a spurious error is logged:
[257.483082] 8021q: adding VLAN 0 to HW filter on device enp0s3
[257.483094] iavf 0000:00:03.0 enp0s3: Max allowed VLAN filters 8. Remove existing VLANs or disable filtering via Ethtool if supported.
The VF driver complains that it cannot add the VLAN 0 filter.
On the other hand, the PF driver always adds VLAN 0 filter on VF
initialization. The VF does not need to ask the PF for that filter at
all.
Fix the error by not tracking VLAN 0 filters altogether. With that, the
check added by commit 0e710a3ffd0c ("iavf: Fix VF driver counting VLAN 0
filters") in iavf_virtchnl.c is useless and might be confusing if left as
it suggests that we track VLAN 0.
Fixes: 0e710a3ffd0c ("iavf: Fix VF driver counting VLAN 0 filters")
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Currently, IAVF's decode_rx_desc_ptype() correctly reports payload type
of L4 for IPv4 UDP packets and IPv{4,6} TCP, but only L3 for IPv6 UDP.
Originally, i40e, ice and iavf were affected.
Commit 73df8c9e3e3d ("i40e: Correct UDP packet header for non_tunnel-ipv6")
fixed that in i40e, then
commit 638a0c8c8861 ("ice: fix incorrect payload indicator on PTYPE")
fixed that for ice.
IPv6 UDP is L4 obviously. Fix it and make iavf report correct L4 hash
type for such packets, so that the stack won't calculate it on CPU when
needs it.
Fixes: 206812b5fccb ("i40e/i40evf: i40e implementation for skb_set_hash")
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Condition, which checks whether the netdev has hashing enabled is
inverted. Basically, the tagged commit effectively disabled passing flow
hash from descriptor to skb, unless user *disables* it via Ethtool.
Commit a876c3ba59a6 ("i40e/i40evf: properly report Rx packet hash")
fixed this problem, but only for i40e.
Invert the condition now in iavf and unblock passing hash to skbs again.
Fixes: 857942fd1aa1 ("i40e: Fix Rx hash reported to the stack by our driver")
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
ice: refactor mailbox overflow detection
Jake Keller says:
The primary motivation of this series is to cleanup and refactor the mailbox
overflow detection logic such that it will work with Scalable IOV. In
addition a few other minor cleanups are done while I was working on the
code in the area.
First, the mailbox overflow functions in ice_vf_mbx.c are refactored to
store the data per-VF as an embedded structure in struct ice_vf, rather than
stored separately as a fixed-size array which only works with Single Root
IOV. This reduces the overall memory footprint when only a handful of VFs
are used.
The overflow detection functions are also cleaned up to reduce the need for
multiple separate calls to determine when to report a VF as potentially
malicious.
Finally, the ice_is_malicious_vf function is cleaned up and moved into
ice_virtchnl.c since it is not Single Root IOV specific, and thus does not
belong in ice_sriov.c
I could probably have done this in fewer patches, but I split pieces out to
hopefully aid in reviewing the overall sequence of changes. This does cause
some additional thrash as it results in intermediate versions of the
refactor, but I think its worth it for making each step easier to
understand.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: call ice_is_malicious_vf() from ice_vc_process_vf_msg()
ice: move ice_is_malicious_vf() to ice_virtchnl.c
ice: print message if ice_mbx_vf_state_handler returns an error
ice: pass mbxdata to ice_is_malicious_vf()
ice: remove unnecessary &array[0] and just use array
ice: always report VF overflowing mailbox even without PF VSI
ice: declare ice_vc_process_vf_msg in ice_virtchnl.h
ice: initialize mailbox snapshot earlier in PF init
ice: merge ice_mbx_report_malvf with ice_mbx_vf_state_handler
ice: remove ice_mbx_deinit_snapshot
ice: move VF overflow message count into struct ice_mbx_vf_info
ice: track malicious VFs in new ice_mbx_vf_info structure
ice: convert ice_mbx_clear_malvf to void and use WARN
ice: re-order ice_mbx_reset_snapshot function
====================
Link: https://lore.kernel.org/r/20230313182123.483057-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
RDMA is not supported in ice on a PF that has been added to a bonded
interface. To enforce this, when an interface enters a bond, we unplug
the auxiliary device that supports RDMA functionality. This unplug
currently happens in the context of handling the netdev bonding event.
This event is sent to the ice driver under RTNL context. This is causing
a deadlock where the RDMA driver is waiting for the RTNL lock to complete
the removal.
Defer the unplugging/re-plugging of the auxiliary device to the service
task so that it is not performed under the RTNL lock context.
Cc: stable@vger.kernel.org # 6.1.x
Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
Link: https://lore.kernel.org/netdev/CAK8fFZ6A_Gphw_3-QMGKEFQk=sfCw1Qmq0TVZK3rtAi7vb621A@mail.gmail.com/
Fixes: 5cb1ebdbc434 ("ice: Fix race condition during interface enslave")
Fixes: 4eace75e0853 ("RDMA/irdma: Report the correct link speed")
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230310194833.3074601-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
i40e: support XDP multi-buffer
Tirthendu Sarkar says:
This patchset adds multi-buffer support for XDP. Tx side already has
support for multi-buffer. This patchset focuses on Rx side. The last
patch contains actual multi-buffer changes while the previous ones are
preparatory patches.
On receiving the first buffer of a packet, xdp_buff is built and its
subsequent buffers are added to it as frags. While 'next_to_clean' keeps
pointing to the first descriptor, the newly introduced 'next_to_process'
keeps track of every descriptor for the packet.
On receiving EOP buffer the XDP program is called and appropriate action
is taken (building skb for XDP_PASS, reusing page for XDP_DROP, adjusting
page offsets for XDP_{REDIRECT,TX}).
The patchset also streamlines page offset adjustments for buffer reuse
to make it easier to post process the rx_buffers after running XDP prog.
With this patchset there does not seem to be any performance degradation
for XDP_PASS and some improvement (~1% for XDP_TX, ~5% for XDP_DROP) when
measured using xdp_rxq_info program from samples/bpf/ for 64B packets.
v1: https://lore.kernel.org/netdev/20230306210822.3381942-1-anthony.l.nguyen@intel.com/
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
i40e: add support for XDP multi-buffer Rx
i40e: add xdp_buff to i40e_ring struct
i40e: introduce next_to_process to i40e_ring
i40e: use frame_sz instead of recalculating truesize for building skb
i40e: Change size to truesize when using i40e_rx_buffer_flip()
i40e: add pre-xdp page_count in rx_buffer
i40e: change Rx buffer size for legacy-rx to support XDP multi-buffer
i40e: consolidate maximum frame size calculation for vsi
====================
Link: https://lore.kernel.org/r/20230309212819.1198218-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The main loop in __ice_clean_ctrlq first checks if a VF might be malicious
before calling ice_vc_process_vf_msg(). This results in duplicate code in
both functions to obtain a reference to the VF, and exports the
ice_is_malicious_vf() from ice_virtchnl.c unnecessarily.
Refactor ice_is_malicious_vf() to be a static function that takes a pointer
to the VF. Call this in ice_vc_process_vf_msg() just after we obtain a
reference to the VF by calling ice_get_vf_by_id.
Pass the mailbox data from the __ice_clean_ctrlq function into
ice_vc_process_vf_msg() instead of calling ice_is_malicious_vf().
This reduces the number of exported functions and avoids the need to obtain
the VF reference twice for every mailbox message.
Note that the state check for ICE_VF_STATE_DIS is kept in
ice_is_malicious_vf() and we call this before checking that state in
ice_vc_process_vf_msg. This is intentional, as we stop responding to VF
messages from a VF once we detect that it may be overflowing the mailbox.
This ensures that we continue to silently ignore the message as before
without responding via ice_vc_send_msg_to_vf().
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The ice_is_malicious_vf() function is currently implemented in ice_sriov.c
This function is not Single Root specific, and a future change is going to
refactor the ice_vc_process_vf_msg() function to call this instead of
calling it before ice_vc_process_vf_msg() in the main loop of
__ice_clean_ctrlq.
To make that change easier to review, first move this function into
ice_virtchnl.c but leave the call in __ice_clean_ctrlq() alone.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
If ice_mbx_vf_state_handler() returns an error, the ice_is_malicious_vf()
function just exits without printing anything.
Instead, use dev_warn_ratelimited to print a warning that we were unable to
check the status for this VF. The _ratelimited variant is used to avoid
potentially spamming the log if this function is failing consistently for
every single mailbox message.
Also we can drop the "goto" as it simply skips over a report_malvf check.
That variable should always be false if ice_mbx_vf_state_handler returns
non-zero.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The ice_is_malicious_vf() function takes information about the current
state of the mailbox during a single interrupt. This information includes
the number of messages processed so far, as well as the number of pending
messages not yet processed.
A future refactor is going to make ice_vc_process_vf_msg() call
ice_is_malicious_vf() instead of having it called separately in ice_main.c
This change will require passing all the necessary arguments into
ice_vc_process_vf_msg().
To make this simpler, have the main loop fill in the struct ice_mbx_data
and pass that rather than passing in the num_msg_proc and num_msg_pending.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
In ice_is_malicious_vf we print the VF MAC address using %pM by passing the
address of the first element of vf->dev_lan_addr. This is equivalent to
just passing vf->dev_lan_addr, so do that.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
In ice_is_malicious_vf we report a message warning the system administrator
when a VF is potentially spamming the PF with asynchronous messages that
could overflow the PF mailbox.
The specific message was requested by our customer support team to include
the VF and PF MAC address. In some cases we may not be able to locate the
PF VSI to obtain the MAC address for the PF. The current implementation
discards the message entirely in this case. Fix this to instead print a
zero address in that case so that we always print something here. Note that
dev_warn will also include the PCI device information allowing another
mechanism for determining on which PF the potentially malicious VF belongs.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The ice_vc_process_vf_msg function is the main entry point for handling
virtchnl messages. This function is defined in ice_virtchnl.c but its
declaration is still in ice_sriov.c
The ice_sriov.c file used to contain all of the virtualization logic until
commit bf93bf791cec ("ice: introduce ice_virtchnl.c and ice_virtchnl.h")
moved the virtchnl logic to its own file.
The ice_vc_process_vf_msg function should have had its declaration moved to
ice_virtchnl.h then. Fix this.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Now that we no longer depend on the number of VFs being allocated, we can
move the ice_mbx_init_snapshot function earlier. This will be required by
Scalable IOV as we will not be calling ice_sriov_configure for Scalable
VFs.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|