summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/i40e/i40e_main.c
AgeCommit message (Collapse)Author
2017-10-31i40e: only redistribute MSI-X vectors when neededShannon Nelson
Whether or not there are vectors_left, we only need to redistribute our vectors if we didn't get as many as we requested. With the current check, the code will try to redistribute even if we did in fact get all the vectors we requested - this can happen when we have more CPUs than we do vectors. This restores an earlier check to be sure we only redistribute if we didn't get the full count we requested. Fixes: 4ce20abc645f (i40e: fix MSI-X vector redistribution if hw limit is reached) Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-31i40e: mark PM functions as __maybe_unusedArnd Bergmann
A cleanup of the PM code left an incorrect #ifdef in place, leading to a harmless build warning: drivers/net/ethernet/intel/i40e/i40e_main.c:12223:12: error: 'i40e_resume' defined but not used [-Werror=unused-function] drivers/net/ethernet/intel/i40e/i40e_main.c:12185:12: error: 'i40e_suspend' defined but not used [-Werror=unused-function] It's easier to use __maybe_unused attributes here, since you can't pick the wrong one. Fixes: 0e5d3da40055 ("i40e: use newer generic PM support instead of legacy PM callbacks") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-25locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns ↵Mark Rutland
to READ_ONCE()/WRITE_ONCE() Please do not apply this to mainline directly, instead please re-run the coccinelle script shown below and apply its output. For several reasons, it is desirable to use {READ,WRITE}_ONCE() in preference to ACCESS_ONCE(), and new code is expected to use one of the former. So far, there's been no reason to change most existing uses of ACCESS_ONCE(), as these aren't harmful, and changing them results in churn. However, for some features, the read/write distinction is critical to correct operation. To distinguish these cases, separate read/write accessors must be used. This patch migrates (most) remaining ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following coccinelle script: ---- // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and // WRITE_ONCE() // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch virtual patch @ depends on patch @ expression E1, E2; @@ - ACCESS_ONCE(E1) = E2 + WRITE_ONCE(E1, E2) @ depends on patch @ expression E; @@ - ACCESS_ONCE(E) + READ_ONCE(E) ---- Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: snitzer@redhat.com Cc: thor.thayer@linux.intel.com Cc: tj@kernel.org Cc: viro@zeniv.linux.org.uk Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-19Merge branch '40GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-10-17 This series contains updates to i40e and ethtool. Alan provides most of the changes in this series which are mainly fixes and cleanups. Renamed the ethtool "cmd" variable to "ks", since the new ethtool API passes us ksettings structs instead of command structs. Cleaned up an ifdef that was not accomplishing anything. Added function header comments to provide better documentation. Fixed two issues in i40e_get_link_ksettings(), by calling ethtool_link_ksettings_zero_link_mode() to ensure the advertising and link masks are cleared before we start setting bits. Cleaned up and fixed code comments which were incorrect. Separated the setting of autoneg in i40e_phy_types_to_ethtool() into its own conditional to clarify what PHYs support and advertise autoneg, and makes it easier to add new PHY types in the future. Added ethtool functionality to intersect two link masks together to find the common ground between them. Overhauled i40e to ensure that the new ethtool API macros are being used, instead of the old ones. Fixed the usage of unsigned 64-bit division which is not supported on all architectures. Sudheer adds support for 25G Active Optical Cables (AOC) and Active Copper Cables (ACC) PHY types. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18ethernet/intel: Convert timers to use timer_setup()Kees Cook
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Switches test of .data field to .function, since .data will be going away. Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: intel-wired-lan@lists.osuosl.org Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-17i40e: fix u64 division usageAlan Brady
Commit 52eb1ff93e98 ("i40e: Add support setting TC max bandwidth rates") and commit 1ea6f21ae530 ("i40e: Refactor VF BW rate limiting") add some needed functionality for TC bandwidth rate limiting. Unfortunately they introduce several usages of unsigned 64-bit division which needs to be handled special by the kernel to support all architectures. Fixes: 52eb1ff93e98 ("i40e: Add support setting TC max bandwidth rates") Fixes: 1ea6f21ae530 ("i40e: Refactor VF BW rate limiting") Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-13i40e: Add support setting TC max bandwidth ratesAmritha Nambiar
This patch enables setting up maximum Tx rates for the traffic classes in i40e. The maximum rate is offloaded to the hardware through the mqprio framework by specifying the mode option as 'channel' and shaper option as 'bw_rlimit' and is configured for the VSI. Configuring minimum Tx rate limit is not supported in the device. The minimum usable value for Tx rate is 50Mbps. Example: # tc qdisc add dev eth0 root mqprio num_tc 2 map 0 0 0 0 1 1 1 1\ queues 4@0 4@4 hw 1 mode channel shaper bw_rlimit\ max_rate 4Gbit 5Gbit To dump the bandwidth rates: # tc qdisc show dev eth0 qdisc mqprio 804a: root tc 2 map 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 queues:(0:3) (4:7) mode:channel shaper:bw_rlimit max_rate:4Gbit 5Gbit Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-13i40e: Refactor VF BW rate limitingAmritha Nambiar
This patch refactors the BW rate limiting for Tx traffic on the VF to be reused in the next patch for rate limiting Tx traffic for the VSIs on the PF as well. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-13i40e: Enable 'channel' mode in mqprio for TC configsAmritha Nambiar
The i40e driver is modified to enable the new mqprio hardware offload mode and factor the TCs and queue configuration by creating channel VSIs. In this mode, the priority to traffic class mapping and the user specified queue ranges are used to configure the traffic classes by setting the mode option to 'channel'. Example: map 0 0 0 0 1 2 2 3 queues 2@0 2@2 1@4 1@5\ hw 1 mode channel qdisc mqprio 8038: root tc 4 map 0 0 0 0 1 2 2 3 0 0 0 0 0 0 0 0 queues:(0:1) (2:3) (4:4) (5:5) mode:channel shaper:dcb The HW channels created are removed and all the queue configuration is set to default when the qdisc is detached from the root of the device. This patch also disables setting up channels via ethtool (ethtool -L) when the TCs are configured using mqprio scheduler. The patch also limits setting ethtool Rx flow hash indirection (ethtool -X eth0 equal N) to max queues configured via mqprio. The Rx flow hash indirection input through ethtool should be validated so that it is within in the queue range configured via tc/mqprio. The bound checking is achieved by reporting the current rss size to the kernel when queues are configured via mqprio. Example: map 0 0 0 1 0 2 3 0 queues 2@0 4@2 8@6 11@14\ hw 1 mode channel Cannot set RX flow hash configuration: Invalid argument Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-13i40e: Add infrastructure for queue channel supportAmritha Nambiar
This patch sets up the infrastructure for offloading TCs and queue configurations to the hardware by creating HW channels(VSI). A new channel is created for each of the traffic class configuration offloaded via mqprio framework except for the first TC (TC0). TC0 for the main VSI is also reconfigured as per user provided queue parameters. Queue counts that are not power-of-2 are handled by reconfiguring RSS by reprogramming LUTs using the queue count value. This patch also handles configuring the TX rings for the channels, setting up the RX queue map for channel. Also, the channels so created are removed and all the queue configuration is set to default when the qdisc is detached from the root of the device. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Kiran Patil <kiran.patil@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-13i40e: Add macro for PF reset bitAmritha Nambiar
Introduce a macro for the bit setting the PF reset flag and update its usages. This makes it easier to use this flag in functions to be introduced in future without encountering checkpatch issues related to alignment and line over 80 characters. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-09i40e: fix a typoRami Rosen
This patch fixes a typo in i40e_vsi_alloc_arrays() documentation. The first parameter name should be "vsi" instead of "type". Signed-off-by: Rami Rosen <rami.rosen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-09i40e: allow XPS with QoS enabledJacob Keller
Recently, the kernel gained support for enabling XPS and QoS at the same time. Thus, we no longer need to worry about the number of traffic classes when enabling XPS. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-09i40e: reduce lrxqthresh from 2 to 1Jacob Keller
The lrxq thresh value tells hardware to immediately interrupt when there are fewer than N*64 packets left in the ring. Counter intuitively, empirical testing has shown that decreasing this value from 2 to 1, and thus changing from an immediate interrupt at fewer than 128 descriptors down to 64 descriptors causes a small increase in the maximum total packets per second we can receive. This increase occurs even when we're polling with interrupts masked, as the hardware must still handle interrupts internally even if we've disabled them in software. Also reduce the value for any VFs we allocate. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-09i40e/i40evf: always set the CLEARPBA flag when re-enabling interruptsJacob Keller
In the past we changed driver behavior to not clear the PBA when re-enabling interrupts. This change was motivated by the flawed belief that clearing the PBA would cause a lost interrupt if a receive interrupt occurred while interrupts were disabled. According to empirical testing this isn't the case. Additionally, the data sheet specifically says that we should set the CLEARPBA bit when re-enabling interrupts in a polling setup. This reverts commit 40d72a509862 ("i40e/i40evf: don't lose interrupts") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-09i40e/i40evf: fix incorrect default ITR values on driver loadJacob Keller
The ITR register expects to be programmed in units of 2 microseconds. Because of this, all of the drivers I40E_ITR_* constants are in terms of this 2 microsecond register. Unfortunately, the rx_itr_default value is expected to be programmed in microseconds. Effectively the driver defaults to an ITR value of half the expected value (in terms of minimum microseconds between interrupts). Fix this by changing the default values to be calculated using ITR_REG_TO_USEC macro which indicates that we're converting from the register units into microseconds. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06i40e: implement split PCI error reset handlerAlan Brady
This patch implements the PCI error handler reset_prepare and reset_done. This allows us to handle function level reset. Without this patch we are unable to perform and recover from an FLR correctly and this will cause VFs to be unable to recover from an FLR on the PF. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06i40e: Properly maintain flow director filters listFilip Sadowski
When there is no space for more flow director filters and user requested to add a new one it is rejected by firmware and automatically removed from the filter list maintained by driver. This behaviour is correct. Afterwards existing filter can be removed making free slot for the new one. This however causes the newly added filter to be accepted by firmware but removed from driver filter list resulting in not showing after issuing 'ethtool -n <dev_name>'. This happened due to not clearing the variable pf->fd_inv which stores filter number to be removed from the list when firmware refused to add the requested filter. It caused the filter with this specific ID to be constantly removed once it was added to the list although it has been accepted by firmware and effectively applied to the NIC. It was fixed by clearing pf->fd_inv variable after removal of the filter from the list when it was rejected by firmware. Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06i40e: Display error message if module does not meet thermal requirementsFilip Sadowski
This patch causes error message to be displayed when NIC detects insertion of module that does not meet thermal requirements. Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06i40e: fix merge errorAlice Michael
This patch removes some code that was accidentally added to the wrong function with a merge error. Fixes: c53934c6d1b1 ("i40e: fix: do not sleep in netdev_ops") Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06i40e/i40evf: use DECLARE_BITMAP for stateJesse Brandeburg
When using set_bit and friends, we should be using actual bitmaps, and fix all the locations where we might access it. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06i40e: re-enable PTP L4 capabilities for XL710 if FW >6.0Alan Brady
Starting with XL710 FW 5.3 PTP L4 was disabled for XL710 due to a bug. The bug has since been resolved in XL710 FW >6.0 and PTP L4 can now be re-enabled on those devices with updated firmware. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06i40e/i40evf: spread CPU affinity hints across online CPUs onlyJacob Keller
Currently, when setting up the IRQ for a q_vector, we set an affinity hint based on the v_idx of that q_vector. Meaning a loop iterates on v_idx, which is an incremental value, and the cpumask is created based on this value. This is a problem in systems with multiple logical CPUs per core (like in simultaneous multithreading (SMT) scenarios). If we disable some logical CPUs, by turning SMT off for example, we will end up with a sparse cpu_online_mask, i.e., only the first CPU in a core is online, and incremental filling in q_vector cpumask might lead to multiple offline CPUs being assigned to q_vectors. Example: if we have a system with 8 cores each one containing 8 logical CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is disabled, only the 1st CPU in each core remains online, so the cpu_online_mask in this case would have only 8 bits set, in a sparse way. In general case, when SMT is off the cpu_online_mask has only C bits set: 0, 1*N, 2*N, ..., C*(N-1) where C == # of cores; N == # of logical CPUs per core. In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set. Instead, we should only assign hints for CPUs which are online. Even better, the kernel already provides a function, cpumask_local_spread() which takes an index and returns a CPU, spreading the interrupts across local NUMA nodes first, and then remote ones if necessary. Since we generally have a 1:1 mapping between vectors and CPUs, there is no real advantage to spreading vectors to local CPUs first. In order to avoid mismatch of the default XPS hints, we'll pass -1 so that it spreads across all CPUs without regard to the node locality. Note that we don't need to change the q_vector->affinity_mask as this is initialized to cpu_possible_mask, until an actual affinity is set and then notified back to us. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-06i40e: add private flag to control source pruningMitch Williams
By default, our devices do source pruning, that is, they drop receive packets that have the source MAC matching one of the receive filters. Unfortunately, this breaks ARP monitoring in channel bonding, as the bonding driver expects devices to receive ARPs containing their own source address. Add an ethtool private flag to control this feature. Also, remove the netif_running() check when we process our private flags. It's OK to reset when the device is closed and in most cases we need the reset the apply these changes. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-02i40e: Stop dropping 802.1ad tags - eth proto 0x88a8Scott Peterson
Enable i40e to pass traffic with VLAN tags using the 802.1ad ethernet protocol ID (0x88a8). This requires NIC firmware providing version 1.7 of the API. With older NIC firmware 802.1ad tagged packets will continue to be dropped. No VLAN offloads nor RSS are supported for 802.1ad VLANs. Signed-off-by: Scott Peterson <scott.d.peterson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-02i40e: limit lan queue count in large CPU count machineShannon Nelson
When a machine has more CPUs than queue pairs, e.g. 512 cores, the counting gets a little funky and turns off Flow Director with the message: not enough queues for Flow Director. Flow Director feature is disabled This patch limits the number of lan queues initially allocated to be sure we have some left for FD and other features. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-09-29i40e: refactor FW version checkingMitch Williams
The i40e driver now supports two different devices with two different firmware versions. So be smart about how we handle these. Move the FW version macros to the appropriate header file, and add a convenience macro that checks the version based on the device. Then use this macro to check whether or not the driver can use the new link info API. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-09-29i40e: shutdown all IRQs and disable MSI-X when suspendedJacob Keller
On some platforms with a large number of CPUs, we will allocate many IRQ vectors. When hibernating, the system will attempt to migrate all of the vectors back to CPU0 when shutting down all the other CPUs. It is possible that we have so many vectors that it cannot re-assign them to CPU0. This is even more likely if we have many devices installed in one platform. The end result is failure to hibernate, as it is not possible to shutdown the CPUs. We can avoid this by disabling MSI-X and clearing our interrupt scheme when the device is suspended. A more ideal solution would be some method for the stack to properly handle this for all drivers, rather than on a case-by-case basis for each driver to fix itself. However, until this more ideal solution exists, we can do our part and shutdown our IRQs during suspend, which should allow systems with a large number of CPUs to safely suspend or hibernate. It may be worth investigating if we should shut down even further when we suspend as it may make the path cleaner, but this was the minimum fix for the hibernation issue mentioned here. Testing-hints: This affects systems with a large number of CPUs, and with multiple devices enabled. Without this change, those platforms are unable to hibernate at all. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-09-29i40e: prevent service task from running while we're suspendedJacob Keller
Although the service task does check the suspended status before running, it might already be part way through running when we go to suspend. Lets ensure that the service task is stopped and will not be restarted again until we finish resuming. This ensures that service task code does not cause strange interactions with the suspend/resume handlers. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-09-29i40e: don't clear suspended state until we finish resumingJacob Keller
When handling suspend and resume callbacks we want to make sure that (a) we don't suspend again if we're already suspended and (b) we don't resume again if we're already resuming. Lets make sure we test_and_set the __I40E_SUSPENDED bit in i40e_suspend which ensures that a suspend call when already suspended will exit early. Additionally, if __I40E_SUSPENDED is not set when we begin resuming, exit early as well. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-09-29i40e: use newer generic PM support instead of legacy PM callbacksJacob Keller
Stop using the old legacy PM support, since we now have stable support for the newer generic PM callbacks. This has several advantages. First, we no longer have to manage our own pci_save_state() and power changes, as it's preferred to have the PCI stack do this. Second, these routines get called for both hibernate and suspend to ram, so we can have the driver properly handle all the suspend/resume flows that it needs to. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-09-29i40e: use separate state bit for miscellaneous IRQ setupJacob Keller
We currently (mis)use the __I40E_RECOVERY_PENDING bit to determine when we should actually request a new IRQ in i40e_setup_misc_vector(). This led to a design mistake where we open-coded the re-setup of the miscellaneous vector in i40e_resume() instead of using the function provided. If we did not open-code this and instead tried to use the i40e_setup_misc_vector() function, it would lead to never reallocating the IRQ. This would lead to a second i40e_suspend() call failing to free the vector due to a NULL pointer dereference. A future patch is going to re-work how the i40e_suspend() and i40e_resume() flows work to clear all IRQ vectors, which would require us to use i40e_setup_misc_vector() directly. Since during this time the __I40E_RECOVERY_PENDING bit is set, we'll never re-allocate the vector. Rather than leaving the open-coded setup in i40e_resume() lets just fix the problem properly in i40e_setup_misc_vector(). Introduce a new state bit which indicates when the IRQ has been assigned, which will be set when i40e_setup_misc_vector is first called. This ultimately resolves the issue of re-requesting the vector, without overloading the __I40E_RECOVERY_PENDING state. This ensures that the suspend/resume cycle can use the setup function instead of open-coding the re-request during resume. Additionally, since the only callers of i40e_stop_misc_vector also want to free it, move this code directly into the function to avoid duplication. Due to the new functionality, rename it to i40e_free_misc_vector(). This lets us drop the extra calls to free and re-enable the vector during i40e_suspend() and i40e_resume(). We don't need to call i40e_setup_misc_Vector() in i40e_resume() because it gets called by the i40e_rebuild() call. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-09-29i40e: fix for flow director counters not wrapping as expectedMariusz Stachura
An errata with GLQF_PCNT causes it to not wrap as expected. This can cause an error in flow director statistics. This patch resets affected counters just after reading. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-09-29i40e: relax warning message in case of version mismatchMariusz Stachura
Fortville and Fort Park devices are often on different firmware release schedules. This change relaxes the minor version warning message, so it is only displayed for older FW warning version for old firmware Fortville 3 or earlier. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-09-29i40e: simplify member variable accessesSudheer Mogilappagari
This commit replaces usage of vsi->back in i40e_print_link_message() (which is actually a PF pointer) with temp variable. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-09-29i40e: Fix link down message when interface is brought upSudheer Mogilappagari
i40e_print_link_message() is intended to compare new link state with current link state and print log message only if the new state is different from current state. However in current driver the new state does not get updated when link is going down because of the if condition. When an interface is brought down, vsi->state is set to I40E_VSI_DOWN in i40e_vsi_close() and later i40e_print_link_message() does not get invoked in i40e_link_event due to if condition. Hence link down message doesn't appear when link is going down. The down state is seen later during i40e_open() and old state gets printed. The actual link state doesn't get updated in i40e_close() or i40e_open() but when i40e_handle_link_event is called inside i40e_clean_adminq_subtask. This change allows i40e_print_link_message() to be called when interface is going down and keeps the state information updated. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-09-29i40e: Fix unqualified module message while bringing link upSudheer Mogilappagari
In current driver, when ifconfig ethx up is done, the link state doesn't transition to UP inside i40e_open(). It changes after AQ command response is handled in i40e_handle_link_event(). When pf->hw.phy.link_info.link_info is DOWN inside i40e_open(), The state is transient and invalid. So log message gets printed based on incorrect info (i.e link_info and an_info). This commit removes check for unqualified module inside i40e_up_complete(). The existing check in i40e_handle_link_event() logs the error message based on correct link state information. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-27i40e: initialize our affinity_mask based on cpu_possible_maskJacob Keller
On older kernels a call to irq_set_affinity_hint does not guarantee that the IRQ affinity will be set. If nothing else on the system sets the IRQ affinity this can result in a bug in the i40e_napi_poll() routine where we notice that our interrupt fired on the "wrong" CPU according to our internal affinity_mask variable. This results in a bug where we continuously tell NAPI to stop polling to move the interrupt to a new CPU, but the CPU never changes because our affinity mask does not match the actual mask setup for the IRQ. The root problem is a mismatched affinity mask value. So lets initialize the value to cpu_possible_mask instead. This ensures that prior to the first time we get an IRQ affinity notification we'll have the mask set to include every possible CPU. We use cpu_possible_mask instead of cpu_online_mask since the former is almost certainly never going to change, while the later might change after we've made a copy. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-27i40e: remove workaround for resetting XPSJacob Keller
Since commit 3ffa037d7f78 ("i40e: Set XPS bit mask to zero in DCB mode") we've tried to reset the XPS settings by building a custom empty CPU mask. This workaround is not necessary because we're not really removing the XPS setting, but simply setting it so that no CPU is valid. Second, we shorten the code further by using zalloc_cpumask_var instead of a separate call to bitmap_zero(). Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-27i40e: Fix for unused value issue found by static analysisCarolyn Wyborny
This patch fixes an issue where an error return value is set, but without an immediate exit, the value can be overwritten by the following code execution. The condition at this point is not fatal, so remove the error assignment and comment the intent for future code maintainers Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-27i40e: 25G FEC status improvementsMariusz Stachura
This patch improves the system log message. The log message will be expanded to include the FEC mode the FW requested before link was established. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-27i40e: force VMDQ device name truncationJacob Keller
In new versions of GCC since 7.x a new warning exists which warns when a string is truncated before all of the format can be completed. When we setup VMDQ netdev names we are copying a pre-existing interface name which could be up to 15 characters in length. Since we also add 4 bytes, v, the literal %, the d and a \0 null, we would overrun the available size unless snprintf truncated for us. The snprintf call will of course truncate on the end, so lets instead modify the code to force truncation of the copied netdev name by 4 characters, to create enough space for the 4 bytes we're adding. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-25i40e: use cpumask_copy instead of direct assignmentJacob Keller
According to the header file cpumask.h, we shouldn't be directly copying a cpumask_t, since its a bitmap and might not be copied correctly. Lets use the provided cpumask_copy() function instead. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-25i40e: move check for avoiding VID=0 filters into i40e_vsi_add_vlanJacob Keller
In i40e_vsi_add_vlan we treat attempting to add VID=0 as an error, because it does not do what the caller might expect. We already special case VID=0 in i40e_vlan_rx_add_vid so that we avoid this error when adding the VLAN. This special casing is necessary so that we do not add the VLAN=0 filter since we don't want to stop receiving untagged traffic. Unfortunately, not all callers of i40e_vsi_add_vlan are aware of this, including when we add VLANs from a VF device. Rather than special casing every single caller of i40e_vsi_add_vlan, lets just move this check internally. This makes the code simpler because the caller does not need to be aware of how VLAN=0 is special, and we don't forget to add this check in new places. This fixes a harmless error message displaying when adding a VLAN from within a VF. The message was meaningless but there is no reason to confuse end users and system administrators, and this is now avoided. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-25i40e: Detect ATR HW Evict NVM issue and disable the featureAnjali Singhai Jain
This patch fixes a problem with the HW ATR eviction feature where the NVM setting was incorrect. This patch detects the issue on X720 adapters and disables the feature if the NVM setting is incorrect. Without this patch, HW ATR Evict feature does not work on broken NVMs and is not detected either. If the HW ATR Evict feature is disabled the SW Eviction feature will take effect. Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-25i40e: remove workaround for Open Firmware MAC addressJacob Keller
Since commit b499ffb0a22c ("i40e: Look up MAC address in Open Firmware or IDPROM"), we've had support for obtaining the MAC address form Open Firmware or IDPROM. This code relied on sending the Open Firmware address directly to the device firmware instead of relying on our MAC/VLAN filter list. Thus, a work around was introduced in commit b1b15df59232 ("i40e: Explicitly write platform-specific mac address after PF reset") We refactored the Open Firmware address enablement code in the ill-named commit 41c4c2b50d52 ("i40e: allow look-up of MAC address from Open Firmware or IDPROM") Since this refactor, we no longer even set I40E_FLAG_PF_MAC. Further, we don't need this work around, because we actually store the MAC address as part of the MAC/VLAN filter hash. Thus, we will restore the address correctly upon reset. The refactor above failed to revert the workaround, so do that now. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-25i40e: separate hw_features from runtime changing flagsJacob Keller
The number of flags found in pf->flags has grown quite large, and there are a lot of different types of flags. Most of the flags are simply hardware features which are enabled on some firmware or some MAC types. Other flags are dynamic run-time flags which enable or disable certain features of the driver. Separate these two types of flags into pf->hw_features and pf->flags. The hw_features list will contain a set of features which are enabled at init time. This will not contain toggles or otherwise dynamically changing features. These flags should not need atomic protections, as they will be set once during init and then be essentially read only. Everything else will remain in the flags variable. These flags may be modified at any time during run time. A future patch may wish to convert these flags into set_bit/clear_bit/test_bit or similar approach to ensure atomic correctness. The I40E_FLAG_MFP_ENABLED flag may be a good fit for hw_features but currently is used by ethtool in the private flags settings, and thus has been left as part of flags. Additionally, I40E_FLAG_DCB_CAPABLE may be a good fit for the hw_features but this patch has not tried to untangle it yet. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-25i40e: Fix a bug with VMDq RSS queue allocationAnjali Singhai Jain
The X722 pf flag setup should happen before the VMDq RSS queue count is initialized for VMDq VSI to get the right number of queues for RSS in case of X722 devices. Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-25i40e/i40evf: adjust packet size to account for double VLANsMitch Williams
Now that the kernel supports double VLAN tags, we should at least play nice. Adjust the max packet size to account for two VLAN tags, not just one. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-07net: sched: get rid of struct tc_to_netdevJiri Pirko
Get rid of struct tc_to_netdev which is now just unnecessary container and rather pass per-type structures down to drivers directly. Along with that, consolidate the naming of per-type structure variables in cls_*. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>