summaryrefslogtreecommitdiff
path: root/drivers/scsi/lpfc/lpfc_crtn.h
AgeCommit message (Collapse)Author
2021-08-24scsi: lpfc: Add cmf_info sysfs entryJames Smart
Allow abbreviated cm framework status information to be obtained via sysfs. Link: https://lore.kernel.org/r/20210816162901.121235-14-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24scsi: lpfc: Add support for maintaining the cm statistics bufferJames Smart
Add the logic to move the congestion management and event information into the cmd statistics buffer maintained for the adapter. The update includes rolling up values for the last minute, hour, and day information. Link: https://lore.kernel.org/r/20210816162901.121235-12-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24scsi: lpfc: Add support for the CM frameworkJames Smart
Complete the enablement of the cm framework feature in the adapter. Perform the following: - Detect the presence of the congestion management framework feature. When the cm framework is present: - Issue the SET_FEATURE command to enable the feature. - Register the cm statistics buffer with the adapter. - Read the cm enablement buffer to determine the cm framework state for cm management. When cm management is enabled: - Monitor all FPIN and congestion signalling events, incrementing counters. - Regularly sync with the adapter to communicate congestion events and to receive an rx request limit. - Monitor requests for rx data and ensure that no more than the adapter prescribed limit is issued on the link. If the limit is exceeded, SCSI and/or NVMe traffic is temporarily suspended. - Maintain the minute, hourly, daily statistics buffer. - Monitor for congestion enablement change events, causing a reread of the enablement buffer and acting on any change in enablement. And: - Add teardown logic, including buffer deregistration, on adapter detachment or reset. Link: https://lore.kernel.org/r/20210816162901.121235-10-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24scsi: lpfc: Add cmfsync WQE supportJames Smart
When congestion mgmt is enabled, cmf has the driver regularly issue a command to synchronize reporting of congestion mgmt events such as fpin and signal delivery. This patch adds the definition of the CMF_SYNC WQE and its CQE fields as well as support for issuing the command. The patch also adds the few remaining cmf-related SLI additions, such as feature definition for enablement of CMF and notifications to the driver if the cm enablement mode changes. Link: https://lore.kernel.org/r/20210816162901.121235-9-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24scsi: lpfc: Add support for cm enablement bufferJames Smart
As part of the cmf framework, the firmware maintains a table with congestion related state information, specifically whether enabled and if enabled, whether monitoring or actively managing congestion. Add definition of the table and add support to read the table from the adapter and determine if it is enabled. In support of this, the READ_OBJECT mailbox command definition is added to the driver. Link: https://lore.kernel.org/r/20210816162901.121235-8-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24scsi: lpfc: Add cm statistics buffer supportJames Smart
The cmf framework requires the driver to maintain a cm statistics table, accessible inband, of congestion related statistics that are reported per minute, rolled up to per hour, and rolled up again per day. Several days worth may be maintained. The table is registered with the adapter when the MIB feature is enabled. Add definition of the table and add support to register the table with the adapter. Includes definition and initialization of event counters that are later added to the statistics table. Link: https://lore.kernel.org/r/20210816162901.121235-7-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24scsi: lpfc: Add EDC ELS supportJames Smart
When congestion management is enabled, issue EDC ELS to register congestion signaling capabilities with the fabric. The response handling will process the fabric parameters and set the reporting parameters. Similarly, add support for receiving an EDC request from the fabric generating a corresponding response. Implement handlers for congestion signals from the fabric and maintain statistics for them. Link: https://lore.kernel.org/r/20210816162901.121235-6-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18scsi: lpfc: Delay unregistering from transport until GIDFT or ADISC completesJames Smart
On an RSCN event, the nodes specified in RSCN payload and in MAPPED state are moved to NPR state in order to revalidate the login. This triggers an immediate unregister from SCSI/NVMe backend. The assumption is that the node may be missing. The re-registration with the backend happens after either relogin (PLOGI/PRLI; if ADISC is disabled or login truly lost) or when ADISC completes successfully (rediscover with ADISC enabled). However, the NVMe-FC standard provides for an RSCN to be triggered when the remote port supports a discovery controller and there was a change of discovery log content. As the remote port typically also supports storage subsystems, this unregister causes all storage controller connections to fail and require reconnect. Correct by reworking the code to ensure that the unregistration only occurs when a login state is truly terminated, thereby leaving the NVMe storage controllers in place. The changes made are: - Retain node state in ADISC_ISSUE when scheduling ADISC ELS retry. - Do not clear wwpn/wwnn values upon ADISC failure. - Move MAPPED nodes to NPR during RSCN processing, but do not unregister with transport. On GIDFT completion, identify missing nodes (not marked NLP_NPR_2B_DISC) and unregister them. - Perform unregistration for nodes that will go through ADISC processing if ADISC completion fails. - Successful ADISC completion will move node back to MAPPED state. Link: https://lore.kernel.org/r/20210707184351.67872-16-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-10scsi: lpfc: vmid: Add datastructure for supporting VMID in lpfcGaurav Srivastava
Add the primary datastructures needed to implement VMID in the lpfc driver. Maintain the capability, current state, and hash table for the vmid/appid along with other information. This implementation supports the two versions of vmid implementation (app header and priority tagging). Link: https://lore.kernel.org/r/20210608043556.274139-5-muneendra.kumar@broadcom.com Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21scsi: lpfc: Fix node handling for Fabric Controller and Domain ControllerJames Smart
During link bounce testing, RPI counts were seen to differ from the number of nodes. For fabric and domain controllers, a temporary RPI is assigned, but the code isn't registering it. If the nodes do go away, such as on link down, the temporary RPI isn't being released. Change the way these two fabric services are managed, make them behave like any other remote port. Register the RPI and register with the transport. Never leave the nodes in a NPR or UNUSED state where their RPI is in limbo. This allows them to follow normal dev_loss_tmo handling, RPI refcounting, and normal removal rules. It also allows fabric I/Os to use the RPI for traffic requests. Note: There is some logic that still has a couple of exceptions when the Domain controller (0xfffcXX). There are cases where the fabric won't have a valid login but will send RDP. Other times, it will it send a LOGO then an RDP. It makes for ad-hoc behavior to manage the node. Exceptions are documented in the code. Link: https://lore.kernel.org/r/20210514195559.119853-7-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13scsi: lpfc: Remove unsupported mbox PORT_CAPABILITIES logicJames Smart
SLI-4 does not contain a PORT_CAPABILITIES mailbox command (only SLI-3 does, and SLI-3 doesn't use it), yet there are SLI-4 code paths that have code to issue the command. The command will always fail. Remove the code for the mailbox command and leave only the resulting "failure path" logic. Link: https://lore.kernel.org/r/20210412013127.2387-12-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13scsi: lpfc: Fix rmmod crash due to bad ring pointers to abort_iotagJames Smart
Rmmod on SLI-4 adapters is sometimes hitting a bad ptr dereference in lpfc_els_free_iocb(). A prior patch refactored the lpfc_sli_abort_iocb() routine. One of the changes was to convert from building/sending an abort within the routine to using a common routine. The reworked routine passes, without modification, the pring ptr to the new common routine. The older routine had logic to check SLI-3 vs SLI-4 and adapt the pring ptr if necessary as callers were passing SLI-3 pointers even when not on an SLI-4 adapter. The new routine is missing this check and adapt, so the SLI-3 ring pointers are being used in SLI-4 paths. Fix by cleaning up the calling routines. In review, there is no need to pass the ring ptr argument to abort_iocb at all. The routine can look at the adapter type itself and reference the proper ring. Link: https://lore.kernel.org/r/20210412013127.2387-2-jsmart2021@gmail.com Fixes: db7531d2b377 ("scsi: lpfc: Convert abort handling to SLI-3 and SLI-4 handlers") Cc: <stable@vger.kernel.org> # v5.11+ Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: lpfc: Update copyrights for 12.8.0.7 and 12.8.0.8 changesJames Smart
For the files modified in 2021 via the 12.8.0.7 and 12.8.0.8 patch sets, update the copyright for 2021. Link: https://lore.kernel.org/r/20210301171821.3427-23-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04scsi: lpfc: Fix dropped FLOGI during pt2pt discovery recoveryJames Smart
When connected in pt2pt mode, there is a scenario where the remote port significantly delays sending a response to our FLOGI, but acts on the FLOGI it sent us and proceeds to PLOGI/PRLI. The FLOGI ends up timing out and kicks off recovery logic. End result is a lot of unnecessary state changes and lots of discovery messages being logged. Fix by terminating the FLOGI and noop'ing its completion if we have already accepted the remote ports FLOGI and are now processing PLOGI. Link: https://lore.kernel.org/r/20210301171821.3427-13-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Implement health checking when aborting I/OJames Smart
Several errors have occurred where the adapter stops or fails but does not raise the register values for the driver to detect failure. Thus driver is unaware of the failure. The failure typically results in I/O timeouts, the I/O timeout handler failing (after several seconds), and the error handler escalating recovery policy and resulting in more errors. Eventually, the driver is in a position where things have spiraled and it can't do recovery because other recovery ops are still outstanding and it becomes unusable. Resolve the situation by having the I/O timeout handler (actually a els, SCSI I/O, NVMe ls, or NVMe I/O timeout), in addition to aborting the I/O, perform a mailbox command and look for a response from the hardware. If the mailbox command fails, it will mark the adapter offline and then invoke the adapter reset handler to clean up. The new I/O timeout test will be limited to a test every 5s. If there are multiple I/O timeouts concurrently, only the 1st I/O timeout will generate the mailbox command. Further testing will only occur once a timeout occurs after a 5s delay from the last mailbox command has expired. Link: https://lore.kernel.org/r/20210104180240.46824-14-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Fix NVMe recovery after mailbox timeoutJames Smart
If a mailbox command times out, the SLI port is deemed in error and the port is reset. The HBA cleanup is not returning I/Os to the NVMe layer before the port is unregistered. This is due to the HBA being marked offline (!SLI_ACTIVE) and cleanup being done by the mailbox timeout handler rather than an general adapter reset routine. The mailbox timeout handler mailbox handler only cleaned up SCSI I/Os. Fix by reworking the mailbox handler to: - After handling the mailbox error, detect the board is already in failure (may be due to another error), and leave cleanup to the other handler. - If the mailbox command timeout is initial detector of the port error, continue with the board cleanup and marking the adapter offline (!SLI_ACTIVE). Remove the SCSI-only I/O cleanup routine. The generic reset adapter routine that is subsequently invoked, will clean up the I/Os. - Have the reset adapter routine flush all NVMe and SCSI I/Os if the adapter has been marked failed (!SLI_ACTIVE). - Rework the NVMe I/O terminate routine to take a status code to fail the I/O with and update so that cleaned up I/O calls the wqe completion routine. Currently it is bypassing the wqe cleanup and calling the NVMe I/O completion directly. The wqe completion routine will take care of data structure and node cleanup then call the NVMe I/O completion handler. Link: https://lore.kernel.org/r/20210104180240.46824-11-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-17scsi: lpfc: Update changed file copyrights for 2020James Smart
Update Copyright in files changed by the 12.8.0.6 patch set to 2020 Link: https://lore.kernel.org/r/20201115192646.12977-18-james.smart@broadcom.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-17scsi: lpfc: Convert abort handling to SLI-3 and SLI-4 handlersJames Smart
This patch reworks the abort interfaces such that SLI-3 retains the iocb-based formatting and completions and SLI-4 now uses native WQEs and completion routines. The following changes are made: - The code is refactored from a confusing 2 routine sequence of xx_abort_iotag_issue(), which creates/formats and abort cmd, and xx_issue_abort_tag(), which then issues and handles the completion of the abort cmd - into a single interface of xx_issue_abort_iotag(). The new interface will determine whether SLI-3 or SLI-4 and then call the appropriate handler. A completion handler can now be specified to address the differences in completion handling. Note: original code is all iocb based, with SLI-4 converting to SLI-3 for the SCSI/ELS path, and NVMe natively using wqes. - The SLI-3 side is refactored: The older iocb-base lpfc_sli_issue_abort_iotag() routine is combined with the logic of lpfc_sli_abort_iotag_issue() as well as the iocb-specific code in lpfc_abort_handler() and lpfc_sli_abort_iocb() to create the new single SLI-3 abort routine that formats and issues the iocb. - The SLI-4 side is refactored and added to: The native WQE abort code in NVMe is moved to the new SLI-4 issue_abort_iotag() routine. Items in SCSI that set fields not set by NVMe is migrated into the new routine. Thus the routine supports NVMe and SCSI initiators. The nvmet block (target) formats the abort slightly different (like the old NVMe initiator) thus it has its own prep routine stolen from NVMe initiator and it retains the current code it has for issuing the WQE (does not use the commonized routine the initiators do). SLI-4 completion handlers were also added. - lpfc_abort_handler now becomes a wrapper that determines whether SLI-3 or SLI-4 and calls the proper abort handler. Link: https://lore.kernel.org/r/20201115192646.12977-16-james.smart@broadcom.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-17scsi: lpfc: Enable common send_io interface for SCSI and NVMeJames Smart
To set up common use by the SCSI and NVMe I/O paths, create a new routine that issues FCP I/O commands which can be used by either protocol. The new routine addresses SLI-3 vs SLI-4 differences within its implementation. Replace the (SLI-3 centric) iocb routine in the SCSI path with this new WQE-centric common routine. Link: https://lore.kernel.org/r/20201115192646.12977-13-james.smart@broadcom.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-17scsi: lpfc: Enable common wqe_template support for both SCSI and NVMeJames Smart
The driver is currently using SLI-4 WQE templates only for NVMe. Refactor the template and the placement of the service routine so that it can be used by both SCSI and NVMe. Link: https://lore.kernel.org/r/20201115192646.12977-12-james.smart@broadcom.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-17scsi: lpfc: Rework remote port ref counting and node freeingJames Smart
When a remote port is disconnected and disappears, its node structure (ndlp) stays allocated and on a vport node list. While on the list it can be matched, thus requires validation checks on state to be added in numerous code paths. If the node comes back, its possible for there to be multiple node structures for the same device on the vport node list. There is no reason to keep the node structure around after it is no longer in existence, and the current implementation creates problems for itself (multiple nodes) and lots of unnecessary code for state validation. Additionally, the reference taking on the node structure didn't follow the normal model used by the kernel kref api. It included lots of odd logic to match state with reference count. The combination of this odd logic plus the way it was implicitly used in the discovery engine made its reference taking implementation suspect and extremely hard to follow. Change the driver such that the reference taking routines are now normal ref increments/decrements and callout on refcount=0. With this in place, the rework can be done such that the node structure is fully removed and deallocated when the remote port no longer exists and all references are removed. This removal logic, and the basic ref counting are intrically tied, thus in a single patch. Link: https://lore.kernel.org/r/20201115192646.12977-2-james.smart@broadcom.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-02scsi: lpfc: Add an internal trace log bufferDick Kennedy
The current logging methods typically end up requesting a reproduction with a different logging level set to figure out what happened. This was mainly by design to not clutter the kernel log messages with things that were typically not interesting and the messages themselves could cause other issues. When looking to make a better system, it was seen that in many cases when more data was wanted was when another message, usually at KERN_ERR level, was logged. And in most cases, what the additional logging that was then enabled was typically. Most of these areas fell into the discovery machine. Based on this summary, the following design has been put in place: The driver will maintain an internal log (256 elements of 256 bytes). The "additional logging" messages that are usually enabled in a reproduction will be changed to now log all the time to the internal log. A new logging level is defined - LOG_TRACE_EVENT. When this level is set (it is not by default) and a message marked as KERN_ERR is logged, all the messages in the internal log will be dumped to the kernel log before the KERN_ERR message is logged. There is a timestamp on each message added to the internal log. However, this timestamp is not converted to wall time when logged. The value of the timestamp is solely to give a crude time reference for the messages. Link: https://lore.kernel.org/r/20200630215001.70793-14-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-09lpfc: nvmet: Add support for NVME LS request hosthandleJames Smart
As the nvmet layer does not have the concept of a remoteport object, which can be used to identify the entity on the other end of the fabric that is to receive an LS, the hosthandle was introduced. The driver passes the hosthandle, a value representative of the remote port, with a ls request receive. The LS request will create the association. The transport will remember the hosthandle for the association, and if there is a need to initiate a LS request to the remote port for the association, the hosthandle will be used. When the driver loses connectivity with the remote port, it needs to notify the transport that the hosthandle is no longer valid, allowing the transport to terminate associations related to the hosthandle. This patch adds support to the driver for the hosthandle. The driver will use the ndlp pointer of the remote port for the hosthandle in calls to nvmet_fc_rcv_ls_req(). The discovery engine is updated to invalidate the hosthandle whenever connectivity with the remote port is lost. Signed-off-by: Paul Ely <paul.ely@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09lpfc: Refactor NVME LS receive handlingJames Smart
In preparation for supporting both intiator mode and target mode receiving NVME LS's, commonize the existing NVME LS request receive handling found in the base driver and in the nvmet side. Using the original lpfc_nvmet_unsol_ls_event() and lpfc_nvme_unsol_ls_buffer() routines as a templates, commonize the reception of an NVME LS request. The common routine will validate the LS request, that it was received from a logged-in node, and allocate a lpfc_async_xchg_ctx that is used to manage the LS request. The role of the port is then inspected to determine which handler is to receive the LS - nvme or nvmet. As such, the nvmet handler is tied back in. A handler is created in nvme and is stubbed out. Signed-off-by: Paul Ely <paul.ely@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09lpfc: Refactor nvmet_rcv_ctx to create lpfc_async_xchg_ctxJames Smart
To support FC-NVME-2 support (actually FC-NVME (rev 1) with Ammendment 1), both the nvme (host) and nvmet (controller/target) sides will need to be able to receive LS requests. Currently, this support is in the nvmet side only. To prepare for both sides supporting LS receive, rename lpfc_nvmet_rcv_ctx to lpfc_async_xchg_ctx and commonize the definition. Signed-off-by: Paul Ely <paul.ely@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-29scsi: lpfc: Make debugfs ktime stats generic for NVME and SCSIJames Smart
Currently driver ktime stats, measuring code paths, is NVME-specific. Convert the stats routines such that the code paths are generic, providing status for NVME and SCSI. Added ktime stat calls in SCSI queuecommand and cmpl routines. Link: https://lore.kernel.org/r/20200322181304.37655-11-jsmart2021@gmail.com Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-03-26scsi: lpfc: Fix scsi host template for SLI3 vportsJames Smart
SCSI layer sends driver IOs with more s/g segments than driver can handle. This results in "Too many sg segments from dma_map_sg. Config 64, seg_cnt 219" error messages from the lpfc_scsi_prep_dma_buf_s3() routine. The was due to use the driver using individual templates for pport and vport, host reset enabled or not, nvme vs scsi, etc. In the end, there was a combination for a vport that didn't match the pport. Rather than enumerating more templates and more discretionary assignments, revert to a base template that is copied to a template specific to the pport/vport. Then, based on role, attributes and sli type, modify the fields that are different for that port. Added a log message to lpfc_create_port to validate values. Link: https://lore.kernel.org/r/20200322181304.37655-5-jsmart2021@gmail.com Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-02-18scsi: lpfc: add RDF registration and Link Integrity FPIN loggingJames Smart
This patch modifies lpfc to register for Link Integrity events via the use of an RDF ELS and to perform Link Integrity FPIN logging. Specifically, the driver was modified to: - Format and issue the RDF ELS immediately following SCR registration. This registers the ability of the driver to receive FPIN ELS. - Adds decoding of the FPIN els into the received descriptors, with logging of the Link Integrity event information. After decoding, the ELS is delivered to the scsi fc transport to be delivered to any user-space applications. - To aid in logging, simple helpers were added to create enum to name string lookup functions that utilize the initialization helpers from the fc_els.h header. - Note: base header definitions for the ELS's don't populate the descriptor payloads. As such, lpfc creates it's own version of the structures, using the base definitions (mostly headers) and additionally declaring the descriptors that will complete the population of the ELS. Link: https://lore.kernel.org/r/20200210173155.547-3-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-21scsi: lpfc: Fix Fabric hostname registration if system hostname changesJames Smart
There are reports of multiple ports on the same system displaying different hostnames in fabric FDMI displays. Currently, the driver registers the hostname at initialization and obtains the hostname via init_utsname()->nodename queried at the time the FC link comes up. Unfortunately, if the machine hostname is updated after initialization, such as via DHCP or admin command, the value registered initially will be incorrect. Fix by having the driver save the hostname that was registered with FDMI. The driver then runs a heartbeat action that will check the hostname. If the name changes, reregister the FMDI data. The hostname is used in RSNN_NN, FDMI RPA and FDMI RHBA. Link: https://lore.kernel.org/r/20191218235808.31922-5-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12scsi: lpfc: fix inlining of lpfc_sli4_cleanup_poll_list()James Smart
Compilation can fail due to having an inline function reference where the function body is not present. Fix by removing the inline tag. Fixes: 93a4d6f40198 ("scsi: lpfc: Add registration for CPU Offline/Online events") Link: https://lore.kernel.org/r/20191111230401.12958-4-jsmart2021@gmail.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06scsi: lpfc: Add registration for CPU Offline/Online eventsJames Smart
The recent affinitization didn't address cpu offlining/onlining. If an interrupt vector is shared and the low order cpu owning the vector is offlined, as interrupts are managed, the vector is taken offline. This causes the other CPUs sharing the vector will hang as they can't get io completions. Correct by registering callbacks with the system for Offline/Online events. When a cpu is taken offline, its eq, which is tied to an interrupt vector is found. If the cpu is the "owner" of the vector and if the eq/vector is shared by other CPUs, the eq is placed into a polled mode. Additionally, code paths that perform io submission on the "sharing CPUs" will check the eq state and poll for completion after submission of new io to a wq that uses the eq. Similarly, when a cpu comes back online and owns an offlined vector, the eq is taken out of polled mode and rearmed to start driving interrupts for eq. Link: https://lore.kernel.org/r/20191105005708.7399-9-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-30scsi: lpfc: Fix NVMe ABTS in response to receiving an ABTSJames Smart
When the port, running as a nvme target, receives an ABTS, it submits commands to the adapter to Abort i/o outstanding in the adapter. The Abort command formatting routine left a command field set to zero, which instructs the adapter to generate an ABTS on the wire as part of cleaning up the I/O. This is common operation for an initiator, but not for a target. Fix the driver to check whether an ABTS had been received for the I/O, and if so, change the Abort command formatting so that the ABTS generation is disabled (IA=1). No need to ABTS it when the other side already has. Also refactored the code such that there is a single routine being used for nvme or nvmet ABORT requests, and IA is an argument. Link: https://lore.kernel.org/r/20190922035906.10977-11-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-29scsi: lpfc: Remove bg debugfs buffersJames Smart
Capturing and downloading dif command data and dif data was done a dozen years ago and no longer being used. Also creates a potential security hole. Remove the debugfs buffer for dif debugging. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> CC: KyleMahlkuch <kmahlkuc@linux.vnet.ibm.com> CC: Hannes Reinecke <hare@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19scsi: lpfc: Merge per-protocol WQ/CQ pairs into single per-cpu pairJames Smart
Currently, each hardware queue, typically allocated per-cpu, consists of a WQ/CQ pair per protocol. Meaning if both SCSI and NVMe are supported 2 WQ/CQ pairs will exist for the hardware queue. Separate queues are unnecessary. The current implementation wastes memory backing the 2nd set of queues, and the use of double the SLI-4 WQ/CQ's means less hardware queues can be supported which means there may not always be enough to have a pair per cpu. If there is only 1 pair per cpu, more cpu's may get their own WQ/CQ. Rework the implementation to use a single WQ/CQ pair by both protocols. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19scsi: lpfc: Fix hang when downloading fw on port enabled for nvmeJames Smart
As part of firmware download, the adapter is reset. On the adapter the reset causes the function to stop and all outstanding io is terminated (without responses). The reset path then starts teardown of the adapter, starting with deregistration of the remote ports with the nvme-fc transport. The local port is then deregistered and the driver waits for local port deregistration. This never finishes. The remote port deregistrations terminated the nvme controllers, causing them to send aborts for all the outstanding io. The aborts were serviced in the driver, but stalled due to its state. The nvme layer then stops to reclaim it's outstanding io before continuing. The io must be returned before the reset on the controller is deemed complete and the controller delete performed. The remote port deregistration won't complete until all the controllers are terminated. And the local port deregistration won't complete until all controllers and remote ports are terminated. Thus things hang. The issue is the reset which stopped the adapter also stopped all the responses that would drive i/o completions, and the aborts were also stopped that stopped i/o completions. The driver, when resetting the adapter like this, needs to be generating the completions as part of the adapter reset so that I/O complete (in error), and any aborts are not queued. Fix by adding flush routines whenever the adapter port has been reset or discovered in error. The flush routines will generate the completions for the scsi and nvme outstanding io. The abort ios, if waiting, will be caught and flushed as well. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-11Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs, mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the removal of the osst driver (I heard from Willem privately that he would like the driver removed because all his test hardware has failed). Plus number of minor changes, spelling fixes and other trivia. The big merge conflict this time around is the SPDX licence tags. Following discussion on linux-next, we believe our version to be more accurate than the one in the tree, so the resolution is to take our version for all the SPDX conflicts" Note on the SPDX license tag conversion conflicts: the SCSI tree had done its own SPDX conversion, which in some cases conflicted with the treewide ones done by Thomas & co. In almost all cases, the conflicts were purely syntactic: the SCSI tree used the old-style SPDX tags ("GPL-2.0" and "GPL-2.0+") while the treewide conversion had used the new-style ones ("GPL-2.0-only" and "GPL-2.0-or-later"). In these cases I picked the new-style one. In a few cases, the SPDX conversion was actually different, though. As explained by James above, and in more detail in a pre-pull-request thread: "The other problem is actually substantive: In the libsas code Luben Tuikov originally specified gpl 2.0 only by dint of stating: * This file is licensed under GPLv2. In all the libsas files, but then muddied the water by quoting GPLv2 verbatim (which includes the or later than language). So for these files Christoph did the conversion to v2 only SPDX tags and Thomas converted to v2 or later tags" So in those cases, where the spdx tag substantially mattered, I took the SCSI tree conversion of it, but then also took the opportunity to turn the old-style "GPL-2.0" into a new-style "GPL-2.0-only" tag. Similarly, when there were whitespace differences or other differences to the comments around the copyright notices, I took the version from the SCSI tree as being the more specific conversion. Finally, in the spdx conversions that had no conflicts (because the treewide ones hadn't been done for those files), I just took the SCSI tree version as-is, even if it was old-style. The old-style conversions are perfectly valid, even if the "-only" and "-or-later" versions are perhaps more descriptive. * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (185 commits) scsi: qla2xxx: move IO flush to the front of NVME rport unregistration scsi: qla2xxx: Fix NVME cmd and LS cmd timeout race condition scsi: qla2xxx: on session delete, return nvme cmd scsi: qla2xxx: Fix kernel crash after disconnecting NVMe devices scsi: megaraid_sas: Update driver version to 07.710.06.00-rc1 scsi: megaraid_sas: Introduce various Aero performance modes scsi: megaraid_sas: Use high IOPS queues based on IO workload scsi: megaraid_sas: Set affinity for high IOPS reply queues scsi: megaraid_sas: Enable coalescing for high IOPS queues scsi: megaraid_sas: Add support for High IOPS queues scsi: megaraid_sas: Add support for MPI toolbox commands scsi: megaraid_sas: Offload Aero RAID5/6 division calculations to driver scsi: megaraid_sas: RAID1 PCI bandwidth limit algorithm is applicable for only Ventura scsi: megaraid_sas: megaraid_sas: Add check for count returned by HOST_DEVICE_LIST DCMD scsi: megaraid_sas: Handle sequence JBOD map failure at driver level scsi: megaraid_sas: Don't send FPIO to RL Bypass queue scsi: megaraid_sas: In probe context, retry IOC INIT once if firmware is in fault scsi: megaraid_sas: Release Mutex lock before OCR in case of DCMD timeout scsi: megaraid_sas: Call disable_irq from process IRQ poll scsi: megaraid_sas: Remove few debug counters from IO path ...
2019-06-21lpfc: add support for translating an RSCN rcv into a discovery rescanJames Smart
This patch updates RSCN receive processing to check for the remote port being an NVME port, and if so, invoke the nvme_fc callback to rescan the remote port. The rescan will generate a discovery udev event. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-06-21lpfc: add support to generate RSCN events for nportJames Smart
This patch adds general RSCN support: - The ability to transmit an RSCN to the port on the other end of the link (regular port if pt2pt, or fabric controller if fabric). - And general recognition of an RSCN ELS when an ELS is received. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-06-18scsi: lpfc: Separate CQ processing for nvmet_fc upcallsJames Smart
Currently the driver is notified of new command frame receipt by CQEs. As part of the CQE processing, the driver upcalls the nvmet_fc transport to deliver the command. nvmet_fc, as part of receiving the command builds out a context for it, where one of the first steps is to allocate memory for the io. When running with tests that do large ios (1MB), it was found on some systems, the total number of outstanding I/O's, at 1MB per, completely consumed the system's memory. Thus additional ios were getting blocked in the memory allocator. Given that this blocked the lpfc thread processing CQEs, there were lots of other commands that were received and which are then held up, and given CQEs are serially processed, the aggregate delays for an IO waiting behind the others became cummulative - enough so that the initiator hit timeouts for the ios. The basic fix is to avoid the direct upcall and instead schedule a work item for each io as it is received. This allows the cq processing to complete very quickly, and each io can then run or block on it's own. However, this general solution hurts latency when there are few ios. As such, implemented the fix such that the driver watches how many CQEs it has processed sequentially in one run. As long as the count is below a threshold, the direct nvmet_fc upcall will be made. Only when the count is exceeded will it revert to work scheduling. Given that debug of this showed a surprisingly long delay in cq processing, the io timer stats were updated to better reflect the processing of the different points. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Update 12.2.0.0 file copyrights to 2019James Smart
For files modified as part of 12.2.0.0 patches, update copyright to 2019 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queuesJames Smart
So far MSIX vector allocation assumed it would be 1:1 with hardware queues. However, there are several reasons why fewer MSIX vectors may be allocated than hardware queues such as the platform being out of vectors or adapter limits being less than cpu count. This patch reworks the MSIX/EQ relationships with the per-cpu hardware queues so they can function independently. MSIX vectors will be equitably split been cpu sockets/cores and then the per-cpu hardware queues will be mapped to the vectors most efficient for them. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Adapt partitioned XRI lists to efficient sharingJames Smart
The XRI get/put lists were partitioned per hardware queue. However, the adapter rarely had sufficient resources to give a large number of resources per queue. As such, it became common for a cpu to encounter a lack of XRI resource and request the upper io stack to retry after returning a BUSY condition. This occurred even though other cpus were idle and not using their resources. Create as efficient a scheme as possible to move resources to the cpus that need them. Each cpu maintains a small private pool which it allocates from for io. There is a watermark that the cpu attempts to keep in the private pool. The private pool, when empty, pulls from a global pool from the cpu. When the cpu's global pool is empty it will pull from other cpu's global pool. As there many cpu global pools (1 per cpu or hardware queue count) and as each cpu selects what cpu to pull from at different rates and at different times, it creates a radomizing effect that minimizes the number of cpu's that will contend with each other when the steal XRI's from another cpu's global pool. On io completion, a cpu will push the XRI back on to its private pool. A watermark level is maintained for the private pool such that when it is exceeded it will move XRI's to the CPU global pool so that other cpu's may allocate them. On NVME, as heartbeat commands are critical to get placed on the wire, a single expedite pool is maintained. When a heartbeat is to be sent, it will allocate an XRI from the expedite pool rather than the normal cpu private/global pools. On any io completion, if a reduction in the expedite pools is seen, it will be replenished before the XRI is placed on the cpu private pool. Statistics are added to aid understanding the XRI levels on each cpu and their behaviors. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Convert ring number to hardware queue for nvme wqe posting.James Smart
SLI4 nvme functions are passing the SLI3 ring number when posting wqe to hardware. This should be indicating the hardware queue to use, not the ring number. Replace ring number with the hardware queue that should be used. Note: SCSI avoided this issue as it utilized an older lfpc_issue_iocb routine that properly adapts. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Partition XRI buffer list across Hardware QueuesJames Smart
Once the IO buff allocations were made shared, there was a single XRI buffer list shared by all hardware queues. A single list isn't great for performance when shared across the per-cpu hardware queues. Create a separate XRI IO buffer get/put list for each Hardware Queue. As SGLs and associated IO buffers get allocated/posted to the firmware; round robin their assignment across all available hardware Queues so that there is an equitable assignment. Modify SCSI and NVME IO submit code paths to use the Hardware Queue logic for XRI allocation. Add a debugfs interface to display hardware queue statistics Added new empty_io_bufs counter to track if a cpu runs out of XRIs. Replace common_ variables/names with io_ to make meanings clearer. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Remove extra vector and SLI4 queue for ExpresslaneJames Smart
There is a extra queue and msix vector for expresslane. Now that the driver will be doing queues per cpu, this oddball queue is no longer needed. Expresslane will utilize the normal per-cpu queues. Updated debugfs sli4 queue output to go along with the change Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05scsi: lpfc: Implement common IO buffers between NVME and SCSIJames Smart
Currently, both NVME and SCSI get their IO buffers from separate pools. XRI's are associated 1:1 with IO buffers, so XRI's are also split between protocols. Eliminate the independent pools and use a single pool. Each buffer structure now has a common section and a protocol section. Per protocol routines for SGL initialization are removed and replaced by common routines. Initialization of the buffers is only done on the common area. All other fields, which are protocol specific, are initialized when the buffer is allocated for use in the per-protocol allocation routine. In the past, the SCSI side allocated IO buffers as part of slave_alloc calls until the maximum XRIs for SCSI was reached. As all XRIs are now common and may be used for either protocol, allocation for everything is done as part of adapter initialization and the scsi side has no action in slave alloc. As XRI's are no longer split, the lpfc_xri_split module parameter is removed. Adapters based on SLI3 will continue to use the older scsi_buf_list_get/put routines. All SLI4 adapters utilize the new IO buffer scheme Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-19scsi: lpfc: Adding ability to reset chip via pci bus resetJames Smart
This patch adds a "pci_bus_reset" option to the board_mode sysfs attribute. This option uses the pci_reset_bus() api to reset the PCIe link the adapter is on, which will reset the chip/adapter. Prior to issuing this option, all functions on the same chip must be placed in the offline state by the admin. After the reset, all of the instances may be brought online again. The primary purpose of this functionality is to support cases where firmware update required a chip reset but the admin did not want to reboot the machine in order to instantiate the firmware update. Sanity checks take place prior to the reset to ensure the adapter is the sole entity on the PCIe bus and that all functions are in the offline state. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07scsi: lpfc: Fix driver release of fw-logging buffersJames Smart
On driver termination, after the driver stops fw logging by writing a register on the chip, the driver immediately unmaps and frees the logging buffer, without confirming in any way that the chip has received the write and terminated the logging. As termination on the chip is not immediate, the chip may issue a dma request to the now unmapped dma buffer, resulting in a iommu fault. Change the driver to receive a confirmation that logging ahs been terminated. As the driver always issues an SLI reset with the device as part of shutdown, and as part of that is receiving confirmation that the reset is complete - the driver was modified to perform the write to disable fw logging prior to the SLI reset and only free the fw log buffer after the SLI reset is complete. That guarantees use of the fw log buffer is fully terminated when it is unmapped. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07scsi: lpfc: Fix discovery failures during port failovers with lots of vportsJames Smart
The driver is getting hit with 100s of RSCNs during remote port address changes. Each of those RSCN's ends up generating UNREG_RPI and REG_PRI mailbox commands. The discovery engine within the driver doesn't wait for the mailbox command completions. Instead it sets state flags and moves forward. At some point, there's a massive backlog of mailbox commands which take time for the adapter to process. Additionally, it appears there were duplicate events from the switch so the driver generated duplicate mailbox commands for the same remote port. During this window, failures on PLOGI and PRLI ELS's are see as the adapter is rejecting them as they are for remote ports that still have pending mailbox commands. Streamline the discovery engine so that PLOGI log checks for outstanding UNREG_RPIs and defer the processing until the commands complete. This better synchronizes the ELS transmission vs the RPI registrations. Filter out multiple UNREG_RPIs being queued up for the same remote port. Beef up log messages in this area. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06scsi: lpfc: Implement GID_PT on Nameserver query to support faster failoverJames Smart
The switches seem to respond faster to GID_PT vs GID_FT NameServer queries. Add support for GID_PT to be used over GID_FT to enable faster storage failover detection. Includes addition of new module parameter to select between GID_PT and GID_FT (GID_FT is default). Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>