summaryrefslogtreecommitdiff
path: root/drivers/cxl/core/pci.c
AgeCommit message (Collapse)Author
2024-05-08cxl: Add post-reset warning if reset results in loss of previously committed ↵Dave Jiang
HDM decoders Secondary Bus Reset (SBR) is equivalent to a device being hot removed and inserted again. Doing a SBR on a CXL type 3 device is problematic if the exported device memory is part of system memory that cannot be offlined. The event is equivalent to violently ripping out that range of memory from the kernel. While the hardware requires the "Unmask SBR" bit set in the Port Control Extensions register and the kernel currently does not unmask it, user can unmask this bit via setpci or similar tool. The driver does not have a way to detect whether a reset coming from the PCI subsystem is a Function Level Reset (FLR) or SBR. The only way to detect is to note if a decoder is marked as enabled in software but the decoder control register indicates it's not committed. Add a helper function to find discrepancy between the decoder software state versus the hardware register state. Suggested-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20240502165851.1948523-6-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
2024-05-08PCI/CXL: Move CXL Vendor ID to pci_ids.hDave Jiang
Move PCI_DVSEC_VENDOR_ID_CXL in CXL private code to PCI_VENDOR_ID_CXL in pci_ids.h in order to be utilized in PCI subsystem. While the CXL Vendor ID (0x1e98) is not listed in the PCI SIG "Member Companies" database at https://pcisig.com/membership/member-companies, the SIG has confirmed that it is reserved by CXL. Link: https://lore.kernel.org/r/20240502165851.1948523-2-dave.jiang@intel.com Suggested-by: Bjorn Helgaas <helgaas@kernel.org> Link: https://lore.kernel.org/linux-cxl/20240402172323.GA1818777@bhelgaas/ Signed-off-by: Dave Jiang <dave.jiang@intel.com> [bhelgaas: update commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
2024-03-13lib/firmware_table: Provide buffer length argument to cdat_table_parse()Robert Richter
There exist card implementations with a CDAT table using a fixed size buffer, but with entries filled in that do not fill the whole table length size. Then, the last entry in the CDAT table may not mark the end of the CDAT table buffer specified by the length field in the CDAT header. It can be shorter with trailing unused (zero'ed) data. The actual table length is determined while reading all CDAT entries of the table with DOE. If the table is greater than expected (containing zero'ed trailing data), the CDAT parser fails with: [ 48.691717] Malformed DSMAS table length: (24:0) [ 48.702084] [CDAT:0x00] Invalid zero length [ 48.711460] cxl_port endpoint1: Failed to parse CDAT: -22 In addition, a check of the table buffer length is missing to prevent an out-of-bound access then parsing the CDAT table. Hardening code against device returning borked table. Fix that by providing an optional buffer length argument to acpi_parse_entries_array() that can be used by cdat_table_parse() to propagate the buffer size down to its users to check the buffer length. This also prevents a possible out-of-bound access mentioned. Add a check to warn about a malformed CDAT table length. Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Len Brown <lenb@kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/ZdEnopFO0Tl3t2O1@rric.localdomain Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12cxl/pci: Get rid of pointer arithmetic reading CDAT tableRobert Richter
Reading the CDAT table using DOE requires a Table Access Response Header in addition to the CDAT entry. In current implementation this has caused offsets with sizeof(__le32) to the actual buffers. This led to hardly readable code and even bugs. E.g., see fix of devm_kfree() in read_cdat_data(): commit c65efe3685f5 ("cxl/cdat: Free correct buffer on checksum error") Rework code to avoid calculations with sizeof(__le32). Introduce struct cdat_doe_rsp for this which contains the Table Access Response Header and a variable payload size for various data structures afterwards to access the CDAT table and its CDAT Data Structures without recalculating buffer offsets. Cc: Lukas Wunner <lukas@wunner.de> Cc: Fan Ni <nifan.cxl@gmail.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240216155844.406996-3-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12cxl/pci: Rename DOE mailbox handle to doe_mbRobert Richter
Trivial variable rename for the DOE mailbox handle from cdat_doe to doe_mb. The variable name cdat_doe is too ambiguous, use doe_mb that is commonly used for the mailbox. Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240216155844.406996-2-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-02-16cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS windowRobert Richter
The Linux CXL subsystem is built on the assumption that HPA == SPA. That is, the host physical address (HPA) the HDM decoder registers are programmed with are system physical addresses (SPA). During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1, 8.1.3.8) are checked if the memory is enabled and the CXL range is in a HPA window that is described in a CFMWS structure of the CXL host bridge (cxl-3.1, 9.18.1.3). Now, if the HPA is not an SPA, the CXL range does not match a CFMWS window and the CXL memory range will be disabled then. The HDM decoder stops working which causes system memory being disabled and further a system hang during HDM decoder initialization, typically when a CXL enabled kernel boots. Prevent a system hang and do not disable the HDM decoder if the decoder's CXL range is not found in a CFMWS window. Note the change only fixes a hardware hang, but does not implement HPA/SPA translation. Support for this can be added in a follow on patch series. Signed-off-by: Robert Richter <rrichter@amd.com> Fixes: 34e37b4c432c ("cxl/port: Enable HDM Capability after validating DVSEC Ranges") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20240216160113.407141-1-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-01-29cxl/pci: Skip to handle RAS errors if CXL.mem device is detachedLi Ming
The PCI AER model is an awkward fit for CXL error handling. While the expectation is that a PCI device can escalate to link reset to recover from an AER event, the same reset on CXL amounts to a surprise memory hotplug of massive amounts of memory. At present, the CXL error handler attempts some optimistic error handling to unbind the device from the cxl_mem driver after reaping some RAS register values. This results in a "hopeful" attempt to unplug the memory, but there is no guarantee that will succeed. A subsequent AER notification after the memdev unbind event can no longer assume the registers are mapped. Check for memdev bind before reaping status register values to avoid crashes of the form: BUG: unable to handle page fault for address: ffa00000195e9100 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page [...] RIP: 0010:__cxl_handle_ras+0x30/0x110 [cxl_core] [...] Call Trace: <TASK> ? __die+0x24/0x70 ? page_fault_oops+0x82/0x160 ? kernelmode_fixup_or_oops+0x84/0x110 ? exc_page_fault+0x113/0x170 ? asm_exc_page_fault+0x26/0x30 ? __pfx_dpc_reset_link+0x10/0x10 ? __cxl_handle_ras+0x30/0x110 [cxl_core] ? find_cxl_port+0x59/0x80 [cxl_core] cxl_handle_rp_ras+0xbc/0xd0 [cxl_core] cxl_error_detected+0x6c/0xf0 [cxl_core] report_error_detected+0xc7/0x1c0 pci_walk_bus+0x73/0x90 pcie_do_recovery+0x23f/0x330 Longer term, the unbind and PCI_ERS_RESULT_DISCONNECT behavior might need to be replaced with a new PCI_ERS_RESULT_PANIC. Fixes: 6ac07883dbb5 ("cxl/pci: Add RCH downstream port error logging") Cc: stable@vger.kernel.org Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Li Ming <ming4.li@intel.com> Link: https://lore.kernel.org/r/20240129131856.2458980-1-ming4.li@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-12-22cxl: Calculate and store PCI link latency for the downstream portsDave Jiang
The latency is calculated by dividing the flit size over the bandwidth. Add support to retrieve the flit size for the CXL switch device and calculate the latency of the PCIe link. Cache the latency number with cxl_dport. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170319621931.2212653.6800240203604822886.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-12-08cxl/cdat: Free correct buffer on checksum errorIra Weiny
The new 6.7-rc1 kernel now checks the checksum on CDAT data. While using a branch of Fan's DCD qemu work (and specifying DCD devices), the following splat was observed. WARNING: CPU: 1 PID: 1384 at drivers/base/devres.c:1064 devm_kfree+0x4f/0x60 ... RIP: 0010:devm_kfree+0x4f/0x60 ... ? devm_kfree+0x4f/0x60 read_cdat_data+0x1a0/0x2a0 [cxl_core] cxl_port_probe+0xdf/0x200 [cxl_port] ... The issue in qemu is still unknown but the spat is a straight forward bug in the CDAT checksum processing code. Use a CDAT buffer variable to ensure the devm_free() works correctly on error. Fixes: 670e4e88f3b1 ("cxl: Add checksum verification to CDAT from CXL") Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Robert Richter <rrichter@amd.com> Link: http://lore.kernel.org/r/20231116-fix-cdat-devm-free-v1-1-b148b40707d7@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-11-02cxl/pci: Change CXL AER support check to use native AERTerry Bowman
Native CXL protocol errors are delivered to the OS through AER reporting. The owner of AER owns CXL Protocol error management with respect to _OSC negotiation.[1] CXL device errors are handled by a separate interrupt with native control gated by _OSC control field 'CXL Memory Error Reporting Control'. The CXL driver incorrectly checks for 'CXL Memory Error Reporting Control' before accessing AER registers and caching RCH downport AER registers. Replace the current check in these 2 cases with native AER checks. [1] CXL 3.0 - 9.17.2 CXL _OSC, Table-9-26, Interpretation of CXL _OSC Support Fields, p.641 Fixes: f05fd10d138d ("cxl/pci: Add RCH downstream port AER register discovery") Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Link: https://lore.kernel.org/r/20231102155232.1421261-1-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-31Merge branch 'for-6.7/cxl-qtg' into cxl/nextDan Williams
Merge some prep-work for CXL QOS class support. This cycle saw large collisions with mm on this topic, so the bulk of this topic needs to wait.
2023-10-27cxl: Add support for reading CXL switch CDAT tableDave Jiang
Add read_cdat_data() call in cxl_switch_port_probe() to allow reading of CDAT data for CXL switches. read_cdat_data() needs to be adjusted for the retrieving of the PCIe device depending on if the passed in port is endpoint or switch. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/169713682855.2205276.6418370379144967443.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27cxl: Add checksum verification to CDAT from CXLDave Jiang
A CDAT table is available from a CXL device. The table is read by the driver and cached in software. With the CXL subsystem needing to parse the CDAT table, the checksum should be verified. Add checksum verification after the CDAT table is read from device. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/169713682277.2205276.2687265961314933628.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27cxl/pci: Disable root port interrupts in RCH modeTerry Bowman
The RCH root port contains root command AER registers that should not be enabled.[1] Disable these to prevent root port interrupts. [1] CXL 3.0 - 12.2.1.1 RCH Downstream Port-detected Errors Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231018171713.1883517-17-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27cxl/pci: Add RCH downstream port error loggingTerry Bowman
RCH downstream port error logging is missing in the current CXL driver. The missing AER and RAS error logging is needed for communicating driver error details to userspace. Update the driver to include PCIe AER and CXL RAS error logging. Add RCH downstream port error handling into the existing RCiEP handler. The downstream port error handler is added to the RCiEP error handler because the downstream port is implemented in a RCRB, is not PCI enumerable, and as a result is not directly accessible to the PCI AER root port driver. The AER root port driver calls the RCiEP handler for handling RCD errors and RCH downstream port protocol errors. Update existing RCiEP correctable and uncorrectable handlers to also call the RCH handler. The RCH handler will read the RCH AER registers, check for error severity, and if an error exists will log using an existing kernel AER trace routine. The RCH handler will also log downstream port RAS errors if they exist. Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231018171713.1883517-16-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27cxl/pci: Map RCH downstream AER registers for logging protocol errorsTerry Bowman
The restricted CXL host (RCH) error handler will log protocol errors using AER and RAS status registers. The AER and RAS registers need to be virtually memory mapped before enabling interrupts. Create the initializer function devm_cxl_setup_parent_dport() for this when the endpoint is connected with the dport. The initialization sets up the RCH RAS and AER mappings. Add 'struct cxl_regs' to 'struct cxl_dport' for saving a pointer to the RCH downstream port's AER and RAS registers. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231018171713.1883517-15-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27cxl/pci: Update CXL error logging to use RAS register addressTerry Bowman
The CXL error handler currently only logs endpoint RAS status. The CXL topology includes several components providing RAS details to be logged during error handling.[1] Update the current handler's RAS logging to use a RAS register address. Also, update the error handler function names to be consistent with correctable and uncorrectable RAS. This will allow for adding support to log other CXL component's RAS details in the future. [1] CXL3.0 Table 8-22 CXL_Capability_ID Assignment Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231018171713.1883517-14-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27cxl/pci: Add RCH downstream port AER register discoveryRobert Richter
Restricted CXL host (RCH) downstream port AER information is not currently logged while in the error state. One problem preventing the error logging is the AER and RAS registers are not accessible. The CXL driver requires changes to find RCH downstream port AER and RAS registers for purpose of error logging. RCH downstream ports are not enumerated during a PCI bus scan and are instead discovered using system firmware, ACPI in this case.[1] The downstream port is implemented as a Root Complex Register Block (RCRB). The RCRB is a 4k memory block containing PCIe registers based on the PCIe root port.[2] The RCRB includes AER extended capability registers used for reporting errors. Note, the RCH's AER Capability is located in the RCRB memory space instead of PCI configuration space, thus its register access is different. Existing kernel PCIe AER functions can not be used to manage the downstream port AER capabilities and RAS registers because the port was not enumerated during PCI scan and the registers are not PCI config accessible. Discover RCH downstream port AER extended capability registers. Use MMIO accesses to search for extended AER capability in RCRB register space. [1] CXL 3.0 Spec, 9.11.2 - System Firmware View of CXL 1.1 Hierarchy [2] CXL 3.0 Spec, 8.2.1.1 - RCH Downstream Port RCRB Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231018171713.1883517-12-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25Merge branch 'for-6.5/cxl-rch-eh' into for-6.5/cxlDan Williams
Pick up the first half of the RCH error handling series. The back half needs some fixups for test regressions. Small conflicts with the PMU work around register enumeration and setup helpers.
2023-06-25Revert "cxl/port: Enable the HDM decoder capability for switch ports"Dan Williams
commit eb0764b822b9 ("cxl/port: Enable the HDM decoder capability for switch ports") ...was added on the observation of CXL memory not being accessible after setting up a region on a "cold-plugged" device. A "cold-plugged" CXL device is one that was not present at boot, so platform-firmware/BIOS has no chance to set it up. While it is true that the debug found the enable bit clear in the host-bridge's instance of the global control register (CXL 3.0 8.2.4.19.2 CXL HDM Decoder Global Control Register), that bit is described as: "This bit is only applicable to CXL.mem devices and shall return 0 on CXL Host Bridges and Upstream Switch Ports." So it is meant to be zero, and further testing confirmed that this "fix" had no effect on the failure. Revert it, and be more vigilant about proposed fixes in the future. Since the original copied stable@, flag this revert for stable@ as well. Cc: <stable@vger.kernel.org> Fixes: eb0764b822b9 ("cxl/port: Enable the HDM decoder capability for switch ports") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168685882012.3475336.16733084892658264991.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25cxl: Rename 'uport' to 'uport_dev'Dan Williams
For symmetry with the recent rename of ->dport_dev for a 'struct cxl_dport', add the "_dev" suffix to the ->uport property of a 'struct cxl_port'. These devices represent the downstream-port-device and upstream-port-device respectively in the CXL/PCIe topology. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-6-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-05-18cxl: Wait Memory_Info_Valid before access memory related infoDave Jiang
The Memory_Info_Valid bit (CXL 3.0 8.1.3.8.2) indicates that the CXL Range Size High and Size Low registers are valid. The bit must be set within 1 second of reset deassertion to the device. Check valid bit before we check the Memory_Active bit when waiting for cxl_await_media_ready() to ensure that the memory info is valid for consumption. Also ensures both DVSEC ranges 1 and 2 are ready if DVSEC Capability indicates they are both supported. Fixes: 523e594d9cc0 ("cxl/pci: Implement wait for media active") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168444687469.3134781.11033518965387297327.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-05-18cxl/port: Enable the HDM decoder capability for switch portsDan Williams
Derick noticed, when testing hot plug, that hot-add behaves nominally after a removal. However, if the hot-add is done without a prior removal, CXL.mem accesses fail. It turns out that the original implementation of the port driver and region programming wrongly assumed that platform-firmware always enables the host-bridge HDM decoder capability. Add support turning on switch-level HDM decoders in the case where platform-firmware has not. The implementation is careful to only arrange for the enable to be undone if the current instance of the driver was the one that did the enable. This is to interoperate with platform-firmware that may expect CXL.mem to remain active after the driver is shutdown. This comes at the cost of potentially not shutting down the enable on kexec flows, but it is mitigated by the fact that the related HDM decoders still need to be enabled on an individual basis. Cc: <stable@vger.kernel.org> Reported-by: Derick Marks <derick.w.marks@intel.com> Fixes: 54cdbf845cf7 ("cxl/port: Add a driver for 'struct cxl_port' objects") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/168437998331.403037.15719879757678389217.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-05-13cxl: Add missing return to cdat read error pathDave Jiang
Add a return to the error path when cxl_cdat_read_table() fails. Current code continues with the table pointer points to freed memory. Fixes: 7a877c923995 ("cxl/pci: Simplify CDAT retrieval error path") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/168382793506.3510737.4792518576623749076.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-22cxl/port: Fix port to pci device assumptions in read_cdat_data()Dan Williams
Not all endpoint CXL ports are associated with PCI devices. The cxl_test infrastructure models 'struct cxl_port' instances hosted by platform devices. Teach read_cdat_data() to be careful about non-pci hosted cxl_memdev instances. Otherwise, cxl_test crashes with this signature: RIP: 0010:xas_start+0x6d/0x290 [..] Call Trace: <TASK> xas_load+0xa/0x50 xas_find+0x25b/0x2f0 xa_find+0x118/0x1d0 pci_find_doe_mailbox+0x51/0xc0 read_cdat_data+0x45/0x190 [cxl_core] cxl_port_probe+0x10a/0x1e0 [cxl_port] cxl_bus_probe+0x17/0x50 [cxl_core] Some other cleanups are included like removing the single-use @uport variable, and removing the indirection through 'struct cxl_dev_state' to lookup the device that registered the memdev and may be a pci device. Fixes: af0a6c3587dc ("cxl/pci: Use CDAT DOE mailbox created by PCI core") Reviewed-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/168213190748.708404.16215095414060364800.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-18cxl/pci: Rightsize CDAT response allocationLukas Wunner
Jonathan notes that cxl_cdat_get_length() and cxl_cdat_read_table() allocate 32 dwords for the DOE response even though it may be smaller. In the case of cxl_cdat_get_length(), only the second dword of the response is of interest (it contains the length). So reduce the allocation to 2 dwords and let DOE discard the remainder. In the case of cxl_cdat_read_table(), a correctly sized allocation for the full CDAT already exists. Let DOE write each table entry directly into that allocation. There's a snag in that the table entry is preceded by a Table Access Response Header (1 dword, CXL 3.0 table 8-14). Save the last dword of the previous table entry, let DOE overwrite it with the header of the next entry and restore it afterwards. The resulting CDAT is preceded by 4 unavoidable useless bytes. Increase the allocation size accordingly. The buffer overflow check in cxl_cdat_read_table() becomes unnecessary because the remaining bytes in the allocation are tracked in "length", which is passed to DOE and limits how many bytes it writes to the allocation. Additionally, cxl_cdat_read_table() bails out if the DOE response is truncated due to insufficient space. Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/7a4e1f86958a79a70f29b96a92199522f08f8322.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-18cxl/pci: Simplify CDAT retrieval error pathDave Jiang
The cdat.table and cdat.length fields in struct cxl_port are set before CDAT retrieval and must therefore be unset on failure. Simplify by setting only on success. Suggested-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Link: https://lore.kernel.org/linux-cxl/20230209113422.00007017@Huawei.com/ Signed-off-by: Dave Jiang <dave.jiang@intel.com> [lukas: rebase and rephrase commit message] Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/7a5c7104fb6a3016dbaec1c5d0ed34619ea11a0c.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-18cxl/pci: Use CDAT DOE mailbox created by PCI coreLukas Wunner
The PCI core has just been amended to create a pci_doe_mb struct for every DOE instance on device enumeration. Drop creation of a (duplicate) CDAT DOE mailbox on cxl probing in favor of the one already created by the PCI core. Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/becaf70e8faf9681d474200117d62d7eaac46cca.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-18cxl/pci: Use synchronous API for DOELukas Wunner
A synchronous API for DOE has just been introduced. Convert CXL CDAT retrieval over to it. Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/c329c0a21c11c3b524ce2336b0bbb3c80a28c415.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-03cxl/pci: Handle excessive CDAT lengthLukas Wunner
If the length in the CDAT header is larger than the concatenation of the header and all table entries, then the CDAT exposed to user space contains trailing null bytes. Not every consumer may be able to handle that. Per Postel's robustness principle, "be liberal in what you accept" and silently reduce the cached length to avoid exposing those null bytes. Fixes: c97006046c79 ("cxl/port: Read CDAT table") Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: stable@vger.kernel.org # v6.0+ Link: https://lore.kernel.org/r/6d98b3c7da5343172bd3ccabfabbc1f31c079d74.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-03cxl/pci: Handle truncated CDAT entriesLukas Wunner
If truncated CDAT entries are received from a device, the concatenation of those entries constitutes a corrupt CDAT, yet is happily exposed to user space. Avoid by verifying response lengths and erroring out if truncation is detected. The last CDAT entry may still be truncated despite the checks introduced herein if the length in the CDAT header is too small. However, that is easily detectable by user space because it reaches EOF prematurely. A subsequent commit which rightsizes the CDAT response allocation closes that remaining loophole. The two lines introduced here which exceed 80 chars are shortened to less than 80 chars by a subsequent commit which migrates to a synchronous DOE API and replaces "t.task.rv" by "rc". The existing acpi_cdat_header and acpi_table_cdat struct definitions provided by ACPICA cannot be used because they do not employ __le16 or __le32 types. I believe that cannot be changed because those types are Linux-specific and ACPI is specified for little endian platforms only, hence doesn't care about endianness. So duplicate the structs. Fixes: c97006046c79 ("cxl/port: Read CDAT table") Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: stable@vger.kernel.org # v6.0+ Link: https://lore.kernel.org/r/bce3aebc0e8e18a1173425a7a865b232c3912963.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-03cxl/pci: Handle truncated CDAT headerLukas Wunner
cxl_cdat_get_length() only checks whether the DOE response size is sufficient for the Table Access response header (1 dword), but not the succeeding CDAT header (1 dword length plus other fields). It thus returns whatever uninitialized memory happens to be on the stack if a truncated DOE response with only 1 dword was received. Fix it. Fixes: c97006046c79 ("cxl/port: Read CDAT table") Reported-by: Ming Li <ming4.li@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ming Li <ming4.li@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: stable@vger.kernel.org # v6.0+ Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/r/000e69cd163461c8b1bc2cf4155b6e25402c29c7.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-03-21cxl/pci: Fix CDAT retrieval on big endianLukas Wunner
The CDAT exposed in sysfs differs between little endian and big endian arches: On big endian, every 4 bytes are byte-swapped. PCI Configuration Space is little endian (PCI r3.0 sec 6.1). Accessors such as pci_read_config_dword() implicitly swap bytes on big endian. That way, the macros in include/uapi/linux/pci_regs.h work regardless of the arch's endianness. For an example of implicit byte-swapping, see ppc4xx_pciex_read_config(), which calls in_le32(), which uses lwbrx (Load Word Byte-Reverse Indexed). DOE Read/Write Data Mailbox Registers are unlike other registers in Configuration Space in that they contain or receive a 4 byte portion of an opaque byte stream (a "Data Object" per PCIe r6.0 sec 7.9.24.5f). They need to be copied to or from the request/response buffer verbatim. So amend pci_doe_send_req() and pci_doe_recv_resp() to undo the implicit byte-swapping. The CXL_DOE_TABLE_ACCESS_* and PCI_DOE_DATA_OBJECT_DISC_* macros assume implicit byte-swapping. Byte-swap requests after constructing them with those macros and byte-swap responses before parsing them. Change the request and response type to __le32 to avoid sparse warnings. Per a request from Jonathan, replace sizeof(u32) with sizeof(__le32) for consistency. Fixes: c97006046c79 ("cxl/port: Read CDAT table") Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: stable@vger.kernel.org # v6.0+ Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/3051114102f41d19df3debbee123129118fc5e6d.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-16Merge branch 'for-6.3/cxl-events' into cxl/nextDan Williams
Include some additional fixups for event support for v6.3, namely, rationalize the identifiers in the trace output and fixup a kdoc comment.
2023-02-16cxl/trace: Standardize device information outputIra Weiny
The trace points were written to take a struct device input for the trace. In CXL multiple device objects are associated with each CXL hardware device. Using different device objects in the trace point can lead to confusion for users. The PCIe device is nice to have, but the user space tooling relies on the memory device naming. It is better to have those device names reported. Change all trace points to take struct cxl_memdev as a standard and report that name. Furthermore, standardize on the name 'memdev' in both /sys/kernel/tracing/trace and cxl-cli monitor output. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20230208-cxl-event-names-v2-1-fca130c2c68b@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-14Merge branch 'for-6.3/cxl-rr-emu' into cxl/nextDan Williams
Pick up the CXL DVSEC range register emulation for v6.3, and resolve conflicts with the cxl_port_probe() split (from for-6.3/cxl-ram-region) and event handling (from for-6.3/cxl-events).
2023-02-14cxl/pci: Remove locked check for dvsec_range_allowed()Dave Jiang
Remove the CXL_DECODER_F_LOCK check to be permissive of platform BIOSes that allow CXL.mem to be remapped. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167640370085.935665.13128321011001358077.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-14cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decodersDave Jiang
CXL rev3 spec 8.1.3 RCDs may not have HDM register blocks. Create a fake HDM with information from the CXL PCIe DVSEC registers. The decoder count will be set to the HDM count retrieved from the DVSEC cap register. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167640368994.935665.15831225724059704620.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-14cxl/hdm: Emulate HDM decoder from DVSEC range registersDave Jiang
In the case where HDM decoder register block exists but is not programmed and at the same time the DVSEC range register range is active, populate the CXL decoder object 'cxl_decoder' with info from DVSEC range registers. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167640368454.935665.13806415120298330717.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-14cxl/pci: Refactor cxl_hdm_decode_init()Dave Jiang
With the previous refactoring of DVSEC range registers out of cxl_hdm_decode_init(), it basically becomes a skeleton function. Squash __cxl_hdm_decode_init() with cxl_hdm_decode_init() to simplify the code. cxl_hdm_decode_init() now returns more error codes than just -EBUSY. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167640367916.935665.12898404758336059003.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-14cxl/port: Export cxl_dvsec_rr_decode() to cxl_portDave Jiang
Call cxl_dvsec_rr_decode() in the beginning of cxl_port_probe() and preserve the decoded information in a local 'struct cxl_endpoint_dvsec_info'. This info can be passed to various functions later on in order to support the HDM decoder emulation. The invocation of cxl_dvsec_rr_decode() in cxl_hdm_decode_init() is removed and a pointer to the 'struct cxl_endpoint_dvsec_info' is passed in. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167640367377.935665.2848747799651019676.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-14cxl/pci: Break out range register decoding from cxl_hdm_decode_init()Dave Jiang
There are 2 scenarios that requires additional handling. 1. A device that has active ranges in DVSEC range registers (RR) but no HDM decoder register block. 2. A device that has both RR active and HDM, but the HDM decoders are not programmed. The goal is to create emulated decoder software structs based on the RR. Move the CXL DVSEC range register decoding code block from cxl_hdm_decode_init() to its own function. Refactor code in preparation for the HDM decoder emulation. There is no functionality change to the code. Name the new function to cxl_dvsec_rr_decode(). The only change is to set range->start and range->end to CXL_RESOURCE_NONE and skipping the reading of base registers if the range size is 0, which equates to range not active. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167640366839.935665.11816388524993234329.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-10Merge branch 'for-6.3/cxl-ram-region' into cxl/nextDan Williams
Include the support for enumerating and provisioning ram regions for v6.3. This also include a default policy change for ram / volatile device-dax instances to assign them to the dax_kmem driver by default.
2023-02-10kernel/range: Uplevel the cxl subsystem's range_contains() helperDan Williams
In support of the CXL subsystem's use of 'struct range' to track decode address ranges, add a common range_contains() implementation with identical semantics as resource_contains(); The existing 'range_contains()' in lib/stackinit_kunit.c is namespaced with a 'stackinit_' prefix. Cc: Kees Cook <keescook@chromium.org> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/167601998163.1924368.6067392174077323935.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-07Merge branch 'for-6.3/cxl' into cxl/nextDan Williams
Merge the general CXL updates with fixes targeting v6.2-rc for v6.3. Resolve a conflict with the fix and move of cxl_report_and_clear() from pci.c to core/pci.c.
2023-01-04cxl/pci: Move tracepoint definitions to drivers/cxl/core/Dan Williams
CXL is using tracepoints for reporting RAS capability register payloads for AER events, and has plans to use tracepoints for the output payload of Get Poison List and Get Event Records commands. For organization purposes it would be nice to keep those all under a single + local CXL trace system. This also organization also potentially helps in the future when CXL drivers expand beyond generic memory expanders, however that would also entail a move away from the expander-specific cxl_dev_state context, save that for later. Note that the powerpc-specific drivers/misc/cxl/ also defines a 'cxl' trace system, however, it is unlikely that a single platform will ever load both drivers simultaneously. Cc: Steven Rostedt <rostedt@goodmis.org> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-12-05Merge branch 'for-6.2/cxl-aer' into for-6.2/cxlDan Williams
Pick up CXL AER handling and correctable error extensions. Resolve conflicts with cxl_pmem_wq reworks and RCH support.
2022-12-03cxl/core/regs: Make cxl_map_{component, device}_regs() device genericDan Williams
There is no need to carry the barno and the block offset through the stack, just convert them to a resource base immediately. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974411035.1608150.8605988708101648442.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-14cxl: Unify debug messages when calling devm_cxl_add_dport()Robert Richter
CXL dports are added in a couple of code paths using devm_cxl_add_dport(). Debug messages are individually generated, but are incomplete and inconsistent. Change this by moving its generation to devm_cxl_add_dport(). This unifies the messages and reduces code duplication. Also, generate messages on failure. Use a __devm_cxl_add_dport() wrapper to keep the readability of the error exits. Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/20221018132341.76259-5-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-19cxl/port: Read CDAT tableIra Weiny
The per-device CDAT data provides performance data that is relevant for mapping which CXL devices can participate in which CXL ranges by QTG (QoS Throttling Group) (per ECN: CXL 2.0 CEDT CFMWS & QTG_DSM) [1]. The QTG association specified in the ECN is advisory. Until the cxl_acpi driver grows support for invoking the QTG _DSM method the CDAT data is only of interest to userspace that may need it for debug purposes. Search the DOE mailboxes available, query CDAT data, cache the data and make it available via a sysfs binary attribute per endpoint at: /sys/bus/cxl/devices/endpointX/CDAT ...similar to other ACPI-structured table data in /sys/firmware/ACPI/tables. The CDAT is relative to 'struct cxl_port' objects since switches in addition to endpoints can host a CDAT instance. Switch CDAT support is not implemented. This does not support table updates at runtime. It will always provide whatever was there when first cached. It is also the case that table updates are not expected outside of explicit DPA address map affecting commands like Set Partition with the immediate flag set. Given that the driver does not support Set Partition with the immediate flag set there is no current need for update support. Link: https://www.computeexpresslink.org/spec-landing [1] Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> [djbw: drop in-kernel parsing infra for now, and other minor fixups] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220719205249.566684-7-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>