summaryrefslogtreecommitdiff
path: root/drivers/perf/hisilicon
AgeCommit message (Collapse)Author
2024-11-06perf: Switch back to struct platform_driver::remove()Uwe Kleine-König
After commit 0edb555a65d1 ("platform: Make platform_driver::remove() return void") .remove() is (again) the right callback to implement for platform drivers. Convert all platform drivers below drivers/perf to use .remove(), with the eventual goal to drop struct platform_driver::remove_new(). As .remove() and .remove_new() have the same prototypes, conversion is done by just changing the structure member name in the driver initializer. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://lore.kernel.org/r/20241027180313.410964-2-u.kleine-koenig@baylibre.com Signed-off-by: Will Deacon <will@kernel.org>
2024-08-30drivers/perf: hisi_pcie: Export supported Root Ports [bdf_min, bdf_max]Yicong Yang
Currently users can get the Root Ports supported by the PCIe PMU by "bus" sysfs attributes which indicates the PCIe bus number where Root Ports are located. This maybe insufficient since Root Ports supported by different PCIe PMUs may be located on the same PCIe bus. So export the BDF range the Root Ports additionally. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240829090332.28756-4-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-08-30drivers/perf: hisi_pcie: Fix TLP headers bandwidth countingYicong Yang
We make the initial value of event ctrl register as HISI_PCIE_INIT_SET and modify according to the user options. This will make TLP headers bandwidth only counting never take effect since HISI_PCIE_INIT_SET configures to count the TLP payloads bandwidth. Fix this by making the initial value of event ctrl register as 0. Fixes: 17d573984d4d ("drivers/perf: hisi: Add TLP filter support") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240829090332.28756-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-08-30drivers/perf: hisi_pcie: Record hardware counts correctlyYicong Yang
Currently we set the period and record it as the initial value of the counter without checking it's set to the hardware successfully or not. However the counter maybe unwritable if the target event is unsupported by the device. In such case we will pass user a wrong count: [start counts when setting the period] hwc->prev_count = 0x8000000000000000 device.counter_value = 0 // the counter is not set as the period [when user reads the counter] event->count = device.counter_value - hwc->prev_count = 0x8000000000000000 // wrong. should be 0. Fix this by record the hardware counter counts correctly when setting the period. Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240829090332.28756-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-07-10perf: add missing MODULE_DESCRIPTION() macrosJeff Johnson
With ARCH=x86, make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm-ccn.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/fsl_imx8_ddr_perf.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/marvell_cn10k_ddr_pmu.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/arm_cspmu_module.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/nvidia_cspmu.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/ampere_cspmu.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/cxl_pmu.o Add the missing invocation of the MODULE_DESCRIPTION() macro to all files which have a MODULE_LICENSE(). This includes drivers/perf/hisilicon/hisi_uncore_pmu.c which, although it did not produce a warning with the x86 allmodconfig configuration, may cause this warning with arm64 configurations. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://lore.kernel.org/r/20240709-md-drivers-perf-v3-1-513275b75ed0@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>
2024-05-22Merge tag 'driver-core-6.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the small set of driver core and kernfs changes for 6.10-rc1. Nothing major here at all, just a small set of changes for some driver core apis, and minor fixups. Included in here are: - sysfs_bin_attr_simple_read() helper added and used - device_show_string() helper added and used All usages of these were acked by the various maintainers. Also in here are: - kernfs minor cleanup - removed unused functions - typo fix in documentation - pay attention to sysfs_create_link() failures in module.c finally All of these have been in linux-next for a very long time with no reported problems" * tag 'driver-core-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: device property: Fix a typo in the description of device_get_child_node_count() kernfs: mount: Remove unnecessary ‘NULL’ values from knparent scsi: Use device_show_string() helper for sysfs attributes platform/x86: Use device_show_string() helper for sysfs attributes perf: Use device_show_string() helper for sysfs attributes IB/qib: Use device_show_string() helper for sysfs attributes hwmon: Use device_show_string() helper for sysfs attributes driver core: Add device_show_string() helper for sysfs attributes treewide: Use sysfs_bin_attr_simple_read() helper sysfs: Add sysfs_bin_attr_simple_read() helper module: don't ignore sysfs_create_link() failures driver core: Remove unused platform_notify, platform_notify_remove
2024-05-04perf: Use device_show_string() helper for sysfs attributesLukas Wunner
Deduplicate sysfs ->show() callbacks which expose a string at a static memory location. Use the newly introduced device_show_string() helper in the driver core instead by declaring those sysfs attributes with DEVICE_STRING_ATTR_RO(). No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://lore.kernel.org/r/3a297850312b4ecb62d6872121de04496900f502.1713608122.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-28drivers/perf: hisi: hns3: Actually use devm_add_action_or_reset()Hao Chen
pci_alloc_irq_vectors() allocates an irq vector. When devm_add_action() fails, the irq vector is not freed, which leads to a memory leak. Replace the devm_add_action with devm_add_action_or_reset to ensure the irq vector can be destroyed when it fails. Fixes: 66637ab137b4 ("drivers/perf: hisi: add driver for HNS3 PMU") Signed-off-by: Hao Chen <chenhao418@huawei.com> Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240425124627.13764-4-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-28drivers/perf: hisi: hns3: Fix out-of-bound access when valid event groupJunhao He
The perf tool allows users to create event groups through following cmd [1], but the driver does not check whether the array index is out of bounds when writing data to the event_group array. If the number of events in an event_group is greater than HNS3_PMU_MAX_HW_EVENTS, the memory write overflow of event_group array occurs. Add array index check to fix the possible array out of bounds violation, and return directly when write new events are written to array bounds. There are 9 different events in an event_group. [1] perf stat -e '{pmu/event1/, ... ,pmu/event9/} Fixes: 66637ab137b4 ("drivers/perf: hisi: add driver for HNS3 PMU") Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Hao Chen <chenhao418@huawei.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Link: https://lore.kernel.org/r/20240425124627.13764-3-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-28drivers/perf: hisi_pcie: Fix out-of-bound access when valid event groupJunhao He
The perf tool allows users to create event groups through following cmd [1], but the driver does not check whether the array index is out of bounds when writing data to the event_group array. If the number of events in an event_group is greater than HISI_PCIE_MAX_COUNTERS, the memory write overflow of event_group array occurs. Add array index check to fix the possible array out of bounds violation, and return directly when write new events are written to array bounds. There are 9 different events in an event_group. [1] perf stat -e '{pmu/event1/, ... ,pmu/event9/}' Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240425124627.13764-2-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19perf/hisi-hns3: Assign parents for event_source deviceJonathan Cameron
Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the PCI device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-6-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19perf/hisi-uncore: Assign parents for event_source devicesJonathan Cameron
Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-4-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19perf/hisi-pcie: Assign parent for event_source deviceJonathan Cameron
Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the PCI device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-2-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09perf/hisi_uncore: Avoid placing cpumask on the stackDawei Li
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_any_and_but() to avoid the need for a temporary cpumask on the stack. Suggested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240403155950.2068109-9-dawei.li@shingroup.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09perf/hisi_pcie: Avoid placing cpumask on the stackDawei Li
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_any_and_but() to avoid the need for a temporary cpumask on the stack. Suggested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240403155950.2068109-8-dawei.li@shingroup.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Merge find_related_event() and get_event_idx()Junhao He
The function xxx_find_related_event() scan all working events to find related events. During this process, we also can find the idle counters. If not found related events, return the first idle counter to simplify the code. Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-8-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Relax the check on related eventsJunhao He
If we use two events with the same filter and related event type (see the following example), the driver check whether they are related events and are in the same group, otherwise the function hisi_pcie_pmu_find_related_event() return -EINVAL, then the 2nd event cannot count but the 1st event is running, although the PCIe PMU has other idle counters. In this case, The perf event scheduler will make the two events to multiplex a counter, if the user use the formula (1st event_value / 2nd event_value) to calculate the bandwidth, he/she won't get the correct value, because they are not counting at the same period. This patch tries to fix this by making the related events to use different idle counters if they are not in the same event group. And finally, I'm going to say. The related events are best used in the same group [1]. There are two ways to know if they are related events. a) By event name, such as the latency events "xxx_latency, xxx_cnt" or bandwidth events "xxx_flux, xxx_time". b) By event type, such as "event=0xXXXX, event=0x1XXXX". Use group to count the related events: [1] -e "{pmu_name/xxx_latency,port=1/,pmu_name/xxx_cnt,port=1/}" example: 1st event: hisi_pcie0_core1/event=0x804,port=1 2nd event: hisi_pcie0_core1/event=0x10804,port=1 test cmd: perf stat -e hisi_pcie0_core1/event=0x804,port=1/ \ -e hisi_pcie0_core1/event=0x10804,port=1/ before patch: 25,281 hisi_pcie0_core1/event=0x804,port=1/ (49.91%) 470,598 hisi_pcie0_core1/event=0x10804,port=1/ (50.09%) after patch: 24,147 hisi_pcie0_core1/event=0x804,port=1/ 474,558 hisi_pcie0_core1/event=0x10804,port=1/ Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com> Link: https://lore.kernel.org/r/20240223103359.18669-7-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Check the target filter properlyJunhao He
The PMU can monitor traffic of certain target Root Port or downstream target Endpoint. User can specify the target filter by the "port" or "bdf" option respectively. The PMU can only monitor the Root Port or Endpoint on the same PCIe core so the value of "port" or "bdf" should be valid and will be checked by the driver. Currently at least and only one of "port" and "bdf" option must be set. If "port" filter is not set or is set explicitly to zero (default), driver will regard the user specifies a "bdf" option since "port" option is a bitmask of the target Root Ports and zero is not a valid value. If user not explicitly set "port" or "bdf" filter, the driver uses "bdf" default value (zero) to set target filter, but driver will skip the check of bdf=0, although it's a valid value (meaning 0000:000:00.0). Then the user just gets zero. Therefore, we need to check if both "port" and "bdf" are invalid, then return failure and report warning. Testing: before the patch: 0 hisi_pcie0_core1/rx_mrd_flux/ 0 hisi_pcie0_core1/rx_mrd_flux,port=0/ 24,124 hisi_pcie0_core1/rx_mrd_flux,port=1/ 0 hisi_pcie0_core1/rx_mrd_flux,bdf=0/ 0 hisi_pcie0_core1/rx_mrd_flux,port=0x800/ <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/ 24,132 hisi_pcie0_core1/rx_mrd_flux,bdf=0x1700/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x0/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1/ 24,138 hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1700/ 24,126 hisi_pcie0_core1/rx_mrd_flux,port=0x1,bdf=0x0/ after the patch: <not supported> hisi_pcie0_core1/rx_mrd_flux/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0/ 24,153 hisi_pcie0_core1/rx_mrd_flux,port=1/ 0 hisi_pcie0_core1/rx_mrd_flux,port=0x800/ <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=0/ <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/ 24,117 hisi_pcie0_core1/rx_mrd_flux,bdf=0x1700/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x0/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1/ 24,120 hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1700/ 24,123 hisi_pcie0_core1/rx_mrd_flux,port=0x1,bdf=0x0/ Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-6-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Add more events for counting TLP bandwidthYicong Yang
A typical PCIe transaction is consisted of various TLP packets in both direction. For counting bandwidth only memory read events are exported currently. Add memory write and completion counting events of both direction to complete the bandwidth counting. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-5-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Fix incorrect counting under metric modeYicong Yang
The metric counting shows incorrect results if the events in the metric group using the same event but different filter options. This is because we only judge the event code to decide whether the event in the metric group should share the same hardware counter, but ignore the settings of the filter. For example, on a platform of 2 ports 0x1 and 0x2 but only port 0x1 has a downstream PCIe NVME device. The metric counting shows both ports have the same counts because we misassign these two events to one same hardware counter: [root@localhost perf-iostat]# ./perf stat -e '{hisi_pcie0_core1/event=0x0104,port=0x2/,hisi_pcie0_core1/event=0x0104,port=0x1/}' Performance counter stats for 'system wide': 7907484924 hisi_pcie0_core1/event=0x0104,port=0x2/ 7907484924 hisi_pcie0_core1/event=0x0104,port=0x1/ 10.153863691 seconds time elapsed Fix this by using the whole config rather than the event only to judge whether two events are the same and should share the same hardware counter. With this patch, the metric counting in the above case tends to be corrected: [root@localhost perf-iostat]# ./perf stat -e '{hisi_pcie0_core1/event=0x0104,port=0x2/,hisi_pcie0_core1/event=0x0104,port=0x1/}' Performance counter stats for 'system wide': 0 hisi_pcie0_core1/event=0x0104,port=0x2/ 8123122077 hisi_pcie0_core1/event=0x0104,port=0x1/ 10.152875631 seconds time elapsed Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-4-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Introduce hisi_pcie_pmu_get_event_ctrl_val()Yicong Yang
Factor out retrieving of the register value for the corresponding event from hisi_pcie_config_event_ctrl() into a new function hisi_pcie_pmu_get_event_ctrl_val() allowing future reuse. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Rename hisi_pcie_pmu_{config,clear}_filter()Yicong Yang
hisi_pcie_pmu_{config,clear}_filter() are config/clear HISI_PCIE_EVENT_CTRL register which contains not only the filter but also the event code. The function names are bit misleading. Rename it to hisi_pcie_pmu_{config,clear}_event_ctrl() to reflects their functions more accurately. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi: Enable HiSilicon Erratum 162700402 quirk for HIP09Junhao He
HiSilicon UC PMU v2 suffers the erratum 162700402 that the PMU counter cannot be set due to the lack of clock under power saving mode. This will lead to error or inaccurate counts. The clock can be enabled by the PMU global enabling control. This patch tries to fix this by set the UC PMU enable before set event period to turn on the clock, and then restore the UC PMU configuration. The counter register can hold its value without a clock. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240227125231.53127-1-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-09perf: hisilicon: Convert to platform remove callback returning voidUwe Kleine-König
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert these drivers from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/33a8be0641b9447469fb7f6af0a10fb65efa97a3.1702648125.git.u.kleine-koenig@pengutronix.de Signed-off-by: Will Deacon <will@kernel.org>
2023-12-05drivers/perf: hisi: Fix some event id for HiSilicon UC pmuJunhao He
Some event id of HiSilicon uncore UC PMU driver is incorrect, fix them. Fixes: 312eca95e28d ("drivers/perf: hisi: Add support for HiSilicon UC PMU driver") Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20231204110425.20354-1-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-10-24perf: hisi: Fix use-after-free when register pmu failsJunhao He
When we fail to register the uncore pmu, the pmu context may not been allocated. The error handing will call cpuhp_state_remove_instance() to call uncore pmu offline callback, which migrate the pmu context. Since that's liable to lead to some kind of use-after-free. Use cpuhp_state_remove_instance_nocalls() instead of cpuhp_state_remove_instance() so that the notifiers don't execute after the PMU device has been failed to register. Fixes: a0ab25cd82ee ("drivers/perf: hisi: Add support for HiSilicon PA PMU driver") FIxes: 3bf30882c3c7 ("drivers/perf: hisi: Add support for HiSilicon SLLC PMU driver") Signed-off-by: Junhao He <hejunhao3@huawei.com> Link: https://lore.kernel.org/r/20231024113630.13472-1-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-10-24drivers/perf: hisi_pcie: Initialize event->cpu only on successYicong Yang
Initialize the event->cpu only on success. To be more reasonable and keep consistent with other PMUs. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20231024092954.42297-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-10-24drivers/perf: hisi_pcie: Check the type first in pmu::event_init()Yicong Yang
Check whether the event type matches the PMU type firstly in pmu::event_init() before touching the event. Otherwise we'll change the events of others and lead to incorrect results. Since in perf_init_event() we may call every pmu's event_init() in a certain case, we should not modify the event if it's not ours. Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20231024092954.42297-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-10-19drivers/perf: hisi: use cpuhp_state_remove_instance_nocalls() for ↵Hao Chen
hisi_hns3_pmu uninit process When tearing down a 'hisi_hns3' PMU, we mistakenly run the CPU hotplug callbacks after the device has been unregistered, leading to fireworks when we try to execute empty function callbacks within the driver: | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 | CPU: 0 PID: 15 Comm: cpuhp/0 Tainted: G W O 5.12.0-rc4+ #1 | Hardware name: , BIOS KpxxxFPGA 1P B600 V143 04/22/2021 | pstate: 80400009 (Nzcv daif +PAN -UAO -TCO BTYPE=--) | pc : perf_pmu_migrate_context+0x98/0x38c | lr : perf_pmu_migrate_context+0x94/0x38c | | Call trace: | perf_pmu_migrate_context+0x98/0x38c | hisi_hns3_pmu_offline_cpu+0x104/0x12c [hisi_hns3_pmu] Use cpuhp_state_remove_instance_nocalls() instead of cpuhp_state_remove_instance() so that the notifiers don't execute after the PMU device has been unregistered. Fixes: 66637ab137b4 ("drivers/perf: hisi: add driver for HNS3 PMU") Signed-off-by: Hao Chen <chenhao418@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20231019091352.998964-1-shaojijie@huawei.com [will: Rewrote commit message] Signed-off-by: Will Deacon <will@kernel.org>
2023-08-16drivers/perf: hisi: Schedule perf session according to localityYicong Yang
The PCIe PMUs locate on different NUMA node but currently we don't consider it and likely stack all the sessions on the same CPU: [root@localhost tmp]# cat /sys/devices/hisi_pcie*/cpumask 0 0 0 0 0 0 This can be optimize a bit to use a local CPU for the PMU. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20230815131010.2147-1-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-06-16drivers/perf: hisi: Add support for HiSilicon UC PMU driverJunhao He
On HiSilicon Hip09 platform, there are 4 UC (unified cache) modules on each chip CCL (CPU Cluster). UC is a cache that provides coherence between NUMA and UMA domains. It is located between L2 and Memory System. Many PMU events are supported. Let's support the UC PMU driver using the HiSilicon uncore PMU framework. * rd_req_en : rd_req_en is the abbreviation of read request tracetag enable and allows user to count only read operations. Details are listed in the hisi-pmu document at Documentation/admin-guide/perf/hisi-pmu.rst * srcid_en & srcid: Allows users to filter statistical information based on specific CPU/ICL by srcid. srcid_en depends on rd_req_en being enabled. * uring_channel: Allows users to filter statistical information based on the specified tx request uring channel. uring_channel only supported events: [0x47 ~ 0x59]. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230615125926.29832-3-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-06-16drivers/perf: hisi: Add support for HiSilicon H60PA and PAv3 PMU driverJunhao He
Compared to the original PA device, H60PA offers higher bandwidth. The H60PA is a new device and we use HID to differentiate them. The events supported by PAv3 and PAv2 are different. The PAv3 PMU removed some events which are supported by PAv2 PMU. The older PA PMU driver will probe v3 as v2. Therefore PA events displayed by "perf list" cannot work properly. We add the HISI0275 HID for PAv3 PMU to distinguish different. For each H60PA PMU, except for the overflow interrupt register, other functions of the H60PA PMU are the same as the original PA PMU module. It has 8-programable counters and each counter is free-running. Interrupt is supported to handle counter (64-bits) overflow. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230615125926.29832-2-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-06-09drivers/perf: hisi: Don't migrate perf to the CPU going to teardownJunhao He
The driver needs to migrate the perf context if the current using CPU going to teardown. By the time calling the cpuhp::teardown() callback the cpu_online_mask() hasn't updated yet and still includes the CPU going to teardown. In current driver's implementation we may migrate the context to the teardown CPU and leads to the below calltrace: ... [ 368.104662][ T932] task:cpuhp/0 state:D stack: 0 pid: 15 ppid: 2 flags:0x00000008 [ 368.113699][ T932] Call trace: [ 368.116834][ T932] __switch_to+0x7c/0xbc [ 368.120924][ T932] __schedule+0x338/0x6f0 [ 368.125098][ T932] schedule+0x50/0xe0 [ 368.128926][ T932] schedule_preempt_disabled+0x18/0x24 [ 368.134229][ T932] __mutex_lock.constprop.0+0x1d4/0x5dc [ 368.139617][ T932] __mutex_lock_slowpath+0x1c/0x30 [ 368.144573][ T932] mutex_lock+0x50/0x60 [ 368.148579][ T932] perf_pmu_migrate_context+0x84/0x2b0 [ 368.153884][ T932] hisi_pcie_pmu_offline_cpu+0x90/0xe0 [hisi_pcie_pmu] [ 368.160579][ T932] cpuhp_invoke_callback+0x2a0/0x650 [ 368.165707][ T932] cpuhp_thread_fun+0xe4/0x190 [ 368.170316][ T932] smpboot_thread_fn+0x15c/0x1a0 [ 368.175099][ T932] kthread+0x108/0x13c [ 368.179012][ T932] ret_from_fork+0x10/0x18 ... Use function cpumask_any_but() to find one correct active cpu to fixes this issue. Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230608114326.27649-1-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-04-17drivers/perf: hisi: add NULL check for nameJunhao He
When allocations fails that can be NULL now. If the name provided is NULL, then the initialization process of the PMU type and dev will be skipped in function perf_pmu_register(). Consequently, the PMU will not be able to register into the kernel. Moreover, in the case of unregister the PMU, the function device_del() will need to handle NULL pointers, which potentially can cause issues. So move this allocation above the cpuhp_state_add_instance() and directly return if it does fail. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230403081423.62460-3-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-04-17drivers/perf: hisi: Remove redundant initialized of pmu->nameJunhao He
"pmu->name" is initialized by perf_pmu_register() function, so remove the redundant initialized in hisi_pmu_init(). Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230403081423.62460-2-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-01-19drivers/perf: hisi: Extract initialization of "cpa_pmu->pmu"Junhao He
Use hisi_pmu_init() function to simplify initialization of "cpa_pmu->pmu". Signed-off-by: Junhao He <hejunhao3@huawei.com> Link: https://lore.kernel.org/r/20230119100307.3660-4-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-01-19drivers/perf: hisi: Simplify the parameters of hisi_pmu_init()Junhao He
Use "hisi_pmu" to simplify the parameter list for the hisi_pmu_init() function. Signed-off-by: Junhao He <hejunhao3@huawei.com> Link: https://lore.kernel.org/r/20230119100307.3660-3-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-01-19drivers/perf: hisi: Advertise the PERF_PMU_CAP_NO_EXCLUDE capabilityJunhao He
Missed initialization the variable of pmu::capabilities when extract the initialization code of hisi_pmu->pmu into a function. HISI UNCORE PMU drivers counters that not support context exclusion. So we have to advertise the PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that perf will prevent us from handling events where any exclusion flags are set. Signed-off-by: Junhao He <hejunhao3@huawei.com> Link: https://lore.kernel.org/r/20230119100307.3660-2-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-29drivers/perf: hisi: Add TLP filter supportYicong Yang
The PMU support to filter the TLP when counting the bandwidth with below options: - only count the TLP headers - only count the TLP payloads - count both TLP headers and payloads In the current driver it's default to count the TLP payloads only, which will have an implicity side effects that on the traffic only have header only TLPs, we'll get no data. Make this user configuration through "len_mode" parameter and make it default to count both TLP headers and payloads when user not specified. Also update the documentation for it. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20221117084136.53572-5-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-29drivers/perf: hisi: Fix some event id for hisi-pcie-pmuYicong Yang
Some event id of hisi-pcie-pmu is incorrect, fix them. Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20221117084136.53572-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-07-06drivers/perf: hisi: add driver for HNS3 PMUGuangbin Huang
HNS3(HiSilicon Network System 3) PMU is RCiEP device in HiSilicon SoC NIC, supports collection of performance statistics such as bandwidth, latency, packet rate and interrupt rate. NIC of each SICL has one PMU device for it. Driver registers each PMU device to perf, and exports information of supported events, filter mode of each event, bdf range, hardware clock frequency, identifier and so on via sysfs. Each PMU device has its own registers of control, counters and interrupt, and it supports 8 hardware events, each hardward event has its own registers for configuration, counters and interrupt. Filter options contains: config - select event port - select physical port of nic tc - select tc(must be used with port) func - select PF/VF queue - select queue of PF/VF(must be used with func) intr - select interrupt number(must be used with func) global - select all functions of IO DIE Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/20220628063419.38514-3-huangguangbin2@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-06-27perf: hisi: Extract hisi_pmu_initChen Jun
Extract the initialization code of hisi_pmu->pmu into a function Signed-off-by: Chen Jun <chenjun102@huawei.com> Link: https://lore.kernel.org/r/20220516131601.48383-1-chenjun102@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06drivers/perf: hisi: Add Support for CPA PMUQi Liu
On HiSilicon Hip09 platform, there is a CPA (Coherency Protocol Agent) on each SICL (Super IO Cluster) which implements packet format translation, route parsing and traffic statistics. CPA PMU has 8 PMU counters and interrupt is supported to handle counter overflow. Let's support its driver under the framework of HiSilicon PMU driver. Signed-off-by: Qi Liu <liuqi115@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/20220415102352.6665-3-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-05-06drivers/perf: hisi: Associate PMUs in SICL with CPUs onlineQi Liu
If a PMU is in a SICL (Super IO cluster), it is not appropriate to associate this PMU with a CPU die. So we associate it with all CPUs online, rather than CPUs in the nearest SCCL. As the firmware of Hip09 platform hasn't been published yet, change of PMU driver will not influence backwards compatibility between driver and firmware. Signed-off-by: Qi Liu <liuqi115@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/20220415102352.6665-2-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-02-15perf: replace bitmap_weight with bitmap_empty where appropriateYury Norov
In some places, drivers/perf code calls bitmap_weight() to check if any bit of a given bitmap is set. It's better to use bitmap_empty() in that case because bitmap_empty() stops traversing the bitmap as soon as it finds first set bit, while bitmap_weight() counts all bits unconditionally. Signed-off-by: Yury Norov <yury.norov@gmail.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220210224933.379149-13-yury.norov@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14drivers/perf: hisi: Add driver for HiSilicon PCIe PMUQi Liu
PCIe PMU Root Complex Integrated End Point(RCiEP) device is supported to sample bandwidth, latency, buffer occupation etc. Each PMU RCiEP device monitors multiple Root Ports, and each RCiEP is registered as a PMU in /sys/bus/event_source/devices, so users can select target PMU, and use filter to do further sets. Filtering options contains: event - select the event. port - select target Root Ports. Information of Root Ports are shown under sysfs. bdf - select requester_id of target EP device. trig_len - set trigger condition for starting event statistics. trig_mode - set trigger mode. 0 means starting to statistic when bigger than trigger condition, and 1 means smaller. thr_len - set threshold for statistics. thr_mode - set threshold mode. 0 means count when bigger than threshold, and 1 means smaller. Acked-by: Krzysztof Wilczyński <kw@linux.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/20211202080633.2919-3-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-04drivers/perf: hisi: Fix PA PMU counter offsetShaokun Zhang
The PA PMU counter offset was correct in [1] and the driver has already been verified. We want to keep the register offset using lower case character in later version that is consistent with the existed driver. Since there was no functional change, we didn't do more test. However there is typo when modified the PA PMU counter offset by mistake, so fix this bad mistake. [1] https://www.spinics.net/lists/arm-kernel/msg865263.html Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: John Garry <john.garry@huawei.com> Cc: Qi Liu <liuqi115@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/20210928123022.23467-1-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-08perf/hisi: Constify static attribute_group structsRikard Falkeborn
These are only put in an array of pointers to const attribute_group structs. Make them const like the other static attribute_group structs to allow the compiler to put them in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Link: https://lore.kernel.org/r/20210605221514.73449-1-rikard.falkeborn@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-04drivers/perf: hisi: Fix data source controlShaokun Zhang
'Data source' is a new function for HHA PMU and config / clear interface was wrong by mistake. 'HHA_DATSRC_CTRL' register is mainly used for data source configuration, if we enable bit0 as driver, it will go on count the event and we didn't check it carefully. So fix the issue and do as the initial purpose. Fixes: 932f6a99f9b0 ("drivers/perf: hisi: Add new functions for HHA PMU") Reported-by: kernel test robot <lkp@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/1622709291-37996-1-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-01drivers/perf: hisi: use the correct HiSilicon copyrightHao Fang
s/Hisilicon/HiSilicon/. It should use capital S, according to the official website https://www.hisilicon.com/en. Signed-off-by: Hao Fang <fanghao11@huawei.com> Link: https://lore.kernel.org/r/1621679037-15323-1-git-send-email-fanghao11@huawei.com Signed-off-by: Will Deacon <will@kernel.org>