summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)Author
2022-11-04drm/amdgpu: Remove unnecessary register program in SRIOVPeng Ju Zhou
Remove unnecessary register program in SRIOV Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-04drm/amdgpu: Disable MCBP from soc21 for SRIOVYiqing Yao
[why] Start from soc21, CP does not support MCBP, so disable it. [how] Used amgpu_mcbp flag alone instead of checking if is in SRIOV to enable/disable MCBP. Only set flag to enable on asic_type prior to soc21 in SRIOV. Signed-off-by: Yiqing Yao <yiqing.yao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-04drm/amdgpu: Clean up soc21 early init for SRIOVYiqing Yao
Use virt_init_setting instead of per ip version setting. Signed-off-by: Yiqing Yao <yiqing.yao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-04drm/amdgpu: extend halt_if_hws_hang to MESGraham Sider
Hang on MES timeout if halt_if_hws_hang is set to 1. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-04Merge tag 'drm-misc-next-2022-11-03' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 6.2: UAPI Changes: Cross-subsystem Changes: - dma-buf: locking improvements - firmware: New API in the RaspberryPi firmware driver used by vc4 Core Changes: - client: Null pointer dereference fix in drm_client_buffer_delete() - mm/buddy: Add back random seed log - ttm: Convert ttm_resource to use size_t for its size, fix for an undefined behaviour Driver Changes: - bridge: - adv7511: use dev_err_probe - it6505: Fix return value check of pm_runtime_get_sync - panel: - sitronix: Fixes and clean-ups - lcdif: Increase DMA burst size - rockchip: runtime_pm improvements - vc4: Fix for a regression preventing the use of 4k @ 60Hz, and further HDMI rate constraints check. - vmwgfx: Cursor improvements Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20221103083437.ksrh3hcdvxaof62l@houat
2022-11-03drm/scheduler: rename dependency callback into prepare_jobChristian König
This now matches much better what this is doing. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014084641.128280-14-christian.koenig@amd.com
2022-11-03drm/amdgpu: use scheduler dependencies for CSChristian König
Entirely remove the sync obj in the job. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014084641.128280-11-christian.koenig@amd.com
2022-11-03drm/amdgpu: use scheduler dependencies for UVD msgsChristian König
Instead of putting that into the job sync object. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014084641.128280-10-christian.koenig@amd.com
2022-11-03drm/amdgpu: use scheduler dependencies for VM updatesChristian König
Instead of putting that into the job sync object. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014084641.128280-9-christian.koenig@amd.com
2022-11-03drm/amdgpu: move explicit sync check into the CSChristian König
This moves the memory allocation out of the critical code path. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014084641.128280-8-christian.koenig@amd.com
2022-11-03drm/amdgpu: cleanup scheduler job initialization v2Christian König
Init the DRM scheduler base class while allocating the job. This makes the whole handling much more cleaner. v2: fix coding style Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014084641.128280-7-christian.koenig@amd.com
2022-11-03drm/amdgpu: drop amdgpu_sync from amdgpu_vmid_grab v2Christian König
Instead return the fence directly. Avoids memory allocation to store the fence. v2: cleanup coding style as well Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014084641.128280-6-christian.koenig@amd.com
2022-11-03drm/amdgpu: drop the fence argument from amdgpu_vmid_grabChristian König
This is always the job anyway. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014084641.128280-5-christian.koenig@amd.com
2022-11-03drm/amdgpu: use drm_sched_job_add_resv_dependencies for movesChristian König
Use the new common scheduler functions to figure out what to wait for. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014084641.128280-4-christian.koenig@amd.com
2022-11-02drm/amdgpu: Disable GPU reset on SRIOV before remove pci.Gavin Wan
The recent change brought a bug on SRIOV envrionment. It caused unloading amdgpu failed on Guest VM. The reason is that the VF FLR was requested while unloading amdgpu driver, but the VF FLR of SRIOV sequence is wrong while removing PCI device. For SRIOV, the guest driver should not trigger the whole XGMI hive to do the reset. Host driver control how the device been reset. Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow for smu v13_0_2") Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Gavin Wan <Gavin.Wan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-02drm/amdgpu: disable GFXOFF during compute for GFX11Graham Sider
Temporary workaround to fix issues observed in some compute applications when GFXOFF is enabled on GFX11. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-11-02drm/amd: Fail the suspend if resources can't be evictedMario Limonciello
If a system does not have swap and memory is under 100% usage, amdgpu will fail to evict resources. Currently the suspend carries on proceeding to reset the GPU: ``` [drm] evicting device resources failed [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <vcn_v3_0> failed -12 [drm] free PSP TMR buffer [TTM] Failed allocating page table [drm] evicting device resources failed amdgpu 0000:03:00.0: amdgpu: MODE1 reset amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset ``` At this point if the suspend actually succeeded I think that amdgpu would have recovered because the GPU would have power cut off and restored. However the kernel fails to continue the suspend from the memory pressure and amdgpu fails to run the "resume" from the aborted suspend. ``` ACPI: PM: Preparing to enter system sleep state S3 SLUB: Unable to allocate memory on node -1, gfp=0xdc0(GFP_KERNEL|__GFP_ZERO) cache: Acpi-State, object size: 80, buffer size: 80, default order: 0, min order: 0 node 0: slabs: 22, objs: 1122, free: 0 ACPI Error: AE_NO_MEMORY, Could not update object reference count (20210730/utdelete-651) [drm:psp_hw_start [amdgpu]] *ERROR* PSP load kdb failed! [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -62 amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_resume failed (-62). PM: dpm_run_callback(): pci_pm_resume+0x0/0x100 returns -62 amdgpu 0000:03:00.0: PM: failed to resume async: error -62 ``` To avoid this series of unfortunate events, fail amdgpu's suspend when the memory eviction fails. This will let the system gracefully recover and the user can try suspend again when the memory pressure is relieved. Reported-by: post@davidak.de Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2223 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-02drm/amdgpu: correct MES debugfs versionsGraham Sider
Use mes.sched_version, mes.kiq_version for debugfs as mes.ucode_fw_version does not contain correct versioning information. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-02drm/amdgpu: set fb_modifiers_not_supported in vkmsYifan Zhang
This patch to fix the gdm3 start failure with virual display: /usr/libexec/gdm-x-session[1711]: (II) AMDGPU(0): Setting screen physical size to 270 x 203 /usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): Failed to make import prime FD as pixmap: 22 /usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): failed to set mode: Invalid argument /usr/libexec/gdm-x-session[1711]: (WW) AMDGPU(0): Failed to set mode on CRTC 0 /usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): Failed to enable any CRTC gnome-shell[1840]: Running GNOME Shell (using mutter 42.2) as a X11 window and compositing manager /usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): failed to set mode: Invalid argument vkms doesn't have modifiers support, set fb_modifiers_not_supported to bring the gdm back. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-01drm/amdgpu: Skip program gfxhub_v3_0_3 system aperture registers under SRIOVYifan Zha
[Why] gfxhub_v3_0_3 system aperture registers are removed from RLCG register access range. [How] Skip access gfxhub_v3_0_3 system aperture registers under SRIOV VF. These registers will be programmed on host side. Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-01drm/amdgpu: Skip access SDMA0_F32_CNTL in sdma_v6_0_enable under SRIOVYifan Zha
[Why] SDMA0_F32_CNTL is a PF_only regitser which will be blocked by L1. RLCG will not program the register as well. [How] Skip to program SDMA0_F32_CNTL under SRIOV VF. Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-01drm/amdgpu: Skip access GRBM_CNTL under SRIOV on gfx_v11Yifan Zha
[Why] GRBM_CNTL is a PF_only register on gfx_v11. RLCG interface will return "out of range" under SRIOV VF. [How] Skip access GRBM_CNTL under gfx_v11 SRIOV VF. Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-01drm/amdgpu: Disable GPU reset on SRIOV before remove pci.Gavin Wan
The recent change brought a bug on SRIOV envrionment. It caused unloading amdgpu failed on Guest VM. The reason is that the VF FLR was requested while unloading amdgpu driver, but the VF FLR of SRIOV sequence is wrong while removing PCI device. For SRIOV, the guest driver should not trigger the whole XGMI hive to do the reset. Host driver control how the device been reset. Fixes: f5c7e7797060 ("drm/amdgpu: Adjust removal control flow for smu v13_0_2") Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Gavin Wan <Gavin.Wan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-01drm/amdgpu: Enable GFX RAS feature for gfx v11_0_3Candice Li
v1: Support gfx ras feature enablement for gfx v11_0_3. v2: Update function name and error message. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdkfd: Cleanup kfd_dev structMukul Joshi
Cleanup kfd_dev struct by removing ddev and pdev as both drm_device and pci_dev can be fetched from amdgpu_device. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Tested-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: disable GFXOFF during compute for GFX11Graham Sider
Temporary workaround to fix issues observed in some compute applications when GFXOFF is enabled on GFX11. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amd: Fail the suspend if resources can't be evictedMario Limonciello
If a system does not have swap and memory is under 100% usage, amdgpu will fail to evict resources. Currently the suspend carries on proceeding to reset the GPU: ``` [drm] evicting device resources failed [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <vcn_v3_0> failed -12 [drm] free PSP TMR buffer [TTM] Failed allocating page table [drm] evicting device resources failed amdgpu 0000:03:00.0: amdgpu: MODE1 reset amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset ``` At this point if the suspend actually succeeded I think that amdgpu would have recovered because the GPU would have power cut off and restored. However the kernel fails to continue the suspend from the memory pressure and amdgpu fails to run the "resume" from the aborted suspend. ``` ACPI: PM: Preparing to enter system sleep state S3 SLUB: Unable to allocate memory on node -1, gfp=0xdc0(GFP_KERNEL|__GFP_ZERO) cache: Acpi-State, object size: 80, buffer size: 80, default order: 0, min order: 0 node 0: slabs: 22, objs: 1122, free: 0 ACPI Error: AE_NO_MEMORY, Could not update object reference count (20210730/utdelete-651) [drm:psp_hw_start [amdgpu]] *ERROR* PSP load kdb failed! [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -62 amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_resume failed (-62). PM: dpm_run_callback(): pci_pm_resume+0x0/0x100 returns -62 amdgpu 0000:03:00.0: PM: failed to resume async: error -62 ``` To avoid this series of unfortunate events, fail amdgpu's suspend when the memory eviction fails. This will let the system gracefully recover and the user can try suspend again when the memory pressure is relieved. Reported-by: post@davidak.de Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2223 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: Add EEPROM I2C address support for ip discoveryCandice Li
1. Update EEPROM_I2C_MADDR_SMU_13_0_0 to EEPROM_I2C_MADDR_54H 2. Add EEPROM I2C address support for smu v13_0_0 and v13_0_10. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: Update ras eeprom support for smu v13_0_0 and v13_0_10Candice Li
Enable RAS EEPROM support for smu v13_0_0 and v13_0_10. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: Optimize TA load/unload/invoke debugfs interfacesCandice Li
1. Add a function pointer structure ta_funcs to psp context 2. Make the interfaces generic to all TAs 3. Leverage exisitng TA context and remove unused functions 4. Fix return code bugs v2: Add comments for ta funcs macros and correct typo Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Candice Li <candice.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: Optimize RAS TA initialization and TA unload funcsCandice Li
1. Save TA unload psp response status 2. Add RAS TA loading status check for initializaiton 3. Drop RAS context teardown to allow RAS TA to be reloaded Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Candice Li <candice.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: remove deprecated MES version varsGraham Sider
MES scheduler and kiq versions are stored in mes.sched_version and mes.kiq_version, respectively, which are read from a register after their queues are initialized. Remove mes.ucode_fw_version and mes.data_fw_version which tried to read this versioning info from the firmware headers (which don't contain this information). Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: correct MES debugfs versionsGraham Sider
Use mes.sched_version, mes.kiq_version for debugfs as mes.ucode_fw_version does not contain correct versioning information. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: Move the mutex_lock to protect the return status of ↵Alan Liu
securedisplay command buffer [Why] Before we call psp_securedisplay_invoke(), we call psp_prep_securedisplay_cmd_buf() to prepare and initialize the command buffer. However, we didn't use the mutex_lock to protect the status of command buffer. So when multiple threads are using the command buffer, after thread A return from psp_securedisplay_invoke() and the command buffer status is set to SUCCESS, another thread B may call psp_prep_securedisplay_cmd_buf() and initialize the status to FAILURE again, and cause Thread A to get a failure return status. [How] Move the mutex_lock out of psp_securedisplay_invoke() to its caller to cover psp_prep_securedisplay_cmd_buf() and the code checking the return status of command buffer. Signed-off-by: Alan Liu <HaoPing.Liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: remove ras_error_status parameter for UMC poison handlerTao Zhou
Make the code simpler. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: add RAS poison handling for MCATao Zhou
For MCA poison, if unmap queue fails, only gpu reset should be triggered without page retirement handling, MCA notifier will do it. v2: handle MCA poison consumption in umc_poison_handler directly. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: use page retirement API in MCA notifierTao Zhou
Make the code more readable. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: add RAS page retirement functions for MCATao Zhou
Define page retirement functions for MCA platform. v2: remove page retirement handling from MCA poison handler, let MCA notifier do page retirement. v3: remove specific poison handler for MCA to simplify code. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: set fb_modifiers_not_supported in vkmsYifan Zhang
This patch to fix the gdm3 start failure with virual display: /usr/libexec/gdm-x-session[1711]: (II) AMDGPU(0): Setting screen physical size to 270 x 203 /usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): Failed to make import prime FD as pixmap: 22 /usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): failed to set mode: Invalid argument /usr/libexec/gdm-x-session[1711]: (WW) AMDGPU(0): Failed to set mode on CRTC 0 /usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): Failed to enable any CRTC gnome-shell[1840]: Running GNOME Shell (using mutter 42.2) as a X11 window and compositing manager /usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): failed to set mode: Invalid argument vkms doesn't have modifiers support, set fb_modifiers_not_supported to bring the gdm back. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/ttm: rework on ttm_resource to use size_t typeSomalapuram Amaranath
Change ttm_resource structure from num_pages to size_t size in bytes. v1 -> v2: change PFN_UP(dst_mem->size) to ttm->num_pages v1 -> v2: change bo->resource->size to bo->base.size at some places v1 -> v2: remove the local variable v1 -> v2: cleanup cmp_size_smaller_first() v2 -> v3: adding missing PFN_UP in ttm_bo_vm_fault_reserved Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221027091237.983582-1-Amaranath.Somalapuram@amd.com Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2022-10-26drm/amdgpu: disallow gfxoff until GC IP blocks complete s2idle resumePrike Liang
In the S2idle suspend/resume phase the gfxoff is keeping functional so some IP blocks will be likely to reinitialize at gfxoff entry and that will result in failing to program GC registers.Therefore, let disallow gfxoff until AMDGPU IPs reinitialized completely. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.15.x
2022-10-25Merge tag 'drm-misc-next-2022-10-20' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 6.2: UAPI Changes: - Documentation for page-flip flags Cross-subsystem Changes: - dma-buf: Add unlocked variant of vmapping and attachment-mapping functions Core Changes: - atomic-helpers: CRTC primary plane test fixes - connector: TV API consistency improvements, cmdline parsing improvements - crtc-helpers: Introduce drm_crtc_helper_atomic_check() helper - edid: Fixes for HFVSDB parsing, - fourcc: Addition of the Vivante tiled modifier - makefile: Sort and reorganize the objects files - mode_config: Remove fb_base from drm_mode_config_funcs - sched: Add a module parameter to change the scheduling policy, refcounting fix for fences - tests: Sort the Kunit tests in the Makefile, improvements to the DP-MST tests - ttm: Remove unnecessary drm_mm_clean() call Driver Changes: - New driver: ofdrm - Move all drivers to a common dma-buf locking convention - bridge: - adv7533: Remove dynamic lane switching - it6505: Runtime PM support - ps8640: Handle AUX defer messages - tc358775: Drop soft-reset over I2C - ast: Atomic Gamma LUT Support, Convert to SHMEM, various improvements - lcdif: Support for YUV planes - mgag200: Fix PLL Setup on some revisions - udl: Modesetting improvements, hot-unplug support - vc4: Fix support for PAL-M Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20221020072405.g3o4hxuk75gmeumw@houat
2022-10-24drm/amdgpu: skip mes self test for gc 11.0.3 in recoverYuBiao Wang
Temporary disable mes self teset for gc 11.0.3 during gpu_recovery. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24drm/amd: Add IMU fw version to fw version queriesDavid Francis
IMU is a new firmware for GFX11. There are four means by which firmware version can be queried from the driver: device attributes, vf2pf, debugfs, and the AMDGPU_INFO_FW_VERSION option in the amdgpu info ioctl. Add IMU as an option for those four methods. V2: Added debugfs Reviewed-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24drm/amd/pm: allow gfxoff on gc_11_0_3Kenneth Feng
allow gfxoff on gc_11_0_3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24drm/amdkfd: Fix memory leak in kfd_mem_dmamap_userptr()Rafael Mendonca
If the number of pages from the userptr BO differs from the SG BO then the allocated memory for the SG table doesn't get freed before returning -EINVAL, which may lead to a memory leak in some error paths. Fix this by checking the number of pages before allocating memory for the SG table. Fixes: 264fb4d332f5 ("drm/amdgpu: Add multi-GPU DMA mapping helpers") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24drm/amdgpu: Remove ATC L2 access for MMHUB 2.1.xLijo Lazar
MMHUB 2.1.x versions don't have ATCL2. Remove accesses to ATCL2 registers. Since they are non-existing registers, read access will cause a 'Completer Abort' and gets reported when AER is enabled with the below patch. Tagging with the patch so that this is backported along with it. v2: squash in uninitialized warning fix (Nathan Chancellor) Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-10-24amd/amdgpu: fix repeated words in commentswangjianli
Delete the redundant word 'the'. Signed-off-by: wangjianli <wangjianli@cdjrlc.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24drm/amdgpu: disallow gfxoff until GC IP blocks complete s2idle resumePrike Liang
In the S2idle suspend/resume phase the gfxoff is keeping functional so some IP blocks will be likely to reinitialize at gfxoff entry and that will result in failing to program GC registers.Therefore, let disallow gfxoff until AMDGPU IPs reinitialized completely. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24drm/amdgpu: skip mes self test for gc 11.0.3 in recoverYuBiao Wang
Temporary disable mes self teset for gc 11.0.3 during gpu_recovery. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>