summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)Author
2019-10-25drm/amdgpu: Move amdgpu_ras_recovery_init to after SMU ready.Andrey Grodzovsky
For Arcturus the I2C traffic is done through SMU tables and so we must postpone RAS recovery init to after they are ready which is in amdgpu_device_ip_hw_init_phase2. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25drm/amdgpu: Use ARCTURUS in RAS EEPROM.Andrey Grodzovsky
Add Arcturus EEPROM/I2C support in generic EEPROM code. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25drm/amdgpu: remove unused parameter in amdgpu_gfx_kiq_free_ringNirmoy Das
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25drm/amdgpu/vcn: Enable VCN2.5 encodingJames Zhu
After VCN2.5 firmware (Version ENC: 1.1 Revision: 11), VCN2.5 encoding can work properly. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25drm/amdgpu/sdma5: do not execute 0-sized IBs (v2)Pelloux-prayer, Pierre-eric
This seems to help with https://bugs.freedesktop.org/show_bug.cgi?id=111481. v2: insert a NOP instead of skipping all 0-sized IBs to avoid breaking older hw Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25drm/amdgpu: Fix SDMA hang when performing VKexample testchen gong
VKexample test hang during Occlusion/SDMA/Varia runs. Clear XNACK_WATERMK in reg SDMA0_UTCL1_WATERMK to fix this issue. Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25drm/amdgpu: define macros for retire page reservationGuchun Chen
Easy for maintainance. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25drm/amdgpu: refine reboot debugfs operation in ras case (v3)Guchun Chen
Ras reboot debugfs node allows user one easy control to avoid gpu recovery hang problem and directly reboot system per card basis, after ras uncorrectable error happens. However, it is one common entry, which should get rid of ras_ctrl node and remove ip dependence when inputting by user. So add one new auto_reboot node in ras debugfs dir to achieve this. v2: in commit mssage, add justification why ras reboot debugfs node is needed. v3: use debugfs_create_bool to create debugfs file for boolean value Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25drm/amd/powerplay: add lock protection for swSMU APIs V2Evan Quan
This is a quick and low risk fix. Those APIs which are exposed to other IPs or to support sysfs/hwmon interfaces or DAL will have lock protection. Meanwhile no lock protection is enforced for swSMU internal used APIs. Future optimization is needed. V2: strip the lock protection for all swSMU internal APIs Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Feifei Xu <Feifei.Xu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25drm/amdgpu/psp: fix spelling mistake "initliaze" -> "initialize"Colin Ian King
There is a spelling mistake in a DRM_ERROR error message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25drm/amdgpu/psp11: fix typo in commentXiaojie Yuan
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25drm/amdgpu/psp11: wait for sOS ready for ring creationXiaojie Yuan
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25drm/amdgpu: call amdgpu_vm_prt_fini before deleting the root PDPelloux-prayer, Pierre-eric
amdgpu_vm_prt_fini uses "vm->root.base.bo" so it must still be valid when we call it. Fixes: b65709a92156 ("drm/amdgpu: reserve the root PD while freeing PASIDs") Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25drm/amdgpu/vce: make some functions staticAlex Deucher
They are not used outside of the file they are defined in. Reviewed-by: James Zhu <James.Zhu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25drm/amdgpu/vce: fix allocation size in enc ring testAlex Deucher
We need to allocate a large enough buffer for the feedback buffer, otherwise the IB test can overwrite other memory. Reviewed-by: James Zhu <James.Zhu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25drm/amdgpu/psp: declare PSP TA firmwarechen gong
Add PSP TA firmware declaration for raven raven2 picasso Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17drm/amdgpu: fix amdgpu trace event print string format errorKevin Wang
the trace event print string format error. (use integer type to handle string) before: amdgpu_test_kev-1556 [002] 138.508781: amdgpu_cs_ioctl: sched_job=8, timeline=gfx_0.0.0, context=177, seqno=1, ring_name=ffff94d01c207bf0, num_ibs=2 after: amdgpu_test_kev-1506 [004] 370.703783: amdgpu_cs_ioctl: sched_job=12, timeline=gfx_0.0.0, context=234, seqno=2, ring_name=gfx_0.0.0, num_ibs=1 change trace event list: 1.amdgpu_cs_ioctl 2.amdgpu_sched_run_job 3.amdgpu_ib_pipe_sync Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17drm/amdgpu/psp: add psp memory training implementation(v3)Tianci.Yin
add memory training implementation code to save resume time. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17drm/amdgpu: reserve vram for memory training(v4)Tianci.Yin
memory training using specific fixed vram segment, reserve these segments before anyone may allocate it. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17drm/amdgpu: add psp memory training callbacks and macroTianci.Yin
add interface for memory training. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17drm/amdgpu/atomfirmware: add memory training related helper functions(v3)Tianci.Yin
parse firmware to get memory training capability and fb location. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17drm/amdgpu: introduce psp_v11_0_is_sos_alive interface(v2)Tianci.Yin
introduce psp_v11_0_is_sos_alive func for common use. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17drm/amdgpu: add a generic fb accessing helper function(v3)Tianci.Yin
add a generic helper function for accessing framebuffer via MMIO Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17drm/amdgpu: update amdgpu_discovery to handle revisionTianci.Yin
update amdgpu_discovery to get IP revision. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17drm/amdgpu/vcn: fix allocation size in enc ring testAlex Deucher
We need to allocate a large enough buffer for the session info, otherwise the IB test can overwrite other memory. - Session info is 128K according to mesa - Use the same session info for create and destroy Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241 Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17drm/amdgpu/uvd7: fix allocation size in enc ring test (v2)Alex Deucher
We need to allocate a large enough buffer for the session info, otherwise the IB test can overwrite other memory. v2: - session info is 128K according to mesa - use the same session info for create and destroy Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241 Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17drm/amdgpu/uvd6: fix allocation size in enc ring test (v2)Alex Deucher
We need to allocate a large enough buffer for the session info, otherwise the IB test can overwrite other memory. v2: - session info is 128K according to mesa - use the same session info for create and destroy Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204241 Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17drm/amdgpu: fix S3 failed as RLC safe mode entry stucked in polloing gfx acqPrike Liang
Fix gfx cgpg setting sequence for RLC deadlock at safe mode entry in polling gfx response. The patch can fix VCN IB test failed and DAL get dispaly count failed issue. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-17drm/amdgpu: add GFX_PIPELINE capacity check for updating gfx cgpgPrike Liang
Before disable gfx pipeline power gating need check the flag AMD_PG_SUPPORT_GFX_PIPELINE. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15drm/amdgpu: enable BACO reset for SMU7 based dGPUs (v2)Alex Deucher
Use BACO to reset the GPU if supported on SMU7 based dGPUs. v2: don't use baco on CI parts Reviewed-by: Evan Quan <evan.quan@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15drm/amdgpu/soc15: add support for baco reset with swSMUAlex Deucher
Add support for vega20 when the swSMU path is used. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15drm/amdgpu: remove in_baco_reset hackAlex Deucher
It was a vega20 specific hack. Check if we are in reset and what reset method we are using. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15drm/amdgpu: simplify ATPX detectionAlex Deucher
Use the base class rather than the specific class and drop the second loop. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15drm/amdgpu: move gpu reset out of amdgpu_device_suspendAlex Deucher
Move it into the caller. There are cases were we don't want it. We need it for hibernation, but we don't need it for runtime pm, so drop it for runtime pm. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15drm/amdgpu: move pci_save_state into suspend pathAlex Deucher
for amdgpu_device_suspend. This follows the logic in the resume path. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15dmr/amdgpu: Fix crash on SRIOV for ERREVENT_ATHUB_INTERRUPT interrupt.Andrey Grodzovsky
Ignre the ERREVENT_ATHUB_INTERRUPT for systems without RAS. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-and-tested-by: Jack Zhang <Jack.Zhang1@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15drm/amdgpu: user pages array memory leak fixPhilip Yang
user_pages array should always be freed after validation regardless if user pages are changed after bo is created because with HMM change parse bo always allocate user pages array to get user pages for userptr bo. v2: remove unused local variable and amend commit v3: add back get user pages in gem_userptr_ioctl, to detect application bug where an userptr VMA is not ananymous memory and reject it. Bugzilla: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1844962 Signed-off-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Joe Barnett <thejoe@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15drm/amd/powerplay: enable Arcturus runtime VCN dpm on/offEvan Quan
Enable runtime VCN DPM on/off on Arcturus. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15drm/amdgpu: Fix tdr3 could hang with slow compute issueEmily Deng
When index is 1, need to set compute ring timeout for sriov and passthrough. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15drm/amdgpu: fix potential VM faultsChristian König
When we allocate new page tables under memory pressure we should not evict old ones. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15drm/amdgpu: fix error handling in amdgpu_bo_list_createChristian König
We need to drop normal and userptr BOs separately. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15drm/amdgpu: add RAS support for VML2 and ATCL2Dennis Li
v1: Add codes to query the EDC count of VML2 & ATCL2 v2: Rename VML2/ATCL2 registers and drop their mask define v3: Add back the ECC mask for VML2 registers Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Hawking Zhang <hawking.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15drm/amdgpu: change to query the actual EDC counterDennis Li
For the potential request in the future, change to query the actual EDC counter. Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Hawking Zhang <hawking.zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15drm/amdgpu/soc15: disable doorbell interrupt as part of BACO entry sequenceLe Ma
Workaround to make RAS recovery work in BACO reset. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15drm/amdgpu: Bail earlier when amdgpu.cik_/si_support is not set to 1Hans de Goede
Bail from the pci_driver probe function instead of from the drm_driver load function. This avoid /dev/dri/card0 temporarily getting registered and then unregistered again, sending unwanted add / remove udev events to userspace. Specifically this avoids triggering the (userspace) bug fixed by this plymouth merge-request: https://gitlab.freedesktop.org/plymouth/plymouth/merge_requests/59 Note that despite that being a userspace bug, not sending unnecessary udev events is a good idea in general. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1490490 Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-15drm/amdgpu/discovery: reserve discovery data at the top of VRAMXiaojie Yuan
IP Discovery data is TMR fenced by the latest PSP BL, so we need to reserve this region. Tested on navi10/12/14 with VBIOS integrated with latest PSP BL. v2: use DISCOVERY_TMR_SIZE macro as bo size use amdgpu_bo_create_kernel_at() to allocate bo Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10drm/amdgpu: Do not implement power-on for SDMA after do mode2 reset on Renoirchen gong
Find that ring sdma0 test failed if turn on SDMA powergating after do mode2 reset. Perhaps the mode2 reset does not reset the SDMA PG state, SDMA is already powered up so there is no need to ask the SMU to power it up again. So I skip this function for a moment. Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10drm/amdgpu/sdma5: fix mask value of POLL_REGMEM packet for pipe syncXiaojie Yuan
sdma will hang once sequence number to be polled reaches 0x1000_0000 Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10drm/amdgpu: fix memory leakNirmoy Das
cleanup error handling code and make sure temporary info array with the handles are freed by amdgpu_bo_list_put() on idr_replace()'s failure. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-10drm/amdgpu: avoid ras error injection for retired pageTao Zhou
check whether a page is bad page before umc error injection, bad page should not be accessed again Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>