summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)Author
2018-11-05drm/amdgpu: cleanup si_dma_ring_test_ibChristian König
Accidentially missed during the last cleanup. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: cleanup uvd_v6_0_ring_test_ringChristian König
Accidentially missed during the last cleanup. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: use ring name instead of idx in tracesChristian König
Further remove using the ring index in messages and traces. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: print an error when the parser can't be initializedSamuel Pitoiset
Similar to other error messages, might help for tracking down issues. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/scheduler: Add drm_sched_job_cleanupSharat Masetty
This patch adds a new API to clean up the scheduler job resources. This is primarliy needed in cases the job was created but was not queued to the scheduler queue. Additionally with this change, the layer which creates the scheduler job also gets to free up the job's resources and this entails moving the dma_fence_put(finished_fence) to the drivers ops free handler routines. Signed-off-by: Sharat Masetty <smasetty@codeaurora.org> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: remove messages from IB testsChristian König
We already print an error message that an IB test failed in the common code. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: cleanup skipping IB test on KIQChristian König
Instead of hard coding the ring type in the function just never provide a test_ib callback. Additional to that remove the emit_ib callback to make sure the nobody ever tries to execute an IB on the KIQ. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: cleanup amdgpu_ib_ring_testsChristian König
Test only initialized rings, use the ring name instead of the index in the error message and note on which device the error occured. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: further ring test cleanupsChristian König
Move all error messages from IP specific code into the common helper. This way we now uses the ring name in the messages instead of the index and note which device is affected as well. Also cleanup error handling in the IP specific code and consequently use ETIMEDOUT when the ring test timed out. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu/amdkfd: clean up mmhub and gfxhub includesAlex Deucher
Use the appropriate mmhub and gfxhub headers rather than adding them to the gmc9 header. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: Enable default GPU reset for dGPU on gfx8/9 v3Andrey Grodzovsky
After testing looks like these subset of ASICs has GPU reset working for the most part. Enable reset due to job timeout. v2: Switch from GFX version to ASIC type. v3: Fix identation Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: Retire amdgpu_ring.ready flag v4Andrey Grodzovsky
Start using drm_gpu_scheduler.ready isntead. v3: Add helper function to run ring test and set sched.ready flag status accordingly, clean explicit sched.ready sets from the IP specific files. v4: Add kerneldoc and rebase. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/ttm: initialize globals during device init (v2)Christian König
Make sure that the global BO state is always correctly initialized. This allows removing all the device code to initialize it. v2: fix up vbox (Alex) Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/ttm: use a static ttm_mem_global instanceChristian König
As the name says we only need one global instance of ttm_mem_global. Drop all the driver initialization and just use a single exported instance which is initialized during BO global initialization. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: Added a few comments for gartOak Zeng
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Christian Konig <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdkfd: Use functions from amdgpu to invalidate vmid in kfdYong Zhao
As part of the change, we stop taking the srbm lock, and start to use the same invalidation engine and software lock as amdgpu. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: Reorganize amdgpu_gmc_flush_gpu_tlb() for kfd to useYong Zhao
Add a flush_type parameter to that series of functions. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdkfd: Remove unnecessary register setting when invalidating tlb in kfdYong Zhao
Those register settings have been done in gfxhub_v1_0_program_invalidation() and mmhub_v1_0_program_invalidation(). Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdkfd: page_table_base already have the flags neededYong Zhao
The flags are added when calling amdgpu_gmc_pd_addr(). Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: Reverse the sequence of ctx_mgr_finiRex Zhu
and vm_fini in amdgpu_driver_postclose_kms csa buffer will be created per ctx, when ctx fini, the csa buffer and va will be released. so need to do ctx_mgr fin before vm fini. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu/psp: avoid hard-code fence value pre submissionHawking Zhang
Hard-code submission fence is not a sustainable way as there is more and more run-time psp kernel mode submission from driver to fw Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdkfd: Add proper prefix to functionsAmber Lin
Add amdgpu_amdkfd_ prefix to amdgpu functions served for amdkfd usage. v2: fix indentation Signed-off-by: Amber Lin <Amber.Lin@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: Remove unused function pointersAmber Lin
Remove unused function pointers in kfd2kgd structure. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdkfd: Use functions from amdgpu for setting up page table baseYong Zhao
Use the functions from amdgpu to avoid directly programming registers in amdgpu_amdkfd_gfx_v9.c. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: Expose *_setup_vm_pt_regs for kfd to useYong Zhao
kfd has the same need to set the VM page table base register, so expose them for kfd to use for better maintainability. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdkfd: Delete unnecessary register settingsYong Zhao
Those register settings have been performed in amdgpu initialization gfxhub_v1_0_setup_vmid_config() and mmhub_v1_0_setup_vmid_config(). So no need to do it again in kfd. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: increase the size of HQD EOP buffersMarek Olšák
Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: put HQD EOP buffers into VRAMMarek Olšák
This increases performance of compute queues. EOP events (PKT3_RELEASE_MEM) are stored into these buffers. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: use scheduler fault instead of reset workChristian König
Signal a fault to the scheduler on an illegal instruction or register access violation instead of kicking of the reset handler directly. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: remove illegal instruction stub from si_dma.cChristian König
Was never used. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: Revised PSP commentsJohn Clements
Revised comments in PSP SOS/Sysdriver loading sequence Signed-off-by: John Clements <clements.jm@gmail.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: fix sdma v4 ring is disabled accidentlyPhilip Yang
For sdma v4, there is bug caused by commit d4e869b6b5d6 ("drm/amdgpu: add ring test for page queue")' local variable ring is reused and changed, so amdgpu_ttm_set_buffer_funcs_status(adev, true) is skipped accidently. As a result, amdgpu_fill_buffer() will fail, kernel message: [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.260444] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.260627] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.290119] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.290370] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.319971] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.320486] amdgpu 0000:19:00.0: [mmhub] VMC page fault (src_id:0 ring:154 vmid:8 pasid:32768, for process pid 0 thread pid 0) [ 25.320533] amdgpu 0000:19:00.0: in page starting at address 0x0000000000000000 from 18 [ 25.320563] amdgpu 0000:19:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00800134 Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: add ring test for page queueHuang Rui
We add page queue for sdma to update page table. So here it also needs ring test to verify it workable during the initialization. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: disable SDMA page queue on Vega20Evan Quan
Since we see driver loading failure on Vega20. Keep it disabled until it's ready. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu/sdma4: APUs do not have a page queueAlex Deucher
Don't use the paging queue on APUs. Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: use paging queue for VM page table updatesChristian König
Only for testing, not sure if we should keep it like this. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: activate paging queue on SDMA v4Christian König
Implement all the necessary stuff to get those extra rings working. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: add some [WR]REG32_SDMA macros to sdma_v4_0.cChristian König
Significantly shortens the code. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: remove SRIOV specific handling from sdma_v4_0_gfx_resumeChristian König
Just use the same code path for both SRIOV and bare metal. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: remove non gfx specific handling from sdma_v4_0_gfx_resumeChristian König
Needed to start using the paging queue. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: add basics for SDMA page queue supportChristian König
Just the common helper and a new ring in the SDMA instance. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: fix sdma v4 startup under SRIOVChristian König
Under SRIOV we were enabling the ring buffer before it was initialized. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/ttm: Rename ttm_bo_global_{init,release}() to ttm_bo_global_ref_{,}()Thomas Zimmermann
The functions ttm_bo_global_init() and ttm_bo_global_release() do not receive an argument of type struct ttm_bo_global. Both take a struct drm_global_reference that contains points to a struct ttm_bo_global_ref. Renaming them reflects this. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/amdgpu: fix sdma doorbell comments typoFrank.Min
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Frank.Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-01drm/amdgpu: revert "enable gfxoff in non-sriov and stutter mode by default"Christian König
This is still completely breaking my Raven system. This reverts commit cdf2f910fa969adca1b0e3ad2b487821233dc038. Revert until we sort out the sbios and firmware combinations that work correctly. bug: https://bugs.freedesktop.org/show_bug.cgi?id=108606 Cc: stable@vger.kernel.org # v4.19 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-01drm/amdgpu: Fix skipping hangged job reset during gpu recover.Andrey Grodzovsky
Problem: During GPU recover DAL would hang in amdgpu_pm_compute_clocks->amdgpu_fence_wait_empty Fix: Turns out there was a typo introduced by 3320b8d drm/amdgpu: remove job->ring which caused skipping amdgpu_fence_driver_force_completion and so the hangged job was never force signaled and this would cause the hang later in DAL. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-26drm/amdgpu: Fix compute ring 1.0.0 failure after resetAndrey Grodzovsky
Problem: After GPU reset on dGPUs with gfx8 compute ring 1.0.0 fails to pass the ring test. Ring registers inspection shows that it's active and no hang is observed (rptr == wptr) No significant diffs were observed between CP_HQD* registers for the ring in good and bad shape. Fix: No clear reason why but reversing the order of ring tests fixes the problem. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-26drm/amdgpu: fix VM leaf walkingChristian König
Make sure we don't try to go down further after the leave walk already ended. This fixes a crash with a new VM test. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Rex Zhu Rex.Zhu@amd.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-25drm/amdgpu: fix amdgpu_vm_finiChristian König
We should not remove mappings in rbtree_postorder_for_each_entry_safe because that rebalances the tree. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-24drm/amdgpu: Fix null point errorRex Zhu
need to check adev->powerplay.pp_funcs first, becasue from AI, the smu ip can be disabled by user, and the pp_handle is null in this case. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>