Age | Commit message (Collapse) | Author |
|
We need to disable this after the last eviction
call, but before we disable the SDMA IP.
Fixes: b70438004a14 ("drm/amdgpu: move buffer funcs setting up a level")
Link: https://lore.kernel.org/r/87edgv4x3i.fsf@vps.thesusis.net
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Tested-by: Phillip Susi <phill@thesusis.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Phillip Susi <phill@thesusis.net>
Cc: Luben Tuikov <ltuikov89@gmail.com>
|
|
MP0 v13.0.6 SOCs don't support DRM MGCG.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
HDP 4.4.2 clockgating is enabled by default, update the flags
accordingly.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Check if function is implemented before making the call.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Only PSPv13.0.6 SOCs take a longer time to reach steady state. Other
PSPv13 based SOCs don't need extended wait. Also, reduce PSPv13.0.6 wait
time.
Cc: stable@vger.kernel.org
Fixes: fc5988907156 ("drm/amdgpu: update retry times for psp vmbx wait")
Fixes: d8c1925ba8cd ("drm/amdgpu: update retry times for psp BL wait")
Link: https://lore.kernel.org/amd-gfx/34dd4c66-f7bf-44aa-af8f-c82889dd652c@amd.com/
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
sort error data list to optimize the printing order.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Boot time error query is not available until fw a10109
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
support new mca smu error code decoding from smu 85.86.0 for smu v13.0.6
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Disable MCBP(mid command buffer preemption) by default as old Mesa
hangs with it. We shall not enable the feature that breaks old usermode
driver.
Fixes: 50a7c8765ca6 ("drm/amdgpu: enable mcbp by default on gfx9")
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
The legacy region at 0x7F000 maps to valid registers in GC 9.4.3 SOCs.
Use 0x1A000 offset instead as MMIO register remap region.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Check smu v13_0_0 SKU type to select EEPROM I2C address.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
|
|
For 'AMDGPU_FAMILY_SI' family cards, in 'si_common_early_init' func, init
'didt_rreg' and 'didt_wreg' to 'NULL'. But in func
'amdgpu_debugfs_regs_didt_read/write', using 'RREG32_DIDT' 'WREG32_DIDT'
lacks of relevant judgment. And other 'amdgpu_ip_block_version' that use
these two definitions won't be added for 'AMDGPU_FAMILY_SI'.
So, add null pointer judgment before calling.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Lu Yao <yaolu@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When dGPU is put into BOCO it may be in D3cold but still able send
PME on display hotplug event. For this to work it must be enabled
as wake source from D3.
When runpm is enabled use pci_wake_from_d3() to mark wakeup as
enabled by default.
Cc: stable@vger.kernel.org # 6.1+
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This worked by luck if the GART aperture ended up at 0. When
we ended up moving GART on some chips, the GART aperture ended
up offsetting the AGP address since the resource->start is
a GART offset, not an MC address. Fix this by moving the AGP
address setup into amdgpu_bo_gpu_offset_no_check().
v2: check mem_type before checking agp
v3: check if the ttm bo has a ttm_tt allocated yet
Fixes: 67318cb84341 ("drm/amdgpu/gmc11: set gart placement GC11")
Tested-by: Mario Limonciello <mario.limonciello@amd.com>
Reported-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reported-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: christian.koenig@amd.com
Cc: mario.limonciello@amd.com
|
|
Fix the amdgpu runpm dereference usage count.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Fix a memory overflow issue in the gfx IB test
for some ASICs. At least 20 bytes are needed for
the IB test packet.
v2: correct code indentation errors. (Christian)
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
enable init_registers callback func for nbio v7.11.
Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Setting register to force ordering to prevent read/write or write/read
hazards for un-cached modes.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
|
|
In nbio v7_9, host driver should not issu gpu reset
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The smu needs to get the rlc power down message to sync the rlc state
with smu, the rlc state updating message need to be sent at while smu
begin suspend sequence , otherwise SMU will crash while RLC state is not
notified by driver, and rlc state probally changed after that
notification, so it needs to notify rlc state to smu at the end of the
suspend sequence in amdgpu_device_suspend() that can make sure the rlc
state is correctly set to SMU.
[ 101.000590] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000
[ 101.000598] amdgpu 0000:03:00.0: amdgpu: Failed to disable gfxoff!
[ 110.838026] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000
[ 110.838035] amdgpu 0000:03:00.0: amdgpu: Failed to disable smu features.
[ 110.838039] amdgpu 0000:03:00.0: amdgpu: Fail to disable dpm features!
[ 110.838040] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62
[ 110.884394] PM: suspend of devices aborted after 21213.620 msecs
[ 110.884402] PM: start suspend of devices aborted after 21213.882 msecs
[ 110.884405] PM: Some devices failed to suspend, or early wake event detected
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
GC IP 9.4.2 and up support TA reporting of the number
of xGMI links between peers.
Tested-by: Vignesh Chander <vignesh.chander@amd.com>
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We've had misc reports of random IOMMU page faults when
this is used. It's just a rarely used optimization anyway, so
let's just disable it. It can still be toggled via the
module parameter for testing.
v2: leave it configurable via module parameter
Reviewed-by: Yang Wang <kevinyang.wang@amd.com> (v1)
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We've had misc reports of random IOMMU page faults when
this is used. It's just a rarely used optimization anyway, so
let's just disable it. It can still be toggled via the
module parameter for testing.
v2: leave it configurable via module parameter
Reviewed-by: Yang Wang <kevinyang.wang@amd.com> (v1)
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We've had misc reports of random IOMMU page faults when
this is used. It's just a rarely used optimization anyway, so
let's just disable it. It can still be toggled via the
module parameter for testing.
v2: leave it configurable via module parameter
Fixes: 67318cb84341 ("drm/amdgpu/gmc11: set gart placement GC11")
Reviewed-by: Yang Wang <kevinyang.wang@amd.com> (v1)
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add a module parameter to control the AGP aperture. The AGP
aperture is an aperture in the GPU's internal address space
which provides direct non-paged access to the platform address
space. This access is non-snooped so only uncached memory
can be accessed.
Add a knob so that we can toggle this for debugging.
Fixes: 67318cb84341 ("drm/amdgpu/gmc11: set gart placement GC11")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Should be && rather than ||.
Fixes: b2e1cbe6281f ("drm/amdgpu/gmc11: disable AGP on GC 11.5")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The port num info is firstly introduced with 20.00.01.13 xgmi ta and
make them as part of topology info.
Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
fix ras err_data null pointer issue in amdgpu_ras.c
Fixes: 8cc0f5669eb6 ("drm/amdgpu: Support multiple error query modes")
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The variable "chunk_ptr" should be a pointer pointing
to a struct drm_amdgpu_cs_chunk instead of to a pointer
of that.
Signed-off-by: YuanShang <YuanShang.Mao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
uvd_entity_init()'
Fixes the following:
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:237: warning: Function parameter or member 'ring' not described in 'amdgpu_vce_entity_init'
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:405: warning: Function parameter or member 'ring' not described in 'amdgpu_uvd_entity_init'
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The valid num_mem_partitions is required during ttm pool fini,
thus move the cleanup at the end of the function.
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
MC_VM_AGP_* registers should not be programmed by guest driver.
v2: move early return outside of loop
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Reviewed-by: Samir Dhume <samir.dhume@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Pull drm fixes from Daniel Vetter:
"Dave's VPN to the big machine died, so it's on me to do fixes pr this
and next week while everyone else is at plumbers.
- big pile of amd fixes, but mostly for hw support newly added in 6.7
- i915 fixes, mostly minor things
- qxl memory leak fix
- vc4 uaf fix in mock helpers
- syncobj fix for DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE"
* tag 'drm-next-2023-11-10' of git://anongit.freedesktop.org/drm/drm: (78 commits)
drm/amdgpu: fix error handling in amdgpu_vm_init
drm/amdgpu: Fix possible null pointer dereference
drm/amdgpu: move UVD and VCE sched entity init after sched init
drm/amdgpu: move kfd_resume before the ip late init
drm/amd: Explicitly check for GFXOFF to be enabled for s0ix
drm/amdgpu: Change WREG32_RLC to WREG32_SOC15_RLC where inst != 0 (v2)
drm/amdgpu: Use correct KIQ MEC engine for gfx9.4.3 (v5)
drm/amdgpu: add smu v13.0.6 pcs xgmi ras error query support
drm/amdgpu: fix software pci_unplug on some chips
drm/amd/display: remove duplicated argument
drm/amdgpu: correct mca debugfs dump reg list
drm/amdgpu: correct acclerator check architecutre dump
drm/amdgpu: add pcs xgmi v6.4.0 ras support
drm/amdgpu: Change extended-scope MTYPE on GC 9.4.3
drm/amdgpu: disable smu v13.0.6 mca debug mode by default
drm/amdgpu: Support multiple error query modes
drm/amdgpu: refine smu v13.0.6 mca dump driver
drm/amdgpu: Do not program PF-only regs in hdp_v4_0.c under SRIOV (v2)
drm/amdgpu: Skip PCTL0_MMHUB_DEEPSLEEP_IB write in jpegv4.0.3 under SRIOV
drm: amd: Resolve Sphinx unexpected indentation warning
...
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.7-2023-11-10:
amdgpu:
- SR-IOV fixes
- DMCUB fixes
- DCN3.5 fixes
- DP2 fixes
- SubVP fixes
- SMU14 fixes
- SDMA4.x fixes
- Suspend/resume fixes
- AGP regression fix
- UAF fixes for some error cases
- SMU 13.0.6 fixes
- Documentation fixes
- RAS fixes
- Hotplug fixes
- Scheduling entity ordering fix
- GPUVM fixes
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231110190703.4741-1-alexander.deucher@amd.com
|
|
When clearing the root PD fails we need to properly release it again.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
mem = bo->tbo.resource may be NULL in amdgpu_vm_bo_update.
Fixes: 180253782038 ("drm/ttm: stop allocating dummy resources during BO creation")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
We need kernel scheduling entities to deal with handle clean up
if apps are not cleaned up properly. With commit 56e449603f0ac5
("drm/sched: Convert the GPU scheduler to variable number of run-queues")
the scheduler entities have to be created after scheduler init, so
change the ordering to fix this.
v2: Leave logic in UVD and VCE code
Fixes: 56e449603f0a ("drm/sched: Convert the GPU scheduler to variable number of run-queues")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: ltuikov89@gmail.com
|
|
The kfd_resume needs to touch GC registers to enable the interrupts,
it needs to be done before GFXOFF is enabled to ensure that the GFX is
not off and GC registers can be touched. So move kfd_resume before the
amdgpu_device_ip_late_init which enables the CGPG/GFXOFF.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If a user has disabled GFXOFF this may cause problems for the suspend
sequence. Ensure that it is enabled in amdgpu_acpi_is_s0ix_active().
The system won't reach the deepest state but it also won't hang.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-fixes for v6.7-rc1:
qxl:
- qxl memory leak fix.
syncobj:
- Fix waiting for DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE
vc4:
- Fix UAF in mock helpers
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
[sima: Stitch together both changelogs from Maarten. Also because of
branch history this contains a few more bugfixes which are already in
v6.6, but I didn't feel like this justifies some backmerge since there
wasn't any real conflict.]
Link: https://patchwork.freedesktop.org/patch/msgid/bc8598ee-d427-4616-8ebd-64107ab9a2d8@linux.intel.com
|
|
W/RREG32_RLC is hardedcoded to use instance 0. W/RREG32_SOC15_RLC
should be used instead when inst != 0.
v2: rebase
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
amdgpu_kiq_wreg/rreg is hardcoded to use MEC engine 0.
Add an xcc_id parameter to amdgpu_kiq_wreg/rreg, define W/RREG32_XCC
and amdgpu_device_xcc_wreg/rreg to use the new xcc_id parameter.
Using amdgpu_sriov_runtime to determine whether to access via kiq or
RLC is sufficient for now.
v5: add condition in amdgpu_device_xcc_w/rreg, remove trace func call
v4: avoid using amdgpu_sriov_w/rreg
v3: use W/RREG32_XCC to handle non-kiq case
v2: define amdgpu_device_xcc_wreg/rreg instead of changing parameters
of amdgpu_device_wreg/rreg
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
add pcs xgmi ras error query support for smu v13.0.6.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When software 'pci unplug' using IGT is executed we got a sysfs directory
entry is NULL for differant ras blocks like hdp, umc, etc.
Before call 'sysfs_remove_file_from_group' and 'sysfs_remove_group'
check that 'sd' is not NULL.
[ +0.000001] RIP: 0010:sysfs_remove_group+0x83/0x90
[ +0.000002] Code: 31 c0 31 d2 31 f6 31 ff e9 9a a8 b4 00 4c 89 e7 e8 f2 a2 ff ff eb c2 49 8b 55 00 48 8b 33 48 c7 c7 80 65 94 82 e8 cd 82 bb ff <0f> 0b eb cc 66 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90
[ +0.000001] RSP: 0018:ffffc90002067c90 EFLAGS: 00010246
[ +0.000002] RAX: 0000000000000000 RBX: ffffffff824ea180 RCX: 0000000000000000
[ +0.000001] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ +0.000001] RBP: ffffc90002067ca8 R08: 0000000000000000 R09: 0000000000000000
[ +0.000001] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ +0.000001] R13: ffff88810a395f48 R14: ffff888101aab0d0 R15: 0000000000000000
[ +0.000001] FS: 00007f5ddaa43a00(0000) GS:ffff88841e800000(0000) knlGS:0000000000000000
[ +0.000002] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000001] CR2: 00007f8ffa61ba50 CR3: 0000000106432000 CR4: 0000000000350ef0
[ +0.000001] Call Trace:
[ +0.000001] <TASK>
[ +0.000001] ? show_regs+0x72/0x90
[ +0.000002] ? sysfs_remove_group+0x83/0x90
[ +0.000002] ? __warn+0x8d/0x160
[ +0.000001] ? sysfs_remove_group+0x83/0x90
[ +0.000001] ? report_bug+0x1bb/0x1d0
[ +0.000003] ? handle_bug+0x46/0x90
[ +0.000001] ? exc_invalid_op+0x19/0x80
[ +0.000002] ? asm_exc_invalid_op+0x1b/0x20
[ +0.000003] ? sysfs_remove_group+0x83/0x90
[ +0.000001] dpm_sysfs_remove+0x61/0x70
[ +0.000002] device_del+0xa3/0x3d0
[ +0.000002] ? ktime_get_mono_fast_ns+0x46/0xb0
[ +0.000002] device_unregister+0x18/0x70
[ +0.000001] i2c_del_adapter+0x26d/0x330
[ +0.000002] arcturus_i2c_control_fini+0x25/0x50 [amdgpu]
[ +0.000236] smu_sw_fini+0x38/0x260 [amdgpu]
[ +0.000241] amdgpu_device_fini_sw+0x116/0x670 [amdgpu]
[ +0.000186] ? mutex_lock+0x13/0x50
[ +0.000003] amdgpu_driver_release_kms+0x16/0x40 [amdgpu]
[ +0.000192] drm_minor_release+0x4f/0x80 [drm]
[ +0.000025] drm_release+0xfe/0x150 [drm]
[ +0.000027] __fput+0x9f/0x290
[ +0.000002] ____fput+0xe/0x20
[ +0.000002] task_work_run+0x61/0xa0
[ +0.000002] exit_to_user_mode_prepare+0x150/0x170
[ +0.000002] syscall_exit_to_user_mode+0x2a/0x50
Cc: Hawking Zhang <hawking.zhang@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
avoid driver to touch invalid mca reg.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
So driver doesn't touch invalid aca entries.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
add pcs xgmi v6.4.0 ras support
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Change local memory type to MTYPE_UC on revision id 0
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Direct error query mode and firmware error query mode
are supported for now.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
refine smu mca driver to support query ras error from pmfw path.
- correct gfx smu bank hwid (from mp5 to smu bank)
- retire unused callback function in amdgpu_mca_smu_funcs{}
- add new mca_bank_set{} structure to collect mca bank
- move enum mca_reg_idx into amdgpu_mca.h header
- add mca status register field decode macro
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|