Age | Commit message (Collapse) | Author |
|
This reverts commit 92020e81ddbeac351ea4a19bcf01743f32b9c800.
This causes stuttering and timeouts with DMCUB for some users
so revert it until we understand why and safely enable it
to save power.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1887
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: stable@vger.kernel.org
|
|
amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:40 vmid:8 pasid:32769, for process test_basic pid 3305 thread test_basic pid 3305)
amdgpu: in page starting at address 0x00007ff990003000 from IH client 0x12 (VMC)
amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00840051
amdgpu: Faulty UTCL2 client ID: MP1 (0x0)
amdgpu: MORE_FAULTS: 0x1
amdgpu: WALKER_ERROR: 0x0
amdgpu: PERMISSION_FAULTS: 0x5
amdgpu: MAPPING_ERROR: 0x0
amdgpu: RW: 0x1
When memory is allocated by kfd, no one triggers the tlb flush for MMHUB0.
There is page fault from MMHUB0.
v2:fix indentation
v3:change subject and fix indentation
Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Use the correct adev variable for the drm_fb_helper in
amdgpu_device_gpu_recover(). Noticed by inspection.
Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Certain GL unit tests for large textures can cause problems
with the OOM killer since there is no way to link this memory
to a process. This was originally mitigated (but not necessarily
eliminated) by limiting the GTT size. The problem is this limit
is often too low for many modern games so just make the limit 1/2
of system memory. The OOM accounting needs to be addressed, but
we shouldn't prevent common 3D applications from being usable
just to potentially mitigate that corner case.
Set default GTT size to max(3G, 1/2 of system ram) by default.
v2: drop previous logic and default to 3/4 of ram
v3: default to half of ram to align with ttm
v4: fix spelling in comment (Kent)
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1942
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The commit below changed the TTM manager size unit from pages to
bytes, but failed to adjust the corresponding calculations in
amdgpu_ioctl.
Fixes: dfa714b88eb0 ("drm/amdgpu: remove GTT accounting v2")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1930
Bug: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6642
Tested-by: Martin Roukala <martin.roukala@mupuf.org>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.18.x
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.19-2022-06-08:
amdgpu:
- DCN 3.1 golden settings fix
- eDP fixes
- DMCUB fixes
- GFX11 fixes and cleanups
- VCN fix for yellow carp
- GMC11 fixes
- RAS fixes
- GPUVM TLB flush fixes
- SMU13 fixes
- VCN3 AV1 regression fix
- VCN2 JPEG fix
- Other misc fixes
amdkfd:
- MMU notifier fix
- Support for more GC 10.3.x families
- Pinned BO handling fix
- Partial migration bug fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220608203008.6187-1-alexander.deucher@amd.com
|
|
invalid/prime icahce operation takes effect both pipes cuconrrently,
therefore CP_MES_IC_BASE_LO/HI and CP_MES_MDBASE_LO/HI both have to be
set before prime icache. Otherwise MES hardware gets garbage data in
above regsters and causes page fault
[ 470.873200] amdgpu 0000:33:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:217 vmid:0 pasid:0, for process pid 0 thread pid 0)
[ 470.873222] amdgpu 0000:33:00.0: amdgpu: in page starting at address 0x000092cb89b00000 from client 10
[ 470.873234] amdgpu 0000:33:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000BB3
[ 470.873242] amdgpu 0000:33:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5)
[ 470.873247] amdgpu 0000:33:00.0: amdgpu: MORE_FAULTS: 0x1
[ 470.873251] amdgpu 0000:33:00.0: amdgpu: WALKER_ERROR: 0x1
[ 470.873256] amdgpu 0000:33:00.0: amdgpu: PERMISSION_FAULTS: 0xb
[ 470.873260] amdgpu 0000:33:00.0: amdgpu: MAPPING_ERROR: 0x1
[ 470.873264] amdgpu 0000:33:00.0: amdgpu: RW: 0x0
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add jpeg vmid update under IB submit
Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
The TLB on GFX8 stores each block of 8 PTEs where any of the valid bits
are set.
Fixes: 5255e146c99a ("drm/amdgpu: rework TLB flushing")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The job is not yet initialized here.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2037
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: cdc7893fc93f ("drm/amdgpu: use job and ib structures directly in CS parsers")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
All other chips, from gfx6-gfx10, now include the MODE register at the
end of the wave debug state. This appears to have been missed in gfx11,
so this patch adds in MODE to the debug state for gfx11.
Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This reverts commit b992a19085885c096b19625a85c674cb89829ca1.
This causes regression in GPU reset related test.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: ricetons@gmail.com
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Suppress the compile warning below:
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:1292
gfx_v11_0_rlc_backdoor_autoload_copy_ucode() warn: should '1 << id' be a 64 bit type?
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The kfd_bo_list is used to restore process BOs after
evictions. As page tables could be destroyed during
evictions, we should also update pinned BOs' page tables
during restoring to make sure they are valid.
So for pinned BOs,
1, Validate them and update their page tables.
2, Don't add eviction fence for them.
v2:
- Don't handle pinned ones specially in BO validation.(Felix)
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Flush TLBs when existing PDEs are updated because a PTB or PDB moved,
but avoids unnecessary TLB flushes when new PDBs or PTBs are added to
the page table, which commonly happens when memory is mapped for the
first time.
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add IP GC 10.3.7 in the list of target to have
tmz enabled by default.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.18.x
|
|
Pull more drm updates from Dave Airlie:
"This is mostly regular fixes, msm and amdgpu. There is a tegra patch
that is bit of prep work for a 5.20 feature to avoid some inter-tree
syncs, and a couple of late addition amdgpu uAPI changes but best to
get those in early, and the userspace pieces are ready.
msm:
- Limiting WB modes to max sspp linewidth
- Fixing the supported rotations to add 180 back for IGT
- Fix to handle pm_runtime_get_sync() errors to avoid unclocked
access in the bind() path for dpu driver
- Fix the irq_free() without request issue which was a big-time
hitter in the CI-runs.
amdgpu:
- Update fdinfo to the common drm format
- uapi:
- Add VM_NOALLOC GPUVM attribute to prevent buffers for going
into the MALL
- Add AMDGPU_GEM_CREATE_DISCARDABLE flag to create buffers that
can be discarded on eviction
- Mesa code which uses these:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16466
- Link training fixes
- DPIA fixes
- Misc code cleanups
- Aux fixes
- Hotplug fixes
- More FP clean up
- Misc GFX9/10 fixes
- Fix a possible memory leak in SMU shutdown
- SMU 13 updates
- RAS fixes
- TMZ fixes
- GC 11 updates
- SMU 11 metrics fixes
- Fix coverage blend mode for overlay plane
- Note DDR vs LPDDR memory
- Fuzz fix for CS IOCTL
- Add new PCI DID
amdkfd:
- Clean up hive setup
- Misc fixes
tegra:
- add some prelim 5.20 work to avoid inter-tree mess"
* tag 'drm-next-2022-06-03-1' of git://anongit.freedesktop.org/drm/drm: (57 commits)
drm/msm/dpu: Move min BW request and full BW disable back to mdss
drm/msm/dpu: Fix pointer dereferenced before checking
drm/msm/dpu: Remove unused code
drm/msm/disp/dpu1: remove superfluous init
drm/msm/dp: Always clear mask bits to disable interrupts at dp_ctrl_reset_irq_ctrl()
gpu: host1x: Add context bus
drm/amdgpu: add drm-client-id to fdinfo v2
drm/amdgpu: Convert to common fdinfo format v5
drm/amdgpu: bump minor version number
drm/amdgpu: add AMDGPU_VM_NOALLOC v2
drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLE
drm/amdgpu: add beige goby PCI ID
drm/amd/pm: Return auto perf level, if unsupported
drm/amdkfd: fix typo in comment
drm/amdgpu/gfx: fix typos in comments
drm/amdgpu/cs: make commands with 0 chunks illegal behaviour.
drm/amdgpu: differentiate between LP and non-LP DDR memory
drm/amdgpu: Resolve pcie_bif RAS recovery bug
drm/amdgpu: clean up asd on the ta_firmware_header_v2_0
drm/amdgpu/discovery: validate VCN and SDMA instances
...
|
|
Adjust the sequence for ras late init and separate ras reset error status
from query status.
v2: squash in fix from Candice
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix aldebaran ras supported check on SRIOV guest side,
the previous check conditicon block all ras feature
on baremetal
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This symbol is not used outside of gfx_v11_0.c, so marks it static.
Fixes the following w1 warning:
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:1945:6: warning: no previous
prototype for function 'gfx_v11_0_rlc_stop' [-Wmissing-prototypes].
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: sunliming <sunliming@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fixes the following w1 warning:
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:5873:2: warning: unannotated
fall-through between switch labels [-Wimplicit-fallthrough].
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: sunliming <sunliming@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Wrong fb offset results in dmub f/w errors and white screen.
[drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
[How]
Read aper_base from mmhub because GC is off by default
v2: use BAR for passthrough (Alex)
Signed-off-by: Roman Li <Roman.Li@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Supports AV1. Mesa already has support for this and
doesn't rely on the kernel caps for yellow carp, so
this was already working from an application perspective.
Fixes: 554398174d98 ("amdgpu/nv.c - Added video codec support for Yellow Carp")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2002
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
This symbol is not used outside of imu_v11_0.c, so marks it
static.
Fixes the following w1 warning:
drivers/gpu/drm/amd/amdgpu/imu_v11_0.c:302:6: warning: no previous
prototype for ‘program_imu_rlc_ram’ [-Wmissing-prototypes].
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"Almost all of MM here. A few things are still getting finished off,
reviewed, etc.
- Yang Shi has improved the behaviour of khugepaged collapsing of
readonly file-backed transparent hugepages.
- Johannes Weiner has arranged for zswap memory use to be tracked and
managed on a per-cgroup basis.
- Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for
runtime enablement of the recent huge page vmemmap optimization
feature.
- Baolin Wang contributes a series to fix some issues around hugetlb
pagetable invalidation.
- Zhenwei Pi has fixed some interactions between hwpoisoned pages and
virtualization.
- Tong Tiangen has enabled the use of the presently x86-only
page_table_check debugging feature on arm64 and riscv.
- David Vernet has done some fixup work on the memcg selftests.
- Peter Xu has taught userfaultfd to handle write protection faults
against shmem- and hugetlbfs-backed files.
- More DAMON development from SeongJae Park - adding online tuning of
the feature and support for monitoring of fixed virtual address
ranges. Also easier discovery of which monitoring operations are
available.
- Nadav Amit has done some optimization of TLB flushing during
mprotect().
- Neil Brown continues to labor away at improving our swap-over-NFS
support.
- David Hildenbrand has some fixes to anon page COWing versus
get_user_pages().
- Peng Liu fixed some errors in the core hugetlb code.
- Joao Martins has reduced the amount of memory consumed by
device-dax's compound devmaps.
- Some cleanups of the arch-specific pagemap code from Anshuman
Khandual.
- Muchun Song has found and fixed some errors in the TLB flushing of
transparent hugepages.
- Roman Gushchin has done more work on the memcg selftests.
... and, of course, many smaller fixes and cleanups. Notably, the
customary million cleanup serieses from Miaohe Lin"
* tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits)
mm: kfence: use PAGE_ALIGNED helper
selftests: vm: add the "settings" file with timeout variable
selftests: vm: add "test_hmm.sh" to TEST_FILES
selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests
selftests: vm: add migration to the .gitignore
selftests/vm/pkeys: fix typo in comment
ksm: fix typo in comment
selftests: vm: add process_mrelease tests
Revert "mm/vmscan: never demote for memcg reclaim"
mm/kfence: print disabling or re-enabling message
include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace"
include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion"
mm: fix a potential infinite loop in start_isolate_page_range()
MAINTAINERS: add Muchun as co-maintainer for HugeTLB
zram: fix Kconfig dependency warning
mm/shmem: fix shmem folio swapoff hang
cgroup: fix an error handling path in alloc_pagecache_max_30M()
mm: damon: use HPAGE_PMD_SIZE
tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate
nodemask.h: fix compilation error with GCC12
...
|
|
This is enough to get gputop working :)
v2: rebase and some addition cleanup
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Convert fdinfo format to one documented in drm-usage-stats.rst.
It turned out that the existing implementation was actually completely
nonsense. The calculated percentages indeed represented the usage of the
engine, but with varying time slices.
So 10% usage for application A could mean something completely different
than 10% usage for application B.
Completely nuke that and just use the now standardized nanosecond
interface.
v2: drop the documentation change for now, nuke percentage calculation
v3: only account for each hw_ip, move the time_spend to the ctx mgr.
v4: move general ctx changes into separate patch, rework the fdinfo to
ctx_mgr interface so that all usages are calculated at once, drop
some unecessary and dangerous refcount dance.
v5: add one more comment how we calculate the time spend
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Increase the minor version number to indicate that the new flags are
available.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add the AMDGPU_VM_NOALLOC flag to let userspace control MALL allocation.
v2: also add the flag to the allowed flags.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add a AMDGPU_GEM_CREATE_DISCARDABLE flag to note that the content of a BO
doesn't needs to be preserved during eviction.
KFD was already using a similar functionality for SVM BOs so replace the
internal flag with the new UAPI.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add a beige goby PCI ID.
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Spelling mistakes (triple letters) in comments.
Detected with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Submitting a cs with 0 chunks, causes an oops later, found trying
to execute the wrong userspace driver.
MESA_LOADER_DRIVER_OVERRIDE=v3d glxinfo
[172536.665184] BUG: kernel NULL pointer dereference, address: 00000000000001d8
[172536.665188] #PF: supervisor read access in kernel mode
[172536.665189] #PF: error_code(0x0000) - not-present page
[172536.665191] PGD 6712a0067 P4D 6712a0067 PUD 5af9ff067 PMD 0
[172536.665195] Oops: 0000 [#1] SMP NOPTI
[172536.665197] CPU: 7 PID: 2769838 Comm: glxinfo Tainted: P O 5.10.81 #1-NixOS
[172536.665199] Hardware name: To be filled by O.E.M. To be filled by O.E.M./CROSSHAIR V FORMULA-Z, BIOS 2201 03/23/2015
[172536.665272] RIP: 0010:amdgpu_cs_ioctl+0x96/0x1ce0 [amdgpu]
[172536.665274] Code: 75 18 00 00 4c 8b b2 88 00 00 00 8b 46 08 48 89 54 24 68 49 89 f7 4c 89 5c 24 60 31 d2 4c 89 74 24 30 85 c0 0f 85 c0 01 00 00 <48> 83 ba d8 01 00 00 00 48 8b b4 24 90 00 00 00 74 16 48 8b 46 10
[172536.665276] RSP: 0018:ffffb47c0e81bbe0 EFLAGS: 00010246
[172536.665277] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[172536.665278] RDX: 0000000000000000 RSI: ffffb47c0e81be28 RDI: ffffb47c0e81bd68
[172536.665279] RBP: ffff936524080010 R08: 0000000000000000 R09: ffffb47c0e81be38
[172536.665281] R10: ffff936524080010 R11: ffff936524080000 R12: ffffb47c0e81bc40
[172536.665282] R13: ffffb47c0e81be28 R14: ffff9367bc410000 R15: ffffb47c0e81be28
[172536.665283] FS: 00007fe35e05d740(0000) GS:ffff936c1edc0000(0000) knlGS:0000000000000000
[172536.665284] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[172536.665286] CR2: 00000000000001d8 CR3: 0000000532e46000 CR4: 00000000000406e0
[172536.665287] Call Trace:
[172536.665322] ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[172536.665332] drm_ioctl_kernel+0xaa/0xf0 [drm]
[172536.665338] drm_ioctl+0x201/0x3b0 [drm]
[172536.665369] ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[172536.665372] ? selinux_file_ioctl+0x135/0x230
[172536.665399] amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[172536.665403] __x64_sys_ioctl+0x83/0xb0
[172536.665406] do_syscall_64+0x33/0x40
[172536.665409] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2018
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Some applications want to know whether the memory is LP or
not.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Check shared buf instead of init flag for xgmi ta shared buf init
during xgmi ta initialization.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
On the psp13 series use ta_firmware_header_v2_0 and the asd firmware
was buildin ta, so needn't request asd firmware separately.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Validate the VCN and SDMA instances against the driver
structure sizes to make sure we don't get into a
situation where the firmware reports more instances than
the driver supports.
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Suppress two compile warnings about "no previous prototype".
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add support of IP GC 10.3.7 in amdgpu_gmc_tmz_set.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use IP version rather then code name of IPs for
tmz set.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
To enable TMZ feature based on IP version needs adev->ip_version
populated but its empty. Move amdgpu_gmc_tmz_set to a place where
ip_version is populated.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
support umc/gfx/sdma ras on guest side
Changed from V1:
move sriov judgment in amdgpu_ras_interrupt_fatal_error_handler
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Make sure the queue is not longer active before programming
the kiq EOP registers.
Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Remove the accidental shifts on the values of RPTR_BLOCK_SIZE
in gfx_v8-v11. The bug essentially always programs the
corresponding fields to zero instead of the correct value.
The hardware clamps the min value to 5 so this resulted in a
value of 5 being programmed.
Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Let each context have a pointer to the ctx manager and properly
initialize the adev pointer inside the context manager.
Reduce the BUG_ON() in amdgpu_ctx_add_fence() into a WARN_ON() and
directly return the sequence number instead of writing into a parmeter.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Clean up redundant, copy-paste code blocks during the initialization of
the doorbells in mqd_init().
Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.19-2022-05-18:
amdgpu:
- Misc code cleanups
- Additional SMU 13.x enablement
- Smartshift fixes
- GFX11 fixes
- Support for SMU 13.0.4
- SMU mutex fix
- Suspend/resume fix
amdkfd:
- static checker fix
- Doorbell/MMIO resource handling fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220518205621.5741-1-alexander.deucher@amd.com
|
|
An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC
reset for handling aborted suspend can't work with s2idle.
This functionality was introduced in commit daf8de0874ab5b ("drm/amdgpu:
always reset the asic in suspend (v2)"). A few other commits have
gone on top of the ASIC reset, but this still doesn't work on the A+A
configuration in s2idle.
Avoid doing the reset on dGPUs specifically when using s2idle.
Fixes: daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2008
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This fixes a kernel oops when MES is not enabled.
Reported-by: Kenny Ho <Kenny.Ho@amd.com>
Suggested-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Fixes: 18ee4ce63e0f32 ("drm/amdgpu: add mes unmap legacy queue routine")
Fixes: 3d879e81f0f9ed ("drm/amdgpu: add init support for GFX11 (v2)")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|