Age | Commit message (Collapse) | Author |
|
Navi12 has worked fine for a while now.
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add device ID for sienna_cichlid.
v2: squash in additional device ids.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Switch from magic numbers to defines for AV1 clockgating.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We never unmapped the regiser BAR on failure.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Set ttm->sg to NULL after kfree, to avoid memory corruption backtrace:
[ 420.932812] kernel BUG at
/build/linux-do9eLF/linux-4.15.0/mm/slub.c:295!
[ 420.934182] invalid opcode: 0000 [#1] SMP NOPTI
[ 420.935445] Modules linked in: xt_conntrack ipt_MASQUERADE
[ 420.951332] Hardware name: Dell Inc. PowerEdge R7525/0PYVT1, BIOS
1.5.4 07/09/2020
[ 420.952887] RIP: 0010:__slab_free+0x180/0x2d0
[ 420.954419] RSP: 0018:ffffbe426291fa60 EFLAGS: 00010246
[ 420.955963] RAX: ffff9e29263e9c30 RBX: ffff9e29263e9c30 RCX:
000000018100004b
[ 420.957512] RDX: ffff9e29263e9c30 RSI: fffff3d33e98fa40 RDI:
ffff9e297e407a80
[ 420.959055] RBP: ffffbe426291fb00 R08: 0000000000000001 R09:
ffffffffc0d39ade
[ 420.960587] R10: ffffbe426291fb20 R11: ffff9e49ffdd4000 R12:
ffff9e297e407a80
[ 420.962105] R13: fffff3d33e98fa40 R14: ffff9e29263e9c30 R15:
ffff9e2954464fd8
[ 420.963611] FS: 00007fa2ea097780(0000) GS:ffff9e297e840000(0000)
knlGS:0000000000000000
[ 420.965144] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 420.966663] CR2: 00007f16bfffefb8 CR3: 0000001ff0c62000 CR4:
0000000000340ee0
[ 420.968193] Call Trace:
[ 420.969703] ? __page_cache_release+0x3c/0x220
[ 420.971294] ? amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu]
[ 420.972789] kfree+0x168/0x180
[ 420.974353] ? amdgpu_ttm_tt_set_user_pages+0x64/0xc0 [amdgpu]
[ 420.975850] ? kfree+0x168/0x180
[ 420.977403] amdgpu_ttm_tt_unpopulate+0x5e/0x80 [amdgpu]
[ 420.978888] ttm_tt_unpopulate.part.10+0x53/0x60 [amdttm]
[ 420.980357] ttm_tt_destroy.part.11+0x4f/0x60 [amdttm]
[ 420.981814] ttm_tt_destroy+0x13/0x20 [amdttm]
[ 420.983273] ttm_bo_cleanup_memtype_use+0x36/0x80 [amdttm]
[ 420.984725] ttm_bo_release+0x1c9/0x360 [amdttm]
[ 420.986167] amdttm_bo_put+0x24/0x30 [amdttm]
[ 420.987663] amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[ 420.989165] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x9ca/0xb10
[amdgpu]
[ 420.990666] kfd_ioctl_alloc_memory_of_gpu+0xef/0x2c0 [amdgpu]
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Not being able to create amdgpu sysfs attributes
is not a fatal error warranting not to continue
to try to bring up the display. Thus, if we get
an error trying to create amdgpu sysfs attrs,
report it and continue on to try to bring up
a display.
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Slava Abramov <slava.abramov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The firmware provided via MODULE_FIRMWARE appears in the
module information. External tools(eg. dracut) may use the
list of fw files to include them as appropriate in an initramfs,
thus missing declaration will lead to request firmware failure
in boot time.
Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tianci Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
With IOMMU enabled, if SDPIF_MMIO_CNTRL_0 is not set
appropriately the system hangs without any trace
during S3.
To ease debug and to ensure that the failure, if any,
was caused by a race conditions that disabled write access to
SDPIF_MMIO_CNTRL_0 register, warn the user about it.
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The two accel cleanup paths were mostly the same once refactored.
Just pass a bool to say if the evictions are to be pipelined.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917064132.148521-2-airlied@gmail.com
|
|
Now the bind functions have all the protection explicitly the
drivers can just call them directly, and the api can be unexported
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-5-airlied@gmail.com
|
|
This moves unbind into the driver side on destroy paths.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-4-airlied@gmail.com
|
|
Call the driver first and have it call the common code cleanup.
This is useful later to fix unbind.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-3-airlied@gmail.com
|
|
This moves the generic tracking into the drivers and protects
against reentrancy in the drivers. It fixes up radeon and agp
to be able to query the bound status as that is required.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-2-airlied@gmail.com
|
|
PASID is defined as a few different types in iommu including "int",
"u32", and "unsigned int". To be consistent and to match with uapi
definitions, define PASID and its variations (e.g. max PASID) as "u32".
"u32" is also shorter and a little more explicit than "unsigned int".
No PASID type change in uapi although it defines PASID as __u64 in
some places.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lkml.kernel.org/r/1600187413-163670-2-git-send-email-fenghua.yu@intel.com
|
|
The firmware provided via MODULE_FIRMWARE appears in the
module information. External tools(eg. dracut) may use the
list of fw files to include them as appropriate in an initramfs,
thus missing declaration will lead to request firmware failure
in boot time.
Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tianci Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Move these up to the bo level, moving ttm_tt to just being
backing store. Next step is to move the bound flag out.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915024007.67163-6-airlied@gmail.com
|
|
Drivers have to call populate themselves now before binding.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915024007.67163-5-airlied@gmail.com
|
|
This adds 2 getters and 4 setters, however unbound and populated
are currently the same thing, this will change, it also drops
a BUG_ON that seems not that useful.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915024007.67163-2-airlied@gmail.com
|
|
Create sysfs interface also for sienna_cichlid.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Copy paste typo.
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Due to hardware bugs, scatter/gather display on raven requires
a 1:1 IOMMU mapping, however, SME (System Memory Encryption)
requires an indirect IOMMU mapping because the encryption bit
is beyond the DMA mask of the chip. As such, the two are
incompatible.
Acked-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fixes coccicheck warning:
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1003:4-9: WARNING: Comparison to bool
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1083:5-11: WARNING: Comparison to bool
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fixes coccicheck warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c:619:15-49: WARNING: Comparison to bool
drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c:629:15-49: WARNING: Comparison to bool
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fixes coccicheck warning:
drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c:1243:14-25: WARNING: Comparison to bool
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fixes coccicheck warning:
drivers/gpu/drm/amd/amdgpu/si.c:1342:5-10: WARNING: Comparison to bool
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fixes coccicheck warning:
drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c:562:5-11: WARNING: Comparison to bool
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fixes coccicheck warning:
drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c:619:5-11: WARNING: Comparison to bool
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fixes coccicheck warning:
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:3563:5-31: WARNING: Comparison to bool
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fixes coccicheck warning:
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:2805:5-11: WARNING: Comparison to bool
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Disabling perf events does not specify reset in ABI so stop doing it in
hardware.
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add more accurate description of the pe parameter of function
amdgpu_vm_sdma_udpate and amdgpu_vm_cpu_update
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add comments to refect what function does
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Create sysfs interface also for sienna_cichlid.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
SDMA utilization calculations are enabled/disabled by
writing to SDMAx_PUB_DUMMY_REG2 register. Currently,
enable this only for Arcturus.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why&How]
To refactor DM IRQ management, all fields used by IRQ is best moved
to a separate struct so that main amdgpu_crtc struct need not be changed
Location of the new struct shall be in DM
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Output RAS init status
If RAS init fails, teardown RAS context
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
It needs to add ta DTM/HDCP print to get HDCP/DTM version info when cat
amdgpu_firmware_info
Signed-off-by: Changfeng <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
XGMI support is more complicated than single device support as
questions of synchronization between the device recovering from
PCI error and other members of the hive are required.
Leaving this for next round.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Reuse exsisting functions from GPU recovery to avoid code
duplications.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Cache the PCI state on boot and before each case where we might
loose it.
v2: Add pci_restore_state while caching the PCI state to avoid
breaking PCI core logic for stuff like suspend/resume.
v3: Extract pci_restore_state from amdgpu_device_cache_pci_state
to avoid superflous restores during GPU resets and suspend/resumes.
v4: Style fixes.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Wait for HW/PSP initiated ASIC reset to complete before
starting the recovery operations.
v2: Remove typo
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
DPC recovery involves ASIC reset just as normal GPU recovery so block
SW GPU schedulers and wait on all concurrent GPU resets.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
At this point the ASIC is already post reset by the HW/PSP
so the HW not in proper state to be configured for suspension,
some blocks might be even gated and so best is to avoid touching it.
v2: Rename in_dpc to more meaningful name
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add PCI Downstream Port Containment (DPC) with
basic recovery functionality
v2: remove pci_save_state to avoid breaking suspend/resume
v3: Fix style comments
v4: Improve description.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In function flr_work, we should do gpu recovery when no job
is running. Fix the logic by inverting it.
v2: modify the description
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Instead of letting TTM make an educated guess based on
some mask all drivers should just specify what caching
they want for their CPU mappings.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/390207/
|
|
As far as I can tell this was never used either and we just
always fallback to the order cached > wc > uncached anyway.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/390142/
|
|
Paul Cercueil needs some patches in -rc5 to apply new patches for ingenic
properly.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
|
|
It's not supported to specify more than one of those flags.
So it never made sense to make this a flag in the first place.
Nuke the flags and specify directly which memory type to use.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/389826/?series=81551&rev=1
|