pm24.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2021-11-09	drm/amd/display: Add callbacks for DMUB HPD IRQ notifications	Nicholas Kazlauskas
	[Why] We need HPD IRQ notifications (RX, short pulse) to properly handle DP MST for DPIA connections. [How] A null pointer exception currently occurs when these are received so add a check to validate that we have a handler installed for the notification. Extend the HPD handler to also handle HPD IRQ (RX) since the logic is the same. Fixes: e27c41d5b068 ("drm/amd/display: Support for DMUB HPD interrupt handling") Reviewed-by: Wayne Lin <Wayne.Lin@amd.com> Reviewed-by: Jude Shih <shenshih@amd.com> Acked-by: Anson Jacob <Anson.Jacob@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09	drm/amd/display: Don't lock connection_mutex for DMUB HPD	Nicholas Kazlauskas
	[Why] Per DRM spec we only need to hold that lock when touching connector->state - which we do not do in that handler. Taking this locking introduces unnecessary dependencies with other threads which is bad for performance and opens up the potential for a deadlock since there are multiple locks being held at once. [How] Remove the connection_mutex lock/unlock routine and just iterate over the drm connectors normally. The iter helpers implicitly lock the connection list so this is safe to do. DC link access also does not need to be guarded since the link table is static at creation - we don't dynamically add or remove links, just streams. Fixes: e27c41d5b068 ("drm/amd/display: Support for DMUB HPD interrupt handling") Reviewed-by: Jude Shih <shenshih@amd.com> Acked-by: Anson Jacob <Anson.Jacob@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09	drm/amd/display: Add comment where CONFIG_DRM_AMD_DC_DCN macro ends	Anson Jacob
	Trivial patch which adds a comment for macro endif's in amdgpu_dm.c Reviewed-by: Ariel Bernstein <Eric.Bernstein@amd.com> Reviewed-by: Harry Wentland <Harry.Wentland@amd.com> Acked-by: Anson Jacob <Anson.Jacob@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Anson Jacob <Anson.Jacob@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09	drm/amdkfd: Fix retry fault drain race conditions	Felix Kuehling
	The check for whether to drain retry faults must be under the mmap write lock to serialize with munmap notifier callbacks. We were also missing checks on child ranges. To fix that, simplify the logic by using a flag rather than checking on each prange. That also allows draining less freqeuntly when many ranges are unmapped at once. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Alex Sierra <Alex.Sierra@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09	drm/amdkfd: lower the VAs base offset to 8KB	Alex Sierra
	The low 16MB of virtual address space are currently reserved for kernel mode allocations mapped into user virtual address space. This causes conflicts with HMM/SVM mappings at low virtual addresses. We tried to move those kernel mode allocations to the upper half of the 64-bit virtual address space for GFX9, which is naturally reserved for kernel use. However, TBA (trap handler code) has problems to access addresses in the high virtual space. We have decided to set this to 8KB of the lower address space as a temporary fix, while investigate TBA address problem. It is very unlikely for user space to map memory at this low region. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09	drm/amd/display: fix exit from amdgpu_dm_atomic_check() abruptly	Shirish S
	make action upon failure in "drm_atomic_add_affected_connectors()" consistent with the rest of failures in amdgpu_dm_atomic_check(). Signed-off-by: Shirish S <shirish.s@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09	drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov	shaoyunl
	The KFD pre_reset should be called before reset been executed, it will hold the lock to prevent other rocm process to sent the packlage to hiq during host execute the real reset on the HW Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09	drm/amdgpu: fix uvd crash on Polaris12 during driver unloading	Evan Quan
	There was a change(below) target for such issue: d82e2c249c8f ("drm/amdgpu: Fix crash on device remove/driver unload") But the fix for VI ASICs was missing there. This is a supplement for that. Fixes: d82e2c249c8f ("drm/amdgpu: Fix crash on device remove/driver unload") Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05	drm/amdgpu: fix SI handling in amdgpu_device_asic_has_dc_support()	Alex Deucher
	Properly handle SI DC support when CONFIG_DRM_AMD_DC_SI is not set. Fixes: f7f12b25823c0d ("drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support") Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05	drm/amdgpu: Fix dangling kfd_bo pointer for shared BOs	Felix Kuehling
	If a kfd_bo was shared (e.g. a dmabuf export), the original kfd_bo may be freed when the amdgpu_bo still lives on. Free the kfd_bo struct in the release_notify callback then the amdgpu_bo is freed. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-By: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05	drm/amd/amdkfd: Don't sent command to HWS on kfd reset	shaoyunl
	When kfd need to be reset, sent command to HWS might cause hang and get unnecessary timeout. This change try not to touch HW in pre_reset and keep queues to be in the evicted state when the reset is done, so they are not put back on the runlist. These queues will be destroied on process termination. Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05	drm/amdgpu: correctly toggle gfx on/off around RLC_SPM_* register access	Evan Quan
	As part of the ib padding process, accessing the RLC_SPM_* register may trigger gfx hang. Since gfxoff may be already kicked during the whole period. To address that, we manually toggle gfx on/off around the RLC_SPM_* register access. This can resolve the gfx hang issue observed on running Talos with RDP launched in parallel. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05	drm/amdgpu: correct xgmi ras error count reset	Tao Zhou
	The error count reset for xgmi3x16 pcs is missed. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05	drm/amd/pm: Correct DPMS disable IP version check	Mario Limonciello
	Previously there was a check based on chip # for chips that aligned to >=CHIP_NAVI10 to have RLC stopped as part of DPMS check. This was because of gfxclk being controlled by RLC in the newer designs. As part of IP version checking though, this got changed to match IP version for SMU. Because Renoir designs also include smu11 that meant that even GFX9 started to stop RLC earlier. Adjust to match GFX IP version instead of SMU IP version to restore the previous behavior. Fixes: a8967967f6a5 ("drm/amdgpu/amdgpu_smu: convert to IP version checking") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05	drm/amd/amdgpu: Fix csb.bo pin_count leak on gfx 9	YuBiao Wang
	[Why] csb bo is not unpinned in gfx 9. It will lead to pin_count leak on driver unload. [How] Call bo_free_kernel corresponding to bo_create_kernel in gfx_rlc_init_csb. This will also unify the code path with other gfx versions. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05	drm/amd/amdgpu: Avoid writing GMC registers under sriov in gmc9	YuBiao Wang
	[Why] For Vega10, disabling gart of gfxhub could mess up KIQ and PSP under sriov mode, and lead to DMAR on host side. [How] Skip writing GMC registers under sriov. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05	drm/amdgpu/powerplay: fix sysfs_emit/sysfs_emit_at handling	Alex Deucher
	sysfs_emit and sysfs_emit_at requrie a page boundary aligned buf address. Make them happy! v2: fix sysfs_emit -> sysfs_emit_at missed conversions Cc: Lang Yu <lang.yu@amd.com> Cc: Darren Powell <darren.powell@amd.com> Fixes: 6db0c87a0a8e ("amdgpu/pm: Replace hwmgr smu usage of sprintf with sysfs_emit") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1774 Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05	drm/amdgpu: Make sure to reserve BOs before adding or removing	Kent Russell
	BOs need to be reserved before they are added or removed, so ensure that they are reserved during kfd_mem_attach and kfd_mem_detach Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05	drm/amdkfd: avoid recursive lock in migrations back to RAM	Alex Sierra
	[Why]: When we call hmm_range_fault to map memory after a migration, we don't expect memory to be migrated again as a result of hmm_range_fault. The driver ensures that all memory is in GPU-accessible locations so that no migration should be needed. However, there is one corner case where hmm_range_fault can unexpectedly cause a migration from DEVICE_PRIVATE back to system memory due to a write-fault when a system memory page in the same range was mapped read-only (e.g. COW). Ranges with individual pages in different locations are usually the result of failed page migrations (e.g. page lock contention). The unexpected migration back to system memory causes a deadlock from recursive locking in our driver. [How]: Creating a task reference new member under svm_range_list struct. Setting this with "current" reference, right before the hmm_range_fault is called. This member is checked against "current" reference at svm_migrate_to_ram callback function. If equal, the migration will be ignored. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05	drm/amd/display: Don't allow partial copy_from_user	Harry Wentland
	There is no reason to allow for partial buffers from userspace in our debugfs. In this particular case callers will zero out the wr_buf but if callers in the future don't do that we might be looking at corrupt data. Linus puts it better than I can in https://lkml.org/lkml/2021/10/26/993 Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/display: 3.2.160	Aric Cyr
	Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Aric Cyr <aric.cyr@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/display: [FW Promotion] Release 0.0.91	Anthony Koo
	Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Anthony Koo <Anthony.Koo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/display: add condition check for dmub notification	Aurabindo Pillai
	[Why & How] In order to have dc_enable_dmub_notifications() more precise, add one more condition to check if dc->debug.dpia_debug.bits.disable_dpia is false. Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/display: Added new DMUB boot option for power optimization	Jake Wang
	[Why] During Z10, root clock gating and memory low power registers needs to to be restored if optimization is enabled in driver. [How] Added new DMUB boot option for root clock gating and memory low power. Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Jake Wang <haonan.wang2@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/display: Add MPC meory shutdown support	Jake Wang
	[Why & How] The MPC memory clocks should be powered down when not in use. Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Jake Wang <haonan.wang2@amd.com> Reviewed-by: Eric Yang <eric.yang2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/display: Added HPO HW control shutdown support	Jake Wang
	[Why] HPO is only used for DP2.0. HPO HW control should be disable when not being used to save power. [How] Shutdown HPO HW control during init hw. Shutdown HPO HW control during stream disable. Enable HPO HW control during stream enable if DP2.0. Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Jake Wang <haonan.wang2@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/display: fix register write sequence for LINK_SQUARE_PATTERN	Wenjing Liu
	[why&how] write LINK_SQUARE_PATTERN_num + 1 for square pulse pattern. Specs requirement to write this register prior to write LINK_QUAL_LANEX_SET. Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Reviewed-by: George Shen <george.shen@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/display: Clear encoder assignments when state cleared.	Jimmy Kizito
	[Why] State can be cleared without removing individual streams (by calling dc_remove_stream_from_ctx()). This can leave the encoder assignment module in an incoherent state and cause future assignments to be incorrect. [How] Clear encoder assignments when committing 0 streams or re-initializing hardware. Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/display: Force disable planes on any pipe split change	Roman Li
	[Why] In scenario when 1 display connected with pipe split (2 pipes in use) and 3 new displays simultaneously hotplugged via MST hub (4 pipes in use), mpcc may get reprogram to other vtg, remaining busy. In this case waiting for mpcc idle timeouts with error like this: [drm] REG_WAIT timeout 1us * 100000 tries - mpc2_assert_idle_mpcc RIP: 0010:mpc2_assert_mpcc_idle_before_connect Call Trace: dcn20_update_mpcc dcn20_program_front_end_for_ctx dc_commit_state amdgpu_dm_atomic_commit_tail ... [How] Add pipe split change condition to disable dangling plane. Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/display: Fix bpc calculation for specific encodings	Bing Guo
	[Why] 1. YCbCr 4:2:2 8bpc/10bpc modes are blocked for HDMI by policy 2. A YCbCr 4:2:0 calculation error blocked some 4:2:0 timing modes [How] YCbCr 4:2:2 8bpc/10bpc modes are allowed for HDMI Fix YCbCr 4:2:0 calculation error Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Bing Guo <Bing.Guo@amd.com> Reviewed-by: Chris Park <chris.park@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/display: avoid link loss short pulse stuck the system	Yu-ting Shen
	[Why] MST monitor sends link loss short pulse continuous but sink is occupy by HDMI input to lead link training fail. [How] disable link once retraining fail. Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Yu-ting Shen <yu-tshen@amd.com> Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/display: Fix dummy p-state hang on monitors with extreme timing	Felipe Clark
	[WHY] It was found that the system would hang on a dummy pstate when playing 4k60 videos on a 1080p 390Hz monitor. [HOW] Properly select the dummy_pstate_latency_ms when firmware assisted memory clock switching is enabled instead of assuming that the highest latency would work for every monitor timing. Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Felipe Clark <felclark@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/display: Fix dcn10_log_hubp_states printf format string	Anson Jacob
	Fix spacing issue for the format string. Addresses-Coverity-ID: 1446765: ("Invalid printf format string") Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Anson Jacob <Anson.Jacob@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/display: dsc engine not disabled after unplug dsc mst hub	Hersen Wu
	[WHY] If timing and bpp of displays on mst hub are not changed, pbn, slot_num for displays should not be changed. Linux user mode may initiate atomic_check with different display configuration after set mode finished. This will call to amdgpu_dm to re-compute payload, slot_num of displays and saved to dm_connect_state. stream->timing.flags.dsc, pbn, slot_num are updated to values which may be different from that were used for set mode. when dsc hub with 3 4k@60hz dp connected, 3 dsc engines are enabled. timing.flags.dsc = 1. timing.flags.dsc are changed to 0 due to atomic check. when dsc hub is unplugged, amdgpu driver check timing.flags.dsc for last mode set and find out flags.dsc = 0, then does not disable dsc. [HOW] check status of displays on dsc mst hubs. re-compute pbn, slot_num, timing.flags.dsc only if there is mode, connect or enable/disable change. Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Hersen Wu <hersenwu@amd.com> Reviewed-by: Mikita Lipski <Mikita.Lipski@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amdgpu: remove duplicated kfd_resume_iommu	James Zhu
	Remove duplicated kfd_resume_iommu which already runs in mdgpu_amdkfd_device_init. Tested-By: Ken Moffat <zarniwhoop@ntlworld.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amdgpu: update RLC_PG_DELAY_3 Value to 200us for yellow carp	Aaron Liu
	For yellow carp, the desired CGPG hysteresis value is 0x4E20. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/display: Look at firmware version to determine using dmub on dcn21	Mario Limonciello
	commit 652de07addd2 ("drm/amd/display: Fully switch to dmub for all dcn21 asics") switched over to using dmub on Renoir to fix Gitlab 1735, but this implied a new dependency on newer firmware which might not be met on older kernel versions. Since sw_init runs before hw_init, there is an opportunity to determine whether or not the firmware version is new to adjust the behavior. Cc: Roman.Li@amd.com BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1772 BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1735 Fixes: 652de07addd2 ("drm/amd/display: Fully switch to dmub for all dcn21 asics") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amdgpu/pm: Don't show pp_power_profile_mode for unsupported devices	Mario Limonciello
	For ASICs not supporting power profile mode, don't show the attribute. Verify that the function has been implemented by the subsystem. Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/pm: Adjust returns when power_profile_mode is not supported	Mario Limonciello
	This better aligns that the caller can make a mistake with the buffer and -EINVAL should be returned, but if the hardware doesn't support the feature it should be -EOPNOTSUPP. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/pm: Add missing mutex for pp_get_power_profile_mode	Mario Limonciello
	Prevent possible issues from set and get being called simultaneously. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amdgpu/pm: drop pp_power_profile_mode support for yellow carp	Mario Limonciello
	This was added by commit bd8dcea93a7d ("drm/amd/pm: add callbacks to read/write sysfs file pp_power_profile_mode") but the feature was deprecated from PMFW. Remove it from the driver. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amdkfd: update gfx target version for Renoir	Graham Sider
	Previously Renoir compiler gfx target version was forced to Raven. Update driver side for completeness. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amdgpu: Convert SMU version to decimal in debugfs	Mario Limonciello
	This is more useful when talking to the SMU team to have the information in this format, save one less step to manually do it. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amdkfd: Handle incomplete migration to system memory	Felix Kuehling
	If some pages fail to migrate to system memory, don't update prange->actual_loc = 0. This prevents endless CPU page faults after partial migration failures due to contested page locks. Migration to RAM must be complete during migrations from VRAM to VRAM and during evictions. Implement retry and fail if the migration to RAM fails. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amdkfd: Avoid thrashing of stack and heap	Felix Kuehling
	Stack and heap pages tend to be shared by many small allocations. Concurrent access by CPU and GPU is therefore likely, which can lead to thrashing. Avoid this by setting the preferred location to system memory. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amdkfd: Fix SVM_ATTR_PREFERRED_LOC	Felix Kuehling
	The preferred location should be used as the migration destination whenever it is accessible by the faulting GPU. System memory is always accessible. Peer memory is accessible if it's in the same XGMI hive. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amdgpu: use correct register mask to extract field	Oak Zeng
	Aldebaran has different register mask definitions for regiter MC_VM_XGMI_LFB_CNTL. Use the correct masks to interpret fields of this register. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03	drm/amd/amdgpu: fix bad job hw_fence use after free in advance tdr	Jingwen Chen
	[Why] In advance tdr mode, the real bad job will be resubmitted twice, while in drm_sched_resubmit_jobs_ext, there's a dma_fence_put, so the bad job is put one more time than other jobs. [How] Adding dma_fence_get before resbumit job in amdgpu_device_recheck_guilty_jobs and put the fence for normal jobs Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-02	Merge tag 'amd-drm-next-5.16-2021-10-29' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.16-2021-10-29: amdgpu: - RAS fixes - Fix a potential memory leak in device tear down - Add a stutter mode quirk - Misc display fixes - Further display FP refactoring - Display USB4 fixes - Display DP2.0 fixes - DCN 3.1 fixes - Display 8 ch audio fix - Fix DMA mask regression for SI parts - Aldebaran fixes amdkfd: - userptr fix - BO lifetime fix - Misc code cleanup UAPI: - Minor header cleanup (no functional change) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211029184338.4863-1-alexander.deucher@amd.com
2021-10-28	drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bits	Alex Deucher
	The DMA mask on SI parts is 40 bits not 44. Copy paste typo. Fixes: 244511f386ccb9 ("drm/amdgpu: simplify and cleanup setting the dma mask") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1762 Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>