summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
AgeCommit message (Collapse)Author
2021-02-09drm/amdgpu: optimize list operation in amdgpu_xgmiKevin Wang
simplify the list operation. Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-13drm/amdgpu: check hive pointer before accessHawking Zhang
in case it is an invalid one Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-09drm/amdgpu: Fix inconsistent of format with argument type in amdgpu_xgmi.cYe Bin
Fix follow warning: [drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c:249]: (warning) %d in format string (no. 1) requires 'int' but the argument type is 'unsigned int'. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24drm/amdgpu: Get DRM dev from adev by inline-fLuben Tuikov
Add a static inline adev_to_drm() to obtain the DRM device pointer from an amdgpu_device pointer. Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24drm/amdgpu: drm_device to amdgpu_device by inline-f (v2)Luben Tuikov
Get the amdgpu_device from the DRM device by use of an inline function, drm_to_adev(). The inline function resolves a pointer to struct drm_device to a pointer to struct amdgpu_device. v2: Use a typed visible static inline function instead of an invisible macro. Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24drm/amdgpu: refine create and release logic of hive infoDennis Li
Change to dynamically create and release hive info object, which help driver support more hives in the future. v2: Change to save hive object pointer in adev, to avoid locking xgmi_mutex every time when calling amdgpu_get_xgmi_hive. v3: 1. Change type of hive object pointer in adev from void* to amdgpu_hive_info*. 2. remove unnecessary variable initialization. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24drm/amdgpu: refine codes to avoid reentering GPU recoveryDennis Li
if other threads have holden the reset lock, recovery will fail to try_lock. Therefore we introduce atomic hive->in_reset and adev->in_gpu_reset, to avoid reentering GPU recovery. v2: drop "? true : false" in the definition of amdgpu_in_reset Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14drm/amdgpu: revert "fix system hang issue during GPU reset"Christian König
The whole approach wasn't thought through till the end. We already had a reset lock like this in the past and it caused the same problems like this one. Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary. This reverts commit df9c8d1aa278c435c30a69b8f2418b4a52fcb929. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-07drm/amdgpu: unlock mutex on errorDennis Li
Make sure to unlock the mutex when error happen v2: 1. correct syntax error in the commit comments 2. remove change-Id Acked-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-27drm/amdgpu: fix system hang issue during GPU resetDennis Li
when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover, the atomic adev->in_gpu_reset and hive->in_reset are used to avoid re-entering GPU recovery. During GPU reset and resume, it is unsafe that other threads access GPU, which maybe cause GPU reset failed. Therefore the new rw_semaphore adev->reset_sem is introduced, which protect GPU from being accessed by external threads during recovery. v2: 1. add rwlock for some ioctls, debugfs and file-close function. 2. change to use dqm->is_resetting and dqm_lock for protection in kfd driver. 3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid re-enter GPU recovery for the same GPU hang. v3: 1. change back to use adev->reset_sem to protect kfd callback functions, because dqm_lock couldn't protect all codes, for example: free_mqd must be called outside of dqm_lock; [ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 1230.177221] Call Trace: [ 1230.178249] dump_stack+0x98/0xd5 [ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu] [ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu] [ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu] [ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu] [ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm] [ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm] [ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm] [ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm] [ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm] [ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm] [ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu] [ 1230.193833] free_mqd+0x25/0x40 [amdgpu] [ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu] [ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu] [ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu] [ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu] [ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu] [ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20 [ 1230.202831] ksys_ioctl+0x98/0xb0 [ 1230.204004] __x64_sys_ioctl+0x1a/0x20 [ 1230.205174] do_syscall_64+0x5f/0x250 [ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe 2. remove try_lock and introduce atomic hive->in_reset, to avoid re-enter GPU recovery. v4: 1. remove an unnecessary whitespace change in kfd_chardev.c 2. remove comment codes in amdgpu_device.c 3. add more detailed comment in commit message 4. define a wrap function amdgpu_in_reset v5: 1. Fix some style issues. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com> Suggested-by: Luben Tukov <luben.tuikov@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amdgpu: remove unused functionsNirmoy Das
Remove unused amdgpu_xgmi_hive_try_lock() and smu7_reset_asic_tasks(). Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-21drm/amdgpu fix incorrect sysfs remove behavior for xgmiJack Zhang
Under xgmi setup,some sysfs fail to create for the second time of kmd driver loading. It's due to sysfs nodes are not removed appropriately in the last unlod time. Changes of this patch: 1. remove sysfs for dev_attr_xgmi_error 2. remove sysfs_link adev->dev->kobj with target name. And it only needs to be removed once for a xgmi setup 3. remove sysfs_link hive->kobj with target name In amdgpu_xgmi_remove_device: 1. amdgpu_xgmi_sysfs_rem_dev_info needs to be run per device 2. amdgpu_xgmi_sysfs_destroy needs to be run on the last node of device. v2: initialize array with memset Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-14drm/amdgpu: remove redundant assignment to variable retColin Ian King
The variable ret is being initializeed with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08drm/amdgpu: use node_id and node_size to calcualte dram_base_addressHawking Zhang
physical_node_id * node_segment_size should be the dram_base_address for current gpu node in xgmi config Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-27drm/amdgpu: sw pstate switch should only be for vega20Jonathan Kim
Driver steered p-state switching is designed for Vega20 only. Also simplify early return for temporary disable due to SMU FW bug. Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22drm/amdgpu: fix race between pstate and remote buffer mapJonathan Kim
Vega20 arbitrates pstate at hive level and not device level. Last peer to remote buffer unmap could drop P-State while another process is still remote buffer mapped. With this fix, P-States still needs to be disabled for now as SMU bug was discovered on synchronous P2P transfers. This should be fixed in the next FW update. Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-01drm/amdgpu: added xgmi ras error reset sequenceJohn Clements
added mechanism to clear xgmi ras status inbetween error queries Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-10drm/amdgpu: call ras_debugfs_create_all in debugfs_initTao Zhou
and remove each ras IP's own debugfs creation this is required to fix ras when the driver does not use the drm load and unload callbacks due to ordering issues with the drm device node. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-06drm/amdgpu: enable PCS error report on arcturusHawking Zhang
add arcturus xgmi/wafl pcs err status group to support PCS error detection and report on arcturus Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-06drm/amdgpu: add helper funcs to detect PCS errorHawking Zhang
Since from vega20, hardware supports run-time detect and report XGMI/WAFL PCS ras error. Add helper functions to walkthrough every type of ras error and report it if any. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu: toggle DF-Cstate to protect DF reg accessHawking Zhang
driver needs to take DF out Cstate before any DF register access. otherwise, the DF register may not be accessible. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu: move get_xgmi_relative_phy_addr to amdgpu_xgmi.cHawking Zhang
centralize all the xgmi related function to amdgpu_xgmi.c Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-06drm/amdgpu: move xgmi init/fini to xgmi_add/remove_device call (v2)Hawking Zhang
For sriov, psp ip block has to be initialized before ih block for the dynamic register programming interface that needed for vf ih ring buffer. On the other hand, current psp ip block hw_init function will initialize xgmi session which actaully depends on interrupt to return session context. This results an empty xgmi ta session id and later failures on all the xgmi ta cmd invoked from vf. xgmi ta session initialization has to be done after ih ip block hw_init call. to unify xgmi session init/fini for both bare-metal sriov virtualization use scenario, move xgmi ta init to xgmi_add_device call, and accordingly terminate xgmi ta session in xgmi_remove_device call. The existing suspend/resume sequence will not be changed. v2: squash in return fix from Nirmoy Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14drm/amdgpu: Create generic DF struct in adevJoseph Greathouse
The only data fabric information the adev struct currently contains is a function pointer table. In the near future, we will be adding some cached DF information into adev. As such, this patch creates a new amdgpu_df struct for adev. Right now, it only containst the old function pointer table, but new stuff will be added soon. Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14drm/amd/powerplay: cover the powerplay implementation details V3Evan Quan
This can save users much troubles. As they do not actually need to care whether swSMU or traditional powerplay routine should be used. V2: apply the fixes to vi.c and cik.c also V3: squash in oops fix Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18drm/amdgpu: Add task barrier to XGMI hive.Andrey Grodzovsky
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07drm/amdgpu: fix vega20 pstate status changeJonathan Kim
vega20 only requires all devices be set to same pstate level for low pstate and not high. Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Evan Quan <Evan.Quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06drm/amdgpu: fix possible pstate switch race conditionEvan Quan
Added lock protection so that the p-state switch will be guarded to be sequential. Also update the hive pstate only all device from the hive are in the same state. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06drm/amd/powerplay: support xgmi pstate setting on powerplay routine V2Evan Quan
Add xgmi pstate setting on powerplay routine. V2: split the change of is_support_sw_smu_xgmi into a separate patch Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03drm/amdgpu: move xgmi ras fini to xgmi blockTao Zhou
it's more suitable to put xgmi ras fini in xgmi block Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16drm/amdgpu: initialize ras structures for xgmi block (v2)Hawking Zhang
init ras common interface and fs node for xgmi block v2: remove unnecessary physical node number check before invoking amdgpu_xgmi_ras_late_init Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30drm/amdgpu: adding xgmi error monitoringJonathan Kim
monitor xgmi errors via mc pie status through fica registers. Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Kent Russell <Kent.Russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18amd/powerplay: No SW XGMI dpm for Arcturus rev 2Yong Zhao
xgmi dpm is handled by the SMU. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18drm/amdgpu: skip get/update xgmi topology info when no psp existsLe Ma
We don't currently have psp support for arcturus so provide a alternative mechanism in the meantime. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18drm/amdgpu: Hack xgmi topology info when there is no psp fwOak Zeng
This is only needed on emulation platform where psp fw might not be available, to hack xgmi topology info such as hive id and node id. v2: Add offset to hacked hive/node id v3: Don't use introduce new module parameter. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24drm/amd/doc: Add XGMI sysfs documentationTom St Denis
Acked-by: Slava Abramov <slava.abramov@amd.com> Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24drm/amdgpu: Update latest xgmi topology info after each device is enumulatedshaoyunl
Adjust the sequence of set/get xgmi topology, so driver can have the latest XGMI topology info for future usage Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24drm/amdgpu: Implement get num of hops between two xgmi deviceshaoyunl
KFD need to provide the info for upper level to determine the data path Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-04-12drm/amdgpu: Set proper function to set xgmi pstateshaoyunl
Driver need to call SMU to set xgmi pstate Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-27drm/amdgpu: XGMI pstate switch initial supportshaoyunl
Driver vote low to high pstate switch whenever there is an outstanding XGMI mapping request. Driver vote high to low pstate when all the outstanding XGMI mapping is terminated. Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-21drm/amdgpu: revert "XGMI pstate switch initial support"Christian König
This reverts commit 9b638f9751308ae3ae8f28e0c6e9decffd97f5f9. Adding this to the mapping is complete nonsense and the whole implementation looks racy. This patch wasn't thoughtfully reviewed and should be reverted for now. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Liu, Shaoyun <Shaoyun.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19drm/amdgpu: XGMI pstate switch initial supportshaoyunl
Driver vote low to high pstate switch whenever there is an outstanding XGMI mapping request. Driver vote high to low pstate when all the outstanding XGMI mapping is terminated. Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19drm/amdgpu: Add sysfs entries for xgmi hive v2.Andrey Grodzovsky
For each device a file xgmi_device_id is created. On the first device a subdirectory named xgmi_hive_info is created, It contains a file named hive_id and symlinks named node 1-4 linking to each device in the hive. v2: Return error codes instead of '-1' and few misspellings. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-05drm/amd/amdgpu: fix spelling mistake "matech" -> "match"Colin Ian King
There is a spelling mistake in a dev_err message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-29drm/amdgpu: Show XGMI node and hive message per device only onceshaoyunl
Reduce the repeated node and hive information during XGMI initialization Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14drm/amd/amdgpu: add missing mutex lock to amdgpu_get_xgmi_hive() (v3)Tom St Denis
v2: Move locks around in other functions so that this function can stand on its own. Also only hold the hive specific lock for add/remove device instead of the driver global lock so you can't add/remove devices in parallel from one hive. v3: add reset_lock Acked-by: Shaoyun.liu < Shaoyun.liu@amd.com> Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14drm/amdgpu: Add message print when unable to get valid hiveshaoyunl
Add message print out and return -EINVAL when driver can not get valid hive from hive arrary on xgmi configuration Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-18drm/amdgpu: correct the return value for error caseEvan Quan
It should not return 0 for error case as '0' is actually a special value for index. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-04drm/amdgpu: Update XGMI node printAndrey Grodzovsky
amdgpu_xgmi_update_topology is called both on device registration and reset. Fix misleading print since the device is added only once to the hive on registration and not on reset. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-03drm/amdgpu: Handle xgmi device removal.Andrey Grodzovsky
XGMI hive has some resources allocted on device init which needs to be deallocated when the device is unregistered. v2: Remove creation of dedicated wq for XGMI hive reset. v3: Use the gmc.xgmi.supported flag Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>