summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
AgeCommit message (Collapse)Author
2020-03-05drm/amdgpu: fix IB test MCBP bugMonk Liu
1)for gfx IB test we shouldn't insert DE meta data 2)we should make sure IB test finished before we send event 3 to hypervisor otherwise the IDLE from event 3 will preempt IB test, which is not designed as a compatible structure for MCBP Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-28drm/amdgpu: no need to clean debugfs at amdgpuYintian Tao
drm_minor_unregister will invoke drm_debugfs_cleanup to clean all the child node under primary minor node. We don't need to invoke amdgpu_debugfs_fini and amdgpu_debugfs_regs_cleanup to clean agian. Otherwise, it will raise the NULL pointer like below. [ 45.046029] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 [ 45.047256] PGD 0 P4D 0 [ 45.047713] Oops: 0002 [#1] SMP PTI [ 45.048198] CPU: 0 PID: 2796 Comm: modprobe Tainted: G W OE 4.18.0-15-generic #16~18.04.1-Ubuntu [ 45.049538] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 [ 45.050651] RIP: 0010:down_write+0x1f/0x40 [ 45.051194] Code: 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb e8 ce d9 ff ff 48 ba 01 00 00 00 ff ff ff ff 48 89 d8 <f0> 48 0f c1 10 85 d2 74 05 e8 53 1c ff ff 65 48 8b 04 25 00 5c 01 [ 45.053702] RSP: 0018:ffffad8f4133fd40 EFLAGS: 00010246 [ 45.054384] RAX: 00000000000000a8 RBX: 00000000000000a8 RCX: ffffa011327dd814 [ 45.055349] RDX: ffffffff00000001 RSI: 0000000000000001 RDI: 00000000000000a8 [ 45.056346] RBP: ffffad8f4133fd48 R08: 0000000000000000 R09: ffffffffc0690a00 [ 45.057326] R10: ffffad8f4133fd58 R11: 0000000000000001 R12: ffffa0113cff0300 [ 45.058266] R13: ffffa0113c0a0000 R14: ffffffffc0c02a10 R15: ffffa0113e5c7860 [ 45.059221] FS: 00007f60d46f9540(0000) GS:ffffa0113fc00000(0000) knlGS:0000000000000000 [ 45.060809] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 45.061826] CR2: 00000000000000a8 CR3: 0000000136250004 CR4: 00000000003606f0 [ 45.062913] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 45.064404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 45.065897] Call Trace: [ 45.066426] debugfs_remove+0x36/0xa0 [ 45.067131] amdgpu_debugfs_ring_fini+0x15/0x20 [amdgpu] [ 45.068019] amdgpu_debugfs_fini+0x2c/0x50 [amdgpu] [ 45.068756] amdgpu_pci_remove+0x49/0x70 [amdgpu] [ 45.069439] pci_device_remove+0x3e/0xc0 [ 45.070037] device_release_driver_internal+0x18a/0x260 [ 45.070842] driver_detach+0x3f/0x80 [ 45.071325] bus_remove_driver+0x59/0xd0 [ 45.071850] driver_unregister+0x2c/0x40 [ 45.072377] pci_unregister_driver+0x22/0xa0 [ 45.073043] amdgpu_exit+0x15/0x57c [amdgpu] [ 45.073683] __x64_sys_delete_module+0x146/0x280 [ 45.074369] do_syscall_64+0x5a/0x120 [ 45.074916] entry_SYSCALL_64_after_hwframe+0x44/0xa9 v2: remove all debugfs cleanup/fini code at amdgpu v3: squash in unused variable removal Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-28Merge tag 'amd-drm-next-5.7-2020-02-26' of ↵Dave Airlie
git://people.freedesktop.org/~agd5f/linux into drm-next amd-drm-next-5.7-2020-02-26: amdgpu: - Rework VM update handling in preparation for HMM support - HDCP srm support - PSR fixes - DC watermark fixes - OLED panel support - SR-IOV fixes - BACO fixes - Optimize debugging vram access - RAS fixes - Use BACO for runtime pm - HDCP fixes - XGMI fixes - DDC fixes - DC clock programming optimizations and fixes - PSP fw loading sequence updates - Drop DRIVER_USE_AGP - Remove legacy drm load and unload callbacks amdkfd: - Add runtime pm support radeon: - Drop DRIVER_USE_AGP Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200227043142.4075-1-alexander.deucher@amd.com
2020-02-26drm/amdgpu: drop legacy drm load and unload callbacksAlex Deucher
We've moved the debugfs handling into a centralized place so we can remove the legacy load an unload callbacks. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu/firmware: move debugfs init into core amdgpu debugfsAlex Deucher
In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for firmware. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu/regs: move debugfs init into core amdgpu debugfsAlex Deucher
In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for register access files. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu/gem: move debugfs init into core amdgpu debugfsAlex Deucher
In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for gem. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu: rename amdgpu_debugfs_preempt_cleanupAlex Deucher
to amdgpu_debugfs_fini. It will be used for other things in the future. Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amdgpu: Increase timout on emulator to tenfold instead of twiceYong Zhao
Since emulators are slower, sometime some operations like flushing tlb through FM need more than twice the regular timout of 100ms, so increase the timeout to 1s on emulators. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-21Merge tag 'drm-misc-next-2020-02-10' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.7: UAPI Changes: - lima: Add support for heap buffers Cross-subsystem Changes: Core Changes: - Implement mode_config mode_valid for memory constrained drivers - Bus format negociation between bridges - Consolidate fake vblank events for drivers without vblank interrupts - drm/bufs: dma_alloc related cleanups - drm/dp_mst: Various fixes - drm/print: New drm_device based print helpers - Thomas is a drm-misc maintainer now! Driver Changes: - DPMS cleanups for atomic drivers - Removal of owner field in SPI tinydrm drivers - Removal of explicit dependency on DT for tinydrm drivers - Conversion to YAML schemas for DT bindings - tidss: New driver - virtio: various reworks and fixes - Our usual dozen or so new panels or bridges Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20200210093421.xu4sofldm6wm6xq6@gilmour.lan
2020-02-12drm/amdkfd: refactor runtime pm for bacoRajneesh Bhardwaj
So far the kfd driver implemented same routines for runtime and system wide suspend and resume (s2idle or mem). During system wide suspend the kfd aquires an atomic lock that prevents any more user processes to create queues and interact with kfd driver and amd gpu. This mechanism created problem when amdgpu device is runtime suspended with BACO enabled. Any application that relies on kfd driver fails to load because the driver reports a locked kfd device since gpu is runtime suspended. However, in an ideal case, when gpu is runtime suspended the kfd driver should be able to: - auto resume amdgpu driver whenever a client requests compute service - prevent runtime suspend for amdgpu while kfd is in use This change refactors the amdgpu and amdkfd drivers to support BACO and runtime power management. Reviewed-by: Oak Zeng <oak.zeng@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-07drm/amdgpu: use the BAR if possible in amdgpu_device_vram_access v2Christian König
This should speed up debugging VRAM access a lot. v2: add HDP flush/invalidate Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-07drm/amdgpu: optimize amdgpu_device_vram_access a bit.Christian König
Only write the _HI register when necessary. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-07drm/amdgpu/sriov Don't send msg when smu suspendJack Zhang
For sriov and pp_onevf_mode, do not send message to set smu status, because smu doesn't support these messages under VF. Besides, it should skip smu_suspend when pp_onevf_mode is disabled. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27drm/amdgpu: enable GPU reset by default on renoirAlex Deucher
Everything is in place. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-27drm/amdgpu: enable GPU reset by default on NaviAlex Deucher
Has been working fine for a while. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-24drm: Avoid drm_global_mutex for simple inc/dec of dev->open_countChris Wilson
Since drm_global_mutex is a true global mutex across devices, we don't want to acquire it unless absolutely necessary. For maintaining the device local open_count, we can use atomic operations on the counter itself, except when making the transition to/from 0. Here, we tackle the easy portion of delaying acquiring the drm_global_mutex for the final release by using atomic_dec_and_mutex_lock(), leaving the global serialisation across the device opens. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: Thomas Hellström <thellstrom@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200124130107.125404-1-chris@chris-wilson.co.uk
2020-01-22drm/amdgpu: remove unnecessary conversion to boolNirmoy Das
Better clean that up before some automation starts to complain about it Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22drm/amdgpu: add kiq version interface for RREG32/WREG32chen gong
Reading some registers by mmio will result in hang when GPU is in "gfxoff" state.This problem can be solved by GPU in "ring command packages" way. Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22drm/amdgpu: provide a generic function interface for reading/writing ↵chen gong
register by KIQ Move amdgpu_virt_kiq_rreg/amdgpu_virt_kiq_wreg function to amdgpu_gfx.c, and rename them to amdgpu_kiq_rreg/amdgpu_kiq_wreg.Make it generic and flexible. Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-17drm/amdgpu: add the lost mutex_init backPan, Xinhui
Initialize notifier_lock. Bug: https://gitlab.freedesktop.org/drm/amd/issues/1016 Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16drm/amdgpu: add arcturus to gpu recovery check code pathHawking Zhang
support check if dirver should try gpu recovery for arcturus Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14drm/amd/powerplay: cover the powerplay implementation details V3Evan Quan
This can save users much troubles. As they do not actually need to care whether swSMU or traditional powerplay routine should be used. V2: apply the fixes to vi.c and cik.c also V3: squash in oops fix Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07amd/amdgpu/sriov tdr enablement with pp_onevf_modeJack Zhang
Under sriov and pp_onevf mode, 1.take resume instead of hw_init for smc recover to avoid potential memory leak. 2.add return condition inside smc resume function for sriov_pp_onevf_mode and pm_enabled param. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23drm/amdgpu: use true, false for bool variable in amdgpu_device.czhengbin
Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3961:1-19: WARNING: Assignment of 0/1 to bool variable drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3981:1-19: WARNING: Assignment of 0/1 to bool variable Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23drm/amdgpu: Remove unneeded variable 'ret' in amdgpu_device.cMa Feng
Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1036:5-8: Unneeded variable: "ret". Return "0" on line 1079 Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ma Feng <mafeng.ma@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18drm/amdgpu: update VCN1(dual instances) fw types ID and VCN ip block typeJane Jian
Previously there is no VCN1 type ID in psp gfx interface. Also add VCN ip block type unless the reinit after FLR for sriov would fail. Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18drm/amdgpu: Switch from system_highpri_wq to system_unbound_wqAndrey Grodzovsky
This is to avoid queueing jobs to same CPU during XGMI hive reset because there is a strict timeline for when the reset commands must reach all the GPUs in the hive. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18drm/amdgpu: Redo XGMI reset synchronization.Andrey Grodzovsky
Use task barrier in XGMI hive to synchronize ASIC resets across devices in XGMI hive. v2: Return right away with a warning if no xgmi hive, update doc. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18drm/amdgpu: reverts commit ce316fa55ef0f1751276b846a54fb3b835bd5e64.Andrey Grodzovsky
In preparation for doing XGMI reset synchronization using task barrier. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18drm/amdgpu: fix KIQ ring test fail in TDR of SRIOVMonk Liu
issues: MEC is ruined by the amdkfd_pre_reset after VF FLR done fix: amdkfd_pre_reset() would ruin MEC after hypervisor finished the VF FLR, the correct sequence is do amdkfd_pre_reset before VF FLR but there is a limitation to block this sequence: if we do pre_reset() before VF FLR, it would go KIQ way to do register access and stuck there, because KIQ probably won't work by that time (e.g. you already made GFX hang) so the best way right now is to simply remove it. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18amd/amdgpu: add sched array to IPs with multiple run-queuesNirmoy Das
This sched array can be passed on to entity creation routine instead of manually creating such sched array on every context creation. v2: squash in missing break fix Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18drm/amdgpu: replace vm_pte's run-queue list with drm gpu scheds listNirmoy Das
drm_sched_entity_init() takes drm gpu scheduler list instead of drm_sched_rq list. This makes conversion of drm_sched_rq list to drm gpu scheduler list unnecessary Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18drm/amdgpu/sriov: Tonga sriov also need load firmware with smuEmily Deng
Fix Tonga sriov load driver fail issue. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewd-by Yintian Tao <Yintian.tao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18drm/amdgpu: Add CU info print logYong Zhao
The log will be useful for easily getting the CU info on various emulation models or ASICs. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11drm/amdgpu: log when amdgpu.dc=1 but ASIC is unsupportedSimon Ser
This makes it easier to figure out whether the kernel parameter has been taken into account. Signed-off-by: Simon Ser <contact@emersion.fr> Cc: Harry Wentland <hwentlan@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-11drm/amd/powerplay: enable pp one vf mode for vega10Yintian Tao
Originally, due to the restriction from PSP and SMU, VF has to send message to hypervisor driver to handle powerplay change which is complicated and redundant. Currently, SMU and PSP can support VF to directly handle powerplay change by itself. Therefore, the old code about the handshake between VF and PF to handle powerplay will be removed and VF will use new the registers below to handshake with SMU. mmMP1_SMN_C2PMSG_101: register to handle SMU message mmMP1_SMN_C2PMSG_102: register to handle SMU parameter mmMP1_SMN_C2PMSG_103: register to handle SMU response v2: remove module parameter pp_one_vf v3: fix the parens v4: forbid vf to change smu feature v5: use hwmon_attributes_visible to skip sepicified hwmon atrribute v6: change skip condition at vega10_copy_table_to_smc Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05drm/amdgpu: clear err_event_athub flag after reset exitLe Ma
Otherwise next err_event_athub error cannot call gpu reset. And following resume sequence will not be affected by this flag. v2: create function to clear amdgpu_ras_in_intr for modularity of ras driver Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05drm/amdgpu: support full gpu reset workflow when ras err_event_athub occursLe Ma
This athub fatal error can be recovered by baco without system-level reboot, so add a mode to use baco for the recovery. Not affect the default psp reset situations for now. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05drm/amdgpu: add concurrent baco reset support for XGMILe Ma
Currently each XGMI node reset wq does not run in parrallel if bound to same cpu. Make change to bound the xgmi_reset_work item to different cpus. XGMI requires all nodes enter into baco within very close proximity before any node exit baco. So schedule the xgmi_reset_work wq twice for enter/exit baco respectively. To use baco for XGMI, PMFW supported for baco on XGMI needs to be involved. The case that PSP reset and baco reset coexist within an XGMI hive never exist and is not in the consideration. v2: define use_baco flag to simplify the code for xgmi baco sequence Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-05drm/amdgpu: enable/disable doorbell interrupt in baco entry/exit helperLe Ma
This operation is needed when baco entry/exit for ras recovery Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02drm/amdgpu/sriov: No need the event 3 and 4 nowEmily Deng
As will call unload kms when initialize fail, and the unload kms will send event 3 and 4, so don't need event 3 and 4 in device init. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Zhan Liu <zhan.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-02drm/amdgpu: not remove sysfs if not create sysfsYintian Tao
When load amdgpu failed before create pm_sysfs and ucode_sysfs, the pm_sysfs and ucode_sysfs should not be removed. Otherwise, there will be warning call trace just like below. [ 24.836386] [drm] VCE initialized successfully. [ 24.841352] amdgpu 0000:00:07.0: amdgpu_device_ip_init failed [ 25.370383] amdgpu 0000:00:07.0: Fatal error during GPU init [ 25.889575] [drm] amdgpu: finishing device. [ 26.069128] amdgpu 0000:00:07.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110) [ 26.070110] [drm:gfx_v9_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed [ 26.200309] [TTM] Finalizing pool allocator [ 26.200314] [TTM] Finalizing DMA pool allocator [ 26.200349] [TTM] Zone kernel: Used memory at exit: 0 KiB [ 26.200351] [TTM] Zone dma32: Used memory at exit: 0 KiB [ 26.200353] [drm] amdgpu: ttm finalized [ 26.205329] ------------[ cut here ]------------ [ 26.205330] sysfs group 'fw_version' not found for kobject '0000:00:07.0' [ 26.205347] WARNING: CPU: 0 PID: 1228 at fs/sysfs/group.c:256 sysfs_remove_group+0x80/0x90 [ 26.205348] Modules linked in: amdgpu(OE+) gpu_sched(OE) ttm(OE) drm_kms_helper(OE) drm(OE) i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache binfmt_misc snd_hda_codec_generic ledtrig_audio crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hda_core snd_hwdep snd_pcm snd_timer input_leds snd joydev soundcore serio_raw pcspkr evbug aesni_intel aes_x86_64 crypto_simd cryptd mac_hid glue_helper sunrpc ip_tables x_tables autofs4 8139too psmouse 8139cp mii i2c_piix4 pata_acpi floppy [ 26.205369] CPU: 0 PID: 1228 Comm: modprobe Tainted: G OE 5.2.0-rc1 #1 [ 26.205370] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 26.205372] RIP: 0010:sysfs_remove_group+0x80/0x90 [ 26.205374] Code: e8 35 b9 ff ff 5b 41 5c 41 5d 5d c3 48 89 df e8 f6 b5 ff ff eb c6 49 8b 55 00 49 8b 34 24 48 c7 c7 48 7a 70 98 e8 60 63 d3 ff <0f> 0b eb d7 66 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 [ 26.205375] RSP: 0018:ffffbee242b0b908 EFLAGS: 00010282 [ 26.205376] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006 [ 26.205377] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff97ad6f817380 [ 26.205377] RBP: ffffbee242b0b920 R08: ffffffff98f520c4 R09: 00000000000002b3 [ 26.205378] R10: ffffbee242b0b8f8 R11: 00000000000002b3 R12: ffffffffc0e58240 [ 26.205379] R13: ffff97ad6d1fe0b0 R14: ffff97ad4db954c8 R15: ffff97ad4db7fff0 [ 26.205380] FS: 00007ff3d8a1c4c0(0000) GS:ffff97ad6f800000(0000) knlGS:0000000000000000 [ 26.205381] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 26.205381] CR2: 00007f9b2ef1df04 CR3: 000000042aab8001 CR4: 00000000003606f0 [ 26.205384] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 26.205385] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 26.205385] Call Trace: [ 26.205461] amdgpu_ucode_sysfs_fini+0x18/0x20 [amdgpu] [ 26.205518] amdgpu_device_fini+0x3b4/0x560 [amdgpu] [ 26.205573] amdgpu_driver_unload_kms+0x4f/0xa0 [amdgpu] [ 26.205623] amdgpu_driver_load_kms+0xcd/0x250 [amdgpu] [ 26.205637] drm_dev_register+0x12b/0x1c0 [drm] [ 26.205695] amdgpu_pci_probe+0x12a/0x1e0 [amdgpu] [ 26.205699] local_pci_probe+0x47/0xa0 [ 26.205701] pci_device_probe+0x106/0x1b0 [ 26.205704] really_probe+0x21a/0x3f0 [ 26.205706] driver_probe_device+0x11c/0x140 [ 26.205707] device_driver_attach+0x58/0x60 [ 26.205709] __driver_attach+0xc3/0x140 Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-26drm/amdgpu: move pci handling out of pm opsAlex Deucher
The documentation says the that PCI core handles this for you unless you choose to implement it. Just rely on the PCI core to handle the pci specific bits. Reviewed-by: Zhan Liu <zhan.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19drm/amdgpu: disentangle runtime pm and vga_switcherooAlex Deucher
Originally we only supported runtime pm on PX/HG laptops so vga_switcheroo and runtime pm are sort of entangled. Attempt to logically separate them. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19drm/amdgpu: add helpers for baco entry and exitAlex Deucher
BACO - Bus Active, Chip Off Will be used for runtime pm. Entry will enter the BACO state (chip off). Exit will exit the BACO state (chip on). Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19drm/amdgpu: rename amdgpu_device_is_px to amdgpu_device_supports_boco (v2)Alex Deucher
BACO - Bus Active, Chip Off BOCO - Bus Off, Chip Off To better match what we are checking for and to align with amdgpu_device_supports_baco. BOCO is used on PowerXpress/Hybrid Graphics systems and BACO is used on desktop dGPU boards. v2: fix typo in documentation Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19drm/amdgpu: add a amdgpu_device_supports_baco helperAlex Deucher
BACO - Bus Active, Chip Off To check if a device supports BACO or not. This will be used in determining when to enable runtime pm. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19drm/amdgpu: put flush_delayed_work at firstYintian Tao
There is one regression from 042f3d7b745cd76aa To put flush_delayed_work after adev->shutdown = true which will make amdgpu_ih_process not response the irq At last, all ib ring tests will be failed just like below [drm] amdgpu: finishing device. [drm] Fence fallback timer expired on ring gfx [drm] Fence fallback timer expired on ring comp_1.0.0 [drm] Fence fallback timer expired on ring comp_1.1.0 [drm] Fence fallback timer expired on ring comp_1.2.0 [drm] Fence fallback timer expired on ring comp_1.3.0 [drm] Fence fallback timer expired on ring comp_1.0.1 amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.1.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.2.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.3.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma0 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on uvd_enc_0.0 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on vce0 (-110). [drm:amdgpu_device_delayed_init_work_handler [amdgpu]] *ERROR* ib ring test failed (-110). v2: replace cancel_delayed_work_sync() with flush_delayed_work() Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-19drm/amdgpu: add driver support for JPEG2.0 and aboveLeo Liu
By using JPEG IP block type Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>