Age | Commit message (Collapse) | Author |
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.13:
UAPI Changes:
Cross-subsystem Changes:
Core Changes:
- %p4cc printk format modifier
- atomic: introduce drm_crtc_commit_wait, rework atomic plane state
helpers to take the drm_commit_state structure
- dma-buf: heaps rework to return a struct dma_buf
- simple-kms: Add plate state helpers
- ttm: debugfs support, removal of sysfs
Driver Changes:
- Convert drivers to shadow plane helpers
- arc: Move to drm/tiny
- ast: cursor plane reworks
- gma500: Remove TTM and medfield support
- mxsfb: imx8mm support
- panfrost: MMU IRQ handling rework
- qxl: rework to better handle resources deallocation, locking
- sun4i: Add alpha properties for UI and VI layers
- vc4: RPi4 CEC support
- vmwgfx: doc cleanup
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210303100600.dgnkadonzuvfnu22@gilmour
|
|
This is just another feature which is only used by VMWGFX, so move
it into the driver instead.
I've tried to add the accounting sysfs file to the kobject of the drm
minor, but I'm not 100% sure if this works as expected.
v2: fix typo in KFD and avoid 64bit divide
v3: fix init order in VMWGFX
v4: use pdev sysfs reference instead of drm
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Zack Rusin <zackr@vmware.com> (v3)
Tested-by: Nirmoy Das <nirmoy.das@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210208133226.36955-2-christian.koenig@amd.com
|
|
In drm_gem_object_free, it will call funcs of drm buffer obj. So
kfd_alloc should use amdgpu_gem_object_create instead of
amdgpu_bo_create to initialize the funcs as amdgpu_gem_object_funcs.
[ 396.231390] amdgpu: Release VA 0x7f76b4ada000 - 0x7f76b4add000
[ 396.231394] amdgpu: remove VA 0x7f76b4ada000 - 0x7f76b4add000 in entry 0000000085c24a47
[ 396.231408] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 396.231445] #PF: supervisor read access in kernel mode
[ 396.231466] #PF: error_code(0x0000) - not-present page
[ 396.231484] PGD 0 P4D 0
[ 396.231495] Oops: 0000 [#1] SMP NOPTI
[ 396.231509] CPU: 7 PID: 1352 Comm: clinfo Tainted: G OE 5.11.0-rc2-custom #1
[ 396.231537] Hardware name: AMD Celadon-RN/Celadon-RN, BIOS WCD0401N_Weekly_20_04_0 04/01/2020
[ 396.231563] RIP: 0010:drm_gem_object_free+0xc/0x22 [drm]
[ 396.231606] Code: eb ec 48 89 c3 eb e7 0f 1f 44 00 00 55 48 89 e5 48 8b bf 00 06 00 00 e8 72 0d 01 00 5d c3 0f 1f 44 00 00 48 8b 87 40 01 00 00 <48> 8b 00 48 85 c0 74 0b 55 48 89 e5 e8 54 37 7c db 5d c3 0f 0b c3
[ 396.231666] RSP: 0018:ffffb4704177fcf8 EFLAGS: 00010246
[ 396.231686] RAX: 0000000000000000 RBX: ffff993a0d0cc400 RCX: 0000000000003113
[ 396.231711] RDX: 0000000000000001 RSI: e9cda7a5d0791c6d RDI: ffff993a333a9058
[ 396.231736] RBP: ffffb4704177fdd0 R08: ffff993a03855858 R09: 0000000000000000
[ 396.231761] R10: ffff993a0d1f7158 R11: 0000000000000001 R12: 0000000000000000
[ 396.231785] R13: ffff993a0d0cc428 R14: 0000000000003000 R15: ffffb4704177fde0
[ 396.231811] FS: 00007f76b5730740(0000) GS:ffff993b275c0000(0000) knlGS:0000000000000000
[ 396.231840] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 396.231860] CR2: 0000000000000000 CR3: 000000016d2e2000 CR4: 0000000000350ee0
[ 396.231885] Call Trace:
[ 396.231897] ? amdgpu_amdkfd_gpuvm_free_memory_of_gpu+0x24c/0x25f [amdgpu]
[ 396.232056] ? __dynamic_dev_dbg+0xcd/0x100
[ 396.232076] kfd_ioctl_free_memory_of_gpu+0x91/0x102 [amdgpu]
[ 396.232214] kfd_ioctl+0x211/0x35b [amdgpu]
[ 396.232341] ? kfd_ioctl_get_queue_wave_state+0x52/0x52 [amdgpu]
Fixes: 246cb7e49a70 ("drm/amdgpu: Introduce GEM object functions")
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Changfeng <changzhu@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v5.12:
UAPI Changes:
- Not necessarily one, but we document that userspace needs to force probe connectors.
Cross-subsystem Changes:
- Require FB_ATY_CT for aty on sparc64.
- video: Fix documentation, and a few compiler warnings.
- Add devicetree bindings for DP connectors.
- dma-buf: Update kernel-doc, and add might_lock for resv objects in begin/end_cpu_access.
Core Changes:
- ttm: Warn when releasing a pinned bo.
- ttm: Cleanup bo size handling.
- cma-helper: Remove prime infix, and implement mmap as GEM CMA functions.
- Split drm_prime_sg_to_page_addr_arrays into 2 functions.
- Add a new api to install irq using devm.
- Update panel kerneldoc to inline style.
- Add DP support to drm/bridge.
- Assorted small fixes to ttm, fb-helper, scheduler.
- Add atomic_commit_setup function callback.
- Automatically use the atomic gamma_set, instead of forcing drivers to declare the default atomic version.
- Allow using degamma for legacy gamma if gamma is not available.
- Clarify that primary/cursor planes are not tied to 1 crtc (depending on possible_crtcs).
- ttm: Cleanup the lru handler.
Driver Changes:
- Add pm support to ingenic.
- Assorted small fixes in radeon, via, rockchip, omap2fb, kmb, gma500, nouveau, virtio, hisilicon, ingenic, s6e63m0 panel, ast, udlfb.
- Add BOE NV110WTM-N61, ys57pss36bh5gq, Khadas TS050 panels.
- Stop using pages with drm_prime_sg_to_page_addr_arrays, and switch all callers to use ttm_sg_tt_init.
- Cleanup compiler and docbook warnings in a lot of fbdev devices.
- Use the drmm_vram_helper in hisilicon.
- Add support for BCM2711 DSI1 in vc4.
- Add support for 8-bit delta RGB panels to ingenic.
- Add documentation on how to test vkms.
- Convert vc4 to atomic helpers.
- Use degamma instead of gamma table in omap, to add support for CTM and color encoding/range properties.
- Rework omap DSI code, and merge all omapdrm modules now that the last omap panel is now a drm panel.
- More refactoring of omap dsi code.
- Enable 10/12 bpc outputs in vc4.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/78381a4f-45fd-aed4-174a-94ba051edd37@linux.intel.com
|
|
it could also be insufficient vram that makes
amdgpu_amdkfd_reserve_mem_limit fail.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Required backmerge since we will be based on top of v5.11, and there
has been a request to backmerge already to upstream some features.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
|
|
I guess Christian didn't compile test amdkfd.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Fixes: e11bfb99d6ec ("drm/ttm: cleanup BO size handling v3")
Cc: Christian König <christian.koenig@amd.com>
Cc: Huang Rui <ray.huang@amd.com> (v1)
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201214191725.3899147-1-daniel.vetter@ffwll.ch
|
|
If vram is used up, display allocate vram evict the KFD BOs to system
memory. KFD schedule restore work to restore BOs back to vram. If
display BOs are pinned in vram, KFD restore work will keep retry, and
may never success.
If restore BO back to vram failed, keep the BO in system memory to
prevent endless retry restore, and GPU mapping will update to system
memory.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.11-2020-11-05:
amdgpu:
- Add initial support for Vangogh
- Add support for Green Sardine
- Add initial support for Dimgrey Cavefish
- Scatter/Gather display support for Renoir
- Updates for Sienna Cichlid
- Updates for Navy Flounder
- SMU7 power improvements
- Modifier support for gfx9+
- CI BACO fixes
- Arcturus SMU fixes
- Lots of code cleanups
- DC fixes
- Kernel doc fixes
- Add more GPU HW client information to page fault error logging
- MPO clock tuning for RV
- FP fixes for DCN3 on ARM and PPC
radeon:
- Expose voltage via hwmon on Sumo APUs
amdkfd:
- Fix unique id handling
- Misc fixes
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201105222749.201798-1-alexander.deucher@amd.com
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.11:
UAPI Changes:
- doc: rules for EBUSY on non-blocking commits; requirements for fourcc
modifiers; on parsing EDID
- fbdev/sbuslib: Remove unused FBIOSCURSOR32
- fourcc: deprecate DRM_FORMAT_MOD_NONE
- virtio: Support blob resources for memory allocations; Expose host-visible
and cross-device features
Cross-subsystem Changes:
- devicetree: Add vendor Prefix for Yes Optoelectronics, Shanghai Top Display
Optoelectronics
- dma-buf: Add struct dma_buf_map that stores DMA pointer and I/O-memory flag;
dma_buf_vmap()/vunmap() return address in dma_buf_map; Use struct_size() macro
Core Changes:
- atomic: pass full state to CRTC atomic enable/disable; warn for EBUSY during
non-blocking commits
- dp: Prepare for DP 2.0 DPCD
- dp_mst: Receive extended DPCD caps
- dma-buf: Documentation
- doc: Format modifiers; dma-buf-map; Cleanups
- fbdev: Don't use compat_alloc_user_space(); mark as orphaned
- fb-helper: Take lock in drm_fb_helper_restore_work_fb()
- gem: Convert implementation and drivers to GEM object functions, remove
GEM callbacks from struct drm_driver (expect gem_prime_mmap)
- panel: Cleanups
- pci: Add legacy infix to drm_irq_by_busid()
- sched: Avoid infinite waits in drm_sched_entity_destroy()
- switcheroo: Cleanups
- ttm: Remove AGP support; Don't modify caching during swapout; Major
refactoring of the implementation and API that affects all depending
drivers; Add ttm_bo_wait_ctx(); Add ttm_bo_pin()/unpin() in favor of
TTM_PL_FLAG_NO_EVICT; Remove ttm_bo_create(); Remove fault_reserve_notify()
callback; Push move() implementation into drivers; Remove TTM_PAGE_FLAG_WRITE;
Replace caching flags with init-time cache setting; Push ttm_tt_bind() into
drivers; Replace move_notify() with delete_mem_notify(); No overlapping memcpy();
no more ttm_set_populated()
- vram-helper: Fix BO top-down placement; TTM-related changes; Init GEM
object functions with defaults; Default placement in system memory; Cleanups
Driver Changes:
- amdgpu: Use GEM object functions
- armada: Use GEM object functions
- aspeed: Configure output via sysfs; Init struct drm_driver with
- ast: Reload LUT after FB format changes
- bridge: Add driver and DT bindings for anx7625; Cleanups
- bridge/dw-hdmi: Constify ops
- bridge/ti-sn65dsi86: Add retries for link training
- bridge/lvds-codec: Add support for regulator
- bridge/tc358768: Restore connector support DRM_GEM_CMA_DRIVEROPS; Cleanups
- display/ti,j721e-dss: Add DT properies assigned-clocks, assigned-clocks-parent and
dma-coherent
- display/ti,am65s-dss: Add DT properies assigned-clocks, assigned-clocks-parent and
dma-coherent
- etnaviv: Use GEM object functions
- exynos: Use GEM object functions
- fbdev: Cleanups and compiler fixes throughout framebuffer drivers
- fbdev/cirrusfb: Avoid division by 0
- gma500: Use GEM object functions; Fix double-free of connector; Cleanups
- hisilicon/hibmc: I2C-based DDC support; Use to_hibmc_drm_device(); Cleanups
- i915: Use GEM object functions
- imx/dcss: Init driver with DRM_GEM_CMA_DRIVER_OPS; Cleanups
- ingenic: Reset pixel clock when parent clock changes; support reserved
memory; Alloc F0 and F1 DMA channels at once; Support different pixel formats;
Revert support for cached mmap buffers
on F0/F1; support 30-bit/24-bit/8-bit-palette modes
- komeda: Use DEFINE_SHOW_ATTRIBUTE
- mcde: Detect platform_get_irq() errors
- mediatek: Use GEM object functions
- msm: Use GEM object functions
- nouveau: Cleanups; TTM-related changes; Use GEM object functions
- omapdrm: Use GEM object functions
- panel: Add driver and DT bindings for Novatak nt36672a; Add driver and DT
bindings for YTC700TLAG-05-201C; Add driver and DT bindings for TDO TL070WSH30;
Cleanups
- panel/mantix: Fix reset; Fix deref of NULL pointer in mantix_get_modes()
- panel/otm8009a: Allow non-continuous dsi clock; Cleanups
- panel/rm68200: Allow non-continuous dsi clock; Fix mode to 50 FPS
- panfrost: Fix job timeout handling; Cleanups
- pl111: Use GEM object functions
- qxl: Cleanups; TTM-related changes; Pin new BOs with ttm_bo_init_reserved()
- radeon: Cleanups; TTM-related changes; Use GEM object functions
- rockchip: Use GEM object functions
- shmobile: Cleanups
- tegra: Use GEM object functions
- tidss: Set drm_plane_helper_funcs.prepare_fb
- tilcdc: Don't keep vblank interrupt enabled all the time
- tve200: Detect platform_get_irq() errors
- vc4: Use GEM object functions; Only register components once DSI is attached;
Add Maxime as maintainer
- vgem: Use GEM object functions
- via: Simplify critical section in via_mem_alloc()
- virtgpu: Use GEM object functions
- virtio: Implement blob resources, host-visible and cross-device features;
Support mapping of host-allocated resources; Use UUID APi; Cleanups
- vkms: Use GEM object functions; Switch to SHMEM
- vmwgfx: TTM-related changes; Inline ttm_bo_swapout_all()
- xen: Use GEM object functions
- xlnx: Use GEM object functions
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20201027100936.GA4858@linux-uq9g
|
|
General code indentation and alignment changes such as replace spaces
by tabs or align function arguments as per the coding style
guidelines. The patch corrects issues for various amdgpu_*.c files
for this driver. Issue reported by checkpatch script.
Signed-off-by: Deepak R Varma <mh12gx2825@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Bool initialisation should use 'true' and 'false' values instead of 0
and 1.
Modify amdgpu_amdkfd_gpuvm.c to initialise variable is_imported
to false instead of 0.
Issue found with Coccinelle.
Signed-off-by: Sumera Priyadarsini <sylphrenadin@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Pull drm updates from Dave Airlie:
"Not a major amount of change, the i915 trees got split into display
and gt trees to better facilitate higher level review, and there's a
major refactoring of i915 GEM locking to use more core kernel concepts
(like ww-mutexes). msm gets per-process pagetables, older AMD SI cards
get DC support, nouveau got a bump in displayport support with common
code extraction from i915.
Outside of drm this contains a couple of patches for hexint
moduleparams which you've acked, and a virtio common code tree that
you should also get via it's regular path.
New driver:
- Cadence MHDP8546 DisplayPort bridge driver
core:
- cross-driver scatterlist cleanups
- devm_drm conversions
- remove drm_dev_init
- devm_drm_dev_alloc conversion
ttm:
- lots of refactoring and cleanups
bridges:
- chained bridge support in more drivers
panel:
- misc new panels
scheduler:
- cleanup priority levels
displayport:
- refactor i915 code into helpers for nouveau
i915:
- split into display and GT trees
- WW locking refactoring in GEM
- execbuf2 extension mechanism
- syncobj timeline support
- GEN 12 HOBL display powersaving
- Rocket Lake display additions
- Disable FBC on Tigerlake
- Tigerlake Type-C + DP improvements
- Hotplug interrupt refactoring
amdgpu:
- Sienna Cichlid updates
- Navy Flounder updates
- DCE6 (SI) support for DC
- Plane rotation enabled
- TMZ state info ioctl
- PCIe DPC recovery support
- DC interrupt handling refactor
- OLED panel fixes
amdkfd:
- add SMI events for thermal throttling
- SMI interface events ioctl update
- process eviction counters
radeon:
- move to dma_ for allocations
- expose sclk via sysfs
msm:
- DSI support for sm8150/sm8250
- per-process GPU pagetable support
- Displayport support
mediatek:
- move HDMI phy driver to PHY
- convert mtk-dpi to bridge API
- disable mt2701 tmds
tegra:
- bridge support
exynos:
- misc cleanups
vc4:
- dual display cleanups
ast:
- cleanups
gma500:
- conversion to GPIOd API
hisilicon:
- misc reworks
ingenic:
- clock handling and format improvements
mcde:
- DSI support
mgag200:
- desktop g200 support
mxsfb:
- i.MX7 + i.MX8M
- alpha plane support
panfrost:
- devfreq support
- amlogic SoC support
ps8640:
- EDID from eDP retrieval
tidss:
- AM65xx YUV workaround
virtio:
- virtio-gpu exported resources
rcar-du:
- R8A7742, R8A774E1 and R8A77961 support
- YUV planar format fixes
- non-visible plane handling
- VSP device reference count fix
- Kconfig fix to avoid displaying disabled options in .config"
* tag 'drm-next-2020-10-15' of git://anongit.freedesktop.org/drm/drm: (1494 commits)
drm/ingenic: Fix bad revert
drm/amdgpu: Fix invalid number of character '{' in amdgpu_acpi_init
drm/amdgpu: Remove warning for virtual_display
drm/amdgpu: kfd_initialized can be static
drm/amd/pm: setup APU dpm clock table in SMU HW initialization
drm/amdgpu: prevent spurious warning
drm/amdgpu/swsmu: fix ARC build errors
drm/amd/display: Fix OPTC_DATA_FORMAT programming
drm/amd/display: Don't allow pstate if no support in blank
drm/panfrost: increase readl_relaxed_poll_timeout values
MAINTAINERS: Update entry for st7703 driver after the rename
Revert "gpu/drm: ingenic: Add option to mmap GEM buffers cached"
drm/amd/display: HDMI remote sink need mode validation for Linux
drm/amd/display: Change to correct unit on audio rate
drm/amd/display: Avoid set zero in the requested clk
drm/amdgpu: align frag_end to covered address space
drm/amdgpu: fix NULL pointer dereference for Renoir
drm/vmwgfx: fix regression in thp code due to ttm init refactor.
drm/amdgpu/swsmu: add interrupt work handler for smu11 parts
drm/amdgpu/swsmu: add interrupt work function
...
|
|
Make use of the new struct_size() helper instead of the offsetof() idiom.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Stop using TTM_PL_FLAG_NO_EVICT.
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/391617/?series=81973&rev=1
|
|
PASID is defined as a few different types in iommu including "int",
"u32", and "unsigned int". To be consistent and to match with uapi
definitions, define PASID and its variations (e.g. max PASID) as "u32".
"u32" is also shorter and a little more explicit than "unsigned int".
No PASID type change in uapi although it defines PASID as __u64 in
some places.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lkml.kernel.org/r/1600187413-163670-2-git-send-email-fenghua.yu@intel.com
|
|
git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.10-2020-09-03:
amdgpu:
- RAS fixes
- Sienna Cichlid updates
- Navy Flounder updates
- DCE6 (SI) support in DC
- Enable plane rotation
- Rework pre-OS vram reservation handling during driver init
- Add standard interface to dump GPU metrics table from SMU
- Rework tiling and tmz state handling in atomic commits
- Pstate fixes
- Add voltage and power hwmon interfaces for renoir
- SW CTF fixes
- S/G display fix for Raven
- Print client strings for vmfaults for vega and newer
- Manual fan control fixes
- Display updates
- Reorg power management directory structure
- Misc bug fixes
- Misc code cleanups
amdkfd:
- Topology fixes
- Add SMI events for thermal throttling and GPU resets
radeon:
- switch from pci_* to dma_* for dma allocations
- PLL fix
Scheduler:
- Clean up priority levels
UAPI:
- amdgpu INFO IOCTL query update for TMZ state
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049
- amdkfd SMI event interface updates
https://github.com/RadeonOpenCompute/rocm_smi_lib/tree/therm_thrott
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200903222921.4152-1-alexander.deucher@amd.com
|
|
Get the amdgpu_device from the DRM device by use
of an inline function, drm_to_adev(). The inline
function resolves a pointer to struct drm_device
to a pointer to struct amdgpu_device.
v2: Use a typed visible static inline function
instead of an invisible macro.
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Sam needs 5.9-rc1 to have dev_err_probe in to merge some patches.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
|
|
The whole approach wasn't thought through till the end.
We already had a reset lock like this in the past and it caused the same problems like this one.
Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary.
This reverts commit df9c8d1aa278c435c30a69b8f2418b4a52fcb929.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Backmerging drm-next into drm-misc-next for nouveau and panel updates.
Resolves a conflict between ttm and nouveau, where struct ttm_mem_res got
renamed to struct ttm_resource.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Thomas Gleixner:
"A set of locking fixes and updates:
- Untangle the header spaghetti which causes build failures in
various situations caused by the lockdep additions to seqcount to
validate that the write side critical sections are non-preemptible.
- The seqcount associated lock debug addons which were blocked by the
above fallout.
seqcount writers contrary to seqlock writers must be externally
serialized, which usually happens via locking - except for strict
per CPU seqcounts. As the lock is not part of the seqcount, lockdep
cannot validate that the lock is held.
This new debug mechanism adds the concept of associated locks.
sequence count has now lock type variants and corresponding
initializers which take a pointer to the associated lock used for
writer serialization. If lockdep is enabled the pointer is stored
and write_seqcount_begin() has a lockdep assertion to validate that
the lock is held.
Aside of the type and the initializer no other code changes are
required at the seqcount usage sites. The rest of the seqcount API
is unchanged and determines the type at compile time with the help
of _Generic which is possible now that the minimal GCC version has
been moved up.
Adding this lockdep coverage unearthed a handful of seqcount bugs
which have been addressed already independent of this.
While generally useful this comes with a Trojan Horse twist: On RT
kernels the write side critical section can become preemtible if
the writers are serialized by an associated lock, which leads to
the well known reader preempts writer livelock. RT prevents this by
storing the associated lock pointer independent of lockdep in the
seqcount and changing the reader side to block on the lock when a
reader detects that a writer is in the write side critical section.
- Conversion of seqcount usage sites to associated types and
initializers"
* tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
locking/seqlock, headers: Untangle the spaghetti monster
locking, arch/ia64: Reduce <asm/smp.h> header dependencies by moving XTP bits into the new <asm/xtp.h> header
x86/headers: Remove APIC headers from <asm/smp.h>
seqcount: More consistent seqprop names
seqcount: Compress SEQCNT_LOCKNAME_ZERO()
seqlock: Fold seqcount_LOCKNAME_init() definition
seqlock: Fold seqcount_LOCKNAME_t definition
seqlock: s/__SEQ_LOCKDEP/__SEQ_LOCK/g
hrtimer: Use sequence counter with associated raw spinlock
kvm/eventfd: Use sequence counter with associated spinlock
userfaultfd: Use sequence counter with associated spinlock
NFSv4: Use sequence counter with associated spinlock
iocost: Use sequence counter with associated spinlock
raid5: Use sequence counter with associated spinlock
vfs: Use sequence counter with associated spinlock
timekeeping: Use sequence counter with associated raw spinlock
xfrm: policy: Use sequence counters with associated lock
netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
netfilter: conntrack: Use sequence counter with associated spinlock
sched: tasks: Use sequence counter with associated spinlock
...
|
|
We need to allocate that manually now.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Link: https://patchwork.freedesktop.org/patch/384330/
|
|
If multiple process share system memory through /dev/shm, KFD allocate
memory should not fail if it reaches the system memory limit because
one copy of physical system memory are shared by multiple process.
Add module parameter no_system_mem_limit to provide user option to
disable system memory limit check at runtime using sysfs or during
driver module init using kernel boot argument. By default the system
memory limit is on.
Print out debug message to warn user if KFD allocate memory failed
because system memory reaches limit.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
A sequence counter write side critical section must be protected by some
form of locking to serialize writers. If the serialization primitive is
not disabling preemption implicitly, preemption has to be explicitly
disabled before entering the sequence counter write side critical
section.
The dma-buf reservation subsystem uses plain sequence counters to manage
updates to reservations. Writer serialization is accomplished through a
wound/wait mutex.
Acquiring a wound/wait mutex does not disable preemption, so this needs
to be done manually before and after the write side critical section.
Use the newly-added seqcount_ww_mutex_t instead:
- It associates the ww_mutex with the sequence count, which enables
lockdep to validate that the write side critical section is properly
serialized.
- It removes the need to explicitly add preempt_disable/enable()
around the write side critical section because the write_begin/end()
functions for this new data type automatically do this.
If lockdep is disabled this ww_mutex lock association is compiled out
and has neither storage size nor runtime overhead.
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://lkml.kernel.org/r/20200720155530.1173732-13-a.darwish@linutronix.de
|
|
when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover,
the atomic adev->in_gpu_reset and hive->in_reset are used to avoid
re-entering GPU recovery.
During GPU reset and resume, it is unsafe that other threads access GPU,
which maybe cause GPU reset failed. Therefore the new rw_semaphore
adev->reset_sem is introduced, which protect GPU from being accessed by
external threads during recovery.
v2:
1. add rwlock for some ioctls, debugfs and file-close function.
2. change to use dqm->is_resetting and dqm_lock for protection in kfd
driver.
3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid
re-enter GPU recovery for the same GPU hang.
v3:
1. change back to use adev->reset_sem to protect kfd callback
functions, because dqm_lock couldn't protect all codes, for example:
free_mqd must be called outside of dqm_lock;
[ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
[ 1230.177221] Call Trace:
[ 1230.178249] dump_stack+0x98/0xd5
[ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu]
[ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu]
[ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu]
[ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu]
[ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm]
[ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm]
[ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm]
[ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm]
[ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm]
[ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm]
[ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu]
[ 1230.193833] free_mqd+0x25/0x40 [amdgpu]
[ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu]
[ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu]
[ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu]
[ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu]
[ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu]
[ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20
[ 1230.202831] ksys_ioctl+0x98/0xb0
[ 1230.204004] __x64_sys_ioctl+0x1a/0x20
[ 1230.205174] do_syscall_64+0x5f/0x250
[ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe
2. remove try_lock and introduce atomic hive->in_reset, to avoid
re-enter GPU recovery.
v4:
1. remove an unnecessary whitespace change in kfd_chardev.c
2. remove comment codes in amdgpu_device.c
3. add more detailed comment in commit message
4. define a wrap function amdgpu_in_reset
v5:
1. Fix some style issues.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com>
Suggested-by: Luben Tukov <luben.tuikov@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.9-2020-07-01:
amdgpu:
- DC DMUB updates
- HDCP fixes
- Thermal interrupt fixes
- Add initial support for Sienna Cichlid GPU
- Add support for unique id on Arcturus
- Major swSMU code cleanup
- Skip BAR resizing if the bios already did id
- Fixes for DCN bandwidth calculations
- Runtime PM reference count fixes
- Add initial UVD support for SI
- Add support for ASSR on eDP links
- Lots of misc fixes and cleanups
- Enable runtime PM on vega10 boards that support BACO
- RAS fixes
- SR-IOV fixes
- Use IP discovery table on renoir
- DC stream synchronization fixes
amdkfd:
- Track SDMA usage per process
- Fix GCC10 compiler warnings
- Locking fix
radeon:
- Default to on chip GART for AGP boards on all arches
- Runtime PM reference count fixes
UAPI:
- Update comments to clarify MTYPE
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200701155041.1102829-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
According to Marek a pipeline sync should be inserted for implicit syncs well.
v2: bump the driver version
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v5.9:
UAPI Changes:
- Add DRM_MODE_TYPE_USERDEF for video modes specified in cmdline.
Cross-subsystem Changes:
- Assorted devicetree binding updates.
- Add might_sleep() to dma_fence_wait().
- Fix fbdev's get_user_pages_fast() handling, and use pin_user_pages.
- Small cleanup with IS_BUILTIN in video/fbdev drivers.
- Fix video/hdmi coding style for infoframe size.
Core Changes:
- Silence vblank output during init.
- Fix DP-MST corruption during send msg timeout.
- Clear leak in drm_gem_objecs_lookup().
- Make newlines work with force connector attribute.
- Fix module refcounting error in drm_encoder_slave, and use new i2c api.
- Header fix for drm_managed.c
- More struct_mutex removal for !legacy drivers:
- Remove gem_free_object()
- Removal of drm_gem_object_put_unlocked().
- Show current->comm alongside pid in debug printfs.
- Add drm_client_modeset_check() + drm_client_framebuffer_flush().
- Replace drm_fb_swab16 with drm_fb_swap that also supports 32-bits.
- Remove mode->vrefresh, and compactify drm_display_mode.
- Use drm_* macros for logging and warnings.
- Add WARN when drm_gem_get_pages is used on a private obj.
- Handle importing and imported dmabuf better in shmem helpers.
- Small fix for drm/mm hole size comparison, and remove invalid entry optimization.
- Add a drm/mm selftest.
- Set DSI connector type for DSI panels.
- Assorted small fixes and documentation updates.
- Fix DDI I2C device registration for MST ports, and flushing on destroy.
- Fix master_set return type, used by vmwgfx.
- Make the drm_set/drop_master ioctl symmetrical.
Driver Changes:
Allow iommu in the sun4i driver and use it for sun8i.
- Simplify backlight lookup for omap, amba-clcd and tilcdc.
- Hold reg_lock for rockchip.
- Add support for bridge gpio and lane reordering + polarity to ti-sn65dsi86, and fix clock choice.
- Small assorted fixes to tilcdc, vc4, i915, omap, fbdev/sm712fb, fbdev/pxafb, console/newport_con, msm, virtio, udl, malidp, hdlcd, bridge/ti-sn65dsi86, panfrost.
- Remove hw cursor support for mgag200, and use simple kms helper + shmem helpers.
- Add support for KOE Allow iommu in the sun4i driver and use it for sun8i.
- Simplify backlight lookup for omap, amba-clcd and tilcdc.
- Hold reg_lock for rockchip.
- Add support for bridge gpio and lane reordering + polarity to ti-sn65dsi86, and fix clock choice.
- Small assorted fixes to tilcdc, vc4 (multiple), i915.
- Remove hw cursor support for mgag200, and use simple kms helper + shmem helpers.
- Add support for KOE TX26D202VM0BWA panel.
- Use GEM CMA functions in arc, arm, atmel-hlcdc, fsi-dcu, hisilicon, imx, ingenic, komeda, malidp, mcde, meson, msxfb, rcar-du, shmobile, stm, sti, tilcdc, tve200, zte.
- Remove gem_print_info.
- Improve gem_create_object_helper so udl can use shmem helpers.
- Convert vc4 dt bindings to schemas, and add clock properties.
- Device initialization cleanups for mgag200.
- Add a workaround to fix DP-MST short pulses handling on broken hardware in i915.
- Allow build test compiling arm drivers.
- Use managed pci functions in mgag200 and ast.
- Use dev_groups in malidp.
- Add per pixel alpha support for PX30 VOP in rockchip.
- Silence deferred probe logs in panfrost.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/001cd9a6-405d-4e29-43d8-354f53ae4e8b@linux.intel.com
|
|
This change converts the existing mmap_sem rwsem calls to use the new mmap
locking API instead.
The change is generated using coccinelle with the following rule:
// spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .
@@
expression mm;
@@
(
-init_rwsem
+mmap_init_lock
|
-down_write
+mmap_write_lock
|
-down_write_killable
+mmap_write_lock_killable
|
-down_write_trylock
+mmap_write_trylock
|
-up_write
+mmap_write_unlock
|
-downgrade_write
+mmap_write_downgrade
|
-down_read
+mmap_read_lock
|
-down_read_killable
+mmap_read_lock_killable
|
-down_read_trylock
+mmap_read_trylock
|
-up_read
+mmap_read_unlock
)
-(&mm->mmap_sem)
+(mm)
Signed-off-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Liam Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ying Han <yinghan@google.com>
Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In free memory of gpu path, remove bo from validate_list to make sure
restore worker don't access the BO any more, then unregister bo MMU
interval notifier. Otherwise, the restore worker will crash in the
middle of validating BO user pages if MMU interval notifer is gone.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Releasing the AMDGPU BO ref directly leads to problems when BOs were
exported as DMA bufs. Releasing the GEM reference makes sure that the
AMDGPU/TTM BO is not freed too early.
Also take a GEM reference when importing BOs from DMABufs to keep
references to imported BOs balances properly.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Track GPU VRAM usage on a per process basis and report it through
sysfs.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Reduce the mem->lock`s protected code area, no need to protect pr_debug.
This also simplifies error handling.
Signed-off-by: Bernard Zhao <bernard@vivo.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Make the code a bit more readable by using a common
error handling pattern.
Signed-off-by: Bernard Zhao <bernard@vivo.com>
Reviewed-by: Christian König <christian.koenig@amd.com>.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Let format prefixes take care of printing the module name
through pr_fmt and dev_fmt definitions.
Signed-off-by: Aurabindo Pillai <mail@aurabindo.in>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
ALLOC_MEM_FLAGS_* used are the same as the KFD_IOC_ALLOC_MEM_FLAGS_*,
but they are interweavedly used in kernel driver, resulting in bad
readability. For example, KFD_IOC_ALLOC_MEM_FLAGS_COHERENT is not
referenced in kernel, and it functions implicitly in kernel through
ALLOC_MEM_FLAGS_COHERENT, causing unnecessary confusion.
Replace all occurrences of ALLOC_MEM_FLAGS_* with
KFD_IOC_ALLOC_MEM_FLAGS_* to solve the problem.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Given we can query all the asic specific information from amdgpu_gfx_config,
we can make get_tile_config() generic.
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
No need to trigger eviction as the memory mapping will not be used
anymore.
All pt/pd bos share same resv, hence the same shared eviction fence.
Everytime page table is freed, the fence will be signled and that cuases
kfd unexcepted evictions.
v2: squash in 32 bit fix
CC: Christian König <christian.koenig@amd.com>
CC: Felix Kuehling <felix.kuehling@amd.com>
CC: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
For unlocked page table updates we need to be able
to sync to fences of a specific VM.
v2: use SYNC_ALWAYS in the UVD code
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
bo_va_list is list_head, so initialize it.
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.6-2019-12-11:
amdgpu:
- Add MST atomic routines
- Add support for DMCUB (new helper microengine for displays)
- Add OEM i2c support in DC
- Use vstartup for vblank events on DCN
- Simplify Kconfig for DC
- Renoir fixes for DC
- Clean up function pointers in DC
- Initial support for HDCP 2.x
- Misc code cleanups
- GFX10 fixes
- Rework JPEG engine handling for VCN
- Add clock and power gating support for JPEG
- BACO support for Arcturus
- Cleanup PSP ring handling
- Add framework for using BACO with runtime pm to save power
- Move core pci state handling out of the driver for pm ops
- Allow guest power control in 1 VF case with SR-IOV
- SR-IOV fixes
- RAS fixes
- Support for power metrics on renoir
- Golden settings updates for gfx10
- Enable gfxoff on supported navi10 skus
- Update MAINTAINERS
amdkfd:
- Clean up generational gfx code
- Fixes for gfx10
- DIQ fixes
- Share more code with amdgpu
radeon:
- PPC DMA fix
- Register checker fixes for r1xx/r2xx
- Misc cleanups
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191211223020.7510-1-alexander.deucher@amd.com
|
|
Allows us to reduce the overhead while syncing to fences a bit.
v2: also drop adev parameter from the functions
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Pull more drm updates from Dave Airlie:
"Rob pointed out I missed his pull request for msm-next, it's been in
next for a while outside of my tree so shouldn't cause any unexpected
issues, it has some OCMEM support in drivers/soc that is acked by
other maintainers as it's outside my tree.
Otherwise it's a usual fixes pull, i915, amdgpu, the main ones, with
some tegra, omap, mgag200 and one core fix.
Summary:
msm-next:
- OCMEM support for a3xx and a4xx GPUs.
- a510 support + display support
core:
- mst payload deletion fix
i915:
- uapi alignment fix
- fix for power usage regression due to security fixes
- change default preemption timeout to 640ms from 100ms
- EHL voltage level display fixes
- TGL DGL PHY fix
- gvt - MI_ATOMIC cmd parser fix, CFL non-priv warning
- CI spotted deadlock fix
- EHL port D programming fix
amdgpu:
- VRAM lost fixes on BACO for CI/VI
- navi14 DC fixes
- misc SR-IOV, gfx10 fixes
- XGMI fixes for arcturus
- SRIOV fixes
amdkfd:
- KFD on ppc64le enabled
- page table optimisations
radeon:
- fix for r1xx/2xx register checker.
tegra:
- displayport regression fixes
- DMA API regression fixes
mgag200:
- fix devices that can't scanout except at 0 addr
omap:
- fix dma_addr refcounting"
* tag 'drm-next-2019-12-06' of git://anongit.freedesktop.org/drm/drm: (100 commits)
drm/dp_mst: Correct the bug in drm_dp_update_payload_part1()
drm/omap: fix dma_addr refcounting
drm/tegra: Run hub cleanup on ->remove()
drm/tegra: sor: Make the +5V HDMI supply optional
drm/tegra: Silence expected errors on IOMMU attach
drm/tegra: vic: Export module device table
drm/tegra: sor: Implement system suspend/resume
drm/tegra: Use proper IOVA address for cursor image
drm/tegra: gem: Remove premature import restrictions
drm/tegra: gem: Properly pin imported buffers
drm/tegra: hub: Remove bogus connection mutex check
ia64: agp: Replace empty define with do while
agp: Add bridge parameter documentation
agp: remove unused variable num_segments
agp: move AGPGART_MINOR to include/linux/miscdevice.h
agp: remove unused variable size in agp_generic_create_gatt_table
drm/dp_mst: Fix build on systems with STACKTRACE_SUPPORT=n
drm/radeon: fix r1xx/r2xx register checker for POT textures
drm/amdgpu: fix GFX10 missing CSIB set(v3)
drm/amdgpu: should stop GFX ring in hw_fini
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull hmm updates from Jason Gunthorpe:
"This is another round of bug fixing and cleanup. This time the focus
is on the driver pattern to use mmu notifiers to monitor a VA range.
This code is lifted out of many drivers and hmm_mirror directly into
the mmu_notifier core and written using the best ideas from all the
driver implementations.
This removes many bugs from the drivers and has a very pleasing
diffstat. More drivers can still be converted, but that is for another
cycle.
- A shared branch with RDMA reworking the RDMA ODP implementation
- New mmu_interval_notifier API. This is focused on the use case of
monitoring a VA and simplifies the process for drivers
- A common seq-count locking scheme built into the
mmu_interval_notifier API usable by drivers that call
get_user_pages() or hmm_range_fault() with the VA range
- Conversion of mlx5 ODP, hfi1, radeon, nouveau, AMD GPU, and Xen
GntDev drivers to the new API. This deletes a lot of wonky driver
code.
- Two improvements for hmm_range_fault(), from testing done by Ralph"
* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
mm/hmm: remove hmm_range_dma_map and hmm_range_dma_unmap
mm/hmm: make full use of walk_page_range()
xen/gntdev: use mmu_interval_notifier_insert
mm/hmm: remove hmm_mirror and related
drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirror
drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror
drm/amdgpu: Call find_vma under mmap_sem
nouveau: use mmu_interval_notifier instead of hmm_mirror
nouveau: use mmu_notifier directly for invalidate_range_start
drm/radeon: use mmu_interval_notifier_insert
RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv
RDMA/odp: Use mmu_interval_notifier_insert()
mm/hmm: define the pre-processor related parts of hmm.h even if disabled
mm/hmm: allow hmm_range to be used with a mmu_interval_notifier or hmm_mirror
mm/mmu_notifier: add an interval tree notifier
mm/mmu_notifier: define the header pre-processor parts even if disabled
mm/hmm: allow snapshot of the special zero page
|
|
Be less pessimistic about estimated page table use for KFD. Most
allocations use 2MB pages and therefore need less VRAM for page
tables. This allows more VRAM to be used for applications especially
on large systems with many GPUs and hundreds of GB of system memory.
Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB
Old page table reservation per GPU: 1GB
New page table reservation per GPU: 32MB
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Allow KFD applications to use more unpinned system memory through
HMM.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Be less pessimistic about estimated page table use for KFD. Most
allocations use 2MB pages and therefore need less VRAM for page
tables. This allows more VRAM to be used for applications especially
on large systems with many GPUs and hundreds of GB of system memory.
Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB
Old page table reservation per GPU: 1GB
New page table reservation per GPU: 32MB
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Convert the collision-retry lock around hmm_range_fault to use the one now
provided by the mmu_interval notifier.
Although this driver does not seem to use the collision retry lock that
hmm provides correctly, it can still be converted over to use the
mmu_interval_notifier api instead of hmm_mirror without too much trouble.
This also deletes another place where a driver is associating additional
data (struct amdgpu_mn) with a mmu_struct.
Link: https://lore.kernel.org/r/20191112202231.3856-13-jgg@ziepe.ca
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Remove the interval tree in the driver and rely on the tree maintained by
the mmu_notifier for delivering mmu_notifier invalidation callbacks.
For some reason amdgpu has a very complicated arrangement where it tries
to prevent duplicate entries in the interval_tree, this is not necessary,
each amdgpu_bo can be its own stand alone entry. interval_tree already
allows duplicates and overlaps in the tree.
Also, there is no need to remove entries upon a release callback, the
mmu_interval API safely allows objects to remain registered beyond the
lifetime of the mm. The driver only has to stop touching the pages during
release.
Link: https://lore.kernel.org/r/20191112202231.3856-12-jgg@ziepe.ca
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|