diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2022-01-10 12:58:46 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-01-10 12:58:46 -0800 |
commit | 8d0749b4f83bf4768ceae45ee6a79e6e7eddfc2a (patch) | |
tree | 069cc92e93982e0b921c09e71df6f7b68b4cbfa2 /drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | |
parent | bf4eebf8cfa2cd50e20b7321dfb3effdcdc6e909 (diff) | |
parent | cb6846fbb83b574c85c2a80211b402a6347b60b1 (diff) |
Merge tag 'drm-next-2022-01-07' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"Highlights are support for privacy screens found in new laptops, a
bunch of nomodeset refactoring, and i915 enables ADL-P systems by
default, while starting to add RPL-S support.
vmwgfx adds GEM and support for OpenGL 4.3 features in userspace.
Lots of internal refactorings around dma reservations, and lots of
driver refactoring as well.
Summary:
core:
- add privacy screen support
- move nomodeset option into drm subsystem
- clean up nomodeset handling in drivers
- make drm_irq.c legacy
- fix stack_depot name conflicts
- remove DMA_BUF_SET_NAME ioctl restrictions
- sysfs: send hotplug event
- replace several DRM_* logging macros with drm_*
- move hashtable to legacy code
- add error return from gem_create_object
- cma-helper: improve interfaces, drop CONFIG_DRM_KMS_CMA_HELPER
- kernel.h related include cleanups
- support XRGB2101010 source buffers
ttm:
- don't include drm hashtable
- stop pruning fences after wait
- documentation updates
dma-buf:
- add dma_resv selftest
- add debugfs helpers
- remove dma_resv_get_excl_unlocked
- documentation
- make fences mandatory in dma_resv_add_excl_fence
dp:
- add link training delay helpers
gem:
- link shmem/cma helpers into separate modules
- use dma_resv iteratior
- import dma-buf namespace into gem helper modules
scheduler:
- fence grab fix
- lockdep fixes
bridge:
- switch to managed MIPI DSI helpers
- register and attach during probe fixes
- convert to YAML in several places.
panel:
- add bunch of new panesl
simpledrm:
- support FB_DAMAGE_CLIPS
- support virtual screen sizes
- add Apple M1 support
amdgpu:
- enable seamless boot for DCN 3.01
- runtime PM fixes
- use drm_kms_helper_connector_hotplug_event
- get all fences at once
- use generic drm fb helpers
- PSR/DPCD/LTTPR/DSC/PM/RAS/OLED/SRIOV fixes
- add smart trace buffer (STB) for supported GPUs
- display debugfs entries
- new SMU debug option
- Documentation update
amdkfd:
- IP discovery enumeration refactor
- interface between driver fixes
- SVM fixes
- kfd uapi header to define some sysfs bitfields.
i915:
- support VESA panel backlights
- enable ADL-P by default
- add eDP privacy screen support
- add Raptor Lake S (RPL-S) support
- DG2 page table support
- lots of GuC/HuC fw refactoring
- refactored i915->gt interfaces
- CD clock squashing support
- enable 10-bit gamma support
- update ADL-P DMC fw to v2.14
- enable runtime PM autosuspend by default
- ADL-P DSI support
- per-lane DP drive settings for ICL+
- add support for pipe C/D DMC firmware
- Atomic gamma LUT updates
- remove CCS FB stride restrictions on ADL-P
- VRR platform support for display 11
- add support for display audio codec keepalive
- lots of display refactoring
- fix runtime PM handling during PXP suspend
- improved eviction performance with async TTM moves
- async VMA unbinding improvements
- VMA locking refactoring
- improved error capture robustness
- use per device iommu checks
- drop bits stealing from i915_sw_fence function ptr
- remove dma_resv_prune
- add IC cache invalidation on DG2
nouveau:
- crc fixes
- validate LUTs in atomic check
- set HDMI AVI RGB quant to full
tegra:
- buffer objects reworks for dma-buf compat
- NVDEC driver uAPI support
- power management improvements
etnaviv:
- IOMMU enabled system support
- fix > 4GB command buffer mapping
- close a DoS vector
- fix spurious GPU resets
ast:
- fix i2c initialization
rcar-du:
- DSI output support
exynos:
- replace legacy gpio interface
- implement generic GEM object mmap
msm:
- dpu plane state cleanup in prep for multirect
- dpu debugfs cleanups
- dp support for sc7280
- a506 support
- removal of struct_mutex
- remove old eDP sub-driver
anx7625:
- support MIPI DSI input
- support HDMI audio
- fix reading EDID
lvds:
- fix bridge DT bindings
megachips:
- probe both bridges before registering
dw-hdmi:
- allow interlace on bridge
ps8640:
- enable runtime PM
- support aux-bus
tx358768:
- enable reference clock
- add pulse mode support
ti-sn65dsi86:
- use regmap bulk write
- add PWM support
etnaviv:
- get all fences at once
gma500:
- gem object cleanups
kmb:
- enable fb console
radeon:
- use dma_resv_wait_timeout
rockchip:
- add DSP hold timeout
- suspend/resume fixes
- PLL clock fixes
- implement mmap in GEM object functions
- use generic fbdev emulation
sun4i:
- use CMA helpers without vmap support
vc4:
- fix HDMI-CEC hang with display is off
- power on HDMI controller while disabling
- support 4K@60Hz modes
- support 10-bit YUV 4:2:0 output
vmwgfx:
- fix leak on probe errors
- fail probing on broken hosts
- new placement for MOB page tables
- hide internal BOs from userspace
- implement GEM support
- implement GL 4.3 support
virtio:
- overflow fixes
xen:
- implement mmap as GEM object function
omapdrm:
- fix scatterlist export
- support virtual planes
mediatek:
- MT8192 support
- CMDQ refinement"
* tag 'drm-next-2022-01-07' of git://anongit.freedesktop.org/drm/drm: (1241 commits)
drm/amdgpu: no DC support for headless chips
drm/amd/display: fix dereference before NULL check
drm/amdgpu: always reset the asic in suspend (v2)
drm/amdgpu: put SMU into proper state on runpm suspending for BOCO capable platform
drm/amd/display: Fix the uninitialized variable in enable_stream_features()
drm/amdgpu: fix runpm documentation
amdgpu/pm: Make sysfs pm attributes as read-only for VFs
drm/amdgpu: save error count in RAS poison handler
drm/amdgpu: drop redundant semicolon
drm/amd/display: get and restore link res map
drm/amd/display: support dynamic HPO DP link encoder allocation
drm/amd/display: access hpo dp link encoder only through link resource
drm/amd/display: populate link res in both detection and validation
drm/amd/display: define link res and make it accessible to all link interfaces
drm/amd/display: 3.2.167
drm/amd/display: [FW Promotion] Release 0.0.98
drm/amd/display: Undo ODM combine
drm/amd/display: Add reg defs for DCN303
drm/amd/display: Changed pipe split policy to allow for multi-display pipe split
drm/amd/display: Set optimize_pwr_state for DCN31
...
Diffstat (limited to 'drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c')
-rw-r--r-- | drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 195 |
1 files changed, 166 insertions, 29 deletions
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index cb0bf6ffd0e3..3a5b247be738 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -29,6 +29,7 @@ #include "i915_gem_ioctls.h" #include "i915_trace.h" #include "i915_user_extensions.h" +#include "i915_vma_snapshot.h" struct eb_vma { struct i915_vma *vma; @@ -307,11 +308,15 @@ struct i915_execbuffer { struct eb_fence *fences; unsigned long num_fences; +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR) + struct i915_capture_list *capture_lists[MAX_ENGINE_INSTANCE + 1]; +#endif }; static int eb_parse(struct i915_execbuffer *eb); static int eb_pin_engine(struct i915_execbuffer *eb, bool throttle); static void eb_unpin_engine(struct i915_execbuffer *eb); +static void eb_capture_release(struct i915_execbuffer *eb); static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb) { @@ -990,7 +995,7 @@ static int eb_validate_vmas(struct i915_execbuffer *eb) } if (!(ev->flags & EXEC_OBJECT_WRITE)) { - err = dma_resv_reserve_shared(vma->resv, 1); + err = dma_resv_reserve_shared(vma->obj->base.resv, 1); if (err) return err; } @@ -1043,6 +1048,7 @@ static void eb_release_vmas(struct i915_execbuffer *eb, bool final) i915_vma_put(vma); } + eb_capture_release(eb); eb_unpin_engine(eb); } @@ -1092,6 +1098,47 @@ static inline struct i915_ggtt *cache_to_ggtt(struct reloc_cache *cache) return &i915->ggtt; } +static void reloc_cache_unmap(struct reloc_cache *cache) +{ + void *vaddr; + + if (!cache->vaddr) + return; + + vaddr = unmask_page(cache->vaddr); + if (cache->vaddr & KMAP) + kunmap_atomic(vaddr); + else + io_mapping_unmap_atomic((void __iomem *)vaddr); +} + +static void reloc_cache_remap(struct reloc_cache *cache, + struct drm_i915_gem_object *obj) +{ + void *vaddr; + + if (!cache->vaddr) + return; + + if (cache->vaddr & KMAP) { + struct page *page = i915_gem_object_get_page(obj, cache->page); + + vaddr = kmap_atomic(page); + cache->vaddr = unmask_flags(cache->vaddr) | + (unsigned long)vaddr; + } else { + struct i915_ggtt *ggtt = cache_to_ggtt(cache); + unsigned long offset; + + offset = cache->node.start; + if (!drm_mm_node_allocated(&cache->node)) + offset += cache->page << PAGE_SHIFT; + + cache->vaddr = (unsigned long) + io_mapping_map_atomic_wc(&ggtt->iomap, offset); + } +} + static void reloc_cache_reset(struct reloc_cache *cache, struct i915_execbuffer *eb) { void *vaddr; @@ -1356,10 +1403,17 @@ eb_relocate_entry(struct i915_execbuffer *eb, * batchbuffers. */ if (reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION && - GRAPHICS_VER(eb->i915) == 6) { + GRAPHICS_VER(eb->i915) == 6 && + !i915_vma_is_bound(target->vma, I915_VMA_GLOBAL_BIND)) { + struct i915_vma *vma = target->vma; + + reloc_cache_unmap(&eb->reloc_cache); + mutex_lock(&vma->vm->mutex); err = i915_vma_bind(target->vma, target->vma->obj->cache_level, PIN_GLOBAL, NULL); + mutex_unlock(&vma->vm->mutex); + reloc_cache_remap(&eb->reloc_cache, ev->vma->obj); if (err) return err; } @@ -1880,36 +1934,113 @@ eb_find_first_request_added(struct i915_execbuffer *eb) return NULL; } -static int eb_move_to_gpu(struct i915_execbuffer *eb) +#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR) + +/* Stage with GFP_KERNEL allocations before we enter the signaling critical path */ +static void eb_capture_stage(struct i915_execbuffer *eb) { const unsigned int count = eb->buffer_count; - unsigned int i = count; - int err = 0, j; + unsigned int i = count, j; + struct i915_vma_snapshot *vsnap; while (i--) { struct eb_vma *ev = &eb->vma[i]; struct i915_vma *vma = ev->vma; unsigned int flags = ev->flags; - struct drm_i915_gem_object *obj = vma->obj; - assert_vma_held(vma); + if (!(flags & EXEC_OBJECT_CAPTURE)) + continue; - if (flags & EXEC_OBJECT_CAPTURE) { + vsnap = i915_vma_snapshot_alloc(GFP_KERNEL); + if (!vsnap) + continue; + + i915_vma_snapshot_init(vsnap, vma, "user"); + for_each_batch_create_order(eb, j) { struct i915_capture_list *capture; - for_each_batch_create_order(eb, j) { - if (!eb->requests[j]) - break; + capture = kmalloc(sizeof(*capture), GFP_KERNEL); + if (!capture) + continue; - capture = kmalloc(sizeof(*capture), GFP_KERNEL); - if (capture) { - capture->next = - eb->requests[j]->capture_list; - capture->vma = vma; - eb->requests[j]->capture_list = capture; - } - } + capture->next = eb->capture_lists[j]; + capture->vma_snapshot = i915_vma_snapshot_get(vsnap); + eb->capture_lists[j] = capture; + } + i915_vma_snapshot_put(vsnap); + } +} + +/* Commit once we're in the critical path */ +static void eb_capture_commit(struct i915_execbuffer *eb) +{ + unsigned int j; + + for_each_batch_create_order(eb, j) { + struct i915_request *rq = eb->requests[j]; + + if (!rq) + break; + + rq->capture_list = eb->capture_lists[j]; + eb->capture_lists[j] = NULL; + } +} + +/* + * Release anything that didn't get committed due to errors. + * The capture_list will otherwise be freed at request retire. + */ +static void eb_capture_release(struct i915_execbuffer *eb) +{ + unsigned int j; + + for_each_batch_create_order(eb, j) { + if (eb->capture_lists[j]) { + i915_request_free_capture_list(eb->capture_lists[j]); + eb->capture_lists[j] = NULL; } + } +} + +static void eb_capture_list_clear(struct i915_execbuffer *eb) +{ + memset(eb->capture_lists, 0, sizeof(eb->capture_lists)); +} + +#else + +static void eb_capture_stage(struct i915_execbuffer *eb) +{ +} + +static void eb_capture_commit(struct i915_execbuffer *eb) +{ +} + +static void eb_capture_release(struct i915_execbuffer *eb) +{ +} + +static void eb_capture_list_clear(struct i915_execbuffer *eb) +{ +} + +#endif + +static int eb_move_to_gpu(struct i915_execbuffer *eb) +{ + const unsigned int count = eb->buffer_count; + unsigned int i = count; + int err = 0, j; + + while (i--) { + struct eb_vma *ev = &eb->vma[i]; + struct i915_vma *vma = ev->vma; + unsigned int flags = ev->flags; + struct drm_i915_gem_object *obj = vma->obj; + + assert_vma_held(vma); /* * If the GPU is not _reading_ through the CPU cache, we need @@ -1990,6 +2121,8 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb) /* Unconditionally flush any chipset caches (for streaming writes). */ intel_gt_chipset_flush(eb->gt); + eb_capture_commit(eb); + return 0; err_skip: @@ -2164,7 +2297,7 @@ static int eb_parse(struct i915_execbuffer *eb) goto err_trampoline; } - err = dma_resv_reserve_shared(shadow->resv, 1); + err = dma_resv_reserve_shared(shadow->obj->base.resv, 1); if (err) goto err_trampoline; @@ -2276,9 +2409,9 @@ static int eb_submit(struct i915_execbuffer *eb) return err; } -static int num_vcs_engines(const struct drm_i915_private *i915) +static int num_vcs_engines(struct drm_i915_private *i915) { - return hweight_long(VDBOX_MASK(&i915->gt)); + return hweight_long(VDBOX_MASK(to_gt(i915))); } /* @@ -3114,7 +3247,7 @@ eb_requests_create(struct i915_execbuffer *eb, struct dma_fence *in_fence, /* Allocate a request for this batch buffer nice and early. */ eb->requests[i] = i915_request_create(eb_find_context(eb, i)); if (IS_ERR(eb->requests[i])) { - out_fence = ERR_PTR(PTR_ERR(eb->requests[i])); + out_fence = ERR_CAST(eb->requests[i]); eb->requests[i] = NULL; return out_fence; } @@ -3132,13 +3265,14 @@ eb_requests_create(struct i915_execbuffer *eb, struct dma_fence *in_fence, } /* - * Whilst this request exists, batch_obj will be on the - * active_list, and so will hold the active reference. Only when - * this request is retired will the batch_obj be moved onto - * the inactive_list and lose its active reference. Hence we do - * not need to explicitly hold another reference here. + * Not really on stack, but we don't want to call + * kfree on the batch_snapshot when we put it, so use the + * _onstack interface. */ - eb->requests[i]->batch = eb->batches[i]->vma; + if (eb->batches[i]->vma) + i915_vma_snapshot_init_onstack(&eb->requests[i]->batch_snapshot, + eb->batches[i]->vma, + "batch"); if (eb->batch_pool) { GEM_BUG_ON(intel_context_is_parallel(eb->context)); intel_gt_buffer_pool_mark_active(eb->batch_pool, @@ -3187,6 +3321,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, eb.fences = NULL; eb.num_fences = 0; + eb_capture_list_clear(&eb); + memset(eb.requests, 0, sizeof(struct i915_request *) * ARRAY_SIZE(eb.requests)); eb.composite_fence = NULL; @@ -3273,6 +3409,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, } ww_acquire_done(&eb.ww.ctx); + eb_capture_stage(&eb); out_fence = eb_requests_create(&eb, in_fence, out_fence_fd); if (IS_ERR(out_fence)) { |