summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
AgeCommit message (Collapse)Author
2021-01-08drm/i915/gt: Disable arbitration on no-preempt requestsChris Wilson
If a request is submitted and known to require no preemption, disable arbitration around the batch which prevents the HW from handling a preemption request during the payload. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-7-chris@chris-wilson.co.uk
2021-01-03drm/i915: fix shift warningArnd Bergmann
Randconfig builds on 32-bit machines show lots of warnings for the i915 driver for passing a 32-bit value into __const_hweight64(): drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2584:9: error: shift count >= width of type [-Werror,-Wshift-count-overflow] return hweight64(VDBOX_MASK(&i915->gt)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/asm-generic/bitops/const_hweight.h:29:49: note: expanded from macro 'hweight64' #define hweight64(w) (__builtin_constant_p(w) ? __const_hweight64(w) : __arch_hweight64(w)) Change it to hweight_long() to avoid the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20210103135158.3591442-1-arnd@kernel.org
2020-12-24drm/i915: clear the gpu reloc batchMatthew Auld
The reloc batch is short lived but can exist in the user visible ppGTT, and since it's backed by an internal object, which lacks page clearing, we should take care to clear it upfront. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20201224151358.401345-2-matthew.auld@intel.com Cc: stable@vger.kernel.org
2020-12-16drm/i915/gt: Move gen8 CS emitters into gen8_engine_cs.hChris Wilson
Reduce the pollution of intel_engine.h by moving gen8_emit_pipe_control and friends to gen8_engine_cs.h Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201216135452.6063-1-chris@chris-wilson.co.uk
2020-12-16drm/i915: Fix mismatch between misplaced vma check and vma insertChris Wilson
When inserting a VMA, we restrict the placement to the low 4G unless the caller opts into using the full range. This was done to allow usersapce the opportunity to transition slowly from a 32b address space, and to avoid breaking inherent 32b assumptions of some commands. However, for insert we limited ourselves to 4G-4K, but on verification we allowed the full 4G. This causes some attempts to bind a new buffer to sporadically fail with -ENOSPC, but at other times be bound successfully. commit 48ea1e32c39d ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1 page") suggests that there is a genuine problem with stateless addressing that cannot utilize the last page in 4G and so we purposefully excluded it. This means that the quick pin pass may cause us to utilize a buggy placement. Reported-by: CQ Tang <cq.tang@intel.com> Testcase: igt/gem_exec_params/larger-than-life-batch Fixes: 48ea1e32c39d ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1 page") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: CQ Tang <cq.tang@intel.com> Reviewed-by: CQ Tang <cq.tang@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v4.5+ Link: https://patchwork.freedesktop.org/patch/msgid/20201216092951.7124-1-chris@chris-wilson.co.uk
2020-12-08drm/i915/gem: Drop false !i915_vma_is_closed assertionChris Wilson
Closed vma are protected by the GT wakeref held as we lookup the vma, so we know that the vma will not be freed as we process it for the execbuf. Instead we expect to catch the closed status of the context, and simply allow the close-race on an individual vma to be washed away. Longer term, the GT wakeref protection will be removed by explicit vma.kref tracking. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2245 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201207193824.18114-1-chris@chris-wilson.co.uk
2020-12-04drm/i915/gem: Propagate error from cancelled submit due to context closureChris Wilson
In the course of discovering and closing many races with context closure and execbuf submission, since commit 61231f6bd056 ("drm/i915/gem: Check that the context wasn't closed during setup") we started checking that the context was not closed by another userspace thread during the execbuf ioctl. In doing so we cancelled the inflight request (by telling it to be skipped), but kept reporting success since we do submit a request, albeit one that doesn't execute. As the error is known before we return from the ioctl, we can report the error we detect immediately, rather than leave it on the fence status. With the immediate propagation of the error, it is easier for userspace to handle. Fixes: 61231f6bd056 ("drm/i915/gem: Check that the context wasn't closed during setup") Testcase: igt/gem_ctx_exec/basic-close-race Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201203103432.31526-1-chris@chris-wilson.co.uk
2020-10-19drm/i915/gem: Support parsing of oversize batchesChris Wilson
Matthew Auld noted that on more recent systems (such as the parser for gen9) we may have objects that are larger than expected by the GEM uAPI (i.e. greater than u32). These objects would have incorrect implicit batch lengths, causing the parser to reject them for being incomplete, or worse. Based on a patch by Matthew Auld. Reported-by: Matthew Auld <matthew.auld@intel.com> Fixes: 435e8fc059db ("drm/i915: Allow parsing of unsized batches") Testcase: igt/gem_exec_params/larger-than-life-batch Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20201015115954.871-1-chris@chris-wilson.co.uk (cherry picked from commit 57b2d834bf235daab388c3ba12d035c820ae09c6) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915: Avoid mixing integer types during batch copiesChris Wilson
Be consistent and use unsigned long throughout the chunk copies to avoid the inherent clumsiness of mixing integer types of different widths and signs. Failing to take acount of a wider unsigned type when using min_t can lead to treating it as a negative, only for it flip back to a large unsigned value after passing a boundary check. Fixes: ed13033f0287 ("drm/i915/cmdparser: Only cache the dst vmap") Testcase: igt/gen9_exec_parse/bb-large Reported-by: "Candelaria, Jared" <jared.candelaria@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: "Candelaria, Jared" <jared.candelaria@intel.com> Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com> Cc: <stable@vger.kernel.org> # v4.9+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@chris-wilson.co.uk (cherry picked from commit b7eeb2b4132ccf1a7d38f434cde7043913d1ed3c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-10drm/i915: Fix slightly botched merge in __reloc_entry_gpuMaarten Lankhorst
This function should be an int, not a bool. Presumably because we had the same 2 reverts in a slightly different way, git got confused. Thanks to Dan for reporting. :) The conflict is between the 3 reverts in drm-fixes: 4993a8a37808 ("Revert "drm/i915: Remove i915_gem_object_get_dirty_page()"") ad5d95e4d538 ("Revert "drm/i915/gem: Async GPU relocations only"") 20561da3a2e1 ("Revert "drm/i915/gem: Delete unused code"") And the slightly different combined revert in drm-intel-gt-next, but with the same goal: 102a0a9051f4 ("Revert "drm/i915/gem: Async GPU relocations only"") In the merge commit 1f4b2aca794f ("Merge tag 'drm-intel-gt-next-2020-09-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next") things went wrong, but the merge commit view now doesn't show any conflict anymore (as git tends to do when the resolution picks one or the other branch). The need to handle other than just true/false error codes in __reloc_entry_gpu was added in the dma_resv locking changes in c43ce12328df ("drm/i915: Use per object locking in execbuf, v12.") Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Dave Airlie <airlied@redhat.com> [danvet: Explain this entire saga a lot better, adding tons of commit references. Also note that this was merged before full intel-gfx-CI results, only after BAT, since the breakage at the BAT run is already severe enough to block all pre-merge testing.] Fixes: 1f4b2aca794f ("Merge tag 'drm-intel-gt-next-2020-09-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next") Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200910111225.2184193-1-maarten.lankhorst@linux.intel.com
2020-09-09Merge tag 'drm-intel-gt-next-2020-09-07' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next (Same content as drm-intel-gt-next-2020-09-04-3, S-o-b's added) UAPI Changes: (- Potential implicit changes from WW locking refactoring) Cross-subsystem Changes: (- WW locking changes should align the i915 locking more with others) Driver Changes: - MAJOR: Apply WW locking across the driver (Maarten) - Reverts for 5 commits to make applying WW locking faster (Maarten) - Disable preparser around invalidations on Tigerlake for non-RCS engines (Chris) - Add missing dma_fence_put() for error case of syncobj timeline (Chris) - Parse command buffer earlier in eb_relocate(slow) to facilitate backoff (Maarten) - Pin engine before pinning all objects (Maarten) - Rework intel_context pinning to do everything outside of pin_mutex (Maarten) - Avoid tracking GEM context until registered (Cc: stable, Chris) - Provide a fastpath for waiting on vma bindings (Chris) - Fixes to preempt-to-busy mechanism (Chris) - Distinguish the virtual breadcrumbs from the irq breadcrumbs (Chris) - Switch to object allocations for page directories (Chris) - Hold context/request reference while breadcrumbs are active (Chris) - Make sure execbuffer always passes ww state to i915_vma_pin (Maarten) - Code refactoring to facilitate use of WW locking (Maarten) - Locking refactoring to use more granular locking (Maarten, Chris) - Support for multiple pinned timelines per engine (Chris) - Move complication of I915_GEM_THROTTLE to the ioctl from general code (Chris) - Make active tracking/vma page-directory stash work preallocated (Chris) - Avoid flushing submission tasklet too often (Chris) - Reduce context termination list iteration guard to RCU (Chris) - Reductions to locking contention (Chris) - Fixes for issues found by CI (Chris) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <jlahtine@jlahtine-mobl.ger.corp.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200907130039.GA27766@jlahtine-mobl.ger.corp.intel.com
2020-09-09Backmerge drm-fixes merge into drm-nextDave Airlie
Commit '6f6a73c8b715d595977774d48450a734297ab21f' from Linus' tree The fixes reverts cause a bit of a conflict pain with intel next, start fixing it up here. Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-09-08Revert "drm/i915/gem: Delete unused code"Dave Airlie
These commits caused a regression on Lenovo t520 sandybridge machine belonging to reporter. We are reverting them for 5.10 for other reasons, so just do it for 5.9 as well. This reverts commit 7ac2d2536dfa71c275a74813345779b1e7522c91. Reported-by: Harald Arnesen <harald@skogtun.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-09-08Revert "drm/i915/gem: Async GPU relocations only"Dave Airlie
These commits caused a regression on Lenovo t520 sandybridge machine belonging to reporter. We are reverting them for 5.10 for other reasons, so just do it for 5.9 as well. This reverts commit 9e0f9464e2ab36b864359a59b0e9058fdef0ce47. Reported-by: Harald Arnesen <harald@skogtun.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-09-07drm/i915: Make sure execbuffer always passes ww state to i915_vma_pin.Maarten Lankhorst
As a preparation step for full object locking and wait/wound handling during pin and object mapping, ensure that we always pass the ww context in i915_gem_execbuffer.c to i915_vma_pin, use lockdep to ensure this happens. This also requires changing the order of eb_parse slightly, to ensure we pass ww at a point where we could still handle -EDEADLK safely. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-15-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Pin engine before pinning all objects, v5.Maarten Lankhorst
We want to lock all gem objects, including the engine context objects, rework the throttling to ensure that we can do this. Now we only throttle once, but can take eb_pin_engine while acquiring objects. This means we will have to drop the lock to wait. If we don't have to throttle we can still take the fastpath, if not we will take the slowpath and wait for the throttle request while unlocked. The engine has to be pinned as first step, otherwise gpu relocations won't work. Changes since v1: - Only need to get a throttled request in the fastpath, no need for a global flag any more. - Always free the waited request correctly. Changes since v2: - Use intel_engine_pm_get()/put() to keeep engine pool alive during EDEADLK handling. Changes since v3: - Fix small rq leak. Changes since v4: - Use a single reloc_context, for intel_context_pin_ww(). Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-13-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Nuke arguments to eb_pin_engineMaarten Lankhorst
Those arguments are already set as eb.file and eb.args, so kill off the extra arguments. This will allow us to move eb_pin_engine() to after we reserved all BO's. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-12-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Use per object locking in execbuf, v12.Maarten Lankhorst
Now that we changed execbuf submission slightly to allow us to do all pinning in one place, we can now simply add ww versions on top of struct_mutex. All we have to do is a separate path for -EDEADLK handling, which needs to unpin all gem bo's before dropping the lock, then starting over. This finally allows us to do parallel submission, but because not all of the pinning code uses the ww ctx yet, we cannot completely drop struct_mutex yet. Changes since v1: - Keep struct_mutex for now. :( Changes since v2: - Make sure we always lock the ww context in slowpath. Changes since v3: - Don't call __eb_unreserve_vma in eb_move_to_gpu now; this can be done on normal unlock path. - Unconditionally release vmas and context. Changes since v4: - Rebased on top of struct_mutex reduction. Changes since v5: - Remove training wheels. Changes since v6: - Fix accidentally broken -ENOSPC handling. Changes since v7: - Handle gt buffer pool better. Changes since v8: - Properly clear variables, to make -EDEADLK handling not BUG. Change since v9: - Fix unpinning fence on pnv and below. Changes since v10: - Make relocation gpu chaining working again. Changes since v11: - Remove relocation chaining, pain to make it work. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-9-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Parse command buffer earlier in eb_relocate(slow)Maarten Lankhorst
We want to introduce backoff logic, but we need to lock the pool object as well for command parsing. Because of this, we will need backoff logic for the engine pool obj, move the batch validation up slightly to eb_lookup_vmas, and the actual command parsing in a separate function which can get called from execbuf relocation fast and slowpath. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-8-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Remove locking from i915_gem_object_prepare_read/writeMaarten Lankhorst
Execbuffer submission will perform its own WW locking, and we cannot rely on the implicit lock there. This also makes it clear that the GVT code will get a lockdep splat when multiple batchbuffer shadows need to be performed in the same instance, fix that up. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-7-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.Maarten Lankhorst
i915_gem_ww_ctx is used to lock all gem bo's for pinning and memory eviction. We don't use it yet, but lets start adding the definition first. To use it, we have to pass a non-NULL ww to gem_object_lock, and don't unlock directly. It is done in i915_gem_ww_ctx_fini. Changes since v1: - Change ww_ctx and obj order in locking functions (Jonas Lahtinen) Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-6-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07Revert "drm/i915/gem: Split eb_vma into its own allocation"Maarten Lankhorst
This reverts commit 0f1dd02295f3 ("drm/i915/gem: Split eb_vma into its own allocation") and also moves all unreserving to a single place at the end, which is a minor simplification. With the WW locking, we will drop all references only at the end when unlocking, so refcounting can now be removed. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-5-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07Revert "drm/i915/gem: Drop relocation slowpath".Maarten Lankhorst
This reverts commit 7dc8f1143778 ("drm/i915/gem: Drop relocation slowpath"). We need the slowpath relocation for taking ww-mutex inside the page fault handler, and we will take this mutex when pinning all objects. We also functionally revert ef398881d27d ("drm/i915/gem: Limit struct_mutex to eb_reserve"), as we need the struct_mutex in the slowpath as well, and a tiny part of 003d8b9143a6 ("drm/i915/gem: Only call eb_lookup_vma once during execbuf ioctl"). Specifically, we make the -EAGAIN handling part of fallback to slowpath again. With this, we have a proper working slowpath again, which will allow us to do fault handling with WW locks held. [mlankhorst: Adjusted for reloc_gpu_flush() changes] Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> [mlankhorst: Removed extra reloc_gpu_flush()] Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-4-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Revert relocation chaining commits.Maarten Lankhorst
This reverts commit 964a9b0f611ee ("drm/i915/gem: Use chained reloc batches") and commit 0e97fbb080553 ("drm/i915/gem: Use a single chained reloc batches for a single execbuf"). When adding ww locking to execbuf, it's hard enough to deal with a single BO that is part of relocation execution. Chaining is hard to get right, and with GPU relocation deprecated, it's best to drop this altogether, instead of trying to fix something we will remove. This is not a completely 1:1 revert, we reset rq_size to 0 in reloc_cache_init, this was from e3d291301f99 ("drm/i915/gem: Implement legacy MI_STORE_DATA_IMM"), because we don't want to break the selftests. (Daniel) Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-3-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07Revert "drm/i915/gem: Async GPU relocations only"Maarten Lankhorst
This reverts commit 9e0f9464e2ab ("drm/i915/gem: Async GPU relocations only"), and related commit 7ac2d2536dfa7 ("drm/i915/gem: Delete unused code"). Async GPU relocations are not the path forward, we want to remove GPU accelerated relocation support eventually when userspace is fixed to use VM_BIND, and this is the first step towards that. We will keep async gpu relocations around for now, until userspace is fixed. Relocation support will be disabled completely on platforms where there was never any userspace that depends on it, as the hardware doesn't require it from at least gen9+ onward. For older platforms, the plan is to use cpu relocations only. The igt side is fixed in igt commit 39e9aa1032a4e ("tests/i915: Remove subtests that rely on async relocation behavior"). Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-2-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915/gem: Free the fence after a fence-chain lookup failureChris Wilson
If dma_fence_chain_find_seqno() reports an error, it does so in its preamble before it disposes of the input fence. On handling the error, we need to drop the reference to the fence. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2292 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 13149e8bafc4 ("drm/i915: add syncobj timeline support") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200806161056.17593-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Export a preallocate variant of i915_active_acquire()Chris Wilson
Sometimes we have to be very careful not to allocate underneath a mutex (or spinlock) and yet still want to track activity. Enter i915_active_acquire_for_context(). This raises the activity counter on i915_active prior to use and ensures that the fence-tree contains a slot for the context. v2: Refactor active_lookup() so it can be called again before/after locking to resolve contention. Since we protect the rbtree until we idle, we can do a lockfree lookup, with the caveat that if another thread performs a concurrent insertion, the rotations from the insert may cause us to not find our target. A second pass holding the treelock will find the target if it exists, or the place to perform our insertion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-3-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915/gem: Remove disordered per-file request list for throttlingChris Wilson
I915_GEM_THROTTLE dates back to the time before contexts where there was just a single engine, and therefore a single timeline and request list globally. That request list was in execution/retirement order, and so walking it to find a particular aged request made sense and could be split per file. That is no more. We now have many timelines with a file, as many as the user wants to construct (essentially per-engine, per-context). Each of those run independently and so make the single list futile. Remove the disordered list, and iterate over all the timelines to find a request to wait on in each to satisfy the criteria that the CPU is no more than 20ms ahead of its oldest request. It should go without saying that the I915_GEM_THROTTLE ioctl is no longer used as the primary means of throttling, so it makes sense to push the complication into the ioctl where it only impacts upon its few irregular users, rather than the execbuf/retire where everybody has to pay the cost. Fortunately, the few users do not create vast amount of contexts, so the loops over contexts/engines should be concise. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200728152010.30701-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-08-17drm/i915: add syncobj timeline supportLionel Landwerlin
Introduces a new parameters to execbuf so that we can specify syncobj handles as well as timeline points. v2: Reuse i915_user_extension_fn v3: Check that the chained extension is only present once (Chris) v4: Check that dma_fence_chain_find_seqno returns a non NULL fence (Lionel) v5: Use BIT_ULL (Chris) v6: Fix issue with already signaled timeline points, dma_fence_chain_find_seqno() setting fence to NULL (Chris) v7: Report ENOENT with invalid syncobj handle (Lionel) v8: Check for out of order timeline point insertion (Chris) v9: After explanations on https://lists.freedesktop.org/archives/dri-devel/2019-August/229287.html drop the ordering check from v8 (Lionel) v10: Set first extension enum item to 1 (Jason) v11: Rebase v12: Allow multiple extension nodes of timeline syncobj (Chris) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Co-authored-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v11) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200804085954.350343-3-lionel.g.landwerlin@intel.com Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2901 Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-08-17drm/i915: introduce a mechanism to extend execbuf2Lionel Landwerlin
We're planning to use this for a couple of new feature where we need to provide additional parameters to execbuf. v2: Check for invalid flags in execbuffer2 (Lionel) v3: Rename I915_EXEC_EXT -> I915_EXEC_USE_EXTENSIONS (Chris) v4: Rebase Move array fence parsing in i915_gem_do_execbuffer() Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200804085954.350343-2-lionel.g.landwerlin@intel.com Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2901 Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-07-08drm/i915: Move the engine mask to intel_gt_infoDaniele Ceraolo Spurio
Since the engines belong to the GT, move the runtime-updated list of available engines to the intel_gt struct. The original mask has been renamed to indicate it contains the maximum engine list that can be found on a matching device. In preparation for other info being moved to the gt in follow up patches (sseu), introduce an intel_gt_info structure to group all gt-related runtime info. v2: s/max_engine_mask/platform_engine_mask (tvrtko), fix selftest Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-5-daniele.ceraolospurio@intel.com
2020-07-03drm/i915/gem: Split the context's obj:vma lut into its own mutexChris Wilson
Rather than reuse the common ctx->mutex for locking the execbuffer LUT, split it into its own lock to avoid being taken [as part of ctx->mutex] at inappropriate times. In particular to avoid the inversion from taking the timeline->mutex for the whole execbuf submission in the next patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200703004306.11117-1-chris@chris-wilson.co.uk
2020-07-01drm/i915/gem: Move obj->lut_list under its own lockChris Wilson
The obj->lut_list is traversed when the object is closed as the file table is destroyed during process termination. As this occurs before we kill any outstanding context if, due to some bug or another, the closure is blocked, then we fail to shootdown any inflight operations potentially leaving the GPU spinning forever. As we only need to guard the list against concurrent closures and insertions, the hold is short and merits being treated as a simple spinlock. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200701084439.17025-1-chris@chris-wilson.co.uk
2020-06-25Merge drm/drm-next into drm-intel-next-queuedJani Nikula
Catch up with upstream, in particular to get c1e8d7c6a7a6 ("mmap locking API: convert mmap_sem comments"). Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-06-11Merge tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "One sun4i fix and a connector hotplug race The ast fix is for a regression in 5.6, and one of the i915 ones fixes an oops reported by dhowells. core: - fix race in connectors sending hotplug i915: - Avoid use after free in cmdparser - Avoid NULL dereference when probing all display encoders - Fixup to module parameter type sun4i: - clock divider fix ast: - 24/32 bpp mode setting fix" * tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drm: drm/ast: fix missing break in switch statement for format->cpp[0] case 4 drm/sun4i: hdmi ddc clk: Fix size of m divider drm/i915/display: Only query DP state of a DDI encoder drm/i915/params: fix i915.reset module param type drm/i915/gem: Mark the buffer pool as active for the cmdparser drm/connector: notify userspace on hotplug after register complete
2020-06-08drm/i915/gem: Mark the buffer pool as active for the cmdparserChris Wilson
If the execbuf is interrupted after building the cmdparser pipeline, and before we commit to submitting the request to HW, we would attempt to clean up the cmdparser early. While we held active references to the vma being parsed and constructed, we did not hold an active reference for the buffer pool itself. The result was that an interrupted execbuf could still have run the cmdparser pipeline, but since the buffer pool was idle, its target vma could have been recycled. Note this problem only occurs if the cmdparser is running async due to pipelined waits on busy fences, and the execbuf is interrupted. Fixes: 686c7c35abc2 ("drm/i915/gem: Asynchronous cmdparser") Fixes: 16e87459673a ("drm/i915/gt: Move the batch buffer pool from the engine to the gt") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200604103751.18816-1-chris@chris-wilson.co.uk (cherry picked from commit 57a78ca4eceab1ecb0299fba8a10211289329889) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-06-05drm/i915/gem: Delete unused codeChris Wilson
Unused as of commit 9e0f9464e2ab ("drm/i915/gem: Async GPU relocations only"), but left behind. >> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:933:21: error: unused function 'unmask_page' [-Werror,-Wunused-function] static inline void *unmask_page(unsigned long p) ^ >> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:938:28: error: unused function 'unmask_flags' [-Werror,-Wunused-function] static inline unsigned int unmask_flags(unsigned long p) ^ >> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:945:33: error: unused function 'cache_to_ggtt' [-Werror,-Wunused-function] static inline struct i915_ggtt *cache_to_ggtt(struct reloc_cache *cache) Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200605200357.13069-1-chris@chris-wilson.co.uk
2020-06-05drm/i915/gem: Async GPU relocations onlyChris Wilson
Reduce the 3 relocation paths down to the single path that accommodates all. The primary motivation for this is to guard the relocations with a natural fence (derived from the i915_request used to write the relocation from the GPU). The tradeoff in using async gpu relocations is that it increases latency over using direct CPU relocations, for the cases where the target is idle and accessible by the CPU. The benefit is greatly reduced lock contention and improved concurrency by pipelining. Note that forcing the async gpu relocations does reveal a few issues they have. Firstly, is that they are visible as writes to gem_busy, causing to mark some buffers are being to written to by the GPU even though userspace only reads. Secondly is that, in combination with the cmdparser, they can cause priority inversions. This should be the case where the work is being put into a common workqueue losing our priority information and so being executed in FIFO from the worker, denying us the opportunity to reorder the requests afterwards. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200604211457.19696-1-chris@chris-wilson.co.uk
2020-06-04drm/i915/gem: Mark the buffer pool as active for the cmdparserChris Wilson
If the execbuf is interrupted after building the cmdparser pipeline, and before we commit to submitting the request to HW, we would attempt to clean up the cmdparser early. While we held active references to the vma being parsed and constructed, we did not hold an active reference for the buffer pool itself. The result was that an interrupted execbuf could still have run the cmdparser pipeline, but since the buffer pool was idle, its target vma could have been recycled. Note this problem only occurs if the cmdparser is running async due to pipelined waits on busy fences, and the execbuf is interrupted. Fixes: 686c7c35abc2 ("drm/i915/gem: Asynchronous cmdparser") Fixes: 16e87459673a ("drm/i915/gt: Move the batch buffer pool from the engine to the gt") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200604103751.18816-1-chris@chris-wilson.co.uk
2020-06-03drm/i915: Drop i915_request.i915 backpointerChris Wilson
We infrequently use the direct i915 backpointer from the i915_request, so do we really need to waste the space in the struct for it? 8 bytes from the most frequently allocated struct vs an 3 bytes and pointer chasing in using rq->engine->i915? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200602220953.21178-1-chris@chris-wilson.co.uk
2020-06-02Merge tag 'drm-next-2020-06-02' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm updates from Dave Airlie: "Highlights: - Core DRM had a lot of refactoring around managed drm resources to make drivers simpler. - Intel Tigerlake support is on by default - amdgpu now support p2p PCI buffer sharing and encrypted GPU memory Details: core: - uapi: error out EBUSY when existing master - uapi: rework SET/DROP MASTER permission handling - remove drm_pci.h - drm_pci* are now legacy - introduced managed DRM resources - subclassing support for drm_framebuffer - simple encoder helper - edid improvements - vblank + writeback documentation improved - drm/mm - optimise tree searches - port drivers to use devm_drm_dev_alloc dma-buf: - add flag for p2p buffer support mst: - ACT timeout improvements - remove drm_dp_mst_has_audio - don't use 2nd TX slot - spec recommends against it bridge: - dw-hdmi various improvements - chrontel ch7033 support - fix stack issues with old gcc hdmi: - add unpack function for drm infoframe fbdev: - misc fbdev driver fixes i915: - uapi: global sseu pinning - uapi: OA buffer polling - uapi: remove generated perf code - uapi: per-engine default property values in sysfs - Tigerlake GEN12 enabled. - Lots of gem refactoring - Tigerlake enablement patches - move to drm_device logging - Icelake gamma HW readout - push MST link retrain to hotplug work - bandwidth atomic helpers - ICL fixes - RPS/GT refactoring - Cherryview full-ppgtt support - i915 locking guidelines documented - require linear fb stride to be 512 multiple on gen9 - Tigerlake SAGV support amdgpu: - uapi: encrypted GPU memory handling - uapi: add MEM_SYNC IB flag - p2p dma-buf support - export VRAM dma-bufs - FRU chip access support - RAS/SR-IOV updates - Powerplay locking fixes - VCN DPG (powergating) enablement - GFX10 clockgating fixes - DC fixes - GPU reset fixes - navi SDMA fix - expose FP16 for modesetting - DP 1.4 compliance fixes - gfx10 soft recovery - Improved Critical Thermal Faults handling - resizable BAR on gmc10 amdkfd: - uapi: GWS resource management - track GPU memory per process - report PCI domain in topology radeon: - safe reg list generator fixes nouveau: - HD audio fixes on recent systems - vGPU detection (fail probe if we're on one, for now) - Interlaced mode fixes (mostly avoidance on Turing, which doesn't support it) - SVM improvements/fixes - NVIDIA format modifier support - Misc other fixes. adv7511: - HDMI SPDIF support ast: - allocate crtc state size - fix double assignment - fix suspend bochs: - drop connector register cirrus: - move to tiny drivers. exynos: - fix imported dma-buf mapping - enable runtime PM - fixes and cleanups mediatek: - DPI pin mode swap - config mipi_tx current/impedance lima: - devfreq + cooling device support - task handling improvements - runtime PM support pl111: - vexpress init improvements - fix module auto-load rcar-du: - DT bindings conversion to YAML - Planes zpos sanity check and fix - MAINTAINERS entry for LVDS panel driver mcde: - fix return value mgag200: - use managed config init stm: - read endpoints from DT vboxvideo: - use PCI managed functions - drop WC mtrr vkms: - enable cursor by default rockchip: - afbc support virtio: - various cleanups qxl: - fix cursor notify port hisilicon: - 128-byte stride alignment fix sun4i: - improved format handling" * tag 'drm-next-2020-06-02' of git://anongit.freedesktop.org/drm/drm: (1401 commits) drm/amd/display: Fix potential integer wraparound resulting in a hang drm/amd/display: drop cursor position check in atomic test drm/amdgpu: fix device attribute node create failed with multi gpu drm/nouveau: use correct conflicting framebuffer API drm/vblank: Fix -Wformat compile warnings on some arches drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode drm/amd/display: Handle GPU reset for DC block drm/amdgpu: add apu flags (v2) drm/amd/powerpay: Disable gfxoff when setting manual mode on picasso and raven drm/amdgpu: fix pm sysfs node handling (v2) drm/amdgpu: move gpu_info parsing after common early init drm/amdgpu: move discovery gfx config fetching drm/nouveau/dispnv50: fix runtime pm imbalance on error drm/nouveau: fix runtime pm imbalance on error drm/nouveau: fix runtime pm imbalance on error drm/nouveau/debugfs: fix runtime pm imbalance on error drm/nouveau/nouveau/hmm: fix migrate zero page to GPU drm/nouveau/nouveau/hmm: fix nouveau_dmem_chunk allocations drm/nouveau/kms/nv50-: Share DP SST mode_valid() handling with MST drm/nouveau/kms/nv50-: Move 8BPC limit for MST into nv50_mstc_get_modes() ...
2020-05-25drm/i915/gem: Suppress some random warningsChris Wilson
Leave the error propagation in place, but limit the warnings to only show up in CI if the unlikely errors are hit. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200525141957.3061-2-chris@chris-wilson.co.uk
2020-05-19drm/i915/gem: Prefer drm_WARN* over WARN*Pankaj Bharadiya
struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN* over WARN* at places where struct drm_device pointer can be extracted. Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504181600.18503-6-pankaj.laxminarayan.bharadiya@intel.com
2020-05-14drm/i915: Drop no-semaphore boostingChris Wilson
Now that we have fast timeslicing on semaphores, we no longer need to prioritise none-semaphore work as we will yield any work blocked on a semaphore to the next in the queue. Previously with no timeslicing, blocking on the semaphore caused extremely bad scheduling with multiple clients utilising multiple rings. Now, there is no impact and we can remove the complication. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200513173504.28322-1-chris@chris-wilson.co.uk
2020-05-13drm/i915/gem: Remove redundant exec_fenceChris Wilson
Since there can only be one of in_fence/exec_fence, just use the single in_fence local. v2: Consolidate lookup Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200513180937.28992-1-chris@chris-wilson.co.uk
2020-05-07drm/i915: Remove wait priority boostingChris Wilson
Upon waiting a request (when asked), we gave that request a small priority boost, not enough for it to cause preemption, but enough for it to be scheduled next before all equals. We also used that bit to give new clients a small priority boost, similar to FQ_CODEL, such that we favoured short interactive tasks ahead of long running streams. However, this is causing lots of complications with timeslicing where we both want to honour the boost and yet ignore it. Those complications cause unexpected user behaviour (tasks not being timesliced and run concurrently as epxected), and the easiest way to resolve that is to remove the boost. Hopefully, we can find a compromise again if we need to, but in theory timeslicing itself and future more advanced schedulers should give us the interactivity boost we seek. Testcase: igt/gem_exec_schedule/lateslice Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200507152338.7452-3-chris@chris-wilson.co.uk
2020-05-04drm/i915/gem: Implement legacy MI_STORE_DATA_IMMChris Wilson
The older arches did not convert MI_STORE_DATA_IMM to using the GTT, but left them writing to a physical address. The notes suggest that the primary reason would be so that the writes were cache coherent, as the CPU cache uses physical tagging. As such we did not implement the legacy variant of MI_STORE_DATA_IMM and so left all the relocations synchronous -- but with a small function to convert from the vma address into the physical address, we can implement asynchronous relocs on these older arches, fixing up a few tests that require them. In order to be able to test the legacy paths, refactor the gpu relocations so that we can hook them up to a selftest. v2: Use an array of offsets not enum labels for the selftest v3: Refactor the common igt_hexdump() Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/757 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504140629.28240-1-chris@chris-wilson.co.uk
2020-05-04drm/i915/gem: Specify address type for chained reloc batchesChris Wilson
It is required that a chained batch be in the same address domain as its parent, and also that must be specified in the command for earlier gen as it is not inferred from the chaining until gen6. Fixes: 964a9b0f611e ("drm/i915/gem: Use chained reloc batches") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504125149.4396-1-chris@chris-wilson.co.uk
2020-05-01drm/i915/gem: Try an alternate engine for relocationsChris Wilson
If at first we don't succeed, try try again. Not all engines may support the MI ops we need to perform asynchronous relocation patching, and so we end up falling back to a synchronous operation that has a liability of blocking. However, Tvrtko pointed out we don't need to use the same engine to perform the relocations as we are planning to execute the execbuf on, and so if we switch over to a working engine, we can perform the relocation asynchronously. The user execbuf will be queued after the relocations by virtue of fencing. This patch creates a new context per execbuf requiring asynchronous relocations on an unusable engines. This is perhaps a bit excessive and can be ameliorated by a small context cache, but for the moment we only need it for working around a little used engine on Sandybridge, and only if relocations are actually required to an active batch buffer. Now we just need to teach the relocation code to handle physical addressing for gen2/3, and we should then have universal support! Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Testcase: igt/gem_exec_reloc/basic-spin # snb Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-3-chris@chris-wilson.co.uk
2020-05-01drm/i915/gem: Use a single chained reloc batches for a single execbufChris Wilson
As we can now keep chaining together a relocation batch to process any number of relocations, we can keep building that relocation batch for all of the target vma. This avoiding emitting a new request into the ring for each target, consuming precious ring space and a potential stall. v2: Propagate the failure from submitting the relocation batch. Testcase: igt/gem_exec_reloc/basic-wide-active Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-2-chris@chris-wilson.co.uk