summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/gt/intel_engine_cs.c
AgeCommit message (Collapse)Author
2019-12-13drm/i915: Introduce new macros for tracingVenkata Sandeep Dhanalakota
New macros ENGINE_TRACE(), CE_TRACE(), RQ_TRACE() and GT_TRACE() are introduce to tag device name and engine name with contexts and requests tracing in i915. Cc: Sudeep Dutt <sudeep.dutt@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191213155152.69182-2-venkata.s.dhanalakota@intel.com
2019-12-05drm/i915/gt: Replace I915_READ with intel_uncore_readAndi Shyti
Get rid of the last remaining I915_READ in gt/ and make gt-land the first I915_READ-free happy island. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191205164422.727968-1-chris@chris-wilson.co.uk
2019-11-28drm/i915/selftests: Try to show where the pulse wentChris Wilson
We have a case of a mysteriously absent pulse, so dump the engine details to see if we can find out what happened to it. References: https://bugs.freedesktop.org/show_bug.cgi?id=112405 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191128102546.3857140-1-chris@chris-wilson.co.uk
2019-11-25drm/i915/gt: Schedule request retirement when timeline idlesChris Wilson
The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA") is that it disables RC6 while Skylake (and friends) is active, and we do not consider the GPU idle until all outstanding requests have been retired and the engine switched over to the kernel context. If userspace is idle, this task falls onto our background idle worker, which only runs roughly once a second, meaning that userspace has to have been idle for a couple of seconds before we enable RC6 again. Naturally, this causes us to consume considerably more energy than before as powersaving is effectively disabled while a display server (here's looking at you Xorg) is running. As execlists will get a completion event as each context is completed, we can use this interrupt to queue a retire worker bound to this engine to cleanup idle timelines. We will then immediately notice the idle engine (without userspace intervention or the aid of the background retire worker) and start parking the GPU. Thus during light workloads, we will do much more work to idle the GPU faster... Hopefully with commensurate power saving! v2: Watch context completions and only look at those local to the engine when retiring to reduce the amount of excess work we perform. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315 References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA") References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
2019-11-11drm/i915: Protect context while grabbing its name for the requestChris Wilson
Inside print_request(), we query the context/timeline name. Nothing immediately protects the context from being freed if the request is complete -- we rely on serialisation by the caller to keep the name valid until they finish using it. Inside intel_engine_dump(), we generally only print the requests in the execution queue protected by the engine->active.lock, but we also show the pending execlists ports which are not protected and so require a rcu_read_lock to keep the pointer valid. [ 1695.700883] BUG: KASAN: use-after-free in i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.700981] Read of size 8 at addr ffff8887344f4d50 by task gem_ctx_persist/2968 [ 1695.701068] [ 1695.701156] CPU: 1 PID: 2968 Comm: gem_ctx_persist Tainted: G U 5.4.0-rc6+ #331 [ 1695.701246] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017 [ 1695.701334] Call Trace: [ 1695.701424] dump_stack+0x5b/0x90 [ 1695.701870] ? i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.701964] print_address_description.constprop.7+0x36/0x50 [ 1695.702408] ? i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.702856] ? i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.702947] __kasan_report.cold.10+0x1a/0x3a [ 1695.703390] ? i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.703836] i915_fence_get_timeline_name+0x53/0x90 [i915] [ 1695.704241] print_request+0x82/0x2e0 [i915] [ 1695.704638] ? fwtable_read32+0x133/0x360 [i915] [ 1695.705042] ? write_timestamp+0x110/0x110 [i915] [ 1695.705133] ? _raw_spin_lock_irqsave+0x79/0xc0 [ 1695.705221] ? refcount_inc_not_zero_checked+0x91/0x110 [ 1695.705306] ? refcount_dec_and_mutex_lock+0x50/0x50 [ 1695.705709] ? intel_engine_find_active_request+0x202/0x230 [i915] [ 1695.706115] intel_engine_dump+0x2c9/0x900 [i915] Fixes: c36eebd9ba5d ("drm/i915/gt: execlists->active is serialised by the tasklet") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-1-chris@chris-wilson.co.uk
2019-11-07drm/i915/gt: Defer engine registration until fully initialisedChris Wilson
Only add the engine to the available set of uabi engines once it has been fully initialised and we know we want it in the public set. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Wajdeczko <michal.wajdeczko@intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191107081252.10542-17-chris@chris-wilson.co.uk
2019-10-29drm/i915/gt: Make timeslice duration configurableChris Wilson
Execlists uses a scheduling quantum (a timeslice) to alternate execution between ready-to-run contexts of equal priority. This ensures that all users (though only if they of equal importance) have the opportunity to run and prevents livelocks where contexts may have implicit ordering due to userspace semaphores. However, not all workloads necessarily benefit from timeslicing and in the extreme some sysadmin may want to disable or reduce the timeslicing granularity. The timeslicing mechanism can be compiled out^W^W disabled (but should DCE!) with ./scripts/config --set-val DRM_I915_TIMESLICE_DURATION 0 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191029091632.26281-1-chris@chris-wilson.co.uk
2019-10-29drm/i915: Don't try to place HWS in non-existing mappable regionMichal Wajdeczko
HWS placement restrictions can't just rely on HAS_LLC flag. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191029095856.25431-5-matthew.auld@intel.com
2019-10-24drm/i915/gt: Split intel_ring_submissionChris Wilson
Split the legacy submission backend from the common CS ring buffer handling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191024100344.5041-1-chris@chris-wilson.co.uk
2019-10-23drm/i915/gt: Replace hangcheck by heartbeatsChris Wilson
Replace sampling the engine state every so often with a periodic heartbeat request to measure the health of an engine. This is coupled with the forced-preemption to allow long running requests to survive so long as they do not block other users. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-5-chris@chris-wilson.co.uk
2019-10-23drm/i915/execlists: Force preemptionChris Wilson
If the preempted context takes too long to relinquish control, e.g. it is stuck inside a shader with arbitration disabled, evict that context with an engine reset. This ensures that preemptions are reasonably responsive, providing a tighter QoS for the more important context at the cost of flagging unresponsive contexts more frequently (i.e. instead of using an ~10s hangcheck, we now evict at ~100ms). The challenge of lies in picking a timeout that can be reasonably serviced by HW for typical workloads, balancing the existing clients against the needs for responsiveness. Note that coupled with timeslicing, this will lead to rapid GPU "hang" detection with multiple active contexts vying for GPU time. The forced preemption mechanism can be compiled out with ./scripts/config --set-val DRM_I915_PREEMPT_TIMEOUT 0 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-2-chris@chris-wilson.co.uk
2019-10-23drm/i915/gt: Try to more gracefully quiesce the system before resetsChris Wilson
If we are doing a normal GPU reset triggered after detecting a long period of stalled work, we can take our time and allow the engines to quiesce. Since we've stopped submission to the engine, and if we wait long enough an innocent context should complete, leaving the engine idle. So by waiting a short amount of time, we should prevent clobbering other users when resetting a stuck context. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Suggested-by: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-1-chris@chris-wilson.co.uk
2019-10-22drm/i915: Pass intel_gt to intel_engines_initTvrtko Ursulin
Engines belong to the GT so make it indicative in the API. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-6-tvrtko.ursulin@linux.intel.com
2019-10-22drm/i915: Pass intel_gt to intel_engines_setupTvrtko Ursulin
Engines belong to the GT so make it indicative in the API. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-5-tvrtko.ursulin@linux.intel.com
2019-10-22drm/i915: Pass intel_gt to intel_engines_cleanupTvrtko Ursulin
Engines belong to the GT so make it indicative in the API. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-4-tvrtko.ursulin@linux.intel.com
2019-10-22drm/i915: Pass intel_gt to intel_setup_engine_capabilitiesTvrtko Ursulin
Engines belong to the GT so make it indicative in the API. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-3-tvrtko.ursulin@linux.intel.com
2019-10-22drm/i915: Pass intel_gt to intel_engines_init_mmioTvrtko Ursulin
Engines belong to the GT so make it indicative in the API. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-2-tvrtko.ursulin@linux.intel.com
2019-10-18drm/i915: Pass in intel_gt at some for_each_engine sitesTvrtko Ursulin
Where the function, or code segment, operates on intel_gt, we need to start passing it instead of i915 to for_each_engine(_masked). This is another partial step in migration of i915->engines[] to gt->engines[]. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191017094500.21831-2-tvrtko.ursulin@linux.intel.com
2019-10-18drm/i915: Make for_each_engine_masked work on intel_gtTvrtko Ursulin
Medium term goal is to eliminate the i915->engine[] array and to get there we have recently introduced equivalent array in intel_gt. Now we need to migrate the code further towards this state. This next step is to eliminate usage of i915->engines[] from the for_each_engine_masked iterator. For this to work we also need to use engine->id as index when populating the gt->engine[] array and adjust the default engine set indexing to use engine->legacy_idx instead of assuming gt->engines[] indexing. v2: * Populate gt->engine[] earlier. * Check that we don't duplicate engine->legacy_idx v3: * Work around the initialization order issue between default_engines() and intel_engines_driver_register() which sets engine->legacy_idx for now. It will be fixed properly later. v4: * Merge with forgotten v2.5. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191017161852.8836-1-tvrtko.ursulin@linux.intel.com
2019-10-16drm/i915/execlist: Trim immediate timeslice expiryChris Wilson
We perform timeslicing immediately upon receipt of a request that may be put into the second ELSP slot. The idea behind this was that since we didn't install the timer if the second ELSP slot was empty, we would not have any idea of how long ELSP[0] had been running and so giving the newcomer a chance on the GPU was fair. However, this causes us extra busy work that we may be able to avoid if we wait a jiffie for the first timeslice as normal. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191016100851.4979-1-chris@chris-wilson.co.uk
2019-10-09drm/i915/gt: execlists->active is serialised by the taskletChris Wilson
The active/pending execlists is no longer protected by the engine->active.lock, but is serialised by the tasklet instead. Update the locking around the debug and stats to follow suit. v2: local_bh_disable() to prevent recursing into the tasklet in case we trigger a softirq (Tvrtko) Fixes: df403069029d ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191009160906.16195-1-chris@chris-wilson.co.uk
2019-10-08drm/i915/gt: Give engine->kernel_context distinct timeline lock classesChris Wilson
Assign a separate lockclass to the perma-pinned timelines of the kernel_context, such that we can use them from within the user timelines should we ever need to inject GPU operations to fixup faults during request construction. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191008185941.15228-1-chris@chris-wilson.co.uk
2019-10-08drm/i915/gt: Flush submission tasklet before waiting/retiringChris Wilson
A common bane of ours is arbitrary delays in ksoftirqd processing our submission tasklet. Give the submission tasklet a kick before we wait to avoid those delays eating into a tight timeout. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191008105655.13256-1-chris@chris-wilson.co.uk
2019-10-07drm/i915/gt: Prefer local path to runtime powermanagementChris Wilson
Avoid going to the base i915 device when we already have a path from gt to the runtime powermanagement interface. The benefit is that it looks a bit more self-consistent to always be acquiring the gt->uncore->rpm for use with the gt->uncore. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191007154531.1750-1-chris@chris-wilson.co.uk
2019-09-27drm/i915: check for kernel_contextMatthew Auld
Explosions during early driver init on the error path. Make sure we fail gracefully. [ 9547.672258] BUG: kernel NULL pointer dereference, address: 000000000000007c [ 9547.672288] #PF: supervisor read access in kernel mode [ 9547.672292] #PF: error_code(0x0000) - not-present page [ 9547.672296] PGD 8000000846b41067 P4D 8000000846b41067 PUD 797034067 PMD 0 [ 9547.672303] Oops: 0000 [#1] SMP PTI [ 9547.672307] CPU: 1 PID: 25634 Comm: i915_selftest Tainted: G U 5.3.0-rc8+ #73 [ 9547.672313] Hardware name: /NUC6i7KYB, BIOS KYSKLi70.86A.0050.2017.0831.1924 08/31/2017 [ 9547.672395] RIP: 0010:intel_context_unpin+0x9/0x100 [i915] [ 9547.672400] Code: 6b 60 00 e9 17 ff ff ff bd fc ff ff ff e9 7c ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 41 54 55 53 <8b> 47 7c 83 f8 01 74 26 8d 48 ff f0 0f b1 4f 7c 48 8d 57 7c 75 05 [ 9547.672413] RSP: 0018:ffffae8ac24ff878 EFLAGS: 00010246 [ 9547.672417] RAX: ffff944a1b7842d0 RBX: ffff944a1b784000 RCX: ffff944a12dd6fa8 [ 9547.672422] RDX: ffff944a1b7842c0 RSI: ffff944a12dd5328 RDI: 0000000000000000 [ 9547.672428] RBP: 0000000000000000 R08: ffff944a11e5d840 R09: 0000000000000000 [ 9547.672433] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 9547.672438] R13: ffffffffc11aaf00 R14: 00000000ffffffe4 R15: ffff944a0e29bf38 [ 9547.672443] FS: 00007fc259b88ac0(0000) GS:ffff944a1f880000(0000) knlGS:0000000000000000 [ 9547.672449] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9547.672454] CR2: 000000000000007c CR3: 0000000853346003 CR4: 00000000003606e0 [ 9547.672459] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 9547.672464] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 9547.672469] Call Trace: [ 9547.672518] intel_engine_cleanup_common+0xe3/0x270 [i915] [ 9547.672567] execlists_destroy+0xe/0x30 [i915] [ 9547.672669] intel_engines_init+0x94/0xf0 [i915] [ 9547.672749] i915_gem_init+0x191/0x950 [i915] Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190927173409.31175-2-matthew.auld@intel.com
2019-09-20drm/i915: Mark i915_request.timeline as a volatile, rcu pointerChris Wilson
The request->timeline is only valid until the request is retired (i.e. before it is completed). Upon retiring the request, the context may be unpinned and freed, and along with it the timeline may be freed. We therefore need to be very careful when chasing rq->timeline that the pointer does not disappear beneath us. The vast majority of users are in a protected context, either during request construction or retirement, where the timeline->mutex is held and the timeline cannot disappear. It is those few off the beaten path (where we access a second timeline) that need extra scrutiny -- to be added in the next patch after first adding the warnings about dangerous access. One complication, where we cannot use the timeline->mutex itself, is during request submission onto hardware (under spinlocks). Here, we want to check on the timeline to finalize the breadcrumb, and so we need to impose a second rule to ensure that the request->timeline is indeed valid. As we are submitting the request, it's context and timeline must be pinned, as it will be used by the hardware. Since it is pinned, we know the request->timeline must still be valid, and we cannot submit the idle barrier until after we release the engine->active.lock, ergo while submitting and holding that spinlock, a second thread cannot release the timeline. v2: Don't be lazy inside selftests; hold the timeline->mutex for as long as we need it, and tidy up acquiring the timeline with a bit of refactoring (i915_active_add_request) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk
2019-09-16drm/i915: Show the logical context ring state on dumpingChris Wilson
Include the active context register state when dumping the engine. Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190915203701.29163-1-chris@chris-wilson.co.uk
2019-08-28drm/i915/selftests: Ignore coherency failures on BroadwaterChris Wilson
We've been ignoring similar coherency issues in IGT for Broadwater, and specifically Broadwater (original gen4) and not, for example, Crestline (same generation as Broadwater, but the mobile variant). Without any means to reproduce locally (I have a 965GM but alas no 965G), fixing will be slow, so tell CI to ignore any failure until we are ready with a fix. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190826133837.6784-1-chris@chris-wilson.co.uk
2019-08-23drm/i915: Refactor instdone loops on new subslice functionsStuart Summers
Refactor instdone loops to use the new intel_sseu_has_subslice function. Signed-off-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-10-stuart.summers@intel.com
2019-08-20drm/i915/tgl: Gen12 render context sizeDaniele Ceraolo Spurio
Re-use Gen11 context size for now. [ Lucas: this is a temporary enabling patch that needs to be confirmed: we need to check BSpec 46255 and recompute ] Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190817093902.2171-27-lucas.demarchi@intel.com
2019-08-16drm/i915/execlists: Lift process_csb() out of the irq-off spinlockChris Wilson
If we only call process_csb() from the tasklet, though we lose the ability to bypass ksoftirqd interrupt processing on direct submission paths, we can push it out of the irq-off spinlock. The penalty is that we then allow schedule_out to be called concurrently with schedule_in requiring us to handle the usage count (baked into the pointer itself) atomically. As we do kick the tasklets (via local_bh_enable()) after our submission, there is a possibility there to see if we can pull the local softirq processing back from the ksoftirqd. v2: Store the 'switch_priority_hint' on submission, so that we can safely check during process_csb(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190816171608.11760-1-chris@chris-wilson.co.uk
2019-08-15drm/i915: Protect request retirement with timeline->mutexChris Wilson
Forgo the struct_mutex requirement for request retirement as we have been transitioning over to only using the timeline->mutex for controlling the lifetime of a request on that timeline. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190815205709.24285-4-chris@chris-wilson.co.uk
2019-08-14drm/i915: Print CCID for all renderCSStuart Summers
Use render class instead of RCS0 when printing CCID. Signed-off-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190813174121.129593-2-stuart.summers@intel.com
2019-08-14drm/i915: Include engine->mmio_base in the debug dumpChris Wilson
Some IGT would like to know the mmio address of each engine so make it available. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190813215707.14703-1-chris@chris-wilson.co.uk
2019-08-12drm/i915/gt: Use the local engine wakeref when checking RING registersChris Wilson
Now that we can atomically acquire the engine wakeref, make use of it when check whether the RING registers are idle. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190812091045.29587-7-chris@chris-wilson.co.uk
2019-08-09drm/i915: Lift timeline into intel_contextChris Wilson
Move the timeline from being inside the intel_ring to intel_context itself. This saves much pointer dancing and makes the relations of the context to its timeline much clearer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190809182518.20486-4-chris@chris-wilson.co.uk
2019-08-09drm/i915: Push the ring creation flags to the backendChris Wilson
Push the ring creation flags from the outer GEM context to the inner intel_context to avoid an unsightly back-reference from inside the backend. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190809182518.20486-3-chris@chris-wilson.co.uk
2019-08-08drm/i915: Defer final intel_wakeref_put to process contextChris Wilson
As we need to acquire a mutex to serialise the final intel_wakeref_put, we need to ensure that we are in process context at that time. However, we want to allow operation on the intel_wakeref from inside timer and other hardirq context, which means that need to defer that final put to a workqueue. Inside the final wakeref puts, we are safe to operate in any context, as we are simply marking up the HW and state tracking for the potential sleep. It's only the serialisation with the potential sleeping getting that requires careful wait avoidance. This allows us to retain the immediate processing as before (we only need to sleep over the same races as the current mutex_lock). v2: Add a selftest to ensure we exercise the code while lockdep watches. v3: That test was extremely loud and complained about many things! v4: Not a whale! Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111295 References: https://bugs.freedesktop.org/show_bug.cgi?id=111245 References: https://bugs.freedesktop.org/show_bug.cgi?id=111256 Fixes: 18398904ca9e ("drm/i915: Only recover active engines") Fixes: 51fbd8de87dc ("drm/i915/pmu: Atomically acquire the gt_pm wakeref") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190808202758.10453-1-chris@chris-wilson.co.uk
2019-08-08drm/i915: Allocate kernel_contexts directlyChris Wilson
Ignore the central i915->kernel_context for allocating an engine, as that GEM context is being phased out. For internal clients, we just need the per-engine logical state, so allocate it at the point of use. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190808110612.23539-1-chris@chris-wilson.co.uk
2019-08-07drm/i915: Rename engines to match their user interfaceChris Wilson
During engine setup, we may find that some engines are fused off causing a misalignment between internal names and the instances seen by users, e.g. (I915_ENGINE_CLASS_VIDEO_DECODE, 1) may be vcs2 in hardware. Normally this is invisible to the user, but a few debug interfaces (and our own internal tracing) use the original HW name not the name the user would expect as formed from their class:instance tuple. Replace our internal name with the uabi name for consistency with, for example, error states. v2: Keep the pretty printing of class name in the selftest Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111311 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190807110431.8130-1-chris@chris-wilson.co.uk
2019-08-06drm/i915/gt: Move the [class][inst] lookup for engines onto the GTChris Wilson
To maintain a fast lookup from a GT centric irq handler, we want the engine lookup tables on the intel_gt. To avoid having multiple copies of the same multi-dimension lookup table, move the generic user engine lookup into an rbtree (for fast and flexible indexing). v2: Split uabi_instance cf uabi_class v3: Set uabi_class/uabi_instance after collating all engines to provide a stable uabi across parallel unordered construction. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2 Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-2-chris@chris-wilson.co.uk
2019-08-04drm/i915: Replace struct_mutex for batch pool serialisationChris Wilson
Switch to tracking activity via i915_active on individual nodes, only keeping a list of retired objects in the cache, and reaping the cache when the engine itself idles. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190804124826.30272-2-chris@chris-wilson.co.uk
2019-08-02drm/i915: Add i915 to i915_inject_probe_failureMichal Wajdeczko
With i915 added to i915_inject_probe_failure we can use dedicated printk when injecting artificial load failure. Also make this function look like other i915 functions that return error code and make it more flexible to return any provided error code instead of previously assumed -ENODEV. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190802184055.31988-2-michal.wajdeczko@intel.com
2019-07-19drm/i915: Fix and improve MCR selection logicTvrtko Ursulin
A couple issues were present in this code: 1. fls() usage was incorrect causing off by one in subslice mask lookup, which in other words means subslice mask of all zeroes is always used (subslice mask of a slice which is not present, or even out of bounds array access), rendering the checks in wa_init_mcr either futile or random. 2. Condition in WARN_ON was not correct. It is doing a bitwise and operation between a positive (present subslices) and negative mask (disabled L3 banks). This means that with corrected fls() usage the assert would always incorrectly fail. We could fix this by inverting the fuse bits in the check, but instead do one better and improve the code so it not only asserts, but finds the first common index between the two masks and only warns if no such index can be found. v2: * Simplify check for logic and redability. * Improve commentary explaining what is really happening ie. what the assert is really trying to check and why. v3: * Find first common index instead of just asserting. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: fe864b76c2ab ("drm/i915: Implement WaProgramMgsrForL3BankSpecificMmioReads") Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1 Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190717180624.20354-4-tvrtko.ursulin@linux.intel.com
2019-07-19drm/i915: Trust programmed MCR in read_subslice_regTvrtko Ursulin
Instead of re-calculating the MCR selector in read_subslice_reg do the rwm on its existing value and restore it when done. This consolidates MCR programming to one place for cnl+, and avoids re-calculating its default value on older platforms during hangcheck. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190717180624.20354-3-tvrtko.ursulin@linux.intel.com
2019-07-19drm/i915: Fix GEN8_MCR_SELECTOR programmingTvrtko Ursulin
fls returns bit positions starting from one for the lsb and the MCR register expects zero based (sub)slice addressing. Incorrent MCR programming can have the effect of directing MMIO reads of registers in the 0xb100-0xb3ff range to invalid subslice returning zeroes instead of actual content. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 1e40d4aea57b ("drm/i915/cnl: Implement WaProgramMgsrForCorrectSliceSpecificMmioReads") Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190717180624.20354-2-tvrtko.ursulin@linux.intel.com
2019-07-16drm/i915/execlists: Disable preemption under GVTChris Wilson
Preempt-to-busy uses a GPU semaphore to enforce an idle-barrier across preemption, but mediated gvt does not fully support semaphores. v2: Fiddle around with the flags and settle on using has-semaphores for the core bits so that we retain the ability to preempt our own semaphores. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Xiaolin Zhang <xiaolin.zhang@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190709091233.8573-1-chris@chris-wilson.co.uk
2019-07-16drm/i915: Lock the engine while dumping the active requestChris Wilson
We cannot let the request be retired and freed while we are trying to dump it during error capture. It is not sufficient just to grab a reference to the request, as during retirement we may free the ring which we are also dumping. So take the engine lock to prevent retiring and freeing of the request. Reported-by: Alex Shumsky <alexthreed@gmail.com> Fixes: 83c317832eb1 ("drm/i915: Dump the ringbuffer of the active request for debugging") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Alex Shumsky <alexthreed@gmail.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190715080946.15593-6-chris@chris-wilson.co.uk
2019-07-12drm/i915/gt: Use intel_gt as the primary object for handling resetsChris Wilson
Having taken the first step in encapsulating the functionality by moving the related files under gt/, the next step is to start encapsulating by passing around the relevant structs rather than the global drm_i915_private. In this step, we pass intel_gt to intel_reset.c Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190712192953.9187-1-chris@chris-wilson.co.uk
2019-07-12drm/i915: Replace "_load" with "_probe" consequentlyJanusz Krzysztofik
Use the "_probe" nomenclature not only in i915_driver_probe() helper name but also in other related function / variable names for consistency. Only the userspace exposed name of a related module parameter is left untouched. Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190712112429.740-4-janusz.krzysztofik@linux.intel.com