drm/i915/gt: Retry RING_HEAD reset until it get sticks
we see an issue where resets fails because the engine resumes from an incorrect RING_HEAD. Since the RING_HEAD doesn't point to the remaining requests to re-run, but may instead point into the uninitialised portion of the ring, the GPU may be then fed invalid instructions from a privileged context, oft pushing the GPU into an unrecoverable hang. If at first the write doesn't succeed, try, try again. v2: Avoid unnecessary timeout macro (Andi) v3: Correct comment format (Andi) v4: Make it generic for all platform as it won't impact (Chris) Link: https://gitlab.freedesktop.org/drm/intel/-/issues/5432 Testcase: igt/i915_selftest/hangcheck Signed-off-by:Chris Wilson <chris.p.wilson@linux.intel.com> Signed-off-by:
Nitin Gote <nitin.r.gote@intel.com> Reviewed-by:
Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by:
Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241015145710.2478599-1-nitin.r.gote@intel.com
Loading
Please sign in to comment