diff options
author | Felix Kuehling <Felix.Kuehling@amd.com> | 2017-09-21 16:26:41 -0400 |
---|---|---|
committer | Alex Deucher <alexander.deucher@amd.com> | 2017-09-28 16:03:30 -0400 |
commit | c98171ccf6580407d07a3b5dc8188ce9e1f4f7ca (patch) | |
tree | 191e13e83451ea87c6e4fdc758425b046a0c4bf8 /drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | |
parent | 1bab0fc01b84c1aa8a65a1f1de885e1faab48264 (diff) |
drm/amdgpu: Handle GPUVM fault storms
When many wavefronts cause VM faults at the same time, it can
overwhelm the interrupt handler and cause IH ring overflows before
the driver can notify or kill the faulting application.
As a workaround I'm introducing limited per-VM fault credit. After
that number of VM faults have occurred, further VM faults are
filtered out at the prescreen stage of processing.
This depends on the PASID in the interrupt packet, so it currently
only works for KFD contexts.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c')
-rw-r--r-- | drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 31 |
1 files changed, 31 insertions, 0 deletions
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 8fcc743dfa86..c91d5c7a273d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2682,6 +2682,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm, } INIT_KFIFO(vm->faults); + vm->fault_credit = 16; return 0; @@ -2776,6 +2777,36 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm) } /** + * amdgpu_vm_pasid_fault_credit - Check fault credit for given PASID + * + * @adev: amdgpu_device pointer + * @pasid: PASID do identify the VM + * + * This function is expected to be called in interrupt context. Returns + * true if there was fault credit, false otherwise + */ +bool amdgpu_vm_pasid_fault_credit(struct amdgpu_device *adev, + unsigned int pasid) +{ + struct amdgpu_vm *vm; + + spin_lock(&adev->vm_manager.pasid_lock); + vm = idr_find(&adev->vm_manager.pasid_idr, pasid); + spin_unlock(&adev->vm_manager.pasid_lock); + if (!vm) + /* VM not found, can't track fault credit */ + return true; + + /* No lock needed. only accessed by IRQ handler */ + if (!vm->fault_credit) + /* Too many faults in this VM */ + return false; + + vm->fault_credit--; + return true; +} + +/** * amdgpu_vm_manager_init - init the VM manager * * @adev: amdgpu_device pointer |