Base solution for your next web application
Open Closed

Entity Framework concurrent savechanges deadlock #11288


User avatar
0
hongbing.wang created

Product version: 11.2.1 Product type: Angular Product framework type: .net core

This issue may be related to ABP framework. ABP framework version: 7.2

In our project, we can delete a single entity, which contains many children and grandchildren, without any issues. 1.

private async Task DeleteControllerDevicesAsync(int controllerId)
            {
                  // delete credentials
                  await credentialRepository.DeleteAsync(d => d.ControllerId == controllerId);
 
                  // delete user access rights
                  await cagRepository.DeleteAsync(d => d.ControllerId == controllerId);
 
                  // delelet timezone and holiday mapping
                  await tzLinkRepository.DeleteAsync(d => d.ControllerId == controllerId);
 
                  await deviceRepository.DeleteAsync(d => d.ParentId == controllerId);
            }
//delete the entity
await _deviceRepository.DeleteAsync(id);

However, if we delete 4 or more of the same entities at the same time from our Angular client, we got the following exception.

at Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerExecutionStrategy.ExecuteAsync[TState,TResult](TState state, Func4 operation, Func4 verifySucceeded, CancellationToken cancellationToken) at Microsoft.EntityFrameworkCore.DbContext.SaveChangesAsync(Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken) at Microsoft.EntityFrameworkCore.DbContext.SaveChangesAsync(Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken) at Abp.EntityFrameworkCore.AbpDbContext.SaveChangesAsync(CancellationToken cancellationToken) at Abp.Zero.EntityFrameworkCore.AbpZeroCommonDbContext`3.SaveChangesAsync(CancellationToken cancellationToken) at Abp.EntityFrameworkCore.Uow.EfCoreUnitOfWork.SaveChangesAsync() at Abp.EntityFrameworkCore.Uow.EfCoreUnitOfWork.CompleteUowAsync()

2022-10-03 17:56:35.674 ERR] An exception occurred in the database while saving changes for context type 'umsplus.EntityFrameworkCore.umsplusDbContext'. System.InvalidOperationException: An exception has been raised that is likely due to a transient failure. Consider enabling transient error resiliency by adding 'EnableRetryOnFailure' to the 'UseSqlServer' call. ---> Microsoft.EntityFrameworkCore.DbUpdateException: An error occurred while saving the entity changes. See the inner exception for details. ---> Microsoft.Data.SqlClient.SqlException (0x80131904): Transaction (Process ID 72) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

2022-10-03 17:56:35.689 ERR] An exception has been raised that is likely due to a transient failure. Consider enabling transient error resiliency by adding 'EnableRetryOnFailure' to the 'UseSqlServer' call. System.InvalidOperationException: An exception has been raised that is likely due to a transient failure. Consider enabling transient error resiliency by adding 'EnableRetryOnFailure' to the 'UseSqlServer' call. ---> Microsoft.EntityFrameworkCore.DbUpdateException: An error occurred while saving the entity changes. See the inner exception for details. ---> Microsoft.Data.SqlClient.SqlException (0x80131904): Transaction (Process ID 72) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

Tried adding the following to Startup.cs ConfigureServices(IServiceCollection services). It didn’t help.

services.AddDbContext<umsplusDbContext>(options =>
            {
                options.UseSqlServer(_appConfiguration.GetConnectionString("umsplusDb"),
                    sqlServerOptionsAction: sqlOptions =>
                    {
                        sqlOptions.EnableRetryOnFailure();
                    });
            });

Does this mean ABP framework doesn't support EnableRetryOnFailure?

Since EnableRetryOnFailure doesn't help resolve the issue in this case, please sugest other approaches for investigating this issue.

Please note that there is no issue with running the corresponding raw sql to delete the single enity.


10 Answer(s)
  • User Avatar
    0
    ismcagdas created
    Support Team

    Hi,

    As I understand correctly, the problem occurs in DeleteControllerDevicesAsync method. If so, could you also share the outer code block which calls DeleteControllerDevicesAsync ? It would be great if you can share the entire flow starting from appservice or controller.

  • User Avatar
    0
    hongbing.wang created

    Hi ismcagdas,

    Thanks for your quick reponse.

    The app service \webapi\src\umsplus.Application\Entities\DevicesAppService.cs

            [AbpAuthorize(AppPermsExtension.Pages_Devices_Delete, AppPermsExtension.Pages_Administration_CardProgrammers_Delete)]
    		public async Task Delete(EntityDto input)
    		{
    			var umsDevice = await _deviceRepository.FirstOrDefaultAsync((int)input.Id);
    			if (umsDevice != null)
    			{
    				_configUid = umsDevice.DeviceUid;
    
    				if (umsDevice.Type == DeviceType.Controller)
    				{
    					// if a controller/panel is deleted also need to delete its childen device
    					await DeleteControllerDevicesAsync(umsDevice.Id);
    				}
                    else ...
    
    				await DeleteConfigRecord(input.Id);
    			}
    		}
    
    		protected virtual async Task DeleteConfigRecord(int id)
            {
    			_unitOfWorkManager.Current.Completed += async (sender, args) =>
    			{
    				if (!string.IsNullOrEmpty(_configUid))
    				{
    					if (UidHelper.GetDeviceType(_configUid) == DeviceType.Controller)
    					{
    						// stop the comms if it is running
    						await PanelCommsServiceClient.SendControlActionAsync(ControlCommand.StopComms, _configUid);
    					}
    					else ...
    
    				}
    
    				await _auditHelper.SendAuditLog();
    			};
    
    			// look up the gui for deletion
    			var device = await _deviceRepository.FirstOrDefaultAsync(id);
    			if (device != null)
    			{
    				var deviceInput = new CreateOrEditDeviceDto
    				{
    					Type = device.Type,
    					Name = device.Name
    				};
    				await _auditHelper.InitAsync(deviceInput, device, (int)AbpSession.UserId, _deviceRepository, DbActionType.Deleted);
    
    				if (!string.IsNullOrEmpty(device.DeviceUid))
                    {
    					if (string.IsNullOrEmpty(_configUid))
    					{
    						_configUid = device.DeviceUid;
    					}
    				}
                    else
                    {
    					await RaiseDbActionEventAsync(device.Type, DbActionType.Deleted, (int)AbpSession.UserId, device.Name);
    				}
    			}
    
    			await _deviceRepository.DeleteAsync(id);
    		}
    
  • User Avatar
    0
    ismcagdas created
    Support Team

    Hi,

    Does this error happen in DeleteConfigRecord ? As I can see, other parts seem fine.

  • User Avatar
    0
    hongbing.wang created

    It coours in DeleteConfigRecord. (Correction: It should be 'DeleteControllerDevicesAsync' instead of 'DeleteConfigRecord'). See the following deadlock log.

    It's the same process process29ded69d848 which caused the deadlock, even though all the methods are async / await.

    Please note that this issue may be intermittent.

    deadlock-list deadlock victim=process29ded69d848 process-list process id=process29ded69d848 taskpriority=0 logused=848 waitresource=KEY: 5:72057594056409088 (17f6261b4a45) waittime=4922 ownerId=395133 transactionname=user_transaction lasttranstarted=2022-10-06T17:13:32.937 XDES=0x29ded188428 lockMode=RangeS-U schedulerid=10 kpid=23188 status=suspended spid=65 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2022-10-06T17:13:33.027 lastbatchcompleted=2022-10-06T17:13:33.020 lastattention=1900-01-01T00:00:00.020 clientapp=Core Microsoft SqlClient Data Provider hostname=<computerName> hostpid=2836 loginname=<loginName> isolationlevel=read uncommitted (1) xactid=395133 currentdb=5 currentdbname=umsplusDb lockTimeout=4294967295 clientoption1=673185824 clientoption2=128056 executionStack frame procname=adhoc line=2 stmtstart=100 stmtend=180 sqlhandle=0x020000006e9e0b047d943cc87e6ddf83e0d73728ec1c27170000000000000000000000000000000000000000 unknown
    frame procname=unknown line=1 sqlhandle=0x0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 unknown
    inputbuf (@p0 int,@p1 int,@p2 int,@p3 int)SET NOCOUNT ON; DELETE FROM [UmsDevice] WHERE [Id] = @p0; SELECT @@ROWCOUNT; DELETE FROM [UmsDevice] WHERE [Id] = @p1; SELECT @@ROWCOUNT; DELETE FROM [UmsDevice] WHERE [Id] = @p2; SELECT @@ROWCOUNT; DELETE FROM [UmsDevice] WHERE [Id] = @p3; SELECT @@ROWCOUNT; process id=process29ded6b5848 taskpriority=0 logused=4020 waitresource=KEY: 5:72057594055753728 (ffffffffffff) waittime=4912 ownerId=395132 transactionname=user_transaction lasttranstarted=2022-10-06T17:13:32.937 XDES=0x29ddf9f0428 lockMode=RangeS-U schedulerid=13 kpid=3508 status=suspended spid=63 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2022-10-06T17:13:33.027 lastbatchcompleted=2022-10-06T17:13:33.020 lastattention=1900-01-01T00:00:00.020 clientapp=Core Microsoft SqlClient Data Provider hostname=<computerName> hostpid=2836 loginname=<loginName> isolationlevel=read uncommitted (1) xactid=395132 currentdb=5 currentdbname=umsplusDb lockTimeout=4294967295 clientoption1=673185824 clientoption2=128056 executionStack frame procname=adhoc line=10 stmtstart=364 stmtend=444 sqlhandle=0x020000006e9e0b047d943cc87e6ddf83e0d73728ec1c27170000000000000000000000000000000000000000 unknown
    frame procname=unknown line=1 sqlhandle=0x0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 unknown
    inputbuf (@p0 int,@p1 int,@p2 int,@p3 int)SET NOCOUNT ON; DELETE FROM [UmsDevice] WHERE [Id] = @p0; SELECT @@ROWCOUNT; DELETE FROM [UmsDevice] WHERE [Id] = @p1; SELECT @@ROWCOUNT; DELETE FROM [UmsDevice] WHERE [Id] = @p2; SELECT @@ROWCOUNT; DELETE FROM [UmsDevice] WHERE [Id] = @p3; SELECT @@ROWCOUNT; resource-list keylock hobtid=72057594056409088 dbid=5 objectname=umsplusDb.dbo.UmsDeviceConfig indexname=IX_UmsDeviceConfig_DeviceId id=lock29dd791f780 mode=RangeX-X associatedObjectId=72057594056409088 owner-list owner id=process29ded6b5848 mode=RangeX-X waiter-list waiter id=process29ded69d848 mode=RangeS-U requestType=wait keylock hobtid=72057594055753728 dbid=5 objectname=umsplusDb.dbo.UmsCAG indexname=IX_UmsCAG_ControllerId id=lock29dd791e600 mode=RangeS-U associatedObjectId=72057594055753728 owner-list owner id=process29ded69d848 mode=RangeS-U waiter-list waiter id=process29ded6b5848 mode=RangeS-U requestType=wait


    Please note that running the following stored peocedure doesn't have any issues. begin tran delete from UmsCredential where controllerid=@controllerId delete from UmsCAG where controllerid=@controllerId delete from UmsTzHolidayLink where controllerid=@controllerId delete from UmsDevice where ParentId=@controllerId delete from UmsDevice where id=@controllerId commit tran

  • User Avatar
    0
    ismcagdas created
    Support Team

    Hi,

    Could you use Domain Events instead of using _unitOfWorkManager.Current.Completed ?

    Using UnitOfWork completed event for such a DB operation may cause those problems randomly. UoW completed event is better for non-db operations.

  • User Avatar
    0
    hongbing.wang created

    Thanks for your suggestion. I will replace '_unitOfWorkManager.Current.Completed += async (sender, args) =>' with a domain event handler. I'm new to domain events. If you have sample code, please share it with me.

  • User Avatar
    0
    ismcagdas created
    Support Team

    Hi,

    Entity Change events are triggered automatically, so you only need to create a class to handle that change. To do that, you basically need to create a class similar to https://aspnetboilerplate.com/Pages/Documents/EventBus-Domain-Events#handling-base-events

  • User Avatar
    0
    hongbing.wang created

    Thank you.

    It seems my issue is similar to Entity framework concurrent savechanges deadlock

    If i start the software twice and hit savechanges at the same time (one client deletes all A houses, the other one all B houses) it also runs into a deadlock. I guess it is not uncommon that clients delete records at the same time that do not have any relationships to each other. My parallel Method just simulates two clients deleting differnt records at the same time. The weird thing is if i only delete the Persons, it works without deadlock, so i guess it has something todo with the cascading delete of houses.

    I think the issue is occured during comitting the changes (SaveChangesAsync) of the transaction rather than after comitting the changes (Completed event). I am not sure whether the domain event handler will help.

    I've made the following changes. It works fine.

    		private async Task DeleteControllerDevicesAsync(int controllerId)
    		{
    #if false
    			// delete credentials
    			await _credentialRepository.DeleteAsync(d => d.ControllerId == controllerId);
    
    			// delete CAGs
    			await _cagRepository.DeleteAsync(d => d.ControllerId == controllerId);
    
    			// delelet timezone and holiday mapping
    			await _tzLinkRepository.DeleteAsync(d => d.ControllerId == controllerId);
    
    			await _deviceRepository.DeleteAsync(d => d.ParentId == controllerId);
    #else
                using (var db = umsplusServiceDbContext.CreateDbContext())
                {
                    await db.Database.ExecuteSqlRawAsync($"delete from UmsCredential where controllerid={controllerId}");
                    await db.Database.ExecuteSqlRawAsync($"delete from UmsCAG where controllerid={controllerId}");
                    await db.Database.ExecuteSqlRawAsync($"delete from UmsTzHolidayLink where controllerid={controllerId}");
                    await db.Database.ExecuteSqlRawAsync($"delete from UmsDevice where ParentId={controllerId}");
                }
    #endif
            }
    

    The following stored procedure also works fine. await db.Database.ExecuteSqlRawAsync($"exec DeleteController {controllerId}");

    It seems Entity Framework is so heavy when deleting entities concurrently. There are too many deadlocks hence errors. It might be caused by cascade deletion.

    It would be good to get EF code working too.

  • User Avatar
    0
    hongbing.wang created

    Hi ismcagdas,

    Sorry there was a mistake in my post. I've corrected it.

    Correction: It should be 'DeleteControllerDevicesAsync' instead of 'DeleteConfigRecord'

    In this case, we may need to investigate something other than UoW Completed event handling.

  • User Avatar
    0
    ismcagdas created
    Support Team

    Hi,

    Yes, it is a different case. Is it possible to share your project with [email protected] or temporarily provide access for one of our team members ?