Open Closed

Azure Signalr Service issue with Azure App Service when the App service is scaled out #11027


0
jason@slickpay.com.au created

Setup Details:

  • What is your product version? v11.1.0
  • What is your product type (Angular or MVC)? ASP.NET CORE MVC & jQuery
  • What is product framework type (.net framework or .net core)? .NET Core
  • What is ABP Framework version? 7.1.0

I am looking at scaling out my App Service to a minimum of 2 instances (with a single Azure Signalr Service instance), however, when I do this, all clients stop receiving messages consistently. As soon as I scale down the App Service back to 1 instance everything works fine again.

Please see the diagram below, I want to make sure we understanding the issue here:

I had Microsoft check the Azure Signalr Service and complete a live trace, while I had the app service is scaled out to 2 instances, they have come back stating no errors are logged in the Signalr service and that all connections were good. Therefore, they assume the issue occurs in code on the app service end, as no response is sent from the server. So the SingalR service is working fine.

I also do not see any errors in App Insights when messages are being lost.

Are there any issues with using ANZ and ABP with the Azure Signalr Service, when the app service is scaled out?


11 Answer(s)
  • 0
    ismcagdas created
    Support Team

    Hi,

    Did you integrate Azure SignalR service to your project according to this document https://docs.microsoft.com/en-us/azure/azure-signalr/signalr-quickstart-dotnet-core ?

  • 0
    sedulen created

    Hi @jason@slickpay.com.au

    I am familiar with this problem. This is not specific to the Azure SignalR service, but rather now the internal ANZ OnlineClientStore works.

    When a client registers to your SignalR Hub, evening when using the Azure SignalR service, the ANZ framework stores the ClientId for that connected client in an OnlineClientStore class. By default, this class uses an InMemory Dictionary, which is not shared across your instances.

    So even though, technically, the Client is connected through the Azure SignalR service, the sending of the message goes from Server to Client, so the Server needs to know who is getting which message when a new message is published. (typical Pub / Sub model). When ANZ Notification Publication classes try to publish a new message to a user via Notifications, the OnlineClientStore on instance B isn't shared with the OnlineClientStore on instance A.

    To solve this, you need to replace the IOnlineClientStore service with something that supports a distributed store. I have developed code that supports 3 possible distributed stores:

    • Sql
    • Redis
    • Azure Storage Account (Tables)

    SQL Pro - it's transactional and it's super easy to implement Con - it's in your db, so it's the slowest possible implementation, plus it's potentially transient data being stored in your db*

    Redis Pro - also super simple to implement Con - could encounter timeouts due to high network I/O, and doesn't really fit with the ANZ PerRequestRedisCache design. Another con is that even if you try to use KeySpace events within Redis and keep the cache sync'd locally across multiple instances as local in-memory cache, you could still possibly encounter timing issues. So IMHO this is the least reliable.

    Azure Storage Account (Tables) Pro - also transactional and super easy to implement. No timing issues, like Redis and no transient data bloat in your db Con - it's 1 more 3rd party service that you have to manage (connectionStrings, security, ...)

    Are you familiar with replacing ANZ core classes on Module initialization? If not, I can try to share some of my code with you.

    @ismcagdas - do you think this is something that the ABP / ANZ framework could benefit from? I'm happy to contribute.

    I hope this helps! -Brian

  • 0
    jason@slickpay.com.au created

    Hi @sedulen,

    Thank you for providing the details above. I have not had to replace any ANZ core classes during module initialization until now! So any code you could provide for the Azure Storage Account (Tables) implementation you have mentioned, would be very helpful.

    The details provided definitely explain the behaviour I have been experiencing.

    Hi @ismcagdas, to answer your question, yes. Furthermore, when I setup a test app, excluding the ABP/ANZ frameworks everything works as expected.

    Thanks Jason

  • 0
    ismcagdas created
    Support Team
  • 0
    jason@slickpay.com.au created

    Hi @ismcagdas,

    I have been looking through the ABP Realtime repo today, which seems to be the issue. I am confused why I need to use Redis, when the Azure Signalr Service shouldn't need to have a Redis backplane. Also, Redis can have latency issues, especially when you have a multi region Azure setup. Futhermore, the cost is much higher espeically when you need to scale.

    The design of the OnlineClientManager with the InMemory store is not designed for horizontal scalablily, which is an issue.

    Do you have another solution other then Redis? This should work out of the box.

    Thanks Jason

  • 0
    sedulen created

    @Jason,

    Here is my AzureTablesOnlineClientStore class. Please note that this was originally written against ABP v4.5.0 and ANZ v6.9.0. I have not tried running this against ANZ v11.1.0.

    using Abp.RealTime;
    using System;
    using System.Collections.Generic;
    using System.Collections.Immutable;
    using System.Linq;
    using Azure.Data.Tables;
    using BrianPieslak.ANZ.Configuration;
    using Azure;
    using Microsoft.Extensions.Configuration;
    using Abp.Runtime.Security;
    
    namespace BrianPieslak.ANZ.Notifications
    {
        public class AzureTablesOnlineClientStore : IOnlineClientStore
        {
            private readonly IAppConfigurationAccessor _configurationAccessor;
    
            public AzureTablesOnlineClientStore(IAppConfigurationAccessor configurationAccessor)
            {
                _configurationAccessor = configurationAccessor;
            }
    
            public void Add(IOnlineClient client)
            {
                var id = PerformOperation<string>((tableClient) =>
                {
                    var partitionKey = client.TenantId.HasValue ? client.TenantId.Value.ToString().PadLeft(10, '0') : "HOST";
                    var entity = new OnlineClientEntity()
                    {
                        PartitionKey = partitionKey,
                        RowKey = client.ConnectionId,
                        Data = SerializeObject(client)
                    };
                    tableClient.AddEntity<OnlineClientEntity>(entity);
                    return client.ConnectionId;
                });
            }
    
            public bool Remove(string connectionId)
            {
                return TryRemove(connectionId, out IOnlineClient removed);
            }
    
            public bool TryRemove(string connectionId, out IOnlineClient client)
            {
                client = PerformOperation<IOnlineClient>((tableClient) =>
                {
                    var resultsqueryResults = tableClient.Query<OnlineClientEntity>(ent => ent.RowKey == connectionId);
                    OnlineClientEntity entity = null;
                    IOnlineClient result = null;
                    if (resultsqueryResults != null && resultsqueryResults.Count() == 1)
                    {
                        foreach (Page<OnlineClientEntity> page in resultsqueryResults.AsPages())
                        {
                            foreach (OnlineClientEntity qEntity in page.Values)
                            {
                                try
                                {
                                    result = DeserializeObject(qEntity.Data);
                                    entity = qEntity;
                                }
                                catch (Exception)
                                {
                                    //unable to decrypt the record so remove it
                                    tableClient.DeleteEntity(qEntity.PartitionKey, qEntity.RowKey);
                                }
                            }
                        }
                    }
                    if(entity != null)
                    {
                        tableClient.DeleteEntity(entity.PartitionKey, entity.RowKey);
                    }
                    return result;
                });
                return client != null;
            }
    
            public bool TryGet(string connectionId, out IOnlineClient client)
            {
                client = PerformOperation<IOnlineClient>((tableClient) =>
                {
                    var resultsqueryResults = tableClient.Query<OnlineClientEntity>(ent => ent.RowKey == connectionId);
                    IOnlineClient result = null;
                    if (resultsqueryResults != null && resultsqueryResults.Count() == 1)
                    {
                        foreach (Page<OnlineClientEntity> page in resultsqueryResults.AsPages())
                        {
                            foreach (OnlineClientEntity qEntity in page.Values)
                            {
                                try
                                {
                                    result = DeserializeObject(qEntity.Data);
                                }
                                catch (Exception)
                                {
                                    //unable to decrypt the record so remove it
                                    tableClient.DeleteEntity(qEntity.PartitionKey, qEntity.RowKey);
                                }
                        }
                    }
                    }
                    return result;
                });
                return client != null;
            }
    
            public bool Contains(string connectionId)
            {
                var id = PerformOperation<string>((tableClient) =>
                {
                    var results = tableClient.Query<OnlineClientEntity>(ent => ent.RowKey == connectionId);
                    if (results != null && results.Count() == 1)
                        return connectionId;
                    return null;
                });
                return !String.IsNullOrEmpty(id);
            }
    
            public IReadOnlyList<IOnlineClient> GetAll()
            {
                return PerformOperation<IReadOnlyList<IOnlineClient>>((tableClient) =>
                {
                    var resultsqueryResults = tableClient.Query<OnlineClientEntity>();
                    var result = new List<IOnlineClient>();
                    if (resultsqueryResults != null)
                    {
                        foreach (Page<OnlineClientEntity> page in resultsqueryResults.AsPages())
                        {
                            foreach (OnlineClientEntity qEntity in page.Values)
                            {
                                try
                                {
                                    result.Add(DeserializeObject(qEntity.Data));
                                }
                                catch (Exception)
                                {
                                    //unable to decrypt the record so remove it
                                    tableClient.DeleteEntity(qEntity.PartitionKey, qEntity.RowKey);
                                }
                            }
                        }
                    }
                    return result.ToImmutableList();
                });
            }
    
            private T PerformOperation<T>(Func<TableClient, Object> function) where T : class
            {
                var tableClient = GetTableClient();
                return function(tableClient) as T;
            }
            private TableServiceClient GetTableServiceClient()
            {
                var configuration = _configurationAccessor.Configuration;
                string connectionStringName = configuration["App:SignalR:OnlineClientStore:Azure:ConnectionString"];
                string connectionString = configuration.GetConnectionString(connectionStringName);
                if (String.IsNullOrEmpty(connectionString))
                {
                    connectionString = connectionStringName;
                }
    
                return new TableServiceClient(connectionString);
            }
            private TableClient GetTableClient()
            {
                return GetTableClient(GetTableServiceClient());
            }
            private TableClient GetTableClient(TableServiceClient serviceClient)
            {
                var configuration = _configurationAccessor.Configuration;
                string tableName = configuration["App:SignalR:OnlineClientStore:Azure:TableName"];
                serviceClient.CreateTableIfNotExists(tableName);
                return serviceClient.GetTableClient(tableName);
            }
    
            private class OnlineClientEntity : ITableEntity
            {
                public string Data { get; set; }
    
                public string PartitionKey { get; set; }
                public string RowKey { get; set; }
                public DateTimeOffset? Timestamp { get; set; }
                public Azure.ETag ETag { get; set; }
            }
    
            private string SerializeObject(IOnlineClient client)
            {
                var configuration = _configurationAccessor.Configuration;
                var data = Newtonsoft.Json.JsonConvert.SerializeObject(client);
                if (bool.TryParse(configuration["App:SignalR:OnlineClientStore:StoreEncrypted"], out bool storeEncrypted) && storeEncrypted)
                {
                    data = SimpleStringCipher.Instance.Encrypt(data);
                }
                return data;
            }
            private Abp.RealTime.OnlineClient DeserializeObject(string data)
            {
                var configuration = _configurationAccessor.Configuration;
                if (bool.TryParse(configuration["App:SignalR:OnlineClientStore:StoreEncrypted"], out bool storeEncrypted) && storeEncrypted)
                {
                    data = SimpleStringCipher.Instance.Decrypt(data);
                }
                return Newtonsoft.Json.JsonConvert.DeserializeObject<Abp.RealTime.OnlineClient>(data);
            }
        }
    }
    
    

    This class is defined in my ".Core" project.

    Then to use this class, in the Module of your .Web.Core project (mine is: BrianPieslakWebCoreModule), in the PreInitialize method, I have the following:

                //Online Client Cache
                Type replacementOnlineCacheStore = default;
    
                if (bool.TryParse(_appConfiguration["App:SignalR:OnlineClientStore:Azure:Enabled"], out bool signalRAzureEnabled) && signalRAzureEnabled)
                {
                    replacementOnlineCacheStore = typeof(AzureTablesOnlineClientStore);
                }
    
                if (replacementOnlineCacheStore != default)
                {
                    if (IocManager.IsRegistered<IOnlineClientStore>())
                    {
                        Configuration.ReplaceService(typeof(IOnlineClientStore), replacementOnlineCacheStore, Abp.Dependency.DependencyLifeStyle.Singleton);
                    }
                    else
                    {
                        IocManager.IocContainer.Register(Component.For(typeof(IOnlineClientStore)).ImplementedBy(replacementOnlineCacheStore).LifestyleSingleton());
                    }
                }
    

    This code is slightly more complicated than it needs to be because I support other implementations of IOnlineClientStore, such as Sql & Redis. This could be more simply implemented as an extension method. I guess I just got a little lazy =D

    Then lastly, to drive my configuration via settings, you'd need to add this tag to your appsettings.json (and then override as appropriate in your appsettings.<environment>.json

    "App": {
        "SignalR": {
          "Azure": {
            "Enabled": true,
            "ConnectionString": "AzureSignalR"
          },
          "OnlineClientStore": {
            "StoreEncrypted": true,
            "Azure": {
              "Enabled": true,
              "ConnectionString": "AzureStorage",
              "TableName": "OnlineClientCache"
            }
          }
        }
    }
    

    The first object of "App:SignalR:Azure" drives if I'm using the Azure Signalr service, and the ConnectionString attribute references a named ConnectionString in the "ConnectionStrings" portion of the configuration file.

    The "App:SignalR:OnlineClientStore" object drives how my PreInitialization code wires up a replacement IOnlineClientStore service, if configured.

    I hope this helps.

    @ismcagdas - feel free to use any/all of this in the ANZ / ABP framework if you want, or I can submit a ticket in github and contribute my code there.

    Cheers! -Brian

  • 0
    jason@slickpay.com.au created

    Hi Brian,

    Thank you for the details shared, much apreciated and very helpfully. I will implement something simlar, adapting it to the latest version of ABP/ANZ.

    Cheers Jason

  • 0
    jason@slickpay.com.au created

    Hi @ismcagdas,

    Before implementing Brians solution above, I turned on Redis as you suggestion as a quick fix, however, this still didn't fix the problem! When the Azure App Service is scaled out to 2+ instances message delivery is still inconsistence!

    Cheers Jason

  • 0
    jason@slickpay.com.au created

    Hi Brian,

    After implementing a solution using your approach, everything works as expected in a fully scaled Azure setup.

    Once again, thanks Brian for your input and kindly offered solution.

    Hi @ismcagdas,

    The way the Online Manager is setup for Signalr should be revised, as cloud solutions that need to scale out will face this issue. Using a service like Azure Signalr should work our of the box and should not require any additional type of backplane to work in a scaled environment.

  • 0
    japnolt created

    Another solution that is possible is to use SignalR Groups although admittedly it does require changes in both the hub and also when sending the message. I have some thoughts about this on this other thread: https://support.aspnetzero.com/QA/Questions/11117/Microsoft-signalR-is-not-working-with-multiple-instances#answer-4522399e-6457-858f-7bdc-3a0504ccce6e

  • 0
    rickfrankel created

    The solution is coming here.
    I use this solution currently.

    https://support.aspnetzero.com/QA/Questions/11027/Azure-Signalr-Service-issue-with-Azure-App-Service-when-the-App-service-is-scaled-out#answer-b7792b56-a93d-8ac9-099a-3a033b15dc28

    However it causes issues as the Azure Table doesn't get cleaned out and eventually the queries timeout, unless you periodically clear the table.

    I can see that the team have now implemented their own supported Redis version https://github.com/aspnetboilerplate/aspnetboilerplate/commit/b58d50ad5796da2cf1bf060fe791d33652700ba9

    Looks to hopefully come out in the next version.