Scaling Subdomains with Redis and ZeroMQ

You know those nice web apps that allow customised subdomains per customer, i.e. mycompany.webapp.com? Nice to have, sure, but there are challenges in keeping this feature performant in a scalable server farm and it’s non-trivial, as I discovered.

To determine which customer organization a web request is for based on the subdomain you could make a call to the service layer to determine the organization ID for the subdomain, but that is massively inefficient – it is a guaranteed database round-trip per web request. To avoid this a cache server is the answer (such as Redis or memcached). Here is the configuration I use:

wtier

You might wonder why a Redis instance per web server instead of just having one shared Redis instance. There’s two main reasons for this:

  1. Since this is a per-request cache fetch I want it to be as fast as possible to increase throughput, so accessing Redis on the same server will be faster than a call over the network.
  2. If one Redis instance is used per server then you are more likely to get a cache hit and so minimize the need to go to the database. Note that the configuration is important for this to work as intended (e.g. use one of the LRU options for Redis maxmemory-policy, and configure the load balancer to use the Source IP algorithm or better.)

The web app on each server is configured to handle the wildcard domain (*.webapp.com), and a request handler is used to query the Host Header subdomain and determine if there is a cache entry for it. If there is no cache entry, a call to the service layer/database can then be done and stored in the cache. Once it is available, the web server can use this for whatever it likes. A common example is to use different or custom UI themes per organization, and so the web server would return different HTML or CSS links depending on the customer organization. With the necessary details cached on the web server, it can do its UI rendering tasks without having to call the service layer.

Sounds good, right? Yes – except for one tiny problem. Anytime a cache is introduced to an architecture it raises the issue of keeping the cache in-sync with the authoritative data store (i.e. the database). To put it simply, if a customer changes the theme they want to use and saves that, the web server cache does not yet reflect the change and so will continue to deliver the old theme until the customer data is evicted from the Redis cache. The customer, having changed theme, will probably then keep making requests, refreshing the page, wondering why the theme hasn’t changed – because each requests is keeping the old data in the cache! The problem is magnified because a stale cache entry for the customer could, potentially, now exist on more than one redis instance – the downside of using multiple Redis instances.

But there is a solution. This is the kind of problem messaging is great at solving. Enter ZeroMQ.

The task of saving the customer record update is done by the service (or data) layer, so it has the responsibility of informing any component that needs to know the customer data is now potentially out-of-date. But how does the service layer know what components to send the message too? (There could be multiple web servers on-line caching the data). The answer is that the service layer server does need to know. It’s not its responsibility. It just needs to send the message – fire and forget.

ZeroMQ is a lightweight messaging option. We could use something like RabbitMQ which can be configured for guaranteed end-to-end messaging, etc., but if the message being sent isn’t mission critical you can decide to trade reliability for performance. ZeroMQ is blazingly fast. MSMQ also is slower and configuring and testing it is a bit more of a pain than using lightweight, embedded messaging components like ZeroMQ.

To handle the notification to multiple web servers I use the pub-sub messaging model. Basically, one web server instance (the primary server) can be set-up as the messaging hub. Yes, it is just one point of failure, but again, these messages aren’t mission critical. You could use a more elaborate message broker set-up with redundancy and message storage but that means trading performance. Let’s look at the ZeroMQ pub-sub implementation in practice.

btier

We’ll use the Pub-Sub Proxy Pattern to handle the registration of web servers and forwarding of messages to them. As a web server comes on-line, it registers as a message subscriber on the XPUB socket on the primary web server (which is configured to listen). When a service tier server publishes a change message the NetMQ proxy (or hub) sends the message on to all subscribers. Each subscribers simply checks the contents of the message to see if the customer id is one it is holding in its Redis cache. If so, it refreshes the entry immediately.

ZeroMQ is a C implementation, so you’ve got (at present) two choices for using it in .NET. You can use clrzmq which is managed DLL wrapper around an unmanaged ZeroMQ library, or you can use NetMQ, which is a native C# implementation of the ZeroMQ functionality. At the time of writing NetMQ is not yet considered production ready, so it’s your call which to use – .NET code not production ready but easier to debug, or native code that will be harder to debug and is potentially open to memory leaks.

Thankfully, NetMQ has an implementation of the Proxy pattern ready built for us.

Here is a sample of the proxy code. Typically this would be run as a separate process or service on the primary web server, or you could run it as a Task or Thread in the main web app (but there are startup/shutdown issues involved which I won’t go into here.)

private void MessagingProxyTaskFunc()
{
    //Use the common context to create the mq sockets - created earlier and stored on the AppDomain
    NetMQContext cxt = (NetMQContext)AppDomain.CurrentDomain.GetData("WB_NetMQContext");

    using (NetMQSocket frontend = cxt.CreateXSubscriberSocket(), backend = cxt.CreateXPublisherSocket())
    {

        frontend.Bind("tcp://*:9100"); //Receive published messages on this server, port 9100
        backend.Bind("tcp://*:9101");  //Subscribers will bind to this server, port 9101, listening for forwarded messages

        //Create & start the proxy and begin listening for published messages
        NetMQ.Proxy proxy = new NetMQ.Proxy(frontend, backend, null);
                
        proxy.Start();
                
        while (true)
        {
            if (taskCancelToken.IsCancellationRequested) break;

            //Blocks until message received or interupted
            NetMQMessage message = frontend.ReceiveMessage();

            //Forward message to the subscribers to this proxy
            backend.SendMessage(message);
        }
    }
}

Next we need the business service to publish the message when the customer data changes:

public Organization SaveOrganization(Organization org)
{
     //Do data store logic here
     
     ...

     if(hasOrgSubdomainChanged){

            //Get the publisher socket, created when the business service was created using:
            //NetMQSocket socket = cxt.CreatePublisherSocket();
            //socket.Connect("tcp://<Primary Web Server IP Address>:9100");

            NetMQSocket socket = (NetMQSocket)AppDomain.CurrentDomain.GetData("WB_PubSocket");
          
            NetMQMessage msg = new NetMQMessage();
            msg.Append(new NetMQFrame(Encoding.UTF8.GetBytes("ORG")));
            msg.Append(new NetMQFrame(Encoding.UTF8.GetBytes(org.PublicID)));
            msg.Append(new NetMQFrame(Encoding.UTF8.GetBytes(org.Serialize())));
            socket.SendMessage(msg);
     }
}

Finally, the code for the Message Listener on each individual web server. Again, this function needs to run as its own process/thread to avoid blocking and ensure timely response to messages:

private void MessagingTaskFunc()
{
    NetMQContext cxt = (NetMQContext)AppDomain.CurrentDomain.GetData("WB_NetMQContext");

    using (NetMQSocket socket = cxt.CreateSubscriberSocket())
    {
        socket.Subscribe(Encoding.UTF8.GetBytes("ORG")); //Subscriber only listens for certain message header
        socket.Connect("tcp://127.0.0.1:9101");

        while (true) 
        {
           if (taskCancelToken.IsCancellationRequested) break;

           NetMQMessage data = null;
           try
           {
              data = socket.ReceiveMessage(); //This blocks until message received. data is null if interrupted.

              if (data == null) break;
              else
              {                            
                 data.Pop(); //Pop first message frame - will always be "ORG"
                 //Get the next message frames which should contain the ID of organization, and the data
                 NetMQFrame frame = data.Pop();
                 string orgID = Encoding.UTF8.GetString(frame.Buffer);

                 //Check that the organization's ID is one cached in Redis. If so, refresh Redis data using 
                 //last message frame data.

                 ...

               }
           }
           catch
           {
               // Handle subscription receive error gracefully - ensure listener loop keeps running;  
           }
         }
      }
 }

There you have it – an architecture for scalable, synchronized, custom subdomains.

.NET Scalable Server Push Notifications with SignalR and Redis

Modern web applications sometimes need to notify a logged-in user of an event that occurs on the server. Doing so involves sending data to the browser when the event happens. This is not easily achieved with the standard request-response model used by the HTTP protocol. A notification to the browser needs what’s known as “server push” technology. The server can not “push” a notification unless there is an open, dedicated connection to the client. HTML5 capable client browsers provide the WebSocket mechanism for this, but it is not widely available yet. Most browsers need to mimic push behavior, such as by using a long-polling technique in JavaScript, which simply means making frequent, light, requests to the server similar to AJAX.

To reduce the complexity of coding for the different browser capabilities the excellent SignalR library is available to use in .NET projects – it allows for the transport mechanisms mentioned, and some others. It automatically selects the best (read: performant) transport for the capabilities of the given browser & server combination. Crucially, it provides a means to configure itself so the developer can optimize it for performance and scalability. Using it for server initiated notifications is a “no-brainer”.

Here’s an example of how to set up such a notification mechanism.

To begin with, install required libraries into the project using NuGet.

PM> Install-Package Microsoft.AspNet.SignalR
PM> Install-Package ServiceStack.Redis
PM> Install-Package Microsoft.AspNet.SignalR.Redis

You can see that Redis is used too. This is to allow for web farm scaling. Redis is used to store the SignalR connections so they will always be available and synchronized no matter which web server the SignalR polling request arrives at. This can be achieved (depending on architectural demands) using just one Redis server instance, or by running multiple replicated Redis server instances (this is outside the scope of this example, but it’s easy to set-up).

Next configure SignalR to use Redis as the backing store and map the signalr route. This is done as part of RegisterRoutes (Global.asax.cs).

public static void RegisterRoutes(RouteCollection routes)
{
      //Use redis for signalr connections - set redis server connection details here
      GlobalHost.DependencyResolver.UseRedis("localhost", 6379, null, "WBSignalR");

      // Register the default SiganlR hubs route: ~/signalr
      // Has to be defined before default route, or it is overidden 
      RouteTable.Routes.MapHubs(new HubConfiguration { EnableDetailedErrors = true });

      //All other mvc routes are defined here            
}

A SignalR Hub subclass is needed to contain the server side code that both the SignalR client and server will use.

public class NotificationHub : Hub
{
}

We also use this class to keep the server aware of the open SignalR connections and – more importantly – which connections relate to which user. The events on the Hub class allow us to keep this up-to-date connection list.

There’s a lot to consider in the code for this class. The full code can be downloaded – NotificationHub.cs. Let’s look at it piece-by-piece.

The first thing is the nested ConnectionDetail class that is used to store the details of the connection in Redis.

[ProtoBuf.ProtoContract]
public class ConnectionDetail
{
    public ConnectionDetail() { }

    [ProtoBuf.ProtoMember(1)]
    public string ConnectionId { get; set; }

    public override bool Equals(object obj)
    {
        if (obj == null) return false;
        if (obj.GetType() != this.GetType()) return false;

        return (obj as ConnectionDetail).ConnectionId.Equals(this.ConnectionId);
    }
}

This class only has one property – the SignalR ConnectionId string. It is better to use a class instead of just the connection id string because we can extend it to store other detail about the connection that later on might affect what message we send, or how it should be treated on the client. For example we could record and store the type of browser associated with the connection (mobile, etc.)

The Equals implementation is needed to check if the connection object is already part of the user’s connection collection or not.

To store the connection detail object in Redis it will be serialized to a byte array using protocol buffers – hence the ProtoBuf attributes. Protocol buffers are a highly performant way of serializing/deserializing data. If you’re not familiar with protobuf.net, you really should check it out.

Next, we use the ServiceStack.Redis client to make all calls to Redis to store the list of connections per user. This is fairly trivial to set-up.

private RedisClient client;

public NotificationHub()
{
    client = new RedisClient();   //Default connection - localhost:6379
}

The connection to Redis is made when we want to add or remove a connection from the user’s connection list. Two methods provide that functionality – AddNotificationConnection and RemoveNotificationConnection. They are very similar, so I’ll just explain the first one.

public void AddNotificationConnection(string username, string connectionid)
{
    string key = String.Format("{0}:{1}", REDIS_NOTIF_PREFIX, username);

       client.Watch(key);
       try
       {
            List<ConnectionDetail> list = new List<ConnectionDetail>();
            byte[] data = client.Get(key);
            MemoryStream stream;
            if (data != null)
            {
                stream = new MemoryStream(data);
                list = ProtoBuf.Serializer.Deserialize<List<ConnectionDetail>>(stream);
            }
            ConnectionDetail cdetail = new ConnectionDetail() { ConnectionId = connectionid };
            if (!list.Contains(cdetail))
            {
                list.Add(cdetail);
            }
            stream = new MemoryStream();
            ProtoBuf.Serializer.Serialize<List<ConnectionDetail>>(stream, list);
            stream.Seek(0, SeekOrigin.Begin);
            data = new byte[stream.Length];
            stream.Read(data, 0, data.Length);

            using (var t = client.CreateTransaction())
            {
                t.QueueCommand(c => c.Set(key, data));
                t.Commit();
            }
        }
        finally
        {
            client.UnWatch();
        }
}

The code looks for data in Redis under a unique key which is a combination of the constant prefix and the username. It keyed this way because we can do a fast key lookup, retrieve and lock a small block of data, and so keep the operation atomic, maintaining integrity of the user’s connection list in an environment where the user could open a new connection via a different web server at any time. Keying it on one user, rather than storing a list of connections for all users under one key, also avoids creating locking bottlenecks at scale.

Next, we use the connection events of the Hub class to maintain the user’s list, e.g.:

public override Task OnConnected()
{
    string Username = GetConnectionUser();

    if (Username != null)
    {
        AddNotificationConnection(Username, Context.ConnectionId);
    }

    return base.OnConnected();
}

It’s fairly simple – the ConnectionId is taken from the Hub Context object and stored. The main issue here is how to get the user name associated with the connection. The usual HttpContext.User is not available in the SingalR Hub implementation. SignalR uses Owin for it’s Hhttp pipeline, not the usual MVC pipeline, and one of the consequences of this is that SignalR does not load the session (based on the session cookie). However, the browser cookies are sent with the SignalR request. In this case, I use FormsAuthentication in the web application, so the user’s name is stored encrypted in the ticket when the user logs in. GetConnectionUser gets this data from the FormsAuthentication cookie.

private string GetConnectionUser(){
    if (Context.RequestCookies.ContainsKey(FormsAuthentication.FormsCookieName))
    {
        string cookie = Context.RequestCookies[FormsAuthentication.FormsCookieName].Value;

        FormsAuthenticationTicket ticket = FormsAuthentication.Decrypt(cookie);
        return ticket.UserData;
    }

    return null;
}

The final piece of the Hub code is the function that actually sends the message to the user’s client browser sessions. It will invoke the corresponding ReceiveNotification function in Javascipt on the client.

public bool SendNotificationToUser(string username, string message){

    List list = GetNotificationConnections(username);
           
    foreach(ConnectionDetail detail in list){
        Clients.Client(detail.ConnectionId).receiveNotification(message);
    }

    return false;
}

To test this, we will call it from a controller action from a test page.

notf2

[HttpPost]
public ActionResult NotfTest(string touser, string message)
{            
    var hubConnection = new HubConnection("http://localhost/SignalR.Notification.Sample");
    IHubProxy hubProxy = hubConnection.CreateHubProxy("NotificationHub");
    try
    {
        hubConnection.Start().Wait(2000); //Async call, 2s wait - should use await in C# 5

        hubProxy.Invoke("SendNotificationToUser", new object[] { touser, message });
    }
    finally
    {
        hubConnection.Stop();
    }
    return View("NotfTestSent");
}

The call is made by the server creating a SignalR hub connection of its own and then sending a request to the Hub’s SendNotificationToUser function (similar to an RPC call).

That’s all the server side code, now for the client side.

To use the client side features of SignalR, we need to include the signalr javascript file, and the server-side generated hubs javascript.

How you want to display the notification in the browser is application dependant, and so up to you. For this, I use the jquery qtip plugin to show it as a tooltip pop-up.

<html>
<head>
    <!-- Add Script includes -->
    <script src="http://cdnjs.cloudflare.com/ajax/libs/qtip2/2.1.1/jquery.qtip.min.js" type="text/javascript"/>
    <script src="@Url.Content("~/Scripts/jquery.signalR-1.1.2.js")" type="text/javascript"/>
    <script src="@Url.Content("~/signalr/hubs")"/>
</head>

Near the end of html page (or near the end of the template page html), some javascript makes the connection to the hub once the page is loaded. Finally, define the client-side implementation ReceiveNotification to handle the display of the message.

<script type="text/javascript">

        //Make a connection to the server hubs
        $.connection.hub.start();

        // Declare a proxy to reference the server-side signalr hub class. 
        var notfHub = $.connection.notificationHub;

        //Link a client-side function to the server hub event
        notfHub.client.receiveNotification = function (message) {

            //Use qtip library to show a tooltip message
            $('#message-icon').qtip({
                content: {
                    text: message,
                    title: 'Notification',
                    button: true
                },
                position: {
                    at: 'top right',
                    my: 'bottom left'
                },
                show: {
                    delay: 0,
                    ready: true,
                    effect: function (offset) {
                        $(this).fadeIn(250);
                    }
                }
            }).show();
        };

    });
</script>

Voila. Server side push notification to any number of users, no matter how many places each is logged-in, and whatever browser they use.

notf3