BlogBack

Exploring WebSockets and Evergreen II : Long-Lived Connections

If you’ll recall from the previous post [http://blog.esilibrary.com/2013/01/14/exploring-websockets-and-evergreen/], one benefit of Websockets is that it allows “messages to be passed back and forth while keeping the connection open”.  (http://en.wikipedia.org/wiki/WebSocket)

Fair warning, this post may get a little dry.  My goal is to make an informed decision, so I’m digging deep. 

The current Evergreen HTTP translator facilitates a series many short conversations.  When a client wishes to issue a series of requests to the translator, each request is a new conversation.  Each conversation requires a round of TCP Setup and TCP Teardown.  Imagine a conversation with a friend where every sentence started with “hello, …” and finished with “…, goodbye”.  “Hello, my name is Bill, goodbye.  Hello, I like cheese, goodbye.  Hello, where’s the beef, goodbye?”

For reference: http://en.wikipedia.org/wiki/File:Tcp_state_diagram_fixed.svg

In contrast, a Websocket connection acts as a single long-running conversation.  (“Hello, my name is Bill.  I like cheese.  Where’s the beef?  Goodbye”).  The diagram below shows how TCP Setup and Teardown affect two requests via the OpenSRF Translator vs a Websockets gateway.  (click for full size).

In short, the translator requires more TCP Setup and Teardown as more requests are added.  This both increases the amount of network traffic and amplifies the effect of network latency on the client.

A Theoretical Example

Let’s assume TCP Setup and Teardown each add one additional network round-trip per affected request.  (In reality, TCP Teardown can be more complicated).  If the minimum possible network packet round-trip time between the client and server is 20ms (it’s 24 to my development server in GA), then the baseline minimum cost for a traditional Evergreen translator request would be 60ms.

20 (setup) + 20 (request/response) + 20 (teardown)

In contrast, a typical request over a websocket connection, which does not require TCP setup and teardown, would cost at minimum 20ms.

This does not mean, of course, that websocket requests take ⅓ of the time overall.  The request/response packet delivery time will normally dwarf the TCP setup/teardown, since it contains larger payloads and requires time for the Evergreen server to produce a response.

The round-trip time for a more realistic request/response might be 250ms.  Running such a request through the translator would take 20 + 250 + 20 = 290.  Via websockets, it would be 250 total, since the 20 for setup and the 20 for teardown are not needed.  The reduction in time for delivery via Websockets is 25/29 (250ms vs 290ms) — about a 14% reduction.  Similarly, the reduction for a 500ms request would be about 8%, and a 1 second request would be about 4%.

In addition to saving time, use of Websockets in the above examples reduces the number of network packets by 4 (two round trips).  Granted, they are small packets that individually have minimal affect on throughput, but in large numbers (e.g. hundreds of staff clients), the effect would be amplified.

One factor that tempers the impact of websockets over the translator in regard to battling network latency is the Apache keepalive interval.  Out of the box, Evergreen uses a 1-second Apache keepalive.  This means that HTTP requests, including translator requests, that occur within one second of each other will be piggy-backed on the same socket connection.  Each keepalive session only requires one TCP setup and teardown, so piggy-backed translator requests are very similar to websocket requests in this regard.

What percentage of translator requests are piggy-backed on an existing socket via keepalive?  Certainly it’s high, possibly a majority.  It depends on the interface.  Most interfaces load with a burst of requests.  Most of those requests, after the first (multiplied by the number of allowable server connections, but I’ll get to that in a moment) will be piggy-backed.  After page load, though, most of the subsequent requests are not, since they occur after the 1-second timeout, and thus require TCP setup/teardown.

To answer that question with any level of certainty, we must finally consider the maximum number of allowed server connections.  XMLHttpRequest enforces a limit on the number of open connections (and thereby the number of active requests) allowed.  The purpose is to prevent flooding the server with requests, which can cause the number of Apache processes to spike.  In the staff client, I believe the limit defaults to 8.  (It can be raised, but it wold require having more Apache resources on hand).  This means that for a batch of requests greater than 8, which happens on occasion, particularly during the initial staff client load, all requests that don’t fall into the first round of 8 are forced to wait until a slot opens before they can be sent to the Evergreen server.

In contrast, given the full-duplex, bi-directional nature of Websockets, there is no limit to the number of requests a client can send over a Websocket connection.  Each request is immediately delivered to the Evergreen server, regardless of the number of active requests in flight.

As a side note, I could imagine throttling requests to some degree in a Websocket gateway so that clients didn’t maliciously (or otherwise) blast thousands of requests at once.  Such a limit would be considerably larger than 8 and certainly higher than any number of simultaneous requests a client would use in practice.

A Slightly Less Theoretical Example

Consider an Evergreen interface that starts with a burst of 10 requests, followed by 3 sparsely placed requests.  Today, using the Translator, this would look like:

1. Request 1 through 8 all require TCP setup, since the staff client knows its allowed to open 8 connections.

2. Request 9 and 10 would be waiting in line to use one of the open connections, but would not require TCP setup, once it had a connection to use.

3. When requests 1 through 10 are complete, the 8 connections are closed.  This results in 8 instances of TCP Teardown.

4. Requests 11, 12, and 13, assuming they each occurred independently (i.e. non-piggy-backed) each require their own TCP Setup and Teardown.

Contrast that with a websocket implementation.

1. The socket is already open, having been opened earlier in the process.

2.  All requests are sent to the server as soon as the client issues the request.

3.  There is no TCP Setup or Teardown for any of the requests.

For anyone counting, if we continue to assume a total of 40ms overhead for TCP Setup and Teardown in total, the websocket approach shaved 440ms (11 * 40) from the communication and required 22 (11 * 2) fewer network packet round-trips (in other words, 44 fewer individual network packets).

In reality, not every one of those 440ms would be felt by the user.  Some occur in parallel and some occur at times when the user is between tasks.  Many would be noticed by the client, though, and all would have some effect on the network at large.  Certainly the cumulative effect would be remarkable.

There’s a lot going on here.  To me, the benefits of Websockets are obvious.  However, it’s not enough that they be better.  Applying a new core technology to a mature project like Evergreen, with many installations in the wild, requires that the new technology not only be better, but be worth the effort in the long run.

So, stay tuned.  In my next post, theory meets practice.  I’ll discuss code I have developed to implement a proof of concept Websocket gateway and client library and ways we could use the code for real-world testing.