Thursday, March 15, 2012

break the data: pump_databroker

Today I'm going to show a simple model for caching endpoint data (with an optional pubsub system for later in-page updates), a "databroker", so that a single page doesn't have to make multiple calls to the same endpoint.

This is one of those ideas where I think: maybe someone coded a better implementation of this idea, and I should embrace it and renounce my "NIH: Not Invented Here" ways. (Like the way I was using a basterdized version of cgi-lite.pl for CGI parsing in my Perl script long after better alternatives like CGI-lite.pm were around) On the other hand, this implementation has pretty good visibility, a minimum number of "moving parts", and is nicely scoped down to the problems at hand.

At work, we were already doing a good job of using wrapper functions for jQuery's ajax calls, jsonGet and jsonGetSynch (for that nasty synchronous stuff). The main thing these functions did, besides simplifying the syntax, was enable a "keep alive" function, so that we could keep polling for notifications, but then stop polling after a given interval (so that the user's session wouldn't live forever as long as they had the page open). So one design criteria was that the databroker 

So what does our databroker do? It stores the responses from ajax calls in an hashmap, keyed by the endpoint URL. (Furthermore, it automatically adds a timestamp to the requests to avoid caching issues) If a request for a URL comes in and the databroker has a cached response, the response is returned immediately, rather than making another call. Furthermore, if a pile of requests to a (possibly slow/long-running) URL come in at once, only one actual call is made, and then all the callbacks get called, passed the same cached data.

Furthermore, there's an optional "pubsub" publish/subscribe model. A request can specify a channel/tag it wants to "subscribe" to, and then when other operations update related data they can call "refresh" on that tag, and the subscribed callbacks are called with fresh data from the server

The implementation is here. It creates a global variable global_databroker. The syntax for the call is global_databroker.getJson(url, callback, errcallback ,tag].  Errcallback and tag are optional. There's also a getJsonSynch for synchronous calls, and that refresh() call. (I think the behavior might be a bit off for the errcallback relative to the pubsub system)

Because I am a bad person I did not write unit tests. I did write a test script, but since I haven't yet instrumented the calls, it relies on using firebug or the like to see when calls are made, and "ccdebug" in it prints to the console.

<script>
$(document).ready(
function(){
 global_databroker.getJson("/rest/user/profile",function(res){
ccdebug("call profile 0 sub");
 },ccdebug(),"sub");
 global_databroker.getJson("/rest/ecommerce/account",function(res){
ccdebug("call ecommerce 1 sub");
 },undefined,"sub");
global_databroker.getJson("/rest/ecommerce/account",function(res){
ccdebug("call ecommerce 2 nosub");
 });
}
);
function FORE(){
global_databroker.getJson("/rest/ecommerce/account",function(res){ ccdebug("call ecommerce 3 sub");},undefined,"sub");
}
</script>
<input type="button" value="refresh sub" onClick="global_databroker.refresh('sub')">
<input type="button" value="do ecommerce 3 with sub" onClick='FORE();'>
In the page I embedded these comments:
<pre>
So in firebug, you should see:
in "ready" 3 requests, with 2 to ecommerce account queued as one

0 and 1 subscribe w/ keyword "sub", so clicking "refresh sub" 
calls the calls to be repeated, and 0 and 1 get called back

click "do ecommerce 3 with sub" -- you see it immediately gets the info 
without another call to the server

then if you hit refresh sub, 0 1 and 3 all get called back
</pre>
That's that! I think it works pretty well, and the implementation is reasonably compact. 


(In the scheme I cribbed this from, the developer thought it would be good to automagically use the rest URL as the tag. (i.e. maybe just have a simple getJsonSubscribe() call). This would be especially cool if there was a 1:1 mapping between GET and POST URLs-- then anything you saw POST information could call any subscriber! In practice, though there isn't that 1:1 map. Not only that, there's often a disconnect where a  POST Url ends up changing data retrieved by a totally different GET. That's why I like a generic tag system to decouple the input and the output to the server.)


Comments welcome. 

No comments:

Post a Comment