UPDATE: Scriptable object cache is now available Cacheismo
Over the last few months, I have become a fan of Lua, especially because of the coroutine support. In fact more than Lua, I have become great fan of Kahlua, the java implementation of the Lua Language. The first product of this fascination was a Lua Http Proxy. This is what it looks like:
function proxy(context, httpRequest)
local request = httpRequest
local request1 = NewLuaHttpRequest("HTTP/1.1", request:getMethod(), request:getUri())
local request2 = NewLuaHttpRequest("HTTP/1.1", request:getMethod(), request:getUri())
local requests = {request1, request2}
request1:setHeader("Host","10.10.8.76")
request2:setHeader("Host","10.10.8.76")
local responses = luaSendRequestAndGetResponseInParallel(context, requests)
local finalResponse = responses[1]
finalResponse:setContent(responses[1]:getContent() .. responses[2]:getContent())
luaSendResponse(context, finalResponse)
end
end
All this code does is that when it gets a request, it makes two parallel requests and then sends back concatenated response. Fairly simple ? No. To the naked eye this is a single threader blocking code, but when you combine this with the power of Lua coroutines you get a proxy which is as simple to write as single threaded blocking code, but underneath you have the full power of java non blocking IO. You can probably have 100K concurrent connections running this code concurrently using may be 4 or 8 threads. That is where simplicity meeds speed and speed meets power or flexibility. It will be trivial to build upon this to build a node.js alternative which is much much easier to code without loosing the speed and gaining platform independence in bonus.
Now to the main topic Scriptable Object Cache. As I mentioned in the last post about the java memcache, I wanted to add some Lua to it. The result was a cache which could contain object instead of bytes. This is where redis fits in, but what if you wanted to store a object specific to your needs. Here is how to implement Set functionality.
function new()
local hashset = {}
return hashset
end
function put(hashset, key)
hashset[key]=1
return "STORED"
end
function exists(hashset, key)
if (hashset[key] == 1) then
return "EXISTS"
end
return "NOT_FOUND"
end
function delete(hashset, key)
hashset[key] = nil
return "DELETED"
end
function count(hashset)
return #hashset
end
function getall(hashset)
local result = ""
for k,v in pairs(hashset) do
result = result .. k .. " "
end
return result
end
function union(hashset, hashset2)
local newhashset = {}
for k,v in pairs(hashset) do
newhashset[k] = 1
end
for k,v in pairs(hashset2) do
newhashset[k] = 1
end
return getall(newhashset)
end
function intersection(hashset, hashset2)
local toIter
local toLookup
local newHashSet = {}
if (#hashset > #hashset2) then
toIter = hashset2
toLookup = hashset
else
toIter = hashset
toLookup = hashset2
end
for k,v in pairs(toIter) do
if (toLookup[k] == 1) then
newHashSet[k] = 1
end
end
return getall(newHashSet)
end
function __init__(context)
context:registerConstructor("new")
context:registerMethod("exists")
context:registerMethod("put")
context:registerMethod("count")
context:registerMethod("delete")
context:registerMethod("getall")
context:registerParamsAsKeyMethod("union")
context:registerParamsAsKeyMethod("intersection")
end
Once you code or copy this the following interface exists:
> new set1 # creates a new set
> invoke set1 put a
> invoke set1 put b
> new set2
> invoke set2 put b
> invoke set2 put c
> invoke set1 union set2
I haven't written the client, so don't know what is the performance, but I wouldn't be surprised if it runs at about 50% the speed of memcached. But wait what about the speed benefits because of not having to do get and update? And by the way how do you update? Using CAS in a loop?
What we have done here is inverted the responsibility. Instead of doing stuff in the code and putting it in the cache, what we can do now is do the stuff atomically in the cache itself.
Benefits:
- Define you own objects using Lua scripts
- Base code is java so runs on any platform and OS
- No CAS in loop. CAS only tells you things have gone wrong. Here you have the flexibility of doing the right thing in the first place.
- No need to define key patterns to store stuff using different keys. Define your object instead. Define its methods and what they return. As a bonus all stuff on a given object is atomic.
- Performance. Updating a 1000 entry list in memcached would need getting the object, updating it and store using CAS. If fails repeat. Now just define a single function which does the update and you are done.
This is not complete yet. I plan to add persistence and simplify the interfaces a bit. Stay tuned.