Thursday, March 17, 2011

Speeding up Kahlua

Earlier post Scriptable Object Cache

The first implementation of the lua based http proxy was fairly slow.  On a 1.6Gz quad core intel processor, it could proxy about 1K requests per second. The final number was about 15K messages per second with 5KB request and 5KB response.

The first problem was LuaTableImpl. It is very slow.  The problem is lua table is both a hashmap and a list with single iterator.  One of the methods in the lua table interface is:


Object next(Object key);

Hack number one is to use default java Hashmap instead of the built in LuaTableImpl. Need to take care of making sure that "array" access works and "next" works.

Problem number two is the overhead of creating the runtime. In my case every request needs a new runtime  or LuaState object which is associated with each request. This was very expensive. Here is what needs to be done to create luaState and set it up for the proxy code to execute.

String     fileName = "/luaTestCode/test.lua";
LuaState   state    = new LuaState(System.out);
File       luaFile  = new File(fileName);
LuaClosure closure  = LuaCompiler.loadis(new FileInputStream(luaFile),
    luaFile.getName(), state.getEnvironment());
state.call(closure, null);
LuaConverterManager manager = new LuaConverterManager();
LuaNumberConverter.install(manager);
LuaTableConverter.install(manager);
LuaJavaClassExposer exposer = new LuaJavaClassExposer(state, manager);
        exposer.exposeClass(LuaContext.class);
                exposer.exposeClass(LuaHttpRequest.class);
                            exposer.exposeClass(LuaHttpResponse.class);

Object proxy = state.getEnvironment().rawget("proxy_function");
Object value = state.call(proxy, null);


As you can see, we have file read, followed by compiler, followed by some Reflection code which will setup java functions in the lua runtime. All of this is expensive and needs to be done for every message. To avoid this, I used couple of hacks. First one was to avoid the compilation again and again. This one was simple. The output of compilation step is LuaClosure. I changed the code so that instead of LuaClosure I would get LuaPrototype which is the real output of compilation. This could be easily reused to create as many LuaClosure objects without paying the price of file access and compilation. The second hack was to avoid setup cost of LuaState object. Digging into code revealed that LuaState (without the stack) is nothing but the environment or a luaTable. So I just added clone method to the LuaTable implementation and used it to cheaply construct new LuaStates which had all the java classes correctly exposed.  The new code looked something like this.

public class LuaContext {
private LuaState     state;
private LuaClosure   closure;
private Object       coroutine;

private static final LuaState            BASE;
private static final LuaConverterManager MANAGER;
private static final LuaJavaClassExposer EXPOSER;
static {
BASE    = new LuaState(System.out);
MANAGER = new LuaConverterManager();
LuaNumberConverter.install(MANAGER);
LuaTableConverter.install(MANAGER);
EXPOSER = new LuaJavaClassExposer(BASE, MANAGER);
                EXPOSER.exposeClass(LuaContext.class);
EXPOSER.exposeClass(LuaHttpRequest.class);
EXPOSER.exposeClass(LuaHttpResponse.class);
}

public LuaContext(LuaPrototype proto) 
throws IOException {
this.state         = BASE.clone();
this.closure       = new LuaClosure(proto, state.getEnvironment());
this.state.call(this.closure, null);
this.coroutine     = this.state.getEnvironment().rawget("proxy_function");
}
That is it.  After adding support for storing state (lua objects), lock and some basic json, I was able to write some cool proxy features in lua itself.

Here is how I could do redirect handling at server side...

function getServerAndPath(response) 
       local newURL = response:getHeader("Location")
       local server     = nil
       local path        = nil
       if (newURL ~= nil) then
            local h1, h2 = string.find(newURL, "http://")
            local s1, s2  = string.find(newURL, "/", h2+1)
            server          = string.sub(newURL, h2+1, s1-1)
            path             = string.sub(newURL, s1)
       end
       return server, path 
end


function isRedirectResponseCode(code)
      if (code == 301) then return true end
      if (code == 302) then return true end
      if (code == 303) then return true end
      if (code == 307) then return true end
      return false
end

function proxy(context, httpRequest)
         local targetServer   = "yahoo.com"
         local path                  = httpRequest:getUri() 
         local responseCode = 301
         local httpResponse  = nil

         while (isRedirectResponseCode(responseCode)) do 
      httpRequest:setHeader("Host", targetServer)
               httpRequest:setUri(path)
               httpResponse = luaSendRequestAndGetResponse(context, httpRequest) 
               responseCode = httpResponse:getStatus()
               print ("got response with status " .. responseCode)
               targetServer, path = getServerAndPath(httpResponse) 
         end
         luaSendResponse(context, httpResponse)
end


 Here is another example...for doing least connection based server load balancing.

-- simple lb implementation 

function newLB() 
  local  object      = {}
  object.lock        = NewLuaLock()
  object.connections = {}
  return object
end

function addServer(object, serverName) 
    object.lock:lock()
    if (object.connections[serverName] == nil) then
        object.connections[serverName] = 0
    end
    object.lock:unlock()
end

function findLeastUsedServer(object)
    local min    = 64 * 1024
    local result = nil
 
    object.lock:lock()
    for k,v in pairs(object.connections) do
if (v < min) then 
             min    = v 
             result = k
        end        
    end
    object.connections[result] = min + 1
    object.lock:unlock() 
    return result
end

function decreaseServerCount(object, server)
    object.lock:lock()
    if (object.connections[server]  ~= nil) then 
if (object.connections[server] > 0) then 
           object.connections[server] = object.connections[server] -1
        end
    end
    object.lock:unlock()
end

--------------------------------------------------------------

function getLBResource() 
local objectMap    = getObjectMap()
         local lbResource   = objectMap:get("lb")

         if (lbResource == nil) then 
lbResource = newLB() 
                addServer(lbResource, "google.com")
                addServer(lbResource, "yahoo.com")
                objectMap:put("lb", lbResource)    
         end 
         return lbResource
end


function proxy(context, httpRequest)
         local lbResource = getLBResource()
         local server         = findLeastUsedServer(lbResource)
               httpRequest:setHeader("Host", server)
         local response    = luaSendRequestAndGetResponse(context, httpRequest) 
               decreaseServerCount(lbResource, server)
         luaSendResponse(context, response)
end

The final example show how to filter the twitter timeline of all the junk information and cache it for a while at the proxy.

function proxy(context, httpRequest)
         local globalCache     = getHttpResponseCache()
local twitterResponse = globalCache:get("timeline")
       
         if (twitterResponse == nil) then 
          local twitterRequest  = NewLuaHttpRequest("HTTP/1.1", "GET", "/1/statuses/public_timeline.json")
                  twitterRequest:setHeader("Host", "api.twitter.com")                
         twitterResponse = luaSendRequestAndGetResponse(context, twitterRequest)
                
                  local jsonArray       = NewLuaJSONArrayFromText(twitterResponse:getContent())    
                  local newJsonArray    = NewLuaJSONArray()        
          
                  for i=0,(jsonArray:length()-1) do 
                       local current = jsonArray:getJSONObjectAtIndex(i)
                       local text    = current:getString("text")
                       newJsonArray:putString(text)
                  end
                  twitterResponse:setContent(newJsonArray:toString())
                  -- cache for five minutes
                  globalCache:put("timeline", 300, twitterResponse)
          end
          luaSendResponse(context, twitterResponse)
end


Wednesday, March 09, 2011

Twitter Business Model

How will twitter make money?

Two things they did are:

  • New website 
  • Promoted Tweets
Why promoted tweets? Yes for money, but more importantly, it is because they will show up on all applications that use twitter API.  Twitter apps don't run in twitter context, twitter runs in twitter app context.  Facebook on the other hand provides the context in which facebook apps run. The important implication is who owns the screen real estate. In case of facebook, it is facebook. In case of twitter it is twitter apps. Hence any mechanism of advertisements from twitter would have to be part of the "content" served by twitter API. But even this doesn't works if clients can filter content.  

So twitter started with making people remove twitter/tweet name from their app names and website names. They purchased few of the clients, did revenue sharing with some of the clients and then added promoted tweets.  All this and the new website are all towards an effort to own the screen real estate, because fundamentally that is what you need if you want to show advertisements.

Here are some ideas on how twitter can make money.
  • Speed  - How much time does it takes for a tweet sent to be displayed on the followers screen.  This could be immediate or could be delayed and it could be delayed so much that it never makes it to the intended person.  Some people would pay for speed in sending it out, some people would pay for speed in receiving it.  Some might pay for slowing it down.
  • Are all tweets equal ? And is chronological order the best way of reading tweets. If someone can save my time by filtering and sorting them for me, I could pay for that. It could be the most "active tweets"(replies) or most liked tweets (retweets) or it could be a person doing the filtering. Keep me connected but don't waste my time.  May be organizing them into tweetshots - collection of related tweets either on a topic or from a person or during a time interval. Out source this problem say via twitter proxy interface and let developers innovate inside the twitter platform, instead of outside. Another way to expose the same functionality could be virtual users. @politics could be a tweet channel that filters tweets related to politics. Basically instead of having hard link between produced and consumer, make it a soft link. 
  • Make twitter a market place.  Paid tweets. Subscribe to Seth Godin's tweets for $1 per month. May be some people would find it worth the price. Similarly twitter could have tweets as advertisements. So I as a subscriber can sell my attention for some money. Lets say I put my attention for tweet at 20 cents. Any one whom I am not following can send me a tweet by paying me 20 cents.  My attention is cheap, but CEO of a company might put it at $100. 
  • Twitter Analytics - Instead of specifying a single shortened URL in the tweet, auto generate different URL for each user.  This creates a way for twitter to monitor exactly who read it and when and what else they are reading.
  • Most broadcast mediums end up causing spam. Email, Groups, etc. Twitter is in a unique position where they control the medium and have power over the complete ecosystem. No one can tweet me until I follow them. Email sucks at this. Communication medium with a single point of control can solve the spam problem.
  • Many people use twitter to complain in public. The purpose of that complaint is not to bring down reputation of the company, but to make the company listen. Reputation management in social media is now almost a buzz word. Twitter can become the unified CRM for companies. #complaint >apple could be slowed down, giving company a chance to handle the complaint, before letting it  spread. People would love it if it makes it easy to communicate with the companies and companies would love it, if they somehow can manage the complaint instead of playing with its reputation.
  • Twitter for computers. Twitter is a communication medium for people. But the same technology could be used for structured messages (json?).  Twitter for Rentals (Need a place or want to rent a place) . Twitter for products (price changes/discounts). I guess the point is, instead of launching websites, providing API's or sending emails to other companies, some companies will benefit by just providing that information in public and let other companies consume it, in a standard way. So if you are recruiting, just tweet it to a recruiting topic with standard interface and let all recruiters in the world help you.  Or if you need a quote for 100 lenovo laptops with 4GB ram, just tweet it.  In short, global structured message pub sub, using standard twitter client.
I like the company, would love to see it making money.