Sunday, April 08, 2012

Two Favorite Patterns in C

The first one is for error handling.


#define IfTrue(x, level, format, ... )          \
if (!(x)) {                                    \
   LOG(level, format, ##__VA_ARGS__)          \
     goto OnError;                              \
 }


This is the simplest way of simulating try/catch in c. Yeah it uses goto and is a bad programming practice and what not, but it makes c code beautiful and understandable. Take a look at the code below for creating a server connection below. These code samples are from cacheismo.

connection_t  connectionServerCreate(u_int16_t port, char* ipAddress, connectionHandler_t* handler) {
connectionImpl_t* pC = ALLOCATE_1(connectionImpl_t);
IfTrue(pC, ERR, "Error allocating memory");
pC->fd = socket(AF_INET, SOCK_STREAM, 0);
IfTrue(pC->fd > 0, ERR, "Error creating new socket");
{
         int flags = fcntl(pC->fd, F_GETFL, 0);
         IfTrue(fcntl(pC->fd, F_SETFL, flags | O_NONBLOCK) == 0,
    ERR, "Error setting non blocking");
}
memset((char*) &pC->address, 0, sizeof(pC->address));
pC->address.sin_family        = AF_INET;
pC->address.sin_addr.s_addr   = INADDR_ANY;
pC->address.sin_port          = htons(port);
if (ipAddress) {
pC->address.sin_addr.s_addr  = inet_addr(ipAddress);
}
IfTrue(bind(pC->fd, (struct sockaddr *) &pC->address,sizeof(pC->address)) == 0,  ERR, "Error binding");
IfTrue(listen(pC->fd, DEFAULT_BACKLOG) == 0,  ERR, "Error listening");
pC->isServer = 1;
pC->CH = handler;
goto OnSuccess;
OnError:
if (pC) {
connectionClose(pC);
pC = 0;
}
OnSuccess:
return pC;
}

It is a linear code. This avoids multiple exist points and repetitive error handling code. Less nesting of "if" blocks makes it easy to follow the code. Error handling/cleanup happens in the end and is common for all possible errors in the function, which also means less code.

The second pattern I use often is opaque objects.

typedef void* chunkpool_t;


chunkpool_t  chunkpoolCreate(u_int32_t maxSizeInPages);
void         chunkpoolDelete(chunkpool_t chunkpool);
void*        chunkpoolMalloc(chunkpool_t chunkpool, u_int32_t size);
void         chunkpoolFree(chunkpool_t  chunkpool, void* pointer);

Almost every type is opaque. What does it accomplishes? Freedom. Freedom to change the implementation of the objects because rest of the code only uses functions to access the object and doesn't knows how object is actually implemented.  This also forces me to think hard about what should be the minimal interface for accessing this object because it is painful to keep writing new methods.  I use this for almost all objects except objects whose only job is to be containers of data and no functionality.

I do use function pointers when they make sense, but that would be a topic for another post. Writing high performance software is fun, but making sure it is easy to code and easy to change makes the journey pleasant.

Friday, April 06, 2012

The Debt Of Humanity

We are all in debt. I don't mean the financial debt, your home loan and stuff like that. I mean the debt of humanity.  What were our chances of surviving if we were born million years ago. Death during labor, infections, lack of food, shelter. Instead of fighting with each other, we choose to live together and developed a language to talk. Rest is history. We have taken the concept of being together from few families to villages, towns, cities, nations and now we are almost at the edge of time when all of humanity is considered one big family. And the reason is simple - nations also fight and don't know how to live together.
No matter how much we feel for our country, the truth is that vaccines that saved us were invented by someone else, the languages that we speak and which runs multi billion dollar BPO industry is not ours. Bangalore is silicon valley of India but computers and programming languages and the OS we use etc were not invented here.
We are so much deeply connected today than we were yesterday but our ability to see these connections has diminished over time. I am not talking about facebook friends, but those who work at facebook to make it possible. Those who work at google to make search simpler, democratize the mobile OS. I am talking about the people who make our cars and those who make sure you get the petrol/diesel at the station. The ones who run the refineries and the ones who dig oil out of wells and the ones who build the pipelines. The ones who build the roads and those who build the equipment to build the roads. The ones who invest their lifetimes researching life saving drugs, the ones who ensure we have electricity at our homes. The list is endless.
Everyone on the planet is in some ways making life easier for the rest of us. They realize it or not is debatable. We realize it or not is also debatable. But I do feel that we would have never been here without the rest of us. 

The Best

We know what is the best (product/service/decision/policy/whatever). Actually we knew the best all our lives. Sony makes best TV's. Ferrari is the fastest car, iphone is the best phone, land and gold are the best investments, and so on.

At some point in the past, the best TV was theater, their was no fastest car,  but fastest horses, their was no phone. People with more gold were probably robbed and people with more land ended by killing each other to get more land.


Beware of the future. Best is yet to come.

Saturday, March 17, 2012

Frequency of Choice

I have talked about this earlier also, but the topic is so close to my heart that I wanted to have a dedicated post.

As far as I understand it, the crux of capitalism is choice. In economics the word choice is substituted by market. What is market? Market is where consumers exercise choice. If consumers don't have a choice it is not a market. The assumption is capitalism thrives on competition.  Competition creates choice. Consumers will choose the best products at lowest prices forcing companies to innovate and reduce prices. The best will survive.

This is all true and then not quite true. Two main problems:

  • Most people don't like to think. Even  if they can, the complexity of world is sufficiently high that figuring out what is best for them is close to impossible. Eventually it is either brands or price because they make the decision simple. 
  • Frequency of choice. Since this is all I want to talk about, I will use the next paragraph.
We are good at stuff we do often, the old practice makes the man perfect thing. We buy petrol, vegetables, groceries, etc almost every day. Prices changes are felt, drop in quality is noticed. But then their are things that we don't do often. Things like joining new job, getting married, buying car or home, taking a loan, choosing college, getting home painted, buying TV or refrigerator or AC, casting our vote, choosing a laptop or OS, choosing email client, signing up on a social network, etc.  Many of these choices are irreversible or if not irreversible then choosing the alter our choice is very expensive.  This is where capitalism fails miserably because it is no longer about choosing from alternatives but the choice of altering our choice. For products with short life spans like vegetables or toothpaste altering a choice is not expensive. Vegetables will last few days, toothpaste few weeks and you can choose better product next time, but with products that last years or decades or in some cases lifetimes, it is the altering of choice which is required not choice among products. 

Specifics:
 
Consider home loan business in India. Floating rates have been around for long time now.  What do they float on is unknown and once you take the loan you realize that the "unknown" is whim of the Bank. Usually your floating home loan interest rate will increase by 20%-40% within few months of taking the loan and now their is no choice.  Well their is a choice to switch to other home loan, but only if you pay 2%-4% of your home loan value as switching charges.  This is as monopolistic as it gets and we call it capitalism, the mecca of markets and choice.  Even banks don't know if they are giving a good/bad interest rate to the customer, then how can customer decide if he is getting a good deal and that deal is good enough for the next 20 years. No one can. The only way I can know if I getting a good deal is if I can switch my home loan any moment I desire to switch. That is what will make it a market.

The same happens when switching a job (notice period), casting a vote (5 years gap), buying a car (10% value drop when it get out of the showroom) and at many other places. In computers, the advent of SAAS based companies have started filling this gap by providing monthly choice to the customers to continue to use them which once was a difficult choice of finding the best product. Amazon EC2 gives choice to use machines by hour and OS by hour. I think governments which call themselves followers of capitalism have missed a point.  It is not the choice alone that matters, it is the frequency of the choice that is at the core of efficient markets.  

Tuesday, March 06, 2012

Threadpool and the task queue

Every architecture makes way for threadpool and a task queue.  Multiple thread wait on the queue always ready to pick up the next task and execute them. Once implemented, the next task is tuning it. How many threads? What is the size of the queue? Blocking queue or throws error on full? Retry handler?

Before you start worrying about this ask a simple question. How much time does it takes to execute the task? If it is not at least couple of orders of magnitude greater than time it takes to do context switch, don't bother about the threadpool/queue, just execute it right there, on your current thread.

Here is why?
  • Task queue has a lock. More threads and more often it is accessed, more contention, more time to submit the task. Extra context switch just to acquire the lock. Basically you are doing serialization before getting to parallelism here. More threads + more tasks => more time per insert. Think of it like talking to a customer care executive(CCE). You do lots of IO using IVR and finally reach the CCE and the guy instead of answering your questions connects you to another guy and you need to explain the problem once again. That is pretty much how context switch works. If you need to talk for 10-20 minutes, it might be worth it, but if all it takes is few seconds of conversation, it just wastes time.   
  • Once the task is submitted, it need to wakeup some thread. That is context switch, costs time.
  • By the time this new thread wakes up because of lock and time elapsed most of the variable it needs are out of cache...more time. Read lock semantics for JVM. 
  • How do you do error handling from the task? Extra code, extra states.
You can avoid all this by executing the task inline....normal function call. It will run faster.  It is easy to write/debug. The assumption here is that the task really takes short time to execute and it mostly cpu intensive. Webserver using threadpool is understandable. Single request might need to do file IO, access some locked resources, possibly make multiple database queries. These are kind of things that make sense in threadpool...things that are complex enough to be simplified by using  a new/dedicated "thread of execution".  For other things, function call is the most efficient.