Tuesday, October 20, 2009

Thoughts on life


  • Life is as much a learning experience, as much it is an unlearning experience.
  • Some problems cannot be solved. They just become irrelevant with time.
  • No matter how stupid people appear to us, how bad choices they make, they have the right to be what they are as much we have the right to be what we are.
  • One of the most interesting gift life has given us is its complexity.
  • Some things are worth spending time on, some are not, but only time can tell the difference.
  • Giving up is not bad nether are compromises. They are just hard choices to make when they are the only choices left.

Zen and the art of filing bugs

This is bit old post..just re posting it here..

Finding bugs in software is the primary job of every QA Engineer. What is a bug? Unacceptable behavior of the software is termed as a bug. This is very generic definition and it is meant to be so. Thus one of the crucial aspects of being able to tell if some behavior is a bug or not, is to fully understand what is acceptable behavior of the software.

It is bad that we don’t yet have any metrics for the quality of a bug. I myself don’t have any new metrics but I will like to express my views on quality of bugs. In the rest of this document I will try to explain how using open and fixed bug count is a bad metric of software quality and tester effectiveness and is not only wrong but also hampers the overall growth of the company.

Non Bugs

Consider a simple car. It is not a bug in the car if the car doesn’t go beyond the speed of 200KM, but definitely a bug if that happens with a Ferrari. What is acceptable from a sports car is not the same as what is acceptable from a simple car. This understanding is very much crucial for any tester to develop so that he can contribute to the software quality.

The problem with finding and filings “bugs” which are not, has two shortcomings.
1. Tester wastes his time and energy is finding and filing it.
2. Developer wastes his time in either trying to fix it / mask it or convincing tester that this is not a bug.

From a company perspective this is not a good situation if majority of the tester and developer time is wasted on such “non bugs”. The basic cause of such situations is lack of proper documentation. It is utmost important that “acceptability” of any software is defined to a reasonable accuracy, so that it is possible for any one in the organization to identify “bugs” from “non bugs”, without much effort. As the percentage of such bugs increase, more and more of tester and developer time is wasted that could have been devoted to developing the product and finding actual bugs. This problem typically happens when a new tester joins a team and in his quest for being productive from the day one, ends up filing such bugs.

Duplicate Bugs a.k.a Bug Bloat

Another class of bugs that impacts the company growth just like “non bugs” is duplicate bugs. Unfortunately, these cannot be resolved by proper documentation. A duplicate bug is a bug, which is a side effect of some other actual bug. Both the bugs might have totally different test cases and may present totally different behaviors. On the face of it they might seem totally unrelated. But when the actual bug is fixed, the side effect is automatically fixed.

Developer is the only person that can establish if a given bug is a duplicate or not, because duplicate bug is not a duplicate in behavior to the original bug, but duplicate because of implementation characteristics. If a developer says that a given bug is a duplicate, it would be good if the tester accepts that hypothesis and verifies it when the actual bug is fixed. This in general is not the case.

Consider a car with a faulty battery that gets discharged as soon as the car is stopped. A tester can easily file the following bugs:
1. After the car is stopped it doesn’t starts.
2. After the car is stopped we cannot switch on lights.
3. After the car is stopped the horn doesn’t works.
4. After the car is stopped the wiper doesn’t works.
5. After the car is stopped the music system doesn’t works.

And may be lots more bugs. If the tester has no understanding of what a battery is and how it is related to functioning of other sub systems in the car, he can keep on filing as many bugs as he wants. They are all duplicates and will go away only by fixing the battery.

The reuse of code in the form of functions, data structures and general dependencies between the subsystems create ample opportunities for existence of such bugs in the software. I won’t say they are same bugs. From a user perspective they are all different bugs. But from the perspective of the developer they are all the same and hence the concept of duplicate bug. From the tester perspective it might be a great opportunity for increasing the bug count, but at the end of the day, it is the company that suffers from such “bug bloats”. Why?

1. Tester tests and opens N number of bugs. Instead of one unit of time he spends N units of time.
2. Developer goes through each of the bugs, logs, talks with the tester, tries to convince him that these are duplicates, marks it as duplicate, only to find that it has been reopened and reassigned and all that and at the end of the day fixing just one and marking all the others as fixed.
3. Tester tests for all the bugs and finds that all of them are fixed.

Again, we see that valuable tester and developer time is wasted in communication and following the processes and as far as the product is concerned, really only one bug got fixed. Company definitely stands at loss because the same time could have been spend in finding the real bugs and fixing them, for improving the product, for adding other features to the product.

It is highly unfortunate that there is no silver bullet is solving this issue. This can be minimized if the tester understands or try to understand such dependencies. We could either say that if a developer says a bug is duplicate it is duplicate, no questions asked. That would be the way of trust. The problem is that we might miss some bugs, because developer only “thinks” they are duplicates. On the other hand, we can use fact as the basis. File every bug as it exists. If we take that path, we end up decreasing our productivity if bugs are actually duplicates.

If we have more automated tests than ad hoc tests, we will be in a good position to solve such problems. When a bug is discovered in a particular component, developer can go through the list of tests to tell which other test cases will fail because of this bug. This would be one way to minimize the “bug bloat”.

Currently our test plans lack specificity. Every test case looks at the system as a whole without any consideration to individual components of the system. We can move a step forward from the black box testing and start exploring “grey” box testing. When I say grey box, I only mean that test cases are aware of the components it is going to test. Also, I don’t mean that we shouldn’t do black box testing.

All I mean is that each test case should have a well defined purpose and it should be known what components are being tested by each test case. What I want to convey is that if we know about the system and its components and inter dependencies between them, we can come up with a reasonable ordering of the test cases, which will help us in finding bugs faster, and in the test suite of the component responsible for the bug. To conclude:
1. Test case shouldn’t just have an id but they should have a context and a purpose.
2. The order in which test cases should be executed is not their id in the test plan, but an ordering developed by discussing it with the developer. A component owner must be able to tell “sanity” of which other components must be tested before his component is tested and he should be able to tell the order in which the test cases of his component should be tested. This could be further enhanced by capturing the inter dependencies between actual test cases in a component. Given the number of test cases in a component, it might make sense to further segregate test cases of component is various classes and defining “component test case class” level dependencies.

Again I am not saying that every test case should or will fall nicely into such a structure. Testing flow of messages when configuration changes are being made would be one such example, where success of the test case depends on correct working of all the subsystems. Clearly all such test cases should be run, when all the components pass their sanity tests. Further, we should first check for configuration changes that impact one system at a time, before checking for configuration changes that impact more than one component.

With a system as complex as ours, we need to make sure that our testing strategy is smart enough to facilitate and reduce the time it takes in finding and fixing the bugs. I believe that it is important that we do minimize “bug bloats” because they hamper productivity of the team in a big way.

Defining the Quality of Bug

Bugs have quality. Given that the purpose of QA is to increase the quality of the product, I will propose that the quality of the bug is in direct proportion to the increase in quality of the product by virtue of fixing that bug and inversely proportional to the time spend on fixing the bug. A high quality test case (bug) is the one which makes it obvious to the developer what needs to be done to fix it.

This definition assumes that the code is well written and doesn’t have any design flaws. And hence, a bug is coming because of some developer oversight at some places in the code. This is valid most of the time.

Given this definition, the time spent in fixing a bug could come from two places:
1. The complexity of the test case used in finding the bug. If the test case is too complex and involves too many components, it is hard to find out the root cause.
2. The details specified while filing the bug. Bugs without logs or irrelevant logs, incorrect summary, or without correct specification of the system state, etc contributes to the time spend in analyzing the bug and hence take the biggest chunk of time spent in fixing the bug.

When filing a bug, tester must try to find the minimal set of simple steps required to reproduce the bug. A bug which says that the system crashes when I run all my test cases in a loop with 100 threads is a very low quality bug, if the real issue was that a particular test case was leaking buffers.

Many times testers confuse themselves with the assumption that bugs found with “complex test scenario” are good quality bugs. This is true, if and only if the “complex test scenario” is the only manifestation of the bug, but it is a very low quality bug, if the same results can be produced by doing something very simple. Many a times the “complex test scenario” is the only manifestation, because tester never tried anything simple.

The quality of the bugs directly determines the quality of the product. Any tester can literally stop the growth of the product by filing too many low quality bugs, because developers will end of spending most of their time in analyzing the bugs than fixing them.

Low quality testing has the power to jeopardize product development.
Maintaining quality of the bugs is very important and for the success of the company, it is important that this quality is strived for.

Filing Bugs

As defined in the last section, filing of bugs plays an important part in determining the quality of the bug and hence the quality of the product.
Filing essentially means capturing enough information about the bug, so that developer can start fixing the bug as soon as he looks at the bug. This is the ideal, but we can strive for it. This may not be possible because of various reasons.
1. The bug doesn’t provide enough and definite steps to reproduce the bug. Or the steps are so complex that it takes some time to execute them and find the root cause.
2. The bug doesn’t have the required log files.
3. The bug doesn’t provide stack trace for crashes.
4. The bug doesn’t capture the details of the environment in which the bug was seen.
5. The bug doesn’t capture the details of the operation that was performed.
6. The bug doesn’t occur when the system is in clean state.
7. The bug doesn’t capture the history of the operation performed.
8. The bug doesn’t have a purpose or intention with respect to what component or feature or behavior is being tested.
9. The bug is filed without testing if the “softer” versions of the test case. The bug essentially describes a “point” in the “test space”, without exploring in any direction around the “test point”.

There are too many wrong ways to file a bug. But the one and only thing that the tester must keep in mind while filing is: “I want this to get fixed, as soon as possible.” And then try to provide as much information as possible to make the task of the developer as easy as possible. Obviously there are trade offs. It shouldn’t be the case that tester spends say 10 hours trying to provide the information about the bug, which developer might have deduced in few minutes. Go an extra mile, find out from the developer what does he wants to make it easy for him to fix, but avoid what is unnecessary and what is difficult.

To conclude, the real job is testing is not to find the bugs as generally believed but to get the bugs fixed. By having a structured approach to testing, we can find and fix bugs faster and improve the quality of the product in lesser time, which gives us more time to add new features to the product. We have a lot to gain by controlling the quality of our testing and a lot to loose by not doing it.

Lots of literature is available on making complex systems but none on testing complex systems. The methods of testing simple software or simple systems when applied on complex systems cause as much catastrophes as caused by making complex systems using simple software methodologies. In the following sections, I will list some of the insights that I have and I believe can simplify the process of finding and fixing bugs in complex systems.

The core concept that I will exploit is that the complex system is generally built using simple components.

What is a component? A component is an encapsulated entity which provides a specific functionality to the system. It could be a library used by a single or multiple processes in the system. It could be an executable which at runtime will provide some functionality. It could be a kernel module. It could be the kernel itself. For that matter it could the complete OS or any other facility provided by the OS. It could be thread in a process also.

This definition can be applied repeatedly on any complex system to further subdivide it into its constituents. At the lowest level of this spectrum will be the utility routines (assuming we are not testing the default libraries and system calls). At this lowest level, the routines can be tested using unit tests. Why do we test routines using unit tests? Routines can be tested independently without depending upon anything else in the system. In fact the system is directly dependent on correct functioning of each of the routines in each of components and its libraries. Though unit testing is done in a very controlled environment, it lays the foundation on which the whole system stands.

1. Core libraries should have unit tests.
2. Unit testing should be extended to test the routines in a environment which is as close to the system as possible. For example if the library is intended to the used from multiple threads, it makes sense to test it in multithreaded environment. If the routine could be called from multiple threads of multiple processes, it should be tested for that behavior.

software re-use is overrated

- Don't write your own web server.
- Eclipse is a good platform for UI applications, use it.
- Tons of languages are available, use them don't invent a new one

You might have reasons not to follow the advise above and I am sure in that case you know what you are doing. For most people this is good enough. This is reuse of the software and it is kind of must. Some people end up writing a Utils class and want everyone to use it. Actually everyone writes their own. It is alright... write your own.

I have started comparing speed of software development with the disk read/write speed. When you do lots of seeks here and there you can only do about 200-300 per second. But when you start writing continuously you can do may be 50MBPS. Software reuse also has a similar cost. It increases the number of "seeks" you have to do and the limit is tight. The more you need to think about when reusing the code, you end up introducing more complexity and next change needs more "seeks". When trying to reuse the code think about the size of reuse. If the code you are trying to reuse will save your say 10K lines of code, do it. It is worth it. May be change those 10K lines of code so that they are more reusable..but saving 100 or 500 lines of code will definitely cause more trouble than help. I have seen people trying to write something which is may be 1K lines of code in a single file into 20 files with 50 lines each. As the number of classes (concepts, terminology) increases, it becomes more reusable because each of these classes can be independently used but it increases the complexity of the code..so that it is hard to maintain, things which could be private are now public and instead of 1 file now we have 20 which is hard to understand. By decreasing the complexity of each file, we ended up introducing the complexity of 20 concepts. This is the kind of reuse which is uncalled for.

When thinking reuse..think substantial and think simple. If it is not substantial and it is not simple, chances are it is not going to be reused. So make your life easier by designing against reuse. By taking reuse out you have better control of the specification/requirements and code will be minimal and complete. By not reusing other peoples code and by doing nothing to make your code reusable you are doing a great service to people who will be maintaining this code. You are making it simple, to the point and will also do is fast, possibly less buggy and easier to test.

Free Software - How to make money

Short answer, Can't period.

Medium answer, if you are not a software company you can make money from free software. For example Google is not a software company..it doesn't sells software..it makes money from ads. Facebook is also not a software company..it also makes money from ads. Microsoft is a software company .. it makes money by selling software. TCS, Wipro, HCL are all software services companies. So in short if your business is not software, free software is for you. Use it to decrease cost of your business...release it to kill any one making the software of that kind..or to get goodwill and mind-share of the market or to get "free developers" or to define the "standards" in your business area.

Long answer, with open source selling software is no longer an option. So throw away the idea of making software and start with thinking about the business. SAAS, Web-API or Website, Iphone App, Android App, Facebook App, etc are some of the viable options where you end up making software but not selling it and yet can make money. What API market gives you is freedom from the GPL and single place of control and the freedom to charge if you can. It is better than writing free software, hosting it and giving it for free and then putting a "donate" button on the website. If you are successful, make parts of your software free or open source..and make it useless without your service.
This is just one way..their could be millions more. Take software services for example. LGPL is a boon for the software services companies, they can get almost everything for free and then customize it and make tons of money. The problem is not everybody can do it. Any CIO who love his job will pay 100 million to TCS but not 100K to some XYZ who claims he can do the same job. TCS provides peace of mind with redundant staff, long term maintainance contract, project planning and 24*7 support. Moreover CIO doesn't wants to deal with 10 different services vendors..maybe just 1 or 2. And it is easy for established players to provide it given the open source and free nature of the software.
Web companies like google, yahoo, facebook, etc are great supporter of free software. It helps them and they don't even need to worry about GPL. Mozilla made money by putting google as the default search engine in the browser.

Google extends deal with Mozilla
91% of the Mozilla Revenue comes from Google

The point is all these companies make money from a business which is built on software but is not selling software. It is hard to believe if these companies would have existed without free software. The software business has changed in the last 10 years. From "bursting of the internet bubble" the industry has came a long way forward where people are now actually making money. It is high time that the philanthropic free software gets a makeover. GPL was written in 1989.

Top 50 software companies in 1989

That is 20 years ago. It appears to me that the context in which it was written was the following:
1) Make software open (so that user can change it and know what it is doing) and free (to make more people use it)
2) GPL to make sure that people don't make money by selling the modifications made to the open source.

20 years ago, this was a great thought. Note the "user". It was written with software people use for personal use it mind. It was written with people need "software" to use computer in mind. Both these things have changed. Instead of "user" now we have companies, instead of selling software now companies sell "services" or "ads" and may be tomorrow something else.

In today's world what would make more sense would be something which has the following properties:
1) It should cover both software and hosted service.
2) It should cover both open and free.
2) Instead of free it should be more driven by volume and profit. For example Microsoft with BizSpark provides free microsoft software for startups which have less than 1m in revenue and are less than three year old. Other example is google app engine which is free upto a given usage. I think the software/service which is open/free should also provide a model which makes it financially viable for people to contribute to it.

The reason I would like it not to be "free at all scale" is that then it is simply mockery of the system by business model. GPL is tied to a business model and that business model is no longer relevant now. Hence the need to delink GPL from the business model. And secondly I want open source to create money for the people who created it. I think software is so much relevant today that it was 20 years ago and the people who make it should get their fair share of the work they have done. Infact I want open source to be the "business/development" model where people who are good at writing software should just do that and will make money without thinking about how to market it or create business from it and the people who are good at marketing and creating business from software should do just that but make sure that they pay the people who made it possible.

Sunday, October 11, 2009

Free Software Take 2

I already talked about why I think free software is a bad idea. Now we turn the tables. Open source and free software is abundantly used. I love firefox, thunderbird, open office, eclipse etc and it saves money. So many of the startups would not have been possible without open source and free software. It provides fuel to innovation as developers don't have to reinvent the wheel. These are all good things but what about the people who contributed in building it. The idea sounds a bit communist...from each according to his ability and to each according to his needs.

I ended up using google and see what I found....

http://www.freesoftwaremagazine.com/node/1707
http://www.getgnulinux.org/linux/misunderstanding_free_software/

The next search was on how to make money from open source.