Tuesday 14 September 2010

Code coverage - when is it sufficient?

We write unit test cases to ascertain functionality delivered by parts of code. We call this white box testing. But it happens for some part of code unit test cases are missing or sometimes all the paths are not thoroughly tested. Code coverage analysis provides overall and in detail statistics about the current coverage of unit testing.
This kind of statistics is important to find out gap areas and add more test cases to improve overall quality.

How is it measured?

Code coverage tools usually provides overall and in-depth code coverage details by package, class, method and line of Java classes. The reporting tools also have the ability to highlight lines of code covered and not-covered by the units tests. These tools collect data from runtime after test executions to produce such reports.
 
OVERALL COVERAGE SUMMARY:
 
[class, %]       [method, %]      [block, %]          [line, %]            [name]
98% (121/123)!   64% (312/489)!   81% (15500/19172)   77% (2633.6/3441)!   all classes
 
OVERALL STATS SUMMARY:
 
total packages: 1
total classes:  123
total methods:  489
total executable files: 31
total executable lines: 3441
 
COVERAGE BREAKDOWN BY PACKAGE:
 
[class, %]      [method, %]     [block, %]          [line, %]         [name]
98% (121/123)!  64% (312/489)!  81% (15500/19172)   77% (2633.6/3441)! default package

Ref:Emma quickstart 

From the code coverage reports, it is very easy to pin-point classes, methods and lines not covered by the unit tests. With a well integrated CI tool like Hudson,it case the code coverage falls below certain benchmark - it is possible to mark the build as failed.
But at times it is easy to fool around with the code coverage tools with dummy test cases which may show high coverage percentage but actual quality of test cases might not just be sufficient. We will try to find out how much is sufficient code coverage.

How much of code coverage is sufficient?

From the code coverage reports, we see certain percentage data for each class, method and package varying from 64 to 98%. But when do we say, it's sufficient code coverage? is it 64, or 80 or 100%? Is 100 % test coverage even sufficient?
The trick is - even 50% test coverage is sufficient if the test cases are well written while 100% coverage might also mean wrong data:
Code Test Case Notes
public int add(int x, int y){
  return x+y;
}
@Test
public add(){
  add(4,5);
  assertTrue(true);
}
While the coverage would be 100% but not sufficient.

Ways to ensure sufficient code coverage

I believe ensuring good test coverage thus good quality is more in the mindset of developers - which for a large group is achieved by well-defined process and practice. Here are my tips for good code coverage :
  • Follow TDD : Ensure you write a test case before writing your code. If you find this difficult, probably you have not understood the business functionality well. Never ever code or fix an issue without a proper test case.
  • Peer Review : Who is your greatest criticizer? Find him (or her) and get your code reviewed. First few rounds of harshness would shape your test cases and code well.
  • Run Code Coverage Tools Often : Don;t keep the code coverage review reports for the end - run often and run for small part (each class). Fixing it early when the memory is fresh helps to add quality test cases
  • Remove unused code : Feel free to delete unwanted code not covered by test cases.


References

Sunday 5 September 2010

"Dirty Code" - who cleans the mess? and how?

During early years my stint as developer, if I found some part of code dirty or messy - I would delete the entire code and rewrite. Of course, not realizing how much extra effort has been wasted in this process. But somehow, I use to get satisfaction of seeing pieces of code readable and understandable.

We, when producing code, produce much of good code alongwith some of dirty code, patches, code non-conforming to standards, designs and unused codes. Over period dirty code heaps up, producing, which Robert C Martin calls Legacy Code.

In this article we will discuss typically how dirty codes are produced, what are the ways to prevent it and what are the ways to clean it. But one thing we need to keep in mind, once the code is dirty, it's difficult to clean it again due to lack of time,budget, willingness and the shear size of clean-up.

The art of producing dirty code

When we write (or change) code , along with clean code we also leave some dirty code. We write code when we add some feature or fix a bug or try to improve the design or optimize resource usage. The structure of the code changes, some functionalities are added/removed from the product. And these gates of introducing changes allows us room for introducing code garbage. These garbage coding could be from inexperienced developer (i.e. developer trying to implement code in a language for first time), lack of conformity of standards in a project and most importantly , lack of sufficient code coverage. We as developers have to understand the producing good quality code is as good as producing good quality test cases.

The art of clean-up

Now that we have produced some garbage code we need to learn how to clean that mess. It is slow, tedious but given sufficient attention and effort it's worthy. To clean-up first task would be to identify garbage code. It could be partly automatic and partly manual. Certain development tools integrated with IDEs and continuous integrations tools can detect anomalies related to code coverage and code reviews.
It would be good to quickly add some tests missing sufficient code coverage. Then it would be good to fix automatic code review comments.
Now comes rather harder parts. Manual code review to find sections of code non-conforming to standards and design principles. The best way is to find some small sections and identify patterns of non-conformance. Then to do refactoring immediately. It's good to do this exercise as "pilot" for a team and slowly roll-out for entire development teams.

Finally, prevention is better than cure...

Now the easier part (or the tougher one as you see). Following simple development guidelines improves code quality significantly. When you change any code (even a single line) ensure it is unit tested. The code is readable without any external documentation. Check that your development IDE has correctly installed plugins for automatic code reviews and rightly configured. Make sure you make your build fail for any code review issues.
If you are new to he project or new to the technology , get your code reviewed by someone more experienced. Well, if you are the first one in your project working in this technology, try to lay out standards. A little precaution would help us to see the garbage clean and appreciate our clean code.


Must read

Book : Clean Code