Tuesday 14 September 2010

Code coverage - when is it sufficient?

We write unit test cases to ascertain functionality delivered by parts of code. We call this white box testing. But it happens for some part of code unit test cases are missing or sometimes all the paths are not thoroughly tested. Code coverage analysis provides overall and in detail statistics about the current coverage of unit testing.
This kind of statistics is important to find out gap areas and add more test cases to improve overall quality.

How is it measured?

Code coverage tools usually provides overall and in-depth code coverage details by package, class, method and line of Java classes. The reporting tools also have the ability to highlight lines of code covered and not-covered by the units tests. These tools collect data from runtime after test executions to produce such reports.
 
OVERALL COVERAGE SUMMARY:
 
[class, %]       [method, %]      [block, %]          [line, %]            [name]
98% (121/123)!   64% (312/489)!   81% (15500/19172)   77% (2633.6/3441)!   all classes
 
OVERALL STATS SUMMARY:
 
total packages: 1
total classes:  123
total methods:  489
total executable files: 31
total executable lines: 3441
 
COVERAGE BREAKDOWN BY PACKAGE:
 
[class, %]      [method, %]     [block, %]          [line, %]         [name]
98% (121/123)!  64% (312/489)!  81% (15500/19172)   77% (2633.6/3441)! default package

Ref:Emma quickstart 

From the code coverage reports, it is very easy to pin-point classes, methods and lines not covered by the unit tests. With a well integrated CI tool like Hudson,it case the code coverage falls below certain benchmark - it is possible to mark the build as failed.
But at times it is easy to fool around with the code coverage tools with dummy test cases which may show high coverage percentage but actual quality of test cases might not just be sufficient. We will try to find out how much is sufficient code coverage.

How much of code coverage is sufficient?

From the code coverage reports, we see certain percentage data for each class, method and package varying from 64 to 98%. But when do we say, it's sufficient code coverage? is it 64, or 80 or 100%? Is 100 % test coverage even sufficient?
The trick is - even 50% test coverage is sufficient if the test cases are well written while 100% coverage might also mean wrong data:
Code Test Case Notes
public int add(int x, int y){
  return x+y;
}
@Test
public add(){
  add(4,5);
  assertTrue(true);
}
While the coverage would be 100% but not sufficient.

Ways to ensure sufficient code coverage

I believe ensuring good test coverage thus good quality is more in the mindset of developers - which for a large group is achieved by well-defined process and practice. Here are my tips for good code coverage :
  • Follow TDD : Ensure you write a test case before writing your code. If you find this difficult, probably you have not understood the business functionality well. Never ever code or fix an issue without a proper test case.
  • Peer Review : Who is your greatest criticizer? Find him (or her) and get your code reviewed. First few rounds of harshness would shape your test cases and code well.
  • Run Code Coverage Tools Often : Don;t keep the code coverage review reports for the end - run often and run for small part (each class). Fixing it early when the memory is fresh helps to add quality test cases
  • Remove unused code : Feel free to delete unwanted code not covered by test cases.


References

Sunday 5 September 2010

"Dirty Code" - who cleans the mess? and how?

During early years my stint as developer, if I found some part of code dirty or messy - I would delete the entire code and rewrite. Of course, not realizing how much extra effort has been wasted in this process. But somehow, I use to get satisfaction of seeing pieces of code readable and understandable.

We, when producing code, produce much of good code alongwith some of dirty code, patches, code non-conforming to standards, designs and unused codes. Over period dirty code heaps up, producing, which Robert C Martin calls Legacy Code.

In this article we will discuss typically how dirty codes are produced, what are the ways to prevent it and what are the ways to clean it. But one thing we need to keep in mind, once the code is dirty, it's difficult to clean it again due to lack of time,budget, willingness and the shear size of clean-up.

The art of producing dirty code

When we write (or change) code , along with clean code we also leave some dirty code. We write code when we add some feature or fix a bug or try to improve the design or optimize resource usage. The structure of the code changes, some functionalities are added/removed from the product. And these gates of introducing changes allows us room for introducing code garbage. These garbage coding could be from inexperienced developer (i.e. developer trying to implement code in a language for first time), lack of conformity of standards in a project and most importantly , lack of sufficient code coverage. We as developers have to understand the producing good quality code is as good as producing good quality test cases.

The art of clean-up

Now that we have produced some garbage code we need to learn how to clean that mess. It is slow, tedious but given sufficient attention and effort it's worthy. To clean-up first task would be to identify garbage code. It could be partly automatic and partly manual. Certain development tools integrated with IDEs and continuous integrations tools can detect anomalies related to code coverage and code reviews.
It would be good to quickly add some tests missing sufficient code coverage. Then it would be good to fix automatic code review comments.
Now comes rather harder parts. Manual code review to find sections of code non-conforming to standards and design principles. The best way is to find some small sections and identify patterns of non-conformance. Then to do refactoring immediately. It's good to do this exercise as "pilot" for a team and slowly roll-out for entire development teams.

Finally, prevention is better than cure...

Now the easier part (or the tougher one as you see). Following simple development guidelines improves code quality significantly. When you change any code (even a single line) ensure it is unit tested. The code is readable without any external documentation. Check that your development IDE has correctly installed plugins for automatic code reviews and rightly configured. Make sure you make your build fail for any code review issues.
If you are new to he project or new to the technology , get your code reviewed by someone more experienced. Well, if you are the first one in your project working in this technology, try to lay out standards. A little precaution would help us to see the garbage clean and appreciate our clean code.


Must read

Book : Clean Code

Saturday 28 August 2010

Does open source mean 'free' usage?

If you are planning to use a open source software in your project have you considered the following?
Are you sure it's free? Are you sure that if you use this software in your product you can keep your source closed? Or are you sure you do not need to pay at all? Can you freely distribute your final product bundled with the open source software?
To find the answer of these questions - we have to understand software licensing. In this article, we will try to answer - "Are all open source softwares free?" , "Does free mean totally free?", " If these software are not free - do I have other options to use it? . We will start with different kind of software licenses and understand significance of each of these licenses.

BSD License

BSD licence allows free usage of copyrighted code. There are two types of BSD licenses currently available - "New BSD License" and "Simplified BSD License/Free BSD License".
New BSD License is a 3-clause license allowing redistribution and use in source and binary forms with/without modifications. For unlimited redistribution copyright notices and the license's disclaimers of warranty must be maintained. Also the names of contributors or the organization should not be used for endorsement of a derived work without specific permission.
Simplified BSD is a 2-clause license omitting the non-endorsemnent clause.

GNU Public License or GPL

Most popular license for free software projects(around 65% by some count). It's a copyleft license. That essentially means the modified or the extended (version using original code) work needs to use the same license as the original work. This mandate makes the code/software using GPL work also to be free.When someone distributes a GPL'd work plus their own modifications, the requirements for distributing the whole work cannot be any greater than the requirements that are in the GPL.One is allowed to make private modified versions, without any obligation to divulge the modifications as long as the modified software is not distributed to anyone else.

GNU Lesser General Public License or LGPL

It was designed as a compromise between the strong-copyleft GNU General Public License or GPL and permissive licenses such as the BSD licenses and the MIT License.The LGPL places copyleft restrictions on the program itself but does not apply these restrictions to other software that merely links with the program. The LGPL is primarily used for software libraries, although it is also used by some stand-alone applications, most notably Mozilla and OpenOffice.org and sometimes media as well.

Affero General Public License or LGPL

The GNU Affero General Public License is a free, copyleft license for software and other kinds of works, specifically designed to ensure cooperation with the community in the case of network server software.The Affero General Public License is the the GNU GPL V2 and "one additional feature", specifically covering the distribution of application programs through web services or computer networks.

Apache License

The Apache License allows the user of the software the freedom to use the software for any purpose, to distribute it, to modify it, and to distribute modified versions of the software, under the terms of the license.It requires preservation of the copyright notice and disclaimer,it allows use of the source code for the development of proprietary software as well as free and open source software.). In every licensed file, any original copyright, patent, trademark, and attribution notices in redistributed code must be preserved and, in every licensed file changed, a notification must be added stating that changes have been made to that file.

Dual License

Dual or multi licensing is commonly done to support free software business models in a commercial environment. In this scenario, one option is a proprietary software license, which allows the possibility of creating proprietary applications derived from it, while the other license is a copyleft free software/open-source license, thus requiring any derived work to be released under the same license.
Some examples of softwares using dual license are MySQL, JQuery, Mozilla Firefox, Perl, Ruby etc.

Summary table

License Nature Redistribution Conditions Softwares using this license
BSD Copyright Allowed unlimited for source/binary with/without modifications.
  • Allows free usage of copyrighted software
  • 3-clause requires contributors names must not be used for endorsement, while 2-clause license omits it.
FreeBSD, MacOS X
GPL Copyleft Allowed but the modified or extended work must be of equals or lesser restriction.
  • Mandates any modified or derived work to be of same or lesser restrictive.
ExtJs
LGPL Copyleft Allowed with restriction. A compromise between GPL and BSD. Restriction applies only to the program not entire software. Mozilla, OpenOffice.org, Hibernate
AGPL Copyleft GPL 2.0 for web applications.
Apache License Copyright Allows the user of the software the freedom to use the software for any purpose, to distribute it, to modify it, and to distribute modified versions of the software Does not require modified versions of the software to be distributed using the same license All Apache Software Foundation(ASF) softwares, Google Web Toolkit (GWT), Spring

References

Thursday 26 August 2010

ExtJs - best among current Javascript RIA frameworks?

Ext JS is a cross-browser JavaScript library for building rich Internet applications. It is a JavaScript library for building interactive web applications using techniques such as Ajax DHTML and DOM scripting.

Currently version (3.0) added support for REST and hosted in Sencha web site. It has a history of licensing changes which became stricter with time.

Pros
  • Excellent out-of-box Widgets
    • Wow! That’s what would be your expression when you explore throughExtJs widgets. ExtJS is like a superset of the widgets like simple label, textbox buttons to complex grids, drag-drop panel s etc. It also provides demo for remoting to invoke remote server methods.
  • Good API documentation
    • It has quite good documentation with tutorials, samples and user community.
  • Active and currently most adopted javascript RIA framework
  • Good code quality/readability
  • Inline Editing
    • Ext JS makes it simple to edit tickets inline. Providing inline editing of tickets without it is out of the scope of what we can accomplish this term. Without it, we would probably end up using a modal dialog box to create and edit ticket details.
Cons
  • The generated code
    • Footprint - The library is 500 KB in size (using mod_gzip could be reduced to 150KB). Loading time would is high for home page on web.
    • CSS – very easy to get lost. It is difficult to find correct class names
    • HTML – full of divs and overly complex generated code. Difficult to debug even with FireBug.
  • Dual Licensing and not free for closed source applications
    • Modified GPL 3.0 licensed. Free for open source applications but paid for commercial closed source applications.
  • Customization
    • Using ExtJs leads us to believe that the GUI would kind of desktop and rich. Customization is not easily achievable.
  • Verbosity
    • Loading even simple things requires few lines of coding which is simpler in plain  html or jQuery.
  • Debugging and Error reporting
    • Debugging is not very easy. Only GWT has bit better debugging in hosted mode among javascript frameworks.
  • No Bookmarking and indexing search engine
    • It is not possible for the user to bookmark a certain page . Since the objects are rendered by DOM manipulation, page can not be indexed by search engines
  • Learning time
    • Need quite experienced developer.
References

Wednesday 25 August 2010

Google Web ToolKit (GWT) - (Java => Javascript really worth?)

Google Web Toolkit (GWT) is an open source set of tools created by Google for web developers in order to allow them create and maintain rich and complex Javascript front-end applications in Java. It is licensed under the Apache Licence V 2.0.
In a J2EE Architecture, GWT emphasizes reusable, efficient solutions to recurring Ajax challenges, namely Asynchronous Remote Procedure Call, history management, bookmarking, internationalization and cross-browser portability.
The major GWT components include:
  • GWT Java-to-JavaScript Compiler
    • Translates the Java programming language to the JavaScript programming language.
  • GWT Emulation Library
    • Allows the developers to run and execute GWT applications in hosted mode (the app runs as Java in the JVM without compiling to JavaScript). It is commonly used for debugging.
  • JRE emulation library 
    • JavaScript implementations of the commonly used classes in the Java standard class library (such as most of the java.lang package classes and a subset of the java.util package classes)
  • GWT Web UI class library
    • A set of custom interfaces and classes for creating widgets.

Pros

  • Java solution to build web GUI and Ajax-enabled applications
    • This means you can use such features like debugging, refactoring, unit testing for your UI in the same way as you do this on the server-side.
    • In a typical web development or Ajax app, one has to know to code in HTML/DHTML, Javascript, JSP/ASP, JSTL etc. With GWT it is all Java. Developers do not need to know JavaScript technology, CSS, or DOM. The GWT Java-to-JavaScript technology compiler compiles the client-side Java code into JavaScript technology code and HTML.
    • Sharing the same language between the client and server (ability to use a shared Java package)
    • Deal only with POJOs – no JSON/XML/DOM stuff. Can leverage typical OO design patterns.
    • Can use complex Java on the client
      • Turned into JavaScript, but you still use String, array, Math class, ArrayList, HashMap, custom classes, etc.
    • Supports refactoring and promotes reusability
  • Communication between browser and Server well handled
    • Can send complex Java types to and from the server
    • Data gets serialized across network
  • Development environment support
    • Developed UI components can be unit tested and reused
      • One can write client and server side JUnit tests like any other java app.
    • Tooling
      • Integration with Eclipse and IntelliJ. One can develop a GWT app like developing any other java app.
      • Can test within Eclipse without installing a server
      • Using statically typed language (Java) to develop the client-side of the app allows to catch various problems even before the code is compiled (tools, IDEs, static analysis tools are available)
    • Provides GWT hosted mode, an environment that enables debugging.
      • You can run your Ajax-enabled application in hosted mode. This allows you to use your IDE's debugging facilities to test and debug the application.
      • hosted mode (you can make changes in Java on the fly and just hit "refresh" in the hosted mode browser)
      • Like any other java app, ide debugging into both client and server side code. No need for separate tools for different browsers ( IE vs Firefox )
      • Debugging tools like any other Java app (can set breakpoints and debug the app in hosted mode)
    • Full IDE-based Java support for development/debugging
  • Handles browser incompatibilities in processing Ajax.
    • The GWT Java-to-JavaScript technology compiler generates browser-compliant JavaScript technology code by DOM abstraction, saving developers from having to code for browser incompatibilities.
  • Performance
    • Code size (javascript footprint) is much smaller and execution speed is much better.
    • Time to deliver is much faster and fixing issues are much faster than typical jsp/javascript apps.
  • Open Source and Apache Licenced
    • Many free widgets, Support for many AJAX widgets.
  • Support by major company : Developed and supported by Google
    • Good documentation
    • Has good community support

Cons

  • GUI code is written in Java and GWT compiler generates javascript
    • Well, the advantage of writing GUI code in java has own disadvantages. For adding any small modifications in UI even, the whole cycle of writing in java and compilation has to be followed. The process is slower with increase of size of the application.
    • Usually the GUI developers are specialized – the java developers have to be accustomed to UI.
    • Generated JavaScript technology code has its drawbacks. Even though you don't need to know JavaScript technology to use GWT, you might sometimes need to examine the generated JavaScript technology code -- for example, if the browser displays a JavaScript technology error. For developers who are unfamiliar with JavaScript technology, understanding the JavaScript technology code that GWT generates can be difficult. Also because GWT generates the JavaScript technology code, you lose fine-grained control over the processing.
    • Just because one can write in java doesn't mean all java api's are supported. This is not a restriction of java – rather it is of javascript. Since GWT compiles the java code into javascript, only features that are supported in javascript can be implemented on the client side code in GWT.
    • Some syntax quirks around passing complex java objects between client and server. Like Collections have to defined in the javadoc
    • Since generated code is javascript/html, one needs knowledge of css/ html for setting styles to the rendered page elements – can be set in separate css file or can be set directly on the Element.
  • Learning curve of nonstandard approach and initial mindset shift
    • Fundamentally different strategy than all other Ajax environments makes evaluation and management buyoff harder
    • Most Ajax environments do JavaScript on the client and have a choice for the server. GWT is based entirely around Java.
    • You never put direct JavaScript in your HTML. Instead, you use JSNI to wrap JavaScript in Java. Very powerful in the long run, but hard to get used to at first.
    • Need to design presentation layer architecture very carefully. When building large complex web application with GWT you end up with a huge number of classes. To be able to maintain and extend application code you will have to use some architectural patterns for building GUI (like HMVC, PAC etc).
    • Concept of modules can be confusing.
    • Jumping between secure to unsecure modes can be quirky – ie have to move between web apps because of app context.
  • GWT compiler restrictions
    • Currently only Java 1.4 syntax and subset of Java core packages is supported (though this is going to change in GWT 1.5)
    • GWT compilation is rather slow comparing to standard Java compiler.
  • Development cycle (deploy, test, adjust, redeploy) time
    • When the code has been deployed to Google's 'hosted mode' browser, many code changes can be tested within a matter of seconds, by 'refreshing' the webapp. Think of it as automatic hot redeploy. Deploying the code to that browser currently takes around a minute, which wouldn't be a problem, if it weren't for the fact that not all code changes can be tested without a redeploy. A minute per development cycle is just too long, even if it affects only 10% of the cycles.
  • Cumbersome deployment
    • Clumsy and poorly documented process to deploy on a regular Java-based Web server.
  • Web indexing
    • Web indexing of Javascript is difficult, often developers need to create a HTML-only version of the app just to allow search engines to index it

References

Tuesday 24 August 2010

Why this blog?

Recently, one of my colleague is trying to evaluate several options to finalize technical architecture and tools for their project. This is what we do as architects - evaluate and choose.


In this blog, I will write about pros and cons of different technical tools and frameworks used in j2EE projects.