19 02 2010
Code Metrics: Lines of Code (LoC)
I started writing series of blog posting about code metrics. I plan to introduce different metrics and explain their meaning. Also I plan to introduce tools you can use to measure those metrics. Where possible I will introduce you how to use one or another metric. The first metric is the simplest one and it is called Lines of Code (LoC).
Lines of Code shows how many lines of source code there is in your application, namespace, class or method. LoC can be used to:
- check the size of code units. If method size is more then 20 code lines then method may be too complex and not so easy to understand. You can also find large classes that must be split to smaller classes.
- estimate the size of project. You have to understand that LoC is useful estimation characteristic only under certain conditions and until you find some better estimation method (be quick finding it). You can use LoC of one application to estimate another one if they are similar applications by logic, requirements and functionalities. You must be still very careful when using this metric.
My suggestion is to use LoC to monitor the size of your code units. When it comes to software estimation you may probably find better estimation methods.
LoC is not linear estimation characteristic
There is very good book about software estimation: Software Estimation – Demystifying the Black Art by Steve McConnell. Before using LoC as silver bullet I suggest you to come back to the ground and read this book.
[Using software industry productivity averages], the 10,000 LOC system would require 13.5 staff months. If effort increased linearly, a 100,000 LOC system would require 135 staff months. But it actually requires 170 staff months.
To get better idea about difference in linear and real estimation take look at the following chart.
I think error at size of 35 staff months is pretty horrible experience for budget, isn’t it?
Productivity cannot be measured by LoC
One of the classic mistakes is using LoC to measure programmers productivity. It is nonsense. One complex algorithm may take about 100 lines of code but the time it takes to make it work may be equal to system that has 10000 lines of code or even more. There is heavy difference in complexity. By example, writing ASP.NET MVC application is pretty easy and straightforward compared to algorithm I mentioned.
Also how can be programmer who wrote 1000 lines of code more effective than programmer who wrote 20 lines of code and achieved same or even better functionality? I see here one more danger – why should programmers write effective and easy to manage code if their work is respected when they write much less effective and way longer code? Measuring productivity this makes strong professionals to seem as horrible ballast in team – do you really want to disrespect or even lose your main workhorses?
Types of LoC
There are two LoC metrics and they differ by measuring method:
- logical LoC contains only lines of executed code – definitions, namespace imports etc are not considered as executed code. As Patrick Smacchia points out in his blog posting How do you count your number of Lines Of Code (LOC)? the logical LoC is better characteristic because it is not dependent on coding style and language.
- physical LoC contains all the lines of code and it is measured by parsing files of source code. Take a look at Hackles to get better idea of physical LoC.
It turns out that logical LoC – however you measure it- is way better than physical LoC because it contains less noise.
LoC is good metric to measure size of code units. It can be also used as estimation metric but under very narrow and restrictive limits. It is something you can use when you start estimating but you have to leave it as soon as you find some more exact estimation method. You cannot use LoC to measure progress of project or productivity of programmers – don’t even think about it. 🙂