Points of View — Source Lines of Code as a Measure of Performance

Glen Alleman at Herding Cats has a post on the measure of source lines of code (SLOC) in project management.  He expresses the opinion that SLOC is an important measure of determining cost and schedule–a critical success factor–in what is narrowly defined as Software Intensive Systems.  Such systems are described as being those that are development intensive and represent embedded code.  The Wikipedia definition of an embedded system is as follows:  “An embedded system is a computer system with a dedicated function within a larger mechanical or electrical system, often with real-time computing constraints. It is embedded as part of a complete device often including hardware and mechanical parts. Embedded systems control many devices in common use today.”  In critiquing what can only be described as a strawman argument, he asserts that in criticizing the effectiveness of SLOC, “It’s one of those irrationally held truths that has been passed down from on high by those NOT working in the domains where SLOC is a critical measure of project and system performance.”

Hmmm.  I…don’t…think…so.  I must respectfully disagree with my colleague in his generalization.

What are we measuring when we measure SLOC?  And which SLOC measure are we using?

Oftentimes we are measuring an estimate of what we think (usually based on the systemic use of the Wild-Assed Guess or “WAG” in real life), given the language in which we are developing, the numbers of effective and executable code lines needed to achieve the desired functionality.  No doubt there are parametric sets that are usually based on a static code environment, that will tell us the range of SLOC that should yield a release, given certain assumptions.  But this is systems estimation, not project management and execution.  Estimates are useful in the systems engineering process in sizing and anticipating the effort in processes such as COCOMO, SEER-SEM, and other estimating methods in a very specific subset of projects where the technology is usually well defined and the code set mature and static–with very specific limitations.

SLOC will not, by itself, provide an indication of a working product.  It is, instead, a part of the data stream in the production process in code development.  What this means is that the data must be further refined to determine effectiveness to become a true critical success factor.  Robert Park at SEI of Carnegie Institute effectively summarizes the history and difficulties in defining and applying SLOC.  Even for supporters of the metric, there are a number of papers similar to Nguyen, Deeds-Rubin, Tan, and Boehm of the Center for Systems and Software Engineering at the University of Southern California articulate the difficulty in specifying a counting standard.

The Software Technology Support Center at Hill Air Force Base’s GSAM 3.0 has this to say about SLOC:

Source lines-of-code are easy to count and most existing software estimating models use SLOCs as the key input.  However, it is virtually impossible to estimate SLOC from initial requirements statements.  Their use in estimation requires a level of detail that is hard to achieve (i.e., the planner must often estimate the SLOC to be produced before sufficient detail is available to accurately do so.)
Because SLOCs are language-specific, the definition of how SLOCs are counted has been troublesome to standardize.  This makes comparisons of size estimates between applications written in different programming languages difficult even though conversion factors are available.

What I have learned (through actual experience in coming from the software domain first as a programmer and then as a program manager) is that there was a lot of variation in the elegance of produced code.  When we use the term “elegance” we are not using woo-woo terms to obscure meaning.  It is a useful term that connotes both simplicity and effectiveness.  For example, in C programming language environments (and its successors), differences in SLOC between a good developer and a run-of-the-mill hack who uses cut-and-paste in recycling code can be as much as 20% or more.  We find evidence of this variation in the details underling the high rate of software project failure noted in my previous posts and in my article on Black Swans at AITS.org.  A 20% difference in executable code translates not only into cost and schedule performance, but the manner in which the code is written translates into qualitative differences in the final product such as its ability to scale and sustainment.

But more to the point, our systems engineering practices seem to contribute to suboptimization.  An example of this was articulated by Steve Ballmer in the movie Triumph of the Nerds where he voiced the very practical financial impact of the SLOC measure:

In IBM there’s a religion in software that says you have to count K-LOCs, and a K-LOC is a thousand lines of code.  How big a project is it?  Oh, it’s sort of a 10K-LOC project.  This is a 20K-LOCer.  And this is 50K-LOCs. And IBM wanted to sort of make it the religion about how we got paid.  How much money we made off OS/2 how much they did.  How many K-LOCs did you do?  And we kept trying to convince them – hey, if we have – a developer’s got a good idea and he can get something done in 4K-LOCs instead of 20K-LOCs, should we make less money?  Because he’s made something smaller and faster, less K-LOC. K-LOCs, K-LOCs, that’s the methodology.  Ugh!  Anyway, that always makes my back just crinkle up at the thought of the whole thing.

Thus, it is not that SLOC is not a metric to be collected, it is just that, given developments in software technology and especially the introduction of Fourth Generation programming language, that SLOC has a place, and that place is becoming less and less significant.  Furthermore, institutionalization of SLOC may represent a significant barrier to technological innovation, preventing leveraging the advantages provided by Moore’s Law.  In technology such bureaucratization is the last thing that is needed.

Mo’Better Risk — Tournaments and Games of Failure Part II

My last post discussed economic tournaments and games of failure in how they describe the success and failure of companies, with a comic example for IT start-up companies.  Glen Alleman at his Herding Cats blog has a more serious response in handily rebutting those who believe that #NoEstimates, Lean, Agile, and other cult-like fads can overcome the bottom line, that is, apply a method to reduce inherent risk and drive success.  As Glen writes:

“It’s about the money. It’s always about the money. Many want it to be about them or their colleagues, or the work environment, or the learning opportunities, or the self actualization.” — Glen Alleman, Herding Cats

Perfectly good products and companies fail all the time.  Oftentimes the best products fail to win the market, or do so only fleetingly.  Just think of the roles of the dead (or walking dead) over the years:  Novell, WordPerfect, Visicalc, Harvard Graphics; the list can go on and on.  Thus, one point that I would deviate from Glen is that it is not always EBITDA.  If that were true then both Facebook and Amazon would not be around today.  We see tremendous payouts to companies with promising technologies acquired for outrageous sums of money, though they have yet to make a profit.  But for every one of these there are many others that see the light of day for a moment and then flicker out of existence

So what is going on and how does this inform our knowledge of project management?  For the measure of our success is time and money, in most cases.  Obviously not all cases.  I’ve given two cases of success that appeared to be failure in previous posts to this blog: the M1A1 Tank and the ACA.  The reason why these “failures” were misdiagnosed was that the agreed measure(s) of success were incorrect.  Knowing this difference, where, and how it applies is important.

So how do tournaments and games of failure play a role in project management?  I submit that the lesson learned from these observations is that we see certain types of behaviors that are encouraged that tend to “bake” certain risks into our projects.  In high tech we know that there will be a thousand failures for every success, but it is important to keep the players playing–at least it is in the interest of the acquiring organization to do so, and is in the public interest in many cases as well.  We also know that most IT projects by most measures–both contracted out and organic–tend to realize a high rate of failure.  But if you win an important contract or secure an important project, the rewards can be significant.

The behaviors that are reinforced in this scenario on the part of the competing organization is to underestimate the cost and time involved in the effort; that is, so-called “bid to win.”  On the acquiring organization’s part, contracting officers lately have been all too happy to award contracts they know to be too low (and normally out of the competitive range) even though they realize it to be significantly below the independent estimate.  Thus “buying in” provides a significant risk that is hard to overcome.

Other behaviors that we see given the project ecosystem are the bias toward optimism and requirements instability.

In the first case, bias toward optimism, we often hear project and program managers dismiss bad news because it is “looking in the rear view mirror.”  We are “exploring,” we are told, and so the end state will not be dictated by history.  We often hear a version of this meme in cases where those in power wish to avoid accountability.  “Mistakes were made” and “we are focused on the future” are attempts to change the subject and avoid the reckoning that will come.  In most cases, however, particularly in project management, the motivations are not dishonest but, instead, sociological and psychological.  People who tend to build things–engineers in general, software coders, designers, etc.–tend to be an optimistic lot.  In very few cases will you find one of them who will refuse to take on a challenge.  How many cases have we presented a challenge to someone with these traits and heard the refrain:  “I can do that.”?  This form of self-delusion can be both an asset and a risk.  Who but an optimist would take on any technically challenging project?  But this is also the trait that will keep people working to the bitter end in a failure that places the entire enterprise at risk.

I have already spent some bits in previous posts regarding the instability of requirements, but this is part and parcel of the traits that we see within this framework.  Our end users determine that given how things are going we really need additional functionality, features, or improvements prior to the product roll out.  Our technical personnel will determine that for “just a bit more effort” they can achieve a higher level of performance or add capabilities at marginal or tradeoff cost.  In many cases, given the realization that the acquisition was a buy-in, project and program managers allow great latitude in accepting as a change an item that was assumed to be in the original scope.

There is a point where one or more of these factors is “baked in” into the course that the project will take.  We can delude ourselves into believing that we can change the course of the trajectory of the system through the application of methods: Agile, Lean, Six Sigma, PMBOK, etc. but, in the end, if we exhaust our resources without a road map on how to do this we will fail.  Our systems must be powerful and discrete enough to note the trend that is “baked in” due to factors in the structure and architecture of the effort being undertaken.  This is the core risk that must be managed in any undertaking.  A good example that applies to a complex topic like Global Warming was recently illustrated by Neil deGrasse Tyson in the series Cosmos:

In this example Dr. Tyson is climate and the dog is the weather.  But in our own analogy Dr. Tyson can be the trajectory of the system with the dog representing the “noise” of periodic indicators and activity around the effort.  We often spend a lot of time and effort (which I would argue is largely unproductive) on influencing these transient conditions in simpler systems rather than on the core inertia of the system itself.  That is where the risk lies. Thus, not all indicators are the same.  Some are measuring transient anomalies that have nothing to do with changing the core direction of the system, others are more valuable.  These latter indicators are the ones that we need to cultivate and develop, and they reside in an initial measurement of the inherent risk of the system largely based on its architecture that is antecedent to the start of the work.

This is not to say that we can do nothing about the trajectory.  A simpler system can be influenced more easily.  We cannot recover the effort already expended–which is why even historical indicators are important.  It is because they inform our future expectations and, if we pay attention to them, they keep us grounded in reality.  Even in the case of Global Warming we can change, though gradually, what will be a disastrous result if we allow things to continue on their present course.  In a deterministic universe we can influence the outcomes based on the contingent probabilities presented to us over time.  Thus, we will know if we have handled the core risk of the system by focusing on these better indicators as the effort progresses.  This will affect its trajectory.

Of course, a more direct way of modifying these risks is to make systemic adjustments.  Do we really need a tournament-based system as it exists and is the waste inherent in accepting so much failure really necessary?  What would that alternative look like?

Livin’ on a Prayer — The Importance of Plan B

Glen Alleman over at Herding Cats has a great presentation up on the importance of risk handling by having a Plan B based on the Shackleton Expedition.  This is an important point and one that goes against the oft-heard assertion, particularly in software development, that we are “exploring,” that our systems are evolutionary, that we are delivering value one increment at a time.

Murphy’s Law of combat operations states that “No OPLAN ever survives initial contact.”  This experience is in line with Eisenhower’s comment that “When preparing for battle, I have always found that plans are useless but planning is indispensable.”  What these observations mean is that we know that once tested by reality–the reality of combat in the case of the examples given–that almost nothing will go according to plan.

As part of operational planning, the staff identifies risks, alternatives, and contingencies.  Everyone in the planning process is made familiar with these alternatives–Plan B, Plan C, etc.  We may “think” that the plan we have chosen is the best one, based as it is on certain assumptions and the 80% solution.  But when in the midst of an operation no one can anticipate everything.  Knowing, however, that during the planning process that what one is seeing before them is very much in line with one of the alternative scenarios allows us to initiate the alternative plan.  Rather than having to throw everything out, including the progress that has been made, or having to improvise from zero, we have a basis to make well-informed decisions based on the alternatives.  To do otherwise is folly and may lead to defeat in the case of military operations.  For more workaday situations, like project management, to do otherwise is folly and may lead to project failure.

This is why our community should be following the ACA roll-out, regardless of the surrounding politics, as I stated in a previous blog.  The ACA program is a fascinating real-life and highly visible experiment in program and project management.  Much of the publicity in the press focused on the federal government’s website roll-out.  But that fails to distinguish two important concepts: that a program is not the same as a project, and that the website was not the entire project for the initial enrollment period for the ACA.  It is, to quote Macbeth, “…a tale told by an idiot, full of sound and fury, signifying nothing.”

The reason for my unsympathetic assessment of the critics of the roll-out is because they are using the wrong measures of success, which was the number of people ultimately enrolled.  (Now projected to be about 7.8 million for the website alone, and for between 14 to 20 million overall).  There was a plan B and a plan C.  The facilitators turned out to be very important in the early stages and represented an effective Plan B.  Adjustments in the enrollment period also allowed for some flexibility, giving the digital systems time to recover, and provided a Plan C.

There will be more detailed postmortems as the players begin to publish once the dust has settled.  Early controversy within the IT community has focused on whether the failure rested with Agile or Waterfall.  I think this is a false debate since no software methodology can credibly claim to inherently handles risk or rises to the level of a project management method.  I think the real issue of interest to IT professionals will focus on the areas of testing and recovery: the first because early reports were that testing was insufficient and the latter because the recovery was remarkably fast, which undermines the credibility of the critics of testing.