Ace of Base(line) — A New Paper on Building a Credible PMB

Glen Alleman, a leading consultant in program management (who also has a blog that I follow), Tom Coonce of the Institute for Defense Analyses, and Rick Price of Lockheed Martin, have jointly published a new paper in the College of Performance Management’s Measureable News entitled “Building A Credible Performance Measurement Baseline.”

The elements of their proposal for constructing a credible PMB, from my initial reading, are as follows:

1.  Rather than a statement of requirements, decision-makers should first conduct a capabilities gap analysis to determine the units of effectiveness and performance.  This ensures that program management decision-makers have a good idea of what “done” looks like, and ensures that performance measurements aren’t disconnected from these essential elements of programmatic success.

2.  Following from item 1 above, the technical plan and the programmatic plan should always be in sync.

3.  Earned value management is but one of many methods for assessing programmatic performance in its present state.  At least that is how I interpret what the are saying, because later in their paper they propose a way to ensure that EVM does not stray from the elements that define technical achievement.  But EVM in itself is not the end-all or be-all of performance management–and fails in many ways to anticipate where the technical and programmatic plans diverge.

4.  All work in achieving the elements of effectiveness and performance are first constructed and given structure in the WBS.  Thus, the WBS ties together all elements of the project plan.  In addition, technical and programmatic risk must be assessed at this stage, rather than further down the line after the IMS has been constructed.

5.  The Integrated Master Plan (IMP) is constructed to incorporate the high level work plans that are manifested through major programmatic events and milestones.  It is through the IMP that EVM is then connected to technical performance measures that affect the assessment of work package completion that will be reflected in the detailed Integrated Master Schedule (IMS).  This establishes not only the importance of the IMP in ensuring the linkage of technical and programmatic plans, but also makes the IMP an essential artifact that has all too often be seen as optional, which probably explains why so many project managers are “surprised” when they construct aircraft that can’t land on the deck of a carrier or satellites that can’t communicate in orbit, though they are well within the tolerance bands of cost and schedule variances.

6.  Construct the IMS taking into account the technical, qualitative, and quantitative risks associated with the events and milestones identified in the IMP.  Construct risk mitigation/handling where possible and set aside both cost and schedule margins for irreducible uncertainties, and management reserve (MR) for reducible risks, keeping in mind that margin is within the PMB but MR is above the PMB but within the CBB.  Furthermore, schedule margin should be transitioned from a deterministic one to a probabilistic one–constructing sufficient margin to protect essential activities.  Cost margin in work packages should also be constructed in the same manner-based on probabilistic models that determine the chances of making a risk reducible until reaching the point of irreducibility.  Once again, all of these elements tie back to the WBS.

7.  Cost and schedule margin are not the same as slack or float.  Margin is reserve.  Slack or float is equivalent to overruns and underruns.  The issue here in practice is going to be to get the oversight agencies to leave margin alone.  All too often this is viewed as “free” money to be harvested.

8.  Cost, schedule, and technical performance measurement, tied together at the elemental level of work–informing each other as a cohesive set of indicators that are interrelated–and tied back to the WBS, is the only valid method of ensuring accurate project performance measurement and the basis for programmatic success.

Most interestingly, in conclusion the authors present as a simplified case an historical example how their method proves itself out as both a common sense and completely reasonable approach, by using the Wright brothers’ proof of concept for the U.S. Army in 1908.  The historical documents in that case show that the Army had constructed elements of effectiveness and performance in determining whether they would purchase an airplane from brothers.  All measures of project success and failure would be assessed against those elements–which combined cost, schedule, and technical achievement.  I was particularly intrigued that the issue of weight of the aircraft was part of the assessment–a common point of argument from critics of the use of technical performance–where it is demonstrated in the paper how the Wright brothers actually assessed and mitigated the risk associated with that measure of performance over time.

My initial impression of the paper is that it is a significant step forward in bringing together all of the practical lessons learned from both the successes and failures of project performance.  Their recommendations are a welcome panacea to many of the deficiencies implicit in our project management systems and procedures.

I also believe that as an integral part of the process in construction of the project artifacts, that it is a superior approach than the one that I initially proposed in 1997, which assumed that TPM would always be applied as an additional process that would inform cost and schedule at the end of each assessment period.  I look forward to hearing the presentation at the next Integrated Program Management Conference, at which I will attempt some live blogging.

Frame by Frame: Framing Assumptions and Project Success or Failure

When we wake up in the morning we enter the day with a set of assumptions about ourselves, our environment, and the world around us.  So too when we undertake projects.  I’ve just returned from the latest NDIA IPMD meeting in Washington, D.C. and the most intriguing presentation at the meeting was given by Irv Blickstein regarding a RAND root cause analysis of major program breaches.  In short, a major breach in the cost of a program is defined by the Nunn-McCurdy amendment that was first passed in 1982, in which a major defense program breaches its projected baseline cost by more than 15%.

The issue of what constitutes programmatic success and failure has generated a fair amount of discussion among the readers of this blog.  The report, which is linked above, is full of useful information regarding Major Defense Acquisition Program (also known as MDAP) breaches under Nunn-McCurdy, but for purposes of this post readers should turn to page 83.  In setting up a project (or program), project/program managers must make a set of assumptions regarding the “uncertain elements of program execution” centered around cost, technical performance, and schedule.  These assumptions are what are referred to as “framing assumptions.”

A framing assumption is one in which there are signposts along the way to determine if an assumption regarding the project/program has changed over time.  Thus, according to the authors, the precise definition of a framing assumption is “any explicit or implicit assumption that is central in shaping cost, schedule, or performance expectations.”  An interesting aspect of their perspective and study is that the three-legged stool of program performance relegates risk to serving as a method that informs the three key elements of program execution, not as one of the three elements.  I have engaged in several conversations over the last two weeks regarding this issue.  Oftentimes the question goes: can’t we incorporate technical performance as an element of risk?  Short answer:  No, you can’t (or shouldn’t).  Long answer: risk is a set of methods for overcoming the implicit invalidity of single point estimates found in too many systems being used (like estimates-at-complete, estimates-to-complete, and the various indices found in earned value management, as well as a means of incorporating qualitative environmental factors not otherwise categorizable), not an element essential to defining the end item application being developed and produced.  Looked at another way, if you are writing a performance specification, then performance is a key determinate of program success.

Additional criteria for a framing assumption are also provided in the RAND study.  These are that the assumptions must be determinative, that is, the consequences of the assumption being wrong significantly affects the program in an essential way.  They must also be unmitigable, that is, the consequences of the assumption being wrong are unavoidable.  They must be uncertain, that is, the outcome or certainty of it being right or wrong cannot be determined in advance.  They must be independent and not dependent on another event or series of events.  Finally, they must be distinctive, in setting the program apart from other efforts.

RAND then applied the framing assumption methodology to a number of programs.  The latest NDIA meeting was an opportunity to provide an update of conclusions based on the work first done in 2013.  What the researchers found was that framing assumptions which are kept at a high level, be developed early in a program’s life cycle, and should be reviewed on a regular basis to determine validity.  They also found that a program breached the threshold when a framing assumption became invalid.  Project and program managers, and requirements personnel have at least intuitively known this for quite some time.  Over the years, this is the reason given for requirements changes and contract modifications over the course of development that result in cost, performance, and schedule impacts.

What is different about the RAND study is that they have outlined a practical process for making these determinations early enough for a project/program to be adjusted with changing circumstances.  For example, the numbers of framing assumptions of all MDAPs in the study could be boiled down to four or five, which are easily tested against reality during the milestone and other reviews held over the course of a program.  This is particularly important given the lengthened time-frames of major acquisitions from development to production.

Looking at these results, my own observation is that this is a useful tool for identifying course corrections that are needed before they manifest into cost and schedule impacts, particularly given that leadership at PARCA has been stressing agile acquisition strategies.  The goal here, it seems, is to allow for course corrections before the inertia of the effort leads to failure or–more likely–the development and deployment of an end item that does not entirely meet the needs of the Defense Department.  (That such “disappointments” often far outstrip the capabilities of our adversaries is a topic for a different post).

I think the court is still out on whether course corrections, given the inertia of work and effort already expended at the point that a framing assumption would be tested as invalid, can ever truly be offsetting to the point of avoiding a breach, unless we then rebrand the existing effort as a new program once it has modified its structure to account for new framing assumptions.  Study after study has shown that project performance is pretty well baked in at the 20% mark.  For MDAPs, much of the front-loaded efforts in technology selection and application have been made.  After all, systems require inputs and to change a system requires more inputs, not less, to overcome the inertia of all of the previous effort, not to mention work in progress.   This is basic physics whether we are dealing with physical systems or complex adaptive (economic) systems.

Certainly, more efficient technology that affects the units of measurement within program performance can result in cost savings or avoidance, but that is usually not the case.  There is a bit of magical thinking here: that commercial technologies will provide a breakthrough to allow for such a positive effect.  This is an ideological idea not borne out by reality.  The fact is that most of the significant technological breakthroughs we have seen over the last 70 years–from the microchip to the internet and now to drones–have resulted from public investments, sometimes in public-private ventures, sometimes in seeded technologies that are then released into the public domain.  The purpose of most developmental programs is to invest in R&D to organically develop technologies (utilizing the talents of the quasi-private A&D industry) or provide economic incentives to incorporate technologies that do not currently exist.

Regardless, the RAND study has identified an important concept in determining the root causes of overruns.  It seems to me that a formalized process of identifying framing assumptions should be applied and done at the inception of the program.  The majority of the assessments to test the framing assumptions should then need to be made prior to the 20% mark as measured by program schedule and effort.  It is easier and more realistic to overcome the bow-wave of effort at that point than further down the line.

Note: I have modified the post to clarify my analysis of the “three-legged stool” of program performance in regard to where risk resides.

I’ve Got Your Number — Types of Project Measurement and Services Contracts

Glen Alleman reminds us at his blog that we measure things for a reason and that they include three general types: measures of effectiveness, measures of performance, and key performance parameters.

Understanding the difference between these types of measurement is key, I think, to defining what we mean by such terms as integrated project management and in understanding the significance of differing project and contract management approaches based on industry and contract type.

For example, project management focused on commodities, with their price volatility, emphasizes schedule and resource management. Cost performance (earned value) where it exists, is measured by time in lieu of volume- or value-based performance. I have often been engaged in testy conversations where those involved in commodity-based PM insist that they have been using Earned Value Management (EVM) for as long as the U.S.-based aerospace and defense industry (though the methodology was borne in the latter). But when one scratches the surface the approaches in the details on how value and performance is determined is markedly different–and so it should be given the different business environments in which enterprises in each of these industries operate.

So what is the difference in these measures? In borrowing from Glen’s categories, I would like to posit a simple definitional model as follows:

Measures of Excellence – are qualitative measures of achievement against the goals in the project;

Measures of Performance – are quantitative measures against a plan or baseline in execution of the project plan.

Key Performance Parameters – are the minimally acceptable thresholds of achievement in the project or effort.

As you may guess there is sometimes overlap and confusion regarding which category a particular measurement falls. This confusion has been exacerbated by efforts to define key performance indicators (KPIs) based on industry, giving the impression that measures are exclusive to a particular activity. While this is sometimes the case it is not always the case.

So when we talk of integrated project management we are not accepting that any particular method of measurement has primacy over the others, nor subsumes them. Earned Value Management (EVM) and schedule performance are clearly performance measures. Qualitative measures oftentimes measure achievement of technical aspects of the end item application being produced. This is not the same as technical performance measurement (TPM), which measures technical achievement against a plan–a performance measure. Technical achievement may inform our performance measurement systems–and it is best if it does. It may also inform our Key Performance Parameters since exceeding a minimally acceptable threshold obviously helps us to determine success or failure in the end. The difference is the method of measurement. In a truly integrated system the measurement of one element informs the others. For the moment these systems presently tend to be stove-piped.

It becomes clear, then, that the variation in approaches differs by industry, as in the example on EVM above, and–in an example that I have seen most recently–by contract type. This insight is particularly important because all too often EVM is viewed as being synonymous with performance measurement, which it is not. Services contracts require structure in measurement as much as R&D-focused production contracts, particularly because they increasingly take up a large part of an enterprise’s resources. But EVM may not be appropriate.

So for our notional example, let us say that we are responsible for managing an entity’s IT support organization. There are types of equipment (PCs, tablet computers, smartphones, etc.) that must be kept operational based on the importance of the end user. These items of hardware use firmware and software that must be updated and managed. Our contract establishes minimal operational parameters that allow us to determine if we are at least meeting the basic requirements and will not be terminated for cause. The contract also provides incentives to encourage us to exceed the minimums.

The sites we support are geographically dispersed. We have to maintain a help desk but also must have people who can come onsite and provide direct labor to setup new systems or fix existing ones–and that the sites and personnel must be supported within a particular time-frame: one hour, two hours, and within twenty-four hours, etc.

In setting up our measurement systems the standard practice is to start with the key performance parameters. Typically we will also measure response times by site and personnel level, record our help desk calls, and track qualitative aspects of the work: How helpful is the help desk? Do calls get answered at the first contract? Are our personnel friendly and courteous? What kinds of hardware and software problems do we encounter? We collect our data from a variety of one-off and specialized sources and then we generate reports from these systems. Many times we will focus on those that will allow us to determine if the incentive will be paid.

Among all of this data we may be able to discern certain things: if the contract is costing more or less than we anticipated, if we are fulfilling our contractual obligations, if our personnel pools are growing or shrinking, if we are good at what we do on a day-to-day basis, and if it looks as if our margin will be met. But what these systems do not do is allow us to operate the organization as a project, nor do they allow us to make adjustments in a timely manner.

Only through integration and aggregation can we know, for example, how the demand for certain services is affecting our resource demands by geographical location and level of service, on a real=time basis where we need to make adjustments in personnel and training, whether we are losing or achieving our margin by location, labor type, equipment type, hardware vs. software; our balance sheets (by location, by equipment type, by software type, etc.), if there is a learning curve, and whether we can make intermediate adjustments to achieve the incentive thresholds before the result is written in stone. Having this information also allows us to manage expectations, factually inform perceptions, and improve customer relations.

What is clear by this example is that “not doing EVM” does not make measurement easy, nor does it imply simplification, nor the absence of measurement. Instead, understanding the nature of the work allows us to identify those measures within their proper category that need to be applied by contract type and/or industry. So while EVM may not apply to services contracts, we know that certain new aggregations do apply.

For many years we have intuitively known that construction and maintenance efforts are more schedule-focused, that oil and gas exploration more resource- and risk-focused, and that aircraft, satellites, and ships more performance-focused. I would posit that now is the time for us to quantify and formalize the commonalities and differences. This also makes an integrated approach not simply a “nice to have” capability, but an essential capability in managing our enterprises and the projects within them.

Note: This post was updated to correct grammatical errors.