Measure for Measure — Must Read: Dave Gordon Is Looking for Utilitarian Metrics at AITS.org

Dave Gordon at his AITS.org blog deals with the issue of metrics and what makes them utilitarian, this is, “actionable.”  Furthermore at his Practicing IT Project Management blog he challenges those in the IT program management community to share real life examples.  The issue of measures and whether they pass the “so-what?” test in an important one, since chasing, and drawing improper conclusions from, the wrong ones are a waste of money and effort at best, and can lead one to make very bad business decisions at worst.

In line with Dave’s challenge, listed below are the types of metrics (or measures) that I often come across.

1.  Measures of performance.  This type of metric is characterized by actual performance against a goal for a physical or functional attribute of the system being developed.  It can be measured across time as one of the axes, but the ultimate benchmark against what is being measured is against the requirement or goal.  Technical performance measurements often fall into this category, though I have seen instances where these TPM is listed in its own category.  I would argue that such separation is artificial.

2.  Measures of progress.  This type of metric is often time-based, oftentimes measured against a schedule or plan.  Measurement of schedule variances in terms of time or expenditure rates against a budget often fall into this category.

3.  Measures of compliance.  This type of metric is one that measures systemic conditions that must be met which, if not, indicates a fatal error in the integrity of the system.

4.  Measures of effectiveness.  This type of metric tracks against those measures related to the operational objectives of the project, usually specified under particular conditions.

5.  Measures of risk.  This type of metric measures quantitatively the effects of qualitative, systemic, and inherent risk.  Oftentimes qualitative and quantitative risk are separated, which is the means of identification and whether that means is recorded either indirectly or directly.  But, in reality, they are measuring different aspects and causes of the same phenomenon.

6.  Measures of health.  This type of metric measures the relative health of a system against a set of criteria.  In medicine there are a set of routine measures for biological subjects.  Measures of health distinguish themselves from measures of compliance in that any variation, while indicative of a possible problem, is not necessarily fatal.  Thus, a range of acceptable indicators or even some variation within the indicators can be acceptable.  So while these measures may point to a system issue, borderline areas may warrant additional investigation.

In any project management system, there are often correct and incorrect ways of constructing these measures.  The basis for determining whether they are correct, I think, is whether the end result metric possesses materiality and traceability to a particular tangible state or criteria.  According to Dave and others, a test of a good metric is whether it is “actionable”.  This is certainly a desirable characteristic, but I would suggest not a necessary one and is contained within materiality and traceability.

For example, some metrics are simply indicators, which suggest further investigation; others suggest an action when viewed in combination with others.  There is no doubt that the universe of “qualitative” measures is shrinking as we have access to bigger and better data that provide us with quantification.  Furthermore as stochastic and other mathematical tools develop, we will have access to more sophisticated means of measurement.  But for the present there will continue to be some of these non-quantifiable measures only because, with experience, we learn that there are different dimensions in measuring the behavior of complex adaptive systems over time that are yet to be fully understood, much less measured.

I also do not mean for this to be an exhaustive list.  Others that have some overlap to what I’ve listed come to mind, such as measures of efficiency (different than effectiveness and performance in some subtle ways), measures of credibility or fidelity (which has some overlap with measures of compliance and health, but really points to a measurement of measures), and measures of learning or adaptation, among others.

Days of Future Passed — Legacy Data and Project Parametrics

I’ve had a lot of discussions lately on data normalization, including being asked the question of what constitutes normalization when dealing with legacy data, specifically in the field of project management.  A good primer can be found at About.com, but there are also very good older papers out on the web from various university IS departments.  The basic principals of data normalization today consist of finding a common location in the database for each value, reducing redundancy, properly establishing relationships among the data elements, and providing flexibility so that the data can be properly retrieved and further processed into intelligence in such as way as the objects produced possess significance.

The reason why answering this question is so important is because our legacy data is of such a size and of such complexity that it falls into the broad category of Big Data.  The condition of the data itself provides wide variations in terms of quality and completeness.  Without understanding the context, interrelationships, and significance of the elements of the data, the empirical approach to project management is threatened, since being able to use this data for purposes of establishing trends and parametric analysis is limited.

A good paper that deals with this issue was authored by Alleman and Coonce, though it was limited to Earned Value Management (EVM).  I would argue that EVM, especially in the types of industries in which the discipline is used, is pretty well structured already.  The challenge is in the other areas that are probably of more significance in getting a fuller understanding of what is happening in the project.  These areas of schedule, risk, and technical performance measures.

In looking at the Big Data that has been normalized to date–and I have participated with others in putting a significant dent in this area–it is apparent that processes in these other areas lack discipline, consistency, completeness, and veracity.  By normalizing data in sub-specialties that have experienced an erosion in enforcing standards of quality and consistency, technology becomes a driver for process improvement.

A greybeard in IT project management once said to me (and I am not long in joining that category): “Data is like water, the more it flows downstream the cleaner it becomes.”  What he meant is that the more that data is exposed in the organizational stream, the more it is questioned and becomes a part of our closed feedback loop: constantly being queried, verified, utilized in decision making, and validated against reality.  Over time more sophisticated and reliable statistical methods can be applied to the data, especially if we are talking about performance data of one sort or another, that takes periodic volatility into account in trending and provides us with a means for ensuring credibility in using the data.

In my last post on Four Trends in Project Management, I posited that the question wasn’t more or less data but utilization of data in a more effective manner, and identifying what is significant and therefore “better” data.  I recently heard this line repeated back to me as a means of arguing against providing data.  This conclusion was a misreading of what I was proposing.  One level of reporting data in today’s environment is no more work than reporting on any other particular level of a project hierarchy.  So cost is no longer a valid point for objecting to data submission (unless, of course, the one taking that position must admit to the deficiencies in their IT systems or the unreliability of their data).

Our projects must be measured against the framing assumptions in which they were first formed, as well as the established measures of effectiveness, measures of performance, and measures of technical achievement.  In order to view these factors one must have access to data originating from a variety of artifacts: the Integrated Master Schedule, the Schedule and Cost Risk Analysis, and the systems engineering/technical performance plan.  I would propose that project financial execution metrics are also essential in getting a complete, integrated, view of our projects.

There may be other supplemental data that is necessary as well.  For example, the NDIA Integrated Program Management Division has a proposed revision to what is known as the Integrated Baseline Review (IBR).  For the uninitiated, this is a process in which both the supplier and government customer project teams can come together, review the essential project artifacts that underlie project planning and execution, and gain a full understanding of the project baseline.  The reporting systems that identify the data that is to be reported against the baseline are identified and verified at this review.  But there are also artifacts submitted here that contain data that is relevant to the project and worthy of continuing assessment, precluding manual assessments and reviews down the line.

We don’t yet know the answer to these data issues and won’t until all of the data is normalized and analyzed.  Then the wheat from the chaff can be separated and a more precise set of data be identified for submittal, normalized and placed in an analytical framework to give us more precise information that is timely so that project stakeholders can make decisions in handling any risks that manifest themselves during the window that they can be handled (or make the determination that they cannot be handled).  As the farmer says in the Chinese proverb:  “We shall see.”

Ace of Base(line) — A New Paper on Building a Credible PMB

Glen Alleman, a leading consultant in program management (who also has a blog that I follow), Tom Coonce of the Institute for Defense Analyses, and Rick Price of Lockheed Martin, have jointly published a new paper in the College of Performance Management’s Measureable News entitled “Building A Credible Performance Measurement Baseline.”

The elements of their proposal for constructing a credible PMB, from my initial reading, are as follows:

1.  Rather than a statement of requirements, decision-makers should first conduct a capabilities gap analysis to determine the units of effectiveness and performance.  This ensures that program management decision-makers have a good idea of what “done” looks like, and ensures that performance measurements aren’t disconnected from these essential elements of programmatic success.

2.  Following from item 1 above, the technical plan and the programmatic plan should always be in sync.

3.  Earned value management is but one of many methods for assessing programmatic performance in its present state.  At least that is how I interpret what the are saying, because later in their paper they propose a way to ensure that EVM does not stray from the elements that define technical achievement.  But EVM in itself is not the end-all or be-all of performance management–and fails in many ways to anticipate where the technical and programmatic plans diverge.

4.  All work in achieving the elements of effectiveness and performance are first constructed and given structure in the WBS.  Thus, the WBS ties together all elements of the project plan.  In addition, technical and programmatic risk must be assessed at this stage, rather than further down the line after the IMS has been constructed.

5.  The Integrated Master Plan (IMP) is constructed to incorporate the high level work plans that are manifested through major programmatic events and milestones.  It is through the IMP that EVM is then connected to technical performance measures that affect the assessment of work package completion that will be reflected in the detailed Integrated Master Schedule (IMS).  This establishes not only the importance of the IMP in ensuring the linkage of technical and programmatic plans, but also makes the IMP an essential artifact that has all too often be seen as optional, which probably explains why so many project managers are “surprised” when they construct aircraft that can’t land on the deck of a carrier or satellites that can’t communicate in orbit, though they are well within the tolerance bands of cost and schedule variances.

6.  Construct the IMS taking into account the technical, qualitative, and quantitative risks associated with the events and milestones identified in the IMP.  Construct risk mitigation/handling where possible and set aside both cost and schedule margins for irreducible uncertainties, and management reserve (MR) for reducible risks, keeping in mind that margin is within the PMB but MR is above the PMB but within the CBB.  Furthermore, schedule margin should be transitioned from a deterministic one to a probabilistic one–constructing sufficient margin to protect essential activities.  Cost margin in work packages should also be constructed in the same manner-based on probabilistic models that determine the chances of making a risk reducible until reaching the point of irreducibility.  Once again, all of these elements tie back to the WBS.

7.  Cost and schedule margin are not the same as slack or float.  Margin is reserve.  Slack or float is equivalent to overruns and underruns.  The issue here in practice is going to be to get the oversight agencies to leave margin alone.  All too often this is viewed as “free” money to be harvested.

8.  Cost, schedule, and technical performance measurement, tied together at the elemental level of work–informing each other as a cohesive set of indicators that are interrelated–and tied back to the WBS, is the only valid method of ensuring accurate project performance measurement and the basis for programmatic success.

Most interestingly, in conclusion the authors present as a simplified case an historical example how their method proves itself out as both a common sense and completely reasonable approach, by using the Wright brothers’ proof of concept for the U.S. Army in 1908.  The historical documents in that case show that the Army had constructed elements of effectiveness and performance in determining whether they would purchase an airplane from brothers.  All measures of project success and failure would be assessed against those elements–which combined cost, schedule, and technical achievement.  I was particularly intrigued that the issue of weight of the aircraft was part of the assessment–a common point of argument from critics of the use of technical performance–where it is demonstrated in the paper how the Wright brothers actually assessed and mitigated the risk associated with that measure of performance over time.

My initial impression of the paper is that it is a significant step forward in bringing together all of the practical lessons learned from both the successes and failures of project performance.  Their recommendations are a welcome panacea to many of the deficiencies implicit in our project management systems and procedures.

I also believe that as an integral part of the process in construction of the project artifacts, that it is a superior approach than the one that I initially proposed in 1997, which assumed that TPM would always be applied as an additional process that would inform cost and schedule at the end of each assessment period.  I look forward to hearing the presentation at the next Integrated Program Management Conference, at which I will attempt some live blogging.

I’ve Got Your Number — Types of Project Measurement and Services Contracts

Glen Alleman reminds us at his blog that we measure things for a reason and that they include three general types: measures of effectiveness, measures of performance, and key performance parameters.

Understanding the difference between these types of measurement is key, I think, to defining what we mean by such terms as integrated project management and in understanding the significance of differing project and contract management approaches based on industry and contract type.

For example, project management focused on commodities, with their price volatility, emphasizes schedule and resource management. Cost performance (earned value) where it exists, is measured by time in lieu of volume- or value-based performance. I have often been engaged in testy conversations where those involved in commodity-based PM insist that they have been using Earned Value Management (EVM) for as long as the U.S.-based aerospace and defense industry (though the methodology was borne in the latter). But when one scratches the surface the approaches in the details on how value and performance is determined is markedly different–and so it should be given the different business environments in which enterprises in each of these industries operate.

So what is the difference in these measures? In borrowing from Glen’s categories, I would like to posit a simple definitional model as follows:

Measures of Excellence – are qualitative measures of achievement against the goals in the project;

Measures of Performance – are quantitative measures against a plan or baseline in execution of the project plan.

Key Performance Parameters – are the minimally acceptable thresholds of achievement in the project or effort.

As you may guess there is sometimes overlap and confusion regarding which category a particular measurement falls. This confusion has been exacerbated by efforts to define key performance indicators (KPIs) based on industry, giving the impression that measures are exclusive to a particular activity. While this is sometimes the case it is not always the case.

So when we talk of integrated project management we are not accepting that any particular method of measurement has primacy over the others, nor subsumes them. Earned Value Management (EVM) and schedule performance are clearly performance measures. Qualitative measures oftentimes measure achievement of technical aspects of the end item application being produced. This is not the same as technical performance measurement (TPM), which measures technical achievement against a plan–a performance measure. Technical achievement may inform our performance measurement systems–and it is best if it does. It may also inform our Key Performance Parameters since exceeding a minimally acceptable threshold obviously helps us to determine success or failure in the end. The difference is the method of measurement. In a truly integrated system the measurement of one element informs the others. For the moment these systems presently tend to be stove-piped.

It becomes clear, then, that the variation in approaches differs by industry, as in the example on EVM above, and–in an example that I have seen most recently–by contract type. This insight is particularly important because all too often EVM is viewed as being synonymous with performance measurement, which it is not. Services contracts require structure in measurement as much as R&D-focused production contracts, particularly because they increasingly take up a large part of an enterprise’s resources. But EVM may not be appropriate.

So for our notional example, let us say that we are responsible for managing an entity’s IT support organization. There are types of equipment (PCs, tablet computers, smartphones, etc.) that must be kept operational based on the importance of the end user. These items of hardware use firmware and software that must be updated and managed. Our contract establishes minimal operational parameters that allow us to determine if we are at least meeting the basic requirements and will not be terminated for cause. The contract also provides incentives to encourage us to exceed the minimums.

The sites we support are geographically dispersed. We have to maintain a help desk but also must have people who can come onsite and provide direct labor to setup new systems or fix existing ones–and that the sites and personnel must be supported within a particular time-frame: one hour, two hours, and within twenty-four hours, etc.

In setting up our measurement systems the standard practice is to start with the key performance parameters. Typically we will also measure response times by site and personnel level, record our help desk calls, and track qualitative aspects of the work: How helpful is the help desk? Do calls get answered at the first contract? Are our personnel friendly and courteous? What kinds of hardware and software problems do we encounter? We collect our data from a variety of one-off and specialized sources and then we generate reports from these systems. Many times we will focus on those that will allow us to determine if the incentive will be paid.

Among all of this data we may be able to discern certain things: if the contract is costing more or less than we anticipated, if we are fulfilling our contractual obligations, if our personnel pools are growing or shrinking, if we are good at what we do on a day-to-day basis, and if it looks as if our margin will be met. But what these systems do not do is allow us to operate the organization as a project, nor do they allow us to make adjustments in a timely manner.

Only through integration and aggregation can we know, for example, how the demand for certain services is affecting our resource demands by geographical location and level of service, on a real=time basis where we need to make adjustments in personnel and training, whether we are losing or achieving our margin by location, labor type, equipment type, hardware vs. software; our balance sheets (by location, by equipment type, by software type, etc.), if there is a learning curve, and whether we can make intermediate adjustments to achieve the incentive thresholds before the result is written in stone. Having this information also allows us to manage expectations, factually inform perceptions, and improve customer relations.

What is clear by this example is that “not doing EVM” does not make measurement easy, nor does it imply simplification, nor the absence of measurement. Instead, understanding the nature of the work allows us to identify those measures within their proper category that need to be applied by contract type and/or industry. So while EVM may not apply to services contracts, we know that certain new aggregations do apply.

For many years we have intuitively known that construction and maintenance efforts are more schedule-focused, that oil and gas exploration more resource- and risk-focused, and that aircraft, satellites, and ships more performance-focused. I would posit that now is the time for us to quantify and formalize the commonalities and differences. This also makes an integrated approach not simply a “nice to have” capability, but an essential capability in managing our enterprises and the projects within them.

Note: This post was updated to correct grammatical errors.