For many years since the publication of my various papers on technical performance measurement, I have been asked to update my perspectives. Over the years I largely declined, mostly this was due to the fact that I had nothing of importance to add to the conversation. I had staked out what I believed to be a reasonable method of integration between the measurement of technical achievement in human effort and the manner in which the value of that achievement could be documented, along with a reasonable model of technical risk to inform us of our ability to achieve success in the next increment of our technical baseline. A little background may be helpful.
The development of the model was a collaborative one. I was the project manager of a team at the Naval Air Systems Command (NAVAIR) that had spent several years attempting to derive “value” from technical achievement, including technical failure. (More on this later). The team had gone down many blind allies and had attempted various methods of integrating technical performance measurement into earned value metrics. Along the way, however, I grew dissatisfied with both the methods and the lack of progress in the project. Thus, the project team changed course, with the expected turnover of personnel, some of whom did not agree with the change. As the project manager I directed my team to end the practice of NIH–“Not Invented Here”–and to scour the literature to determine what the efforts already directed at the problem could teach us. I was also given the resources of a team of mathematicians, physicists, and statisticians familiar with the issues regarding systems engineering who were organized into a reserve unit at NAVAIR, and was assisted by Matt Goldberg of Luri-Goldberg risk fame, who was then working at the Institute for Defense Analysis (IDA).
Among the published literature was a book that had been recommended to me by senior engineering personnel at the (at that time) newly formed Lockheed Martin Corporation. The book was Technical Risk Management by Jack V. Michaels, who also collaborated on a book called Design to Cost. Both of these works, but in particular the former one, were influential in my selected approach to technical performance management. I was also strongly influenced by the work of Daniel Dennett at Tufts in his philosophical and technical observations of strong AI. It seemed (and seems) to me that the issue of measuring both the intrinsic and practical value of technical achievement requires an understanding of several concepts: of cognition that includes the manner in which we define both progress and learning; the concept of systems, their behavior and the manner in which they respond to stimulus–thus, the evolutionary nature of human systems and the manner that they adapt; the fallacy of reification in measurement and the manner that we overcome it; an understanding of technical risk in the systems engineering process; an understanding of the limitations in measurement of those technical planning systems both in terms of fidelity and horizon; and the way the universe works on the level of the system being developed, given the limitations of the inputs, outputs, and physics involved in its development.
While our understanding needed to be interdisciplinary and comprehensive, conversely, the solution needed to be fairly simple and understandable. It needed to pass what I called the “So-What? Test.” That is, would the addition of complexity that improved accuracy pass a test in which there was no good answer to the question “so what?” when looking at the differences in the results. I have applied this test to software development and other efforts. For example, is it really worth the added complexity to add that additional button for some marginal advantage if its absence makes no difference in the performance and acceptance of the product?
In the end I selected an approach that I felt was coherent and directed the team to apply the approach to several retrospective analyses, as well as one live demonstration project. In every case the model demonstrated that it would have been a better predictor than cost and schedule indicators alone in project performance. In addition, integration of technical achievement which, after all, was one of the cornerstones of the WBS approach in its adoption in the Department of Defense in the early ’90s, improved the early warning capabilities in predicting the manifestation of technical risk, which was then reflected in both cost and schedule performance.
I then published several papers on our findings in collaboration with my team, but then decided to publish my own perspectives separately at the end of my role in the project, which is linked in the first paragraph to this post. Greatly assisting me with criticism and suggestions during the writing of my paper was Jim Henderson, a colleague whom I respect greatly who was a senior cost analyst at the time at NAVAIR. He resisted my efforts to credit his assistance at the time for reasons of his own, but I suspect that he would not object now and it is only fitting that I give him the credit that he is due in influencing my own thinking; though I take full responsibility for the opinions and conclusions expressed in it.
Underlying this approach were several, to some new, artifacts deemed essential to the approach. First among these was the establishment of a technical performance baseline. The purpose of this baseline is to drive an assessment of current capabilities and then to break down the effort into increments that involve assessments of progress and testing. These increments should be tied to the WBS and the resources associated with the system being developed. Second, was the realization that technical achievement was best assessed in increments of short duration through a determination of the risk involved in successfully achieving the next milestone in the technical performance baseline. This approach was well documented and in use in systems engineering technical risk assessments (TRAs) and, as such, would not cause major changes to a process that was reliable and understood. Finally, and most (as it turns out) controversially, was the manner in which we “informed” cost performance in deriving value from the TRA. The last portion of the model was, admittedly, its most contingent portion, though well within the accepted norms of assessment.
It is at the point of applying the value of technical achievement that we still find the most resistance, and so the reason why I have decided to reenter the conversation. What is the value of failure? For example, did Space X derive no value from its Falcon 9 rocket exploding? Here is the video:
From the perspective of an outside non-technical observer, the program seems to have suffered a setback, but not so fast. Space X later released an announcement that indicated that it was reviewing the data from the failed rocket, which had detected an anomaly and self-destructed. Thus, what we know is that the failed test has caused a two week delay (at least). Additional resources will need to be expended to study the data from the failure. The data will be used to review the established failure modes and routines that the engineers developed. There is both value and risk associated with these activities. Some of them increase risk in the achievement of the next planned milestone, some handle and mitigate future risks, additional time and resources–perhaps anticipated or perhaps not–will actually be expended as a result of the test.
So how would the model I put forth–we’ll call it the PEO(A) Model for the organization that sponsored it–handle this scenario. First we would assess the risk associated with achieving the next milestone based on the expert opinion of the engineers for the system. The model is a simple one–10, 50, 90, 100%. We trace the WBS elements and schedule activities associated with the system and adjust them appropriately, showing the schedule delay and assigning additional resources. We inform earned value by the combination of risk and resources. That is, if our chance of achieving our next milestone is 50%, that is at about the level of chance, then the resources dedicated to the effort may need to be increased an appropriate amount and is reflected in the figure for earned value.
There are two main objections to this approach.
The first objection encounted challenges the engineers’ assessment as being “subjective.” This objection is invalid on its face. Expert opinion is both a valid and accepted approach in systems engineering and it seems there is a different bias that underlies the objection. For example, I have seen this objection raised where the majority of a project is being managed using percent complete as an earned value method, which is probably the most subjective and inaccurate method applied, oftentimes by Control Account Managers (CAMs) that are removed one or two levels (or, at least, degrees of separation) from the actual work. This goes back to the occasional refrain that I often heard during program reviews by the PM: “How do I make that indicator green?” Well, there are only two direct ways with variations: game the system, or actually perform the work to the level of planned achievement and then accurately measure progress. I suspect the fear here is that the engineers will be too accurate in their assessments, and so the fine game of controlling the indicators to avoid being micromanaged and making enough progress to support the indicators is undermined. Changing this occasionally encountered mindset is a discussion for a different blog post.
There is certainly room for improvement in using this method. Since we are dealing with a time-phased technical performance plan with short duration milestones, providing results based on an assessment of achieving the next increment, we can always apply simulated Monte Carlo at our present position to get the most likely range of probable outcomes. Handicapping the handicapper is an acceptable way of assessing future performance with a great deal of fidelity. Thus, the model proposed could be adjusted to incorporate the refinement. I am not certain, however, if it would pass the “so-what?” test. It would be interesting to find out.
The second objection encountered challenges the maximum achievement at 100%. I find this objection both amusing at one level and understandable given the perspective of the individuals raising the objection. First the amusing (I hope) response: in this universe the best you can ever do is 100%. You can’t “give 110%,” you can’t “achieve 150%,” etc. This sounds catchy in trendy books on management, in get rich quick schemes, and is heard too often by narcissists in business. But in reality the best you can do is 100%–and if you’ve lived long enough with your feet on the ground you know that 100% is a rare level of achievement that exists in limited, well-defined settings. In risk, 100% probability is even rarer, and so when we say that our chance of achieving the next level in a technical performance plan is 100%, we are saying that all risks have been eliminated and/or mitigated to zero.
The serious part of this objection, though, surrounds the financial measurement of technical performance, to wit: what if we are ahead of plan? In that case, the argument goes, we should be able to show more than 100% of achievement. While understandable, this objection demonstrates a misunderstanding of the model. The assessment from one milestone to the next is one based on risk. The 100% chance of achievement allows the project to claim full credit for all effort up to that point. The actual expenditures and the resulting savings in resources and schedule, documented through normal cost and schedule performance measurement, will reflect the full effect of being on plan or ahead of plan. So while it is true that technical performance measurement can only, in the words of one critic, “take away performance,” this does not pass the “so-what?” test–at least not in its practical application.
What the objection does do is expose the underlying weakness and uncertainty held with any model that concerns itself with deriving “value” from success and failure in developmental efforts. It is a debate that has raged in systems engineering circles for quite some time, not to mention the expansive definition of value during discussions of investment and its return. But that is not what the PEO(A) model of technical performance measurement (TPM) was about. The intent of proposing a working model of technical achievement was geared specifically to defining our terms and aims with precision, and then to come up with a model to meet those aims. It measures “value” only insomuch that the value is determined by the resources committed to achieving the increment in time. The base “value” of a test failure, as in the Space X example, is derived from the assessment of technical risk in the wake of the postmortems.
It should not be controversial to advocate for finding a way of integrating technical performance into project management measures, particularly in their role in determining predictive outcomes. We can spend a great deal of time and money building a jet that, in the end, cannot land on an aircraft carrier, a satellite that cannot communicate after it achieves orbit, a power plant that cannot achieve full capacity, or a ship that cannot operate in the conditions intended, if we do not track the technical specifications that define the purpose of the system being developed and deployed. I would argue that technical achievement is the strategic center of measurements upon which all others are constructed. The PEO(A) model is one model that has been proposed and has been in use since its introduction in 1997. I would not argue that it cannot be improved or that alternatives may achieve an equal or better result.
In the days prior to satellite and GPS navigation was the norm, when leaving and entering port–to ensure that we avoided shoal water and stayed within the channel–ships would take a fix on a set of geographical points known as bearing measurements. Lines were drawn from the points of measurement and where they intersected was the position of the ship. This could be achieved by sight, radar, or radio. What the navigation team was doing was determining the position of the ship in a three dimensional world and representing it on a two dimensional chart. The lines formed in the sighting are called lines of position (LOP). Two lines of position are sufficient to establish a “good” fix, but we know that errors are implicit due to variations in observation, inconvenient angles, the drift of the ship, and the ellipse error in all charts and maps (the earth not being flat). Thus, at least three points of reference are usually the minimum acceptable from a practical perspective in establishing an “excellent” fix. The more the merrier. As an aside, ships and boats still use this method to verify their positions–even redundantly–since depending too heavily on technology that could fail could be the difference between a routine day and tragedy.
This same principle applies to any measurement of progress in human endeavor. In project management we measure many things. Oftentimes they are measured independently. But if they do nothing to note the position of the project in space and time–to allow it to establish a “fix” on where it is–then they are of little use. The points must converge in order to achieve value in the measurement. Cost, schedule, qualitative risk, technical performance, and financial execution are our bearing measurements. The representation of these measurements on a common chart–the points of integration–will determine our position. Without completing this last essential task is like being adrift in a channel.