Technical Foul — It’s Time for TPI in EVM

For more than 40 years the discipline of earned value management (EVM) has gone through a number of changes in its descriptions, governance, and procedures.  During that same time its community has been resistant to improvements in its methodology or to changes that extend its value when taking into account other methods that either augment its usefulness, or that potentially provide more utility in the area of performance management.  This has been especially the case where it is suggested that EVM is just one of many methodologies that contribute to this assessment under a more holistic approach.

Instead, it has been asserted that EVM is the basis for integrated project management.  (I disagree–and solely on the evidence that if it was so, then project managers would more fully participate in its organizations and conferences.  This would then pose the problem that PMs might then propose changes to EVM that, well…default to the second sentence in this post).  As evidence it need only be mentioned that there has been resistance to such recent developments in using earned schedule, technical performance, and risk–most especially risk based on Bayesian analysis).

Some of this resistance is understandable.  First, it took quite a long time just to get to a consensus on the application of EVM, though its principles and methods are based on simple and well proven statistical methods.  Second, the industries in which EVM has been accepted are sensitive to risk, and so a bureaucracy of practitioners have grown to ensure both consensus and compliance to accepted methods.  Third, the community that makes up practitioners of EVM consist mostly of cost analysts, trained in simple accounting, arithmetic, and statistical methodology.  It is thus a normal human bias to assume that the path of one’s previous success is the way to future success, though our understanding of the design space (reality) that we inhabit has been enhanced through new knowledge.  Fourth, there is a lot of data that applies to project management, and the EVM community is only now learning of the ways that this other data impacts our understanding of measuring project performance and the probability of reaching project goals in rolling out a product.  Finally, there is the less defensible reason that a lot of people and firms have built their careers that depends on maintaining the status quo.

Our ability to integrate disparate datasets is accelerating on a yearly basis thanks to digital technology, and the day in achieving integration of all relevant factors in project and enterprise performance is inevitable.  To be frank, I am personally engaged in such projects and am assisting organizations in moving in this direction today.  Regardless, we can make an advance in the discipline of performance management by pulling down low hanging fruit.  The most reachable one, in my opinion, is technical performance measurement.

The literature of technical performance has come quite a long way, thanks largely to the work of the Institute for Defense Analyses (IDA) and others, particularly the National Defense Industrial Association through the publication of their predictive measures guide.  This has been a topic of interest to me since its study was part of my duties back when I was still wearing a uniform.  The early results of these studies resulted in a paper that proposed a method of integrating technical performance, earned value, and risk.  A pretty comprehensive overview of the literature and guidance for technical performance can be found at this presentation by Glen Alleman and Tom Coonce given at EVM World in 2015.  It must be mentioned that Rick Price of Lockheed Martin also contributed greatly to this literature.

Keep in mind what is meant when we decide to assess technical performance within the context of R&D.  It is an assessment against expected or specified:

a.  Measures of Effectiveness (MoE)

b.  Measures of Performance (MoP), and

c.  Key Performance Parameters (KPP)

The opposition from the project management community to widespread application of this methodology took two forms.  First, it was argued, the method used to adjust the value of earned (CPI) seemed always to have a negative impact.  Second, there are technical performance factors that transcend the WBS, and so it is hard to properly adjust the individual control accounts based on the contribution of technical performance.  Third, some performance measures defy an assessment of value in a time-phased manner.  The most common example has been tracking weight of aircraft, which has contributors from virtually all components that go into it.

Let’s take these in order.  But lest one think that this perspective is an artifact from 1997, just a short while ago, in the A&D community, the EVM policy office at DoD attempted to apply a somewhat modest proposal of ensuring that technical performance was included as an element in EVM reporting.  Note that the EIA 748 standard states this clearly and has done so for quite some time.  Regardless, the same three core objections were raised in comments from the industry.  Thus, this caused me to ask some further in-depth questions and my revised perspective follows below.

The first condition occurred, in many cases, due to optimism bias in registering earned value, which often occurs when using a single point estimate of percent complete by a limited population of experts contributing to an assessment of the element.  Fair enough, but as you can imagine, its not a message that a PM wants to hear or will necessarily accept or admit, regardless of the merits.  There are more than enough pathways to second guessing and testing selection bias at other levels of reporting.  Glen Alleman in his Herding Cats blog post of 12 August has a very good post listing the systemic reasons for program failure.

Another factor is that the initial methodology did possess a skewing toward more pessimistic results.  This was not entirely apparent at the time because the statistical methods applied did not make that clear.  But, to critique that first proposal, which was the result of contributions from IDA and other systems engineering technical experts, the 10-50-90 method in assessing probability along the bandwidth of the technical performance baseline was too inflexible.  The graphic that we proposed is as follows and one can see that, while it was “good enough”, if rolled up there could be some bias that required adjustment.

TPM Graphic

 

Note that this range around 50% can be interpreted to be equivalent to the bandwidth found in the presentation given by Alleman and Coonce (as well as the Predictive Measures Guide), though the intent here was to perform an assessment based on a simplified means of handicapping the handicappers–or more accurately, performing a probabilistic assessment on expert opinion.  The method of performing Bayesian analysis to achieve this had not yet matured for such applications, and so we proposed a method that would provide a simple method that our practitioners could understand that still met the criteria of being a valid approach.  The reason for the difference in the graphic resides in the fact that the original assessment did not view this time-phasing as a continuous process, but rather an assessment at critical points along the technical baseline.

From a practical perspective, however, the banding proposed by Alleman and Coonce take into account the noise that will be experienced during the life cycle of development, and so solves the slight skewing toward pessimism.  We’ll leave aside for the moment how we determine the bands and, thus, acceptable noise as we track along our technical baseline.

The second objection is valid only so far as any alignment of work-related indicators vary from project to project.  For example, some legs of the WBS tree go down nine levels and others go down five levels, based on the complexity of the work and the organizational breakdown structure (OBS).  Thus where we peg within each leg of the tree the control account (CA) and work package (WP) level becomes relative.  Do the schedule activities have a one-to-one relationship or many-to-one relationship with the WP level in all legs?  Or is the lowest level that the alignment can be made in certain legs at the CA level?

Given that planning begins with the contract spec and (ideally) proceed from IMP –> IMS –> WBS –> PMB in a continuity, then we will be able to determine the contributions of TPM to each WBS element at their appropriate level.

This then leads us to another objection, which is that not all organizations bother with developing an IMP.  That is a topic for another day, but whether such an artifact is created formally or not, one must achieve in practice the purpose of the IMP in order to get from contract spec to IMS under a sufficiently complex effort to warrant CPM scheduling and EVM.

The third objection is really a child of the second objection.  There very well may be TPMs, such as weight, with so many contributors that distributing the impact would both dilute the visibility of the TPM and present a level of arbitrariness in distribution that would render its tracking useless.  (Note that I am not saying that the impact cannot be distributed because, given modern software applications, this can easily be done in an automated fashion after configuration.  My concern is in regard to visibility on a TPM that could render the program a failure).  In these cases, as with other indicators that must be tracked, there will be high level programmatic or contract level TPMs.

So where do we go from here?  Alleman and Coonce suggest adjusting the formula for BCWP, where P is informed by technical risk.  The predictive measures guide takes a similar approach and emphasizes the systems engineering (SE) domain in getting to an assessment to determine the impact of reported EVM element performance.  The recommendation of the 1997 project that I headed in assignments across Navy and OSD, was to inform performance based on a risk assessment of probable achievement at each discrete performance milestone.  What all of these studies have in common, and in common with common industry practice using SE principles, is an intermediate assessment, informed by risk, of a technical performance index against a technical performance baseline.

So let’s explore this part of the equation more fully.

Given that we have MoE, MoP, and KPP are identified for the project, different methods of determining progress apply.  This can be a very simplistic set of TPMs that, through the acquisition or fabrication of compliant materials, meet contractual requirements.  These are contract level TPMs.  Depending on contract type, achievement of these KPPs may result in either financial penalties or financial reward.  Then there are the R&D-dependent MoEs, MoPs, and KPPs that require more discrete time-phasing and ties to the physical completion of work documented by through the WBS structure.  As with EVM on the measurement of the value of work, our index of physical technical achievement can be determined through various methods: current EVM methods, simulated Monte Carlo technical risk, 10-50-90 risk assessment, Bayesian analysis, etc.  All of these methods are designed to militate against selection bias and the inherent limitations of limited sample size and, hence, extreme subjectivity.  Still, expert opinion is a valid method of assessment and (in cases where it works) better than a WAG or coin flip.

Taken together these TPMs can be used to determine the technical achievement of the project or program over time, with a financial assessment of the future work needed to bring it in line.  These elements can be weighted, as suggested by Coonce, Alleman, and Price, through an assessment of relative risk to project success.  Some of these TPIs will apply to particular WBS elements at various levels (since their efforts are tied to specific activities and schedules via the IMS), and the most important project and program-level TPMs are reflected at that level.

What about double counting?  A comparison of the aggregate TPIs and the aggregate CPI and SPI will determine the fidelity of the WBS to technical achievement.  Furthermore, a proper baseline review will ensure that double counting doesn’t occur.  If the element can be accounted for within the reported EVM elements, then it need not be tracked separately by a TPI.  Only those TPMs that cannot be distributed or that represent such overarching risk to project success need be tracked separately, with an overall project assessment made against MR or any reprogramming budget available that can bring the project back into spec.

My last post on project management concerned the practices at what was called Google X.  There incentives are given to teams that identify an unacceptably high level of technical risk that will fail to pay off within the anticipated planning horizon.  If the A&D and DoD community is to become more nimble in R&D, it needs the necessary tools to apply such long established concepts such as Cost-As-An-Independent-Variable (CAIV), and Agile methods (without falling into the bottomless pit of unsupported assertions by the cult such as elimination of estimating and performance tracking).

Even with EVM, the project and program management community needs a feel for where their major programmatic efforts are in terms of delivery and deployment, in looking at the entire logistics and life cycle system.  The TPI can be the logic check of whether to push ahead, or finishing the low risk items that are remaining in R&D to move to first item delivery, or to take the lessons learned from the effort, terminate the project, and incorporate those elements into the next generation project or related components or systems.  This aligns with the concept of project alignment with framing assumptions as an early indicator of continued project investment at the corporate level.

No doubt, existing information systems, many built using 1990s technology and limited to line-and-staff functionality, do not provide the ability to do this today.  Of course, these same systems do not take into account a whole plethora of essential information regarding contract and financial management: from the tracking of CLINs/SLINs, to work authorization and change order processing, to the flow of funding from TAB to PMB/MR and from PMB to CA/UB/PP, contract incentive threshold planning, and the list can go on.  What this argues for is innovation and rewarding those technology solutions that take a more holistic approach to project management within its domain as a subset of program, contract, and corporate management–and such solutions that do so without some esoteric promise of results at some point in the future after millions of dollars of consulting, design, and coding.  The first company or organization that does this will reap the rewards of doing so.

Furthermore, visibility equals action.  Diluting essential TPMs within an overarching set of performance metrics may have the effect of hiding them and failing to properly identify, classify, and handle risk.  Including TPI as an element at the appropriate level will provide necessary visibility to get to the meat of those elements that directly impact programmatic framing assumptions.

Technical Ecstacy — Technical Performance and Earned Value

As many of my colleagues in project management know, I wrote a series of articles on the application of technical performance risk in project management back in 1997, one of which made me an award recipient from the institution now known as Defense Acquisition University.  Over the years various researchers and project organizations have asked me if I have any additional thoughts on the subject and the response up until now has been: no.  From a practical standpoint, other responsibilities took me away from the domain of determining the best way of recording technical achievement in complex projects.  Furthermore, I felt that the field was not ripe for further development until there were mathematics and statistical methods that could better approach the behavior of complex adaptive systems.

But now, after almost 20 years, there is an issue that has been nagging at me since publication of the results of the project studies that I led from 1995 through 1997.  It is this: the complaint by project managers in resisting the application of measuring technical achievement of any kind, and integrating it with cost performance, the best that anyone can do is 100%.  “All TPM can do is make my performance look worse”, was the complaint.  One would think this observation would not only not face opposition, especially from such an engineering dependent industry, but also because, at least in this universe, the best you can do is 100%.*  But, of course, we weren’t talking about the same thing and I have heard this refrain again at recent conferences and meetings.

To be honest, in our recommended solution in 1997, we did not take things as far as we could have.  It was always intended to be the first but not the last word regarding this issue.  And there have been some interesting things published about this issue recently, which I noted in this post.

In the discipline of project management in general, and among earned value practitioners in particular, the performance being measured oftentimes exceeds 100%.  But there is the difference.  What is being measured as exceeding 100% is progress against both a time-based and fiscally-based linear plan.  Most of the physical world doesn’t act nor can it be measured this way.  When measuring the attributes of a system or component against a set of physical or performance thresholds, linearity against a human-imposed plan oftentimes goes out the window.

But a linear progression can be imposed on the development toward the technical specification.  So then the next question is how do we measure progress during the development curve and duration.

The short answer, without repeating a summarization of the research (which is linked above) is through risk assessment, and the method that we used back in 1997 was a distribution curve that determined the probability of reaching the next step in the technical development.  This was based on well-proven systems engineering techniques that had been used in industry for many years, particularly at pre-Lockheed Martin Martin Marietta.  Technical risk assessment, even using simplistic 0-50-80-100 curves, provides a good approximation of probability and risk between each increment of development, though now there are more robust models.  For example, the use of Bayesian methodology, which introduces mathematical rigor into statistics, as outlined in this post by Eliezer Yudkowsky.  (As an aside, I strongly recommend his blogs for anyone interested in the cutting edge of rational inquiry and AI).

So technical measurement is pretty well proven.  But the issue that then presents itself (and presented itself in 1997) was how to derive value from technical performance.  Value is a horse of a different color.  The two bugaboos that were presented as being impassible roadblocks were weight and test failure.

Let’s take weight first.  On one of my recent trips I found myself seated in an Embraer E-jet.  These are fairly small aircraft, especially compared to conventional commercial aircraft, and are lightweight.  As such, they rely on a proper distribution and balance of weight, especially if one finds oneself at 5,000 feet above sea level with the long runway shut down, a 10-20 mph crosswind, and a mountain range rising above the valley floor in the direction of takeoff.  So the flight crew, when the cockpit noted a weight disparity, shifted baggage from belly stowage to the overhead compartments in the main cabin.  What was apparent is that weight is not an ad hoc measurement.  The aircraft’s weight distribution and tolerances are documented–and can be monitored as part of operations.

When engineering an aircraft, each component is assigned its weight.  Needless to say, weight is then allocated and measured as part of the development of subsystems of the aircraft.  One would not measure the overall weight of the aircraft or end item without ensuring that the components and subsystems did not conform to the weight limitations.  The overall weight limitation of an aircraft will very depending on mission and use.  If a commercial-type passenger airplane built to takeoff and land from modern runways, weight limitations are not as rigorous.  If the aircraft in question is going to takeoff and land from a carrier deck at sea then weight limitations become more critical.  (Side note:  I also learned these principles in detail while serving on active duty at NAS Norfolk and working with the Navy Air Depot there).  Aside from aircraft weight is important in a host of other items–from laptops to ships.  In the latter case, of which I am also intimately familiar, weight is important in balancing the ship and its ability to make way in the water (and perform its other missions).

So given that weight is an allocated element of performance within subsystem or component development, we achieve several useful bits of information.  First off, we can aggregate and measure weight of the entire end item to track if we are meeting the limitations of the item.  Secondly, we can perform trade-off.  If a subsystem or component can be made with a lighter material or more efficiently weight-wise, then we have more leeway (maybe) somewhere else.  Conversely, if we need weight for balance and the component or subsystem is too light, we need to figure out how to add weight or ballast.  So measuring and recording weight is not a problem. Finally, we allocate and tie performance-wise a key technical specification to the work, avoiding subjectivity.

So how to do we show value?  We do so by applying the same principles as any other method of earned value.  Each item of work is covered by a Work Breakdown Structure (WBS), which is tied (hopefully) to an Integrated Master Schedule (IMS).  A Performance Management Baseline (PMB) is applied to the WBS (or sometimes thought a resource-loaded IMS).  If we have properly constructed our Integrated Management Plan (IMP) prior to the IMS, we should clearly have tied the relationship of technical measures to the structure.  I acknowledge that not every program performs an IMP, but stating so is really an acknowledgement of a clear deficiency in our systems, especially involving complex R&D programs.  Since our work is measured in short increments against a PMB, we can claim 100% of a technical specification but be ahead of plan for the WBS elements involved.

It’s not as if the engineers in our industrial activities and aerospace companies have never designed a jet aircraft or some other item before.  Quite a bit of expertise and engineering know-how transfers from one program to the next.  There is a learning curve.  The more information we collect in that regard, the more effective that curve.  Hence my emphasis in recent posts on data.

For testing, the approach is the same.  A test can fail, that is, a rocket can explode on the pad or suffer some other mishap, but the components involved will succeed or fail based on the after-action report.  At that point we will know, through allocation of the test results, where we are in terms of technical performance.  While rocket science is involved in the item’s development, recording technical achievement is not rocket science.

Thus, while our measures of effectiveness, measures of performance, measures of progress, and technical performance will determine our actual achievement against a standard, our fiscal assessment of value against the PMB can still reflect whether we are ahead of schedule and below budget.  What it takes is an understanding of how to allocate more rigorous measures to the WBS that are directly tied to the technical specifications.  To do otherwise is to build a camel when a horse was expected or–as has been recorded in real life in previous programs–to build a satellite that cannot communicate, a Navy aircraft that cannot land on a carrier deck, a ship that cannot fight, and a vaccine that cannot be delivered and administered in the method required.  We learn from our failures, and that is the value of failure.

 

*There are colloquial expressions that allow for 100% to be exceeded, such as exceeding 100% of the tolerance of a manufactured item or system, which essentially means to exceed its limits and, therefore, breaking it.

The Monster Mash — Zombie Ideas in Project and Information Management

Just completed a number of meetings and discussions among thought leaders in the area of complex project management this week, and I was struck by a number of zombie ideas in project management, especially related to information, that just won’t die.  The use of the term zombie idea is usually attributed to the Nobel economist Paul Krugman from his excellent and highly engaging (as well as brutally honest) posts at the New York Times, but for those not familiar, a zombie idea is “a proposition that has been thoroughly refuted by analysis and evidence, and should be dead — but won’t stay dead because it serves a political purpose, appeals to prejudices, or both.”

The point is that to a techie–or anyone engaged in intellectual honesty–is that they are often posed in the form of question begging, that is, they advance invalid assumptions in the asking or the telling.  Most often they take the form of the assertive half of the same coin derived from “when did you stop beating your wife?”-type questions.  I’ve compiled a few of these for this post and it is important to understand the purpose for doing so.  It is not to take individuals to task or to bash non-techies–who have a valid reason to ask basic questions based on what they’ve heard–but propositions put forth by people who should know better based on their technical expertise or experience.  Furthermore, knowing and understanding technology and its economics is really essential today to anyone operating in the project management domain.

So here are a few zombies that seem to be most common:

a.  More data equals greater expense.  I dealt with this issue in more depth in a previous post, but it’s worth repeating here:  “When we inform Moore’s Law by Landauer’s Principle, that is, that the energy expended in each additional bit of computation becomes vanishingly small, it becomes clear that the difference in cost in transferring a MB of data as opposed to a KB of data is virtually TSTM (“too small to measure”).”  The real reason why we continue to deal with this assertion is both political in nature and also based in social human interaction.  People hate oversight and they hate to be micromanaged, especially to the point of disrupting the work at hand.  We see behavior, especially in regulatory and contractual relationships, where the reporting entity plays the game of “hiding the button.”  This behavior is usually justified by pointing to examples of dysfunction, particularly on the part of the checker, where information submissions lead to the abuse of discretion in oversight and management.  Needless to say, while such abuse does occur, no one has yet to point quantitatively to data (as opposed to anecdotally) that show how often this happens.

I would hazard to guess that virtually anyone with some experience has had to work for a bad boss; where every detail and nuance is microscopically interrogated to the point where it becomes hard to make progress on the task at hand.  Such individuals, who have been advanced under the Peter principle must, no doubt, be removed from such a position.  But this often happens in any organization, whether it be in private enterprise–especially in places where there is no oversight, check-and-balances, means of appeal, or accountability–or government–and is irrelevant to the assertion.  The expense item being described is bad management, not excess data.  Thus, such assertions are based on the antecedent assumption of bad management, which goes hand-in-hand with…

b. More information is the enemy of efficiency.  This is the other half of the economic argument to more data equals greater expense.  And I failed to mention that where the conflict has been engaged over these issues, some unjustifiable figure is given for the additional data that is certainly not supported by the high tech economics cited above.  Another aspect of both of these perspectives also comes from the conception of non-techies that more data and information is equivalent to pre-digital effort, especially in conceptualizing the work that often went into human-readable reports.  This is really an argument that supports the assertion that it is time to shift the focus from fixed report formatting functionality in software based on limited data to complete data, which can be formatted and processed as necessary.  If the right and sufficient information is provided up-front, then additional questions and interrogatories that demand supplemental data and information–with the attendant multiplication of data streams and data islands that truly do add cost and drive inefficiency–are at least significantly reduced, if not eliminated.

c.  Data size adds unmanageable complexity.  This was actually put forth by another software professional–and no doubt the non-techies in the room would have nodded their heads in agreement (particularly given a and b above), if opposing expert opinion hadn’t been offered.  Without putting too fine a point on it, a techie saying this to an open forum is equivalent to whining that your job is too hard.  This will get you ridiculed at development forums, where you will be viewed as an insufferable dilettante.  Digitized technology for well over 40 years has been operating under the phenomenon of Moore’s Law.  Under this law, computational and media storage capability doubles at least every two years under the original definition, though that equation has accelerated to somewhere between 12 and 24 months.  Thus, what was considered big data, say, in 1997 when NASA first coined the term, is not considered big data today.  No doubt, what is considered big data this year will not be considered big data two years from now.  Thus, the term itself is relative and may very well become archaic.  The manner in which data is managed–its rationalization and normalization–is important in successfully translating disparate data sources, but the assertion that big is scary is simply fear mongering because you don’t have the goods.

d.  Big data requires more expensive and sophisticated approaches.  This flows from item c above as well and is often self-serving.  Scare stories abound, often using big numbers which sound scary.  All data that has a common use across domains has to be rationalized at some point if they come from disparate sources, and there are a number of efficient software techniques for accomplishing this.  Furthermore, support for agnostic APIs and common industry standards, such as the UN/CEFACT XML, take much of the rationalization and normalization work out of a manual process.  Yet I have consistently seen suboptimized methods being put forth that essentially require an army of data scientists and coders to essentially engage in brute force data mining–a methodology that has been around for almost 30 years: except that now it carries with it the moniker of big data.  Needless to say this approach is probably the most expensive and slowest out there.  But then, the motivation for its use by IT shops is usually based in rice bowl and resource politics.  This is flimflam–an attempt to revive an old zombie under a new name.  When faced with such assertions, see Moore’s Law and keep on looking for the right answer.  It’s out there.

e.  Performance management and assessment is an unnecessary “regulatory” expense.  This one keeps coming up as part of a broader political agenda beyond just project management.  I’ve discussed in detail the issues of materiality and prescriptiveness in regulatory regimes here and here, and have addressed the obvious legitmacy of organizations to establish one in fiduciary, contractual, and governmental environments.

My usual response to the assertion of expense is to simply point to the unregulated derivatives market largely responsible for the financial collapse, and the resulting deep economic recession that followed once the housing bubble burst.  (And, aside from the cost of human suffering and joblessness, the expenses related to TARP).  Thus we know that the deregulation of banking had gone so well.  Even after the Band-Aid of Dodd-Frank the situation probably requires a bit more vigor, and should include the ratings agencies as well as the real estate market.  But here is the fact of the matter: such expenses cannot be monetized as additive because “regulatory” expenses usually represent an assessment of the day-to-day documentation, systems, and procedures required when performing normal business operations and due diligence in management.  I attended an excellent presentation last week where the speaker, tasked with finding unnecessary regulatory expenses, admitted as much.

Thus, what we are really talking about is an expense that is an essential prerequisite to entry in a particular vertical, especially where monopsony exists as a result of government action.  Moral hazard, then, is defined by the inherent risk assumed by contract type, and should be assessed on those terms.  Given the current trend is to raise thresholds, the question is going to be–in the government sphere–whether public opinion will be as forgiving in a situation where moral hazard assumes $100M in risk when things head south, as they often do with regularity in project management.  The way to reduce that moral hazard is through sufficiency of submitted data.  Thus, we return to my points in a and b above.

f.  Effective project assessment can be performed using high level data.  It appears that this view has its origins in both self-interest and a type of anti-intellectualism/anti-empiricism.

In the former case, the bias is usually based on the limitations of either individuals or the selected technology in providing sufficient information.  In the latter case, the argument results in a tautology that reinforces the fallacy that absence of evidence proves evidence of absence.  Here is how I have heard the justification for this assertion: identifying emerging trends in a project does not require that either trending or lower level data be assessed.  The projects in question are very high dollar value, complex projects.

Yes, I have represented this view correctly.  Aside from questions of competency, I think the fallacy here is self-evident.  Study after study (sadly not all online, but performed within OSD at PARCA and IDA over the last three years) have demonstrated that high level data averages out and masks indicators of risk manifestation, which could have been detected looking at data at the appropriate level, which is the intersection of work and assigned resources.  In plain language, this requires integration of the cost and schedule systems, with risk first being noted through consecutive schedule performance slips.  When combined with technical performance measures, and effective identification of qualitative and quantitative risk tied to schedule activities, the early warning is two to three months (and sometime more) before the risk is reflected in the cost measurement systems.  You’re not going to do this with an Excel spreadsheet.  But, for reference, see my post  Excel is not a Project Management Solution.

It’s time to kill the zombies with facts–and to behead them once and for all.

Second Foundation — More on a General Theory of Project Management

In ending my last post on developing a general theory of project management, I introduced the concept of complex adaptive systems (CAS) and posited that projects and their ecosystems fall into this specific category of systems theory.  I also posited that it is through the tools of CAS that we will gain insight into the behavior of projects.  The purpose is not only to identify commonalities in these systems across what is frequently asserted are irreconcilable across economic market verticals, but to identify regularities and the proper math in determining the behavior of these systems.

A brief overview of some of the literature is in order so that we can define our terms, since CAS is a Protean term that has evolved with its application.  Aside from the essential work at the Santa Fe Institute, some of which I linked in my last post on the topic, I would first draw your attention to an overview of CAS by Serena Chan at MIT.  Ms. Chan wrote her paper in 2001, and so her perspective in one important way has proven to be limited, which I will shortly address.  Ms. Chan correctly defines complexity and I will leave it to the reader to go to the link above to read the paper.  The meat of her paper is her definition of CAS by identifying its characteristics.  These are: distributed control, connectivity, co-evolution, sensitive dependence on initial conditions, emergence, distance from equilibrium, and existence in a state of paradox.  She then posits some tools that may be useful in studying the behavior of CAS and then concludes with an odd section on the application of CAS to engineering systems, positing that engineering systems cannot be CAS because they are centrally controlled and hence do not exhibit emergence (non-preprogrammed behavior).  She interestingly uses the example of the internet as her proof.  In the year 2015, I don’t think one can seriously make this claim.  Even in 2001 such an assertion would be specious for it had been ten years since the passage of the High Performance Computing and Communication Act of 1991 (also called the Gore Bill) which commercialized ARPANET.  (Yes, he really did have a major hand in “inventing” the internet as we know it).  It was also eight years from the introduction of Mosaic.  Thus, the internet, as many engineering systems requiring collaboration and human interaction, fall under the rubric of CAS as defined by Ms. Chan.

The independent consultant Peter Fryer at his Trojan Mice blog adds a slightly different spin to identifying CAS.  He asserts that CAS properties are emergence, co-evolution, suboptimal, requisite variety, connectivity, simple rules, iteration, self-organizing, edge of chaos, and nested systems.  My only pique with many of these stated characteristics is that they seem to be slightly overlapping and redundant, splitting hairs without adding to our understanding.  They also tend to be covered by the larger definitions of systems theory and complexity.  Perhaps its worth reducing them within CAS because they provide specific avenues in which to study these types of systems.  We’ll explore this in future posts.

An extremely useful book on CAS is by John H. Miller and Scott E. Page under the rubric of the Princeton Studies in Complexity entitled Complex Adaptive Systems: An Introduction to Computational Models of Social Life.  I strongly recommend it.  In the book Miller and Page explore the concepts of emergence, self-organized criticality, automata, networks, diversity, adaptation, and feedback in CAS.  They also recommend mathematical models to study and assess the behavior of CAS.  In future posts I will address the limitations of mathematics and its inability to contribute to learning, as opposed to providing logical proofs of observed behavior.  Needless to say, this critique will also discuss the further limitations of statistics.

Still, given these stated characteristics, can we state categorically that a project organization is a complex adaptive system?  After, all people attempt to control the environment, there are control systems in place, oftentimes work and organizations are organized according to the expenditure of resources, there is a great deal of planning, and feedback occurs on a regular basis.  Is there really emergence and diversity in this kind of environment?  I think so.  The reason why I think so is because of the one obvious factor that is measures despite the best efforts to exert control, which in reality consists of multiple agents: the presence of risk.  We think we have control of our projects, but in reality we only can exert so much control. Oftentimes we move the goalposts to define success.  This is not necessarily a form of cheating, though sometimes it can be viewed in that context.  The goalposts change because in human CAS we deal with the concept of recursion and its effects.  Risk and recursion are sufficient to land project efforts clearly within the category of CAS.  Furthermore, that projects clearly fall within the definition of CAS follows below.

It is within an extremely useful paper written on CAS from a practical standpoint that was published in 2011 and written by Keith L. Green of the Institute for Defense Analysis (IDA) entitled Complex Adaptive Systems in Military Analysis that we find a clear and comprehensive definition.  In borrowing from A. S. Elgazzar, of both the mathematics departments of El-Arish, Egypt and Al-Jouf King Saud University in the Kingdom of Saudi Arabia; and A. S. Hegazi of the Mathematics Department, Faculty of Science at Mansoura, Egypt–both of whom have contributed a great deal of work on the study of the biological immune systems as a complex adaptive system–Mr. Green states:

A complex adaptive system consists of inhomogeneous, interacting adaptive agents.  Adaptive means capable of learning.  In this instance, the ability to learn does not necessarily imply awareness on the part of the learner; only that the system has memory that affects its behavior in the environment.  In addition to this abstract definition, complex adaptive systems are recognized by their unusual properties, and these properties are part of their signature.  Complex adaptive systems all exhibit non-linear, unpredictable, emergent behavior.  They are self-organizing in that their global structures arise from interactions among their constituent elements, often referred to as agents.  An agent is a discrete entity that behaves in a given manner within its environment.  In most models or analytical treatments, agents are limited to a simple set of rules that guide their responses to the environment.  Agents may also have memory or be capable of transitioning among many possible internal states as a consequence of their previous interactions with other agents and their environment.  The agents of the human brain, or of any brain in fact, are called neurons, for example.  Rather than being centrally controlled, control over the coherent structure is distributed as an emergent property of the interacting agents.  Collectively, the relationships among agents and their current states represent the state of the entire complex adaptive system.

No doubt, this definition can be viewed as having a specific biological bias.  But when applied to the artifacts and structures of more complex biological agents–in our case people–we can clearly see that the tools we use must been broader than those focused on a specific subsystem that possesses the attributes of CAS.  It calls for an interdisciplinary approach that utilizes not only mathematics, statistics, and networks, but also insights from the areas of the physical and computational sciences, economics, evolutionary biology, neuroscience, and psychology.  In understanding the artifacts of human endeavor we must be able to overcome recursion in our observations.  It is relatively easy for an entomologist to understand the structures of ant and termite colonies–and the insights they provide of social insects.  It has been harder, particularly in economics and sociology, for the scientific method to be applied in a similarly detached and rigorous method.  One need only look to the perverse examples of Spencer’s Social Statics and Murray and Herrnstein’s The Bell Curve as but two examples where selection bias, ideology, class bias, and racism have colored such attempts regarding more significant issues.

It is my intent to avoid bias by focusing on the specific workings of what we call project systems.  My next posts on the topic will focus on each of the signatures of CAS and the elements of project systems that fall within them.

Talking (Project Systems) Blues: A Foundation for a General Theory

As with those of you who observe the upcoming Thanksgiving holiday, I find myself suddenly in a state of non-motion and, as a result, with feet firmly on the ground, able to write a post.  This is preface to pointing out that the last couple of weeks have been both busy and productive in a positive way.

Among the events of the last two weeks was the meeting of project management professionals focused on the discipline of aerospace and defense at the Integrated Program Management Workshop.  This vertical, unlike other areas of project management, is characterized by applying a highly structured approach that involves a great deal of standardization.  Most often, people involved in this area tend to engage in an area where the public sector plays a strong role in defining the environment in which the market operates.  Furthermore, the major suppliers tend to be limited, and so both oligopolistic and monopolistic market competition defines the market space.

Within this larger framework, however, is a set of mid-level and small firms engaged in intense competition to provide both supplies and services to the limited set of large suppliers.  As such, they operate within the general framework of the larger environment defined by public sector procedures, laws, and systems, but within those constraints act with a great deal of freedom, especially in acting as a conduit to commercial and innovative developments from the private sector.

Furthermore, since many technologies originate within the public sector (as in the internet, microchips, etc. among other examples since the middle of the 20th century), the layer of major suppliers, and mid-level to small businesses also act as a conduit to introducing such technologies to the larger private sector.  Thus, the relationship is a mutually reinforcing one.

Given the nature of this vertical and its various actors, I’ve come upon the common refrain that it is unique in its characteristics and, as such, acts as a poor analogue of other project management systems.  Dave Gordon, for example, who is a well-respected expert in IT projects in commenting on previous posts, has expressed some skepticism in my suggestion that there may be commonalities across the project management discipline regardless of vertical or end-item development.  I have promised a response and a dialogue and, given recent discussions, I think I have a path forward.

I would argue, instead, that the nature of the aerospace and defense (A&D) vertical provides a perfect control for determining the strength of commonalities.  My contention is that because larger and less structured economic verticals do not have the same ability to control the market environment and mechanisms that they provide barriers to identifying possible commonalities due to their largely chaotic condition.  Thus, unlike in other social sciences, we are not left with real time experimentation absent a control group.  Both non-A&D and A&D verticals provide the basis to provide controls for the other, given enough precision in identifying the characteristics being identified and measured.

But we need a basis, a framework for identifying commonalities.  As such our answers will be found in systems theory.  This is not a unique or new observation, but for the basis of outlining our structure it is useful to state the basis of the approach.  For those of you playing along at home, the seminal works in this area are Norbert Weiner’s Cybernetics or, Control and Communication in the Animal and the Machine (1948), and Ludwig von Bertalanffy’s General Systems Theory (1968).

But we must go beyond basic systems theory in its formative stage.  Project are a particular type of system, a complex system.  Even beyond that they must go one more step, because they are human systems that both individually in its parts and in aggregate displays learning.  As such these are complex adaptive systems or CAS.  They exist in a deterministic universe, as all CAS do, but are non-deterministic within the general boundaries of that larger physical world.

The main thought leaders of CAS are John H. Holland, as in this 1992 paper in Daedalus, and Murray Gell-Mann with his work at the Santa Fe Institute.  The literature is extensive and this is just the start, including taking into account the work of Kristo Ivanov from the concepts coming out of his work, Hypersystems: A Base for Specification of Computer-Supported Self-Learning Social Systems.

It is upon this basis, especially in the manner in which the behavior that CAS can be traced and predicted, where will be able to establish the foundation of a general theory of project management systems.  I’ll be vetting ideas over the coming weeks regarding this approach, with some suggestions on real world applicability and methodologies across project domains.

New Directions — Fourth Generation apps, Agile, and the New Paradigm

The world is moving forward and Moore’s Law is accelerating in interesting ways on the technology side, which opens new opportunities, especially in software.  In the past I have spoken of the flexibility of Fourth Generation software, that is, software that doesn’t rely on structured hardcoding, but instead, is focused on the data to deliver information to the user in more interesting and essential ways.  I work in this area for my day job, and so using such technology has tipped over more than a few rice bowls.

The response from entrenched incumbents and those using similar technological approaches in the industry focused on “tools” capabilities has been to declare vices as virtues.  Hard-coded applications that require long-term development and structures, built on proprietary file and data structures are, they declare, the right way to do things.  “We provide value by independently developing IP based on customer requirements,” they declare.  It sounds very reasonable, doesn’t it?  Only one problem: you have to wait–oh–a year or two to get that chart or graph you need, to refresh that user interface, to expand functionality, and you will almost never be able to leverage the latest capabilities afforded by the doubling of computing capability every 12 to 24 months.  The industry is filled with outmoded, poorly supported, and obsolete “tools’ already.  Guess it’s time for a new one.

The motivation behind such assertions, of course, is to slow things down.  Not possessing the underlying technology to provide more, better, and more powerful functionality to the customer quicker and more flexibly based on open systems principles, that is, dealing with data in an agnostic manner, they use their position to try to hold up disruptive entries from leaving them far behind.  This is done, especially in the bureaucratic complexities of A&D and DoD project management, through professional organizations that are used as thinly disguised lobbying opportunities by software suppliers such as the NDIA, or by appeals to contracting rules that they hope will undermine the introduction of new technologies.

All of these efforts, of course, are blowing into the wind.  The economics of the new technologies is too compelling for anyone to last long in their job by partying like it’s still 1997 under the first wave of software solutions targeted at data silos and stove-piped specialization.

The new paradigm is built on Agile and those technologies that facilitate that approach.  In case my regular readers think that I have become one of the Cultists, bowing before the Manfesto That May Not Be Named, let me assure you that is not the case.  The best articulation of Agile that I have read recently comes from Neil Killick, whom I have expressed some disagreement on the #NoEstimates debate and the more cultish aspects of Agile in past posts, but who published an excellent post back in July entitled “12 questions to find out: Are you doing Agile Software Development?”

Here are Neil’s questions:

  1. Do you want to do Agile Software Development? Yes – go to 2. No – GOODBYE.
  2. Is your team regularly reflecting on how to improve? Yes – go to 3. No – regularly meet with your team to reflect on how to improve, go to 2.
  3. Can you deliver shippable software frequently, at least every 2 weeks? Yes – go to 4. No – remove impediments to delivering a shippable increment every 2 weeks, go to 3.
  4. Do you work daily with your customer? Yes – go to 5. No – start working daily with your customer, go to 4.
  5. Do you consistently satisfy your customer? Yes – go to 6. No – find out why your customer isn’t happy, fix it, go to 5.
  6. Do you feel motivated? Yes – go to 7. No – work for someone who trusts and supports you, go to 2.
  7. Do you talk with your team and stakeholders every day? Yes – go to 8. No – start talking with your team and stakeholders every day, go to 7.
  8. Do you primarily measure progress with working software? Yes – go to 9. No – start measuring progress with working software, go to 8.
  9. Can you maintain pace of development indefinitely? Yes – go to 10. No – take on fewer things in next iteration, go to 9.
  10. Are you paying continuous attention to technical excellence and good design? Yes – go to 11. No – start paying continuous attention to technical excellent and good design, go to 10.
  11. Are you keeping things simple and maximising the amount of work not done? Yes – go to 12. No – start keeping things simple and writing as little code as possible to satisfy the customer, go to 11.
  12. Is your team self-organising? Yes – YOU’RE DOING AGILE SOFTWARE DEVELOPMENT!! No – don’t assign tasks to people and let the team figure out together how best to satisfy the customer, go to 12.

Note that even in software development based on Agile you are still “provid(ing) value by independently developing IP based on customer requirements.”  Only you are doing it faster and more effectively.

Now imagine a software technology that is agnostic to the source of data, that does not require a staff of data scientists, development personnel, and SMEs to care and feed it; that allows multiple solutions to be released from the same technology; that allows for integration and cross-data convergence to gain new insights based on Knowledge Discovery in Databases (KDD) principles; and that provides shippable, incremental solutions every two weeks or as often as can be absorbed by the organization, but responsively enough to meet multiple needs of the organization at any one time.

This is what is known as disruptive value.  There is no stopping this train.  It is the new paradigm and it’s time to take advantage of the powerful improvements in productivity, organizational effectiveness, and predictive capabilities that it provides.  This is the power of technology combined with a new approach to “small” big data, or structured data, that is effectively normalized and rationalized to the point of breaking down proprietary barriers, hewing to the true meaning of making data–and therefore information–both open and accessible.

Furthermore, such solutions using the same data streams produced by the measurement of work can also be used to evaluate organizational and systems compliance (where necessary), and effectiveness.  Combined with an effective feedback mechanism, data and technology drive organizational improvement and change.  There is no need for another tool to layer with the multiplicity of others, with its attendant specialized training, maintenance, and dead-end proprietary idiosyncrasies.  On the contrary, such an approach is an impediment to data maximization and value.

Vices are still vices even in new clothing.  Time to come to the side of the virtues.

The Water is Wide — Data Streams and Data Reservoirs

I’ll have an article that elaborates on some of the ramifications of data streams and data reservoirs on AITS.org, so stay tuned there.  In the meantime, I’ve had a lot of opportunities lately, in a practical way, to focus on data quality and approaches to data.  There is some criticism in our industry about using metaphors to describe concepts in computing.

Like any form of literature, however, there are good and bad metaphors.  Opposing them in general, I think, is contrarian posing.  Metaphors, after all, often allow us to discover insights into an otherwise opaque process, clarifying in our mind’s eye what is being observed through the process of deriving similarities to something more familiar.  Strong metaphors allow us to identify analogues among the phenomena being observed, providing a ready path to establishing a hypothesis.  Having served this purpose, we can test that hypothesis to see if the metaphor serves our purposes in contributing to understanding.

I think we have a strong set of metaphors in the case of data streams and data reservoirs.  So let’s define our terms.

Traditionally a data stream in communications theory is a set of data packets that are submitted in sequence.  For the purpose of systems theory, a data stream is data that is submitted between two entities either on a sequential real time or on a regular periodic basis.  A data reservoir is just what it sounds like it is.  Streams can be diverted to feed a reservoir, which diverts data for a specific purpose.  Thus, data in the reservoir is a repository of all data from the selected streams, and any alternative streams, that includes legacy data.  The usefulness of the metaphors are found in the way in which we treat these data.

So, for example, data streams in practical terms in project and business management are the artifacts that represent the work that is being performed.  This can be data relating to planning, production, financial management and execution, earned value, scheduling, technical performance, and risk for each period of measurement.  This data, then, requires real time analysis, inference, and distribution to decision makers.  Over time, this data provides trending and other important information that measures the inertia of the efforts in providing leading and predictive indicators.

Efficiencies can be realized by identifying duplication in data streams, especially if the data being provided into the streams are derived from a common dataset.  Streams can be modified to expand the data that is submitted, so as to eliminate alternative streams of data that add little value on their own, that is, that are stovepiped and suboptimized contrary to the maximum efficiency of the system.

In the case of data reservoirs, what these contain is somewhat different than the large repositories of metadata that must be mined.  On the contrary, a data reservoir contains a finite set of data, since what is contained in the reservoir is derived from the streams.  As such, these reservoirs contain much essential historical information to derive parametrics and sufficient data from which to derive organizational knowledge and lessons learned.  Rather than processing data in real time, the handling of data reservoirs are done to append the historical record of existing efforts to provide a fuller picture of performance and trending, and of closed out efforts that can inform systems approaches to similar future efforts.  While not quite fitting into the category of Big Data, such reservoirs can probably best be classified as Small Big Data.

Efficiencies from the streams into the reservoir can be realized if the data can be further definitized through the application of structured schemas, combined with flexible Data Exchange Instructions (DEIs) that standardize the lexicon, allowing for both data normalization and rationalization.  Still, there may be data that is not incorporated into such schemas, especially if the legacy metadata predates the schema specified for the applicable data streams.  In this case, data rationalization must be undertaken combined with standard APIs to provide consistency and structure to the data.  Even in this case, however, given the finite set since the data is specific to a system that uses a fairly standard lexicon, such rationalization will yield results that are valid.

Needless to say, applications that are agnostic to data and that provide on-the-fly flexibility in UI configuration by calling standard operating environment objects–also known as fourth generation software–have the greatest applicability to this new data paradigm.  This is because they most effectively leverage both flexibility in the evolution of the data streams to reach maximum efficiency, and in leveraging the lessons learned that are derived from the integration of data that was previously walled off from complementary data that will identify and clarify systems interdependencies.

 

Measure for Measure — Must Read: Dave Gordon Is Looking for Utilitarian Metrics at AITS.org

Dave Gordon at his AITS.org blog deals with the issue of metrics and what makes them utilitarian, this is, “actionable.”  Furthermore at his Practicing IT Project Management blog he challenges those in the IT program management community to share real life examples.  The issue of measures and whether they pass the “so-what?” test in an important one, since chasing, and drawing improper conclusions from, the wrong ones are a waste of money and effort at best, and can lead one to make very bad business decisions at worst.

In line with Dave’s challenge, listed below are the types of metrics (or measures) that I often come across.

1.  Measures of performance.  This type of metric is characterized by actual performance against a goal for a physical or functional attribute of the system being developed.  It can be measured across time as one of the axes, but the ultimate benchmark against what is being measured is against the requirement or goal.  Technical performance measurements often fall into this category, though I have seen instances where these TPM is listed in its own category.  I would argue that such separation is artificial.

2.  Measures of progress.  This type of metric is often time-based, oftentimes measured against a schedule or plan.  Measurement of schedule variances in terms of time or expenditure rates against a budget often fall into this category.

3.  Measures of compliance.  This type of metric is one that measures systemic conditions that must be met which, if not, indicates a fatal error in the integrity of the system.

4.  Measures of effectiveness.  This type of metric tracks against those measures related to the operational objectives of the project, usually specified under particular conditions.

5.  Measures of risk.  This type of metric measures quantitatively the effects of qualitative, systemic, and inherent risk.  Oftentimes qualitative and quantitative risk are separated, which is the means of identification and whether that means is recorded either indirectly or directly.  But, in reality, they are measuring different aspects and causes of the same phenomenon.

6.  Measures of health.  This type of metric measures the relative health of a system against a set of criteria.  In medicine there are a set of routine measures for biological subjects.  Measures of health distinguish themselves from measures of compliance in that any variation, while indicative of a possible problem, is not necessarily fatal.  Thus, a range of acceptable indicators or even some variation within the indicators can be acceptable.  So while these measures may point to a system issue, borderline areas may warrant additional investigation.

In any project management system, there are often correct and incorrect ways of constructing these measures.  The basis for determining whether they are correct, I think, is whether the end result metric possesses materiality and traceability to a particular tangible state or criteria.  According to Dave and others, a test of a good metric is whether it is “actionable”.  This is certainly a desirable characteristic, but I would suggest not a necessary one and is contained within materiality and traceability.

For example, some metrics are simply indicators, which suggest further investigation; others suggest an action when viewed in combination with others.  There is no doubt that the universe of “qualitative” measures is shrinking as we have access to bigger and better data that provide us with quantification.  Furthermore as stochastic and other mathematical tools develop, we will have access to more sophisticated means of measurement.  But for the present there will continue to be some of these non-quantifiable measures only because, with experience, we learn that there are different dimensions in measuring the behavior of complex adaptive systems over time that are yet to be fully understood, much less measured.

I also do not mean for this to be an exhaustive list.  Others that have some overlap to what I’ve listed come to mind, such as measures of efficiency (different than effectiveness and performance in some subtle ways), measures of credibility or fidelity (which has some overlap with measures of compliance and health, but really points to a measurement of measures), and measures of learning or adaptation, among others.

Let’s Get Physical — Pondering the Physics of Big Data

I’ll have a longer and less wonky article on this and related topics next week at AITS.org’s Blogging Alliance, but Big Data has been a hot topic of late.  It also concerns the business line in which I engage and so it is time to sweep away a lot of the foolishness concerning it: what it can do, its value, and its limitations.

As a primer a useful commentary on the ethical uses of Big Data was published today at Salon.com in an excerpt from Jacob Silverman’s book, Terms of Service: Social Media and the Price of Constant Connection.  Silverman takes a different approach from the one that I outline in my article, but he tackles the economics of new media that were identified years ago by Brad DeLong and A. Michael Froomkin back in the late 1990s and first decade of the 21st century.  This article on First Monday from 2000 regarding speculative microeconomics emerging from new media nicely summarizes their thesis.  Silverman rejects reforming the system in economic terms, entering the same ethical terrain on personal data collection that was explored by Rebecca Skloot on the medical profession’s genetic collection and use of tissue during biopsies in the book, The Immortal Life of Henrietta Lacks.

What Silverman’s book does make clear–and which is essential in understanding the issue–is that not all big data is the same.  To our brute force machines data is data absent the means of software to distinguish it, since they are not yet conscious in the manner that would pass a Turing test.  Even with software such machines still cannot pass such a test, though I personally believe that strong AI is inevitable.

Thus, there is Big Data that is swept up–often without deliberate consent by the originator of the data–from the larger pool of society at large by commercial companies that have established themselves as surveillance “statelets” in gathering data from business transactions, social media preferences, and other electronic means.

And there is data that is deliberately stored and, oftentimes shared, among conscious actors for a specific purpose.  These actors are often government agencies, corporations, and related organizations that cooperatively share business information from their internal processes and systems for the purpose of developing predictive systems toward a useful public purpose, oftentimes engaged in joint enterprises toward the development of public goods and services.  It is in this latter domain that I operate.  I like to call this “small” Big Data, since we operate in what can realistically be characterized as closed systems.

Data and computing has a physical and mathematical basis.  For anyone who has studied the history of computing (or has coded) this is a self-evident fact.  But for the larger community of users it appears–especially if one listens to the hype of our industry–that the sky is the limit.  But perhaps that is a good comparison after all, for anyone who has flown in a plane knows that the sky does indeed have limits.  To fly requires a knowledge of gravity, the atmosphere, lift, turbulence, aerodynamics, and propulsion, among other disciplines and sciences.  All of these have their underpinnings in physics and mathematics.

The equation that we use in computing is known as Landauer’s Principle.  It is as follows:

kT In 2,

where k is the Boltzmann constant, T is the temperature of the circuit in Kelvins, and In 2 is the natural logarithm of 2.

This equation follows those in thermodynamics established earlier in physics.  What this means is that the inherent entropy in a system–its onward inevitable journey toward a state of disorder–cannot be reduced, it can only be expelled from the system. For Landauer, who worked at IBM in physical computing, entropy is expelled in the form of heat and energy.  For the longest time, given the close correlation and applied proofs of the Principle, this was seen as a physical law, but modern computing seems to be undermining the manner in which entropy is expelled.

Big Data runs up against the physics identified in Landauer’s Principle because heat and energy are not the only ways to expel entropy.  For really Big Data entropy is expelled by the iron law of Boltzmann’s Constant: the calculation of probable states of disorder in the system. The larger the system, the larger the probable states of disorder, and the more our results in processing such information become a function of probability.  This may or many not matter, depending on the fidelity of the probabilistic methods and their application.

For “small” Big Data, the acceptability of variations from the likely outcome is much narrower.  We need to approach being 100% correct, 100% of the time, though small variations are acceptable depending on the type of system.  So, for example, in project management systems, we can be a percent or two off on rolling up data, since accountability is not an issue.  Financial systems compliance is a different matter.

In “small” Big Data, entropy can be expelled by pre-processing the data in the form of effort expended toward standardization, normalization, and rationalization.  Our equation, kT In 2, is the lower bound, that is, it identifies the minimum state of entropy that need be expelled in order to process a bit.  In reality we will never reach this lower bound, but we can approach it until the difference between the lower bound of entropy and the “cost” of processing data is vanishingly small.  Once we have expelled entropy by limiting the states of instability in the data, expelling the cost of entropy through the data pipeline, we can then process the data to derive its significant with a high degree of confidence.

But this is only the start.  For once “small” Big Data undergoes a process to ensure its fidelity, the same pattern recognition algorithms used in Big Data can be applied, but to more powerful and credible effect.  Early warning “signatures” of project performance can be collected and applied to provide decision-makers with information early enough to affect the outcome of efforts before risk is fully manifested, with the calculated probabilities of cost, schedule, and technical impacts possessing a higher level of certainty.

 

The Song Remains the Same (But the Paradigm Is Shifting) — Data Driven Assessment and Better Software in Project Management

Probably the biggest news out of the NDIA IPMD meeting this past week was the unofficial announcement by Frank Kendall, who is the Undersecretary of Defense for Acquisition, Technology, and Logistics USD(AT&L), that thresholds would be raised for mandatory detailed surveillance of programs to $100M from the present requirement of $20M.  While earned value management implementation and reporting will still be required on programs based on dollar value, risk, and other key factors, especially the $20M threshold for R&D-type projects, the raising of the threshold for mandatory surveillance reviews was seen as good news all around for reducing some regulatory burden.  The big proviso in this announcement, however, was that it is to go into effect later this summer and that, if the data in reporting submissions show inconsistencies and other anomalies that call into question the validity of performance management data, then all bets are off and the surveillance regime is once again imposed, though by exception.

The Department of Defense–especially under the leadership of SecDef Ashton Carter and Mr. Kendall–has been looking for ways of providing more flexibility in acquisition to allow for new technology to be more easily leveraged into long-term, complex projects.  This is known as the Better Buying Power 3.0 Initiative.  It is true that surveillance and oversight can be restrictive to the point of inhibiting industry from concentrating on the business of handling risk in project management, causing resources to be devoted to procedural and regulatory issues that do not directly impact whether the project will successfully achieve its goals within a reasonable range of cost and schedule targets.  Furthermore, the enforcement of surveillance has oftentimes been inconsistent and–in the worst cases–contrary to the government’s own guidance due to inconsistent expertise and training.  The change maintains a rigorous regulatory environment for the most expensive and highest risk projects, while reducing unnecessary overhead, and allowing for more process flexibility for those below the threshold, given that industry’s best practices are effective in exercising project control.

So the question that lay beneath the discussion of the new policy coming out of the meeting was: why now?  The answer is that technology has reached the point where the ability to effectively use the kind of Big Data required by DoD and other large organizations to detect patterns in data that suggest systems issues has changed both the regulatory and procedural landscape.

For many years as a techie I have heard the mantra that software is a nice reporting and analysis tool (usually looking in the rear view mirror), but that only good systems and procedures will ensure a credible and valid system.  This mantra has withstood the fact that projects have failed at the usual rate despite having the expected artifacts that define an acceptable project management system.  Project organizations’ systems descriptions have been found to be acceptable, work authorization, change control, and control account plans, PMBs, and IMSs have all passed muster and yet projects still fail, oftentimes with little advance warning of the fatal event or series of events.  More galling, the same consultants and EVM “experts” can be found across organizations without changing the arithmetic of project failure.

It is true that there are specific causes for this failure: the inability of project leadership to note changes in framing assumptions, the inability of our systems and procedures to incorporate technical performance into overall indicators of project performance, and the inability of organizations to implement and enforce their own policies.  But in the last case, it is not clear that the failure to follow controls in all cases had any direct impact on the final result; they were contributors to the failure but not the main cause.  It is also true that successful projects have experienced many of the same discrepancies in their systems and procedures.  This is a good indication that something else is afoot: that there are factors not being registered when we note project performance, that we have a issue in defining “done”.

The time has come for systems and procedural assessment to step aside as the main focus of compliance and oversight.  It is not that systems and procedures are unimportant.  It is that data driver assessment–and only data driver assessment–that is powerful enough to quickly and effectively identify issues within projects that otherwise go unreported.  For example, if we call detailed data from the performance management systems that track project elements of cost, the roll up should, theoretically, match the summarized data at the reporting level.  But this is not always the case.

There are two responses to this condition.  The first is: if the variations are small; that is, within 1% or 2% from the actuals, we must realize that earned value management is a project management system, not a financial management systems, and need not be exact.  This is a strong and valid assertion.  The second, is that the proprietary systems used for reporting have inherent deficiencies in summarizing reporting.  Should the differences once again not be significant, then this too is a valid assertion.  But there is a point at which these assertions fail.  If the variations from the rollups is more significant than (I would suggest) about 2% from the rollup, then there is a systemic issue with the validity of data that undermines the credibility of the project management systems.

Checking off compliance of the EIA 748 criteria will not address such discrepancies, but a robust software solution that has the ability to handle such big data, the analytics to identify such discrepancies, and the flexibility to identify patterns and markers in the data that suggest an early indication of project risk manifestation will address the problem at hand.  The technology is now here to be able to perform this operation and to do so at the level of performance expected in desktop operations.  This type of solution goes far beyond EVM Tools or EVM engines.  The present generation of software possesses both the ability to hardcode solutions out of the box, but also the ability to configure objects, conditional formatting, calculations, and reporting from the same data to introduce leading indicators across a wider array of project management dimensions aside from just cost and schedule.