Innervisions: The Connection Between Data and Organizational Vision

During my day job I provide a number of fairly large customers with support to determine their needs for software that meets the criteria from my last post. That is, I provide software that takes an open data systems approach to data transformation and integration. My team and I deliver this capability with an open user interface based on Windows and .NET components augmented by time-phased and data management functionality that puts SMEs back in the driver’s seat of what they need in terms of analysis and data visualization. In virtually all cases our technology obviates the need for the extensive, time consuming, and costly services of a data scientist or software developer.

Over the course of my career both as a consumer and a provider of technology solutions, I have seen an evolution in software that began with simple point solutions being developed to automate particular manual processes, to more sophisticated solutions that are designed to automate a complex function. In most of these cases, a customer has identified a gap or deficiency in their requirements that represents an inefficiency or sub-optimization of their processes and then seek a software “tool” to acquire in order to address that specific purpose. The application of these “tools” combine to meet the overall vision of the organization or sub-system within the organization.

What Do You Do With A Problem Like “Tools”

The capabilities of software in terms of data handling capabilities and functionality double every 12-18 months in today’s environment. The use of the term “tools” for software, which is really based on a pre-2000 concept, is that in the mind’s eye software is analogous to any other tool. In the literature, particularly in that authored by consultants, this analogy is oftentimes extended to common household or construction tools: a wrench, a screwdriver, or a power drill. Under this concept each tool has a specific purpose and it is up to the SME to determine which tool is best for a specific job.

The problem with this concept is that not only is it obsolete, but it does great harm financially to the organization in terms of overhead costs, organizational efficiency and effectiveness.

First of all, most physical tools are fairly static in their specific use. A hammer is still a hammer, even if some sort of power is extended to give it power. It’s purpose remains to use force to insert a connective fastener, like a nail, into a medium, like a piece of wood. A nail gun, for instance, is a type of hammer. It is more powerful and efficient but, still, it is a glorified hammer. It is a superior tool in construction because it is more efficient, provides a consistency in quality, and is faster. It also eliminates the factors of arm strength, physical coordination, and visual alignment skills of the user; as anyone who has experienced a sore thumb as a result of a misaligned strike can attest. But a nail gun is still restricted to its specific function–sinking nails for the purpose of fastening.

Software, as it has evolved, was similarly based on the concept of a tool. The physical functions of a specific vocation were the first to undergo digitization: accountants and business operations personnel had spreadsheet software applications, secretarial and clerical staffs (yes, they used to exist) had word processing software, marketing and middle management could relay their ideas with presentation software, and the list went on.

As the power of software improved it followed the functions of traditional line-and-staff organizations. Many of these were built to replace the physical calculation of formulae and concepts that required a slide rule and, later, a scientific calculator. Soon scheduling software replaced manual GANTT planning, earned value software automated the calculation of basic EVM analytics, and risk software allowed for the complex formulation involved in assessing risk for the branch of a plan using simulated Monte Carlo analysis.

Each of these software applications targeted a specific occupation, and incorporated specific knowledge (functionality) required of that occupation.

Organizational software for multiple functions usually consisted of a suite of tools under the rubric of an ERP or Business Intelligence System. Modules and “bolt-ons” consisted of tying together business processes and point software requirements augmented by large software consulting staffs to customize the solutions. In actual practice, however, these were software tools tied together though a common brand and operating environment. Oftentimes the individual bolt-ons and tools weren’t even authored by the same development team with a common vision in mind, but a reaction to market forces that required a gap be filled through acquisition of a company or intellectual property.

Needless to say, these “enterprise” solutions aren’t that at all. Instead, they are a business-driven means to penetrate a vertical by providing scattershot functionality. Once inside a company or organization the other bolt-ons and modules are marketed in order to take over other business processes. Integration is achieved across domains through data transfer or other interpretive methods.

This approach has been successful, as it has been since the halcyon days when IBM dominated the computing market, especially among the larger software firms. It also meets many of the emotional and psychic needs of many senior managers. After all, the software firm–given its economic size–feels solid. The numbers of specialists introduced into the organization to augment staff provide a feeling of safety and accomplishment. C-level management and stockholders feel that risk is handled given that their software needs are being met at some level.

What this approach did not, and does not, meet is genuine data integration, especially given the realization that the data we have been using has been inadequate and artificially restricted based on what software providers were convincing their customers was the art of the possible. The term “Big Data” began to be introduced into the lexicon, and with it the economic realization that capturing and integrating datasets that were previously “impossible” to capture and integrate was (and presently is) an economic imperative.

But the approach of incumbents, whose priority is to remain “sticky” and to defend territory against new technologies, was to respond: “we have a tool for that.” Thus, the result has been the further introduction of inefficient individual applications with their inability to fully exploit data. Among these tools are largely “dumb”–that is, viewing data flat–data visualization tools that essentially paint pretty pictures from Excel or, when they need to be applied on a larger scale, default to the old business intelligence brute force approach of applying labor to derive the importance in data. Old habits are hard to change and what one person has done another can do. But this is the economic equivalent of what is called rent-seeking behavior. That is, it is inefficient and exploitative.

After all, if you buy what was advertised as a sports car you expect to see an engine under the hood and a transmission connected to a drivetrain and a pretty powerful one at that. What one does not expect is to buy the car but have to design and build the features of these essential systems while a team of individuals are paid by the hour to push us to where we want to go. Yet, organizations (and especially consultants) seem to be happy with this model when it comes to information management.

Thus, when a technology company like mine comes across a request for proposal, an informal invitation to participate in market research, or in exploratory professional meetings (largely virtual as of this writing), the emphasis and terminology is on software “tools”, which limits the ability of consumers to exploit technology because it mentally paints a picture that limits the definition of what software should do and can do.

This mindset, however, is beginning to change and, no doubt, our current predicament under the Coronavirus crisis will accelerate that transition.

To take our analogy one step further, we are long past the time when we must buy each component of an automobile individually and then assemble it in our own garage. Point solutions, which are set and inelastic, are like individual parts of the car.

Enterprise solutions consisting of different modules and datasets, oftentimes constructed from incompatible foundations, exacerbate this situation and add the element of labor to a supposedly automated process, like buying OEM products and having to upgrade the automobile we supposed bought to do its job, but still needed (with the help of a mechanic) to perform the normal functions of steering, stopping, and accelerating.

Open systems solutions provide more flexibility, but they can be both a blessing and a curse. The challenge is to provide the right balance of out-of-the-box point solution-type functionality while still providing enough flexibility for adaptability. Taking a common data approach is key to achieving this balance. This will require the abandonment of the concept of software “tools” and shifting the focus on data.

Data and Information Take Over: Two Models

The economic imperative for data integration and optimization developed from the needs of the organization and its practitioners–whether it be managers, analysts, or auditors working in a company, a business unit, a governmental agency, or a program or project organization–is to be positioned facing forward.

In order to face forward one must first establish a knowledge-based organization or, as oftentimes identified, a data-driven organization. What this means in real terms is that data is captured, processed, and contextualized so that its importance and meaning can be derived in a timely manner so that something can be done about what is happening. During our own present situation this is not just an economic imperative, but for public health an existential one for many of us.

Thus, we are faced with several key dimensions that must be addressed: size, manner of integration, contextualization, timeliness, and target. This applies to both known and unknown datasets.

Our known datasets are those that are already being used and populated in existing systems. We know, for example, that in program and project management that we require an estimate and plan, a schedule, a manner of organizing and tracking our progress, financial management and material management systems and others. These represent our pool of structured data, and understanding the lexicon of these systems is what is necessary to normalize and rationalize the data through a universal translator.

Our unknown datasets are those that require collection but, when done, is collected and processed in an ad hoc manner. Usually the need for this data collection is learned through the school of hard knocks. In other cases, the information is not collected at all or accidentally, such as when management relies on outside experts and anecdotal information. This is the equivalent of an organizational JOHARI window shown below.

Overview of Johari Window with quadrants
showing the relationships of self-knowledge and understanding

The Johari Window explains our perceptions and our relationship to the outside world. Our universe is not a construction of our own making or imagination. We cannot make our own reality nor are there “alternative facts.” The most colorful example of refuting this specious philosophical mind game is relayed to us in Boswell’s Life of Samuel Johnson.

After we came out of the church, we stood talking for some time together of Bishop Berkeley’s ingenious sophistry to prove the nonexistence of matter, and that every thing in the universe is merely ideal. I observed, that though we are satisfied his doctrine is not true, it is impossible to refute it. I never shall forget the alacrity with which Johnson answered, striking his foot with mighty force against a large stone, till he rebounded from it — “I refute it thus.”

We can deny what we do not know, or construct magical thinking. but reality is unmoved. In the case of Johnson he kicked the stone and the stone, also unmoved, kicked back in the form of the pain that Johnson felt when he “rebounded from it”.

Nor are the quadrants equal in our perceptual windows. Some people and organizations are very well informed and others less so, but the tension and conflict of our lives–both internally and externally–relates to expanding the “open” and “facade” portions of the Johari window so that we are not only informed of how others register us, but also to uncover the unknown, and to attempt to control how others perceive us in our various roles and guises.

We see this playing out in tracking the current Coronavirus pandemic. The absence of reliable widespread tests and testing infrastructure has impeded an understanding of the virus and the most effective strategies to deploy in dealing with it. Absent data, health and governmental agencies have been left with no choice but to use the same social distancing and travel restrictions deployed during the 1918 Influenza Pandemic and then, if lifting some of these, hope for the best.

This is the situation despite the fact that national risk assessments and risk registers, such as the U.S. National Security Council Pandemic Playbook and the U.K. National Risk Register, outlined measures to be taken given certain particular indicators. No doubt there are lessons to be learned here, but at the core lesson is the fact that, absent reliable and timely data that is converted into information that can be used in a decisive and practical manner, an organization, a state, or a nation risks its survival when it fails to imagine what information it needs to collect, absent the prosaic information that comes from performing the day-to-day routine.

Admittedly, there is no great insight here regarding this need (or, at least, there shouldn’t be). This condition is the reason why intelligence systems and agencies were created in the first place. It is why military and health services imagine scenarios and war-game them, and why organizations deploy brain-storming. Individuals and organizations that go into the world uninformed or self-deluded do not last long, and history is replete with such examples. Blanche DuBois relied on the kindness of strangers and we are best served by her experience as an archetype.

And yet, we still find ourselves struggling to properly collect, integrate, and utilize information at the same time that we have come to the realization that we need to collect and process information from larger pools of data. The root cause of this condition, as asserted above, rests in the mental framing of how to approach data and the problem that needs to be solved. It requires us to change the conceptual framework that relies on the concept of “tools.”

We can make this adjustment by realigning the object of the challenge so that it conforms with what we imagine to be the desired end-state. But, still, how do we determine what we need to collect? This is first a question of perception as opposed to one regarding knowledge: what one views as not only necessary but within the realm of possibility.

Once again, this dilemma is best served by models and, in this case, it is not unlike the Overton Window. Those preferring to eschew Wikipedia entries can also find a more detailed and nuanced definition at the source through the Mackinac Center for Public Policy website.

Overton Windows showing degrees of acceptability as modified by Joshua Trevino

Joseph Overton described the window as one of defining acceptable political policies in the mind of the public. He used the terms “more free” and “less free” to describe policies that think tanks recommend to describe the amount of government intervention, avoiding the left-right comparisons used by polemicists. Various adjustments and variations to the basic window have been proposed since his original use of the model, but it has been expanded to describe public perceptions in general on a host of socioeconomic concerns.

As with the Johari Window, I would posit that there is an analogous Overton Window in relation to information that frames what is viewed as the art of the possible. These perceptions influence the actions of decision-makers in assessing the risk involved in buying software solutions. When it comes to the rapidly developing field of data capture, transformation, and effective utilization, the perception from the start suggests some degree of risk and the danger of moving too quickly. For those in the field of data optimization, given that new technology capacity increases exponentially in shorter periods of time, the barrier here is to shift the informational Overton Window so that the market is educated on the risk-reward equation.

A Unified Model for Aligning Our Data

We have discussed two models up to this point in our exploration: an Informational Johari Window and an Informational Overton Window. Each of these models, using a simplified method, isolates different dimensions of the problem of data, which when freed of the concept of “tools” unlocking it, provides us with a clearer picture of the essential nature of its capture and utilization, and to what purposes.

We are now ready to take the next step in defining how to approach data to serve the strategic interests of the enterprise or organization.

For those of us in the information field, especially in the early years when applying solutions to line-and-staff organizations, what we found is that the very introduction of the new technology changed both the structure and nature of the organization. Initially we noted a sophisticated and accelerated version of the Hawthorne Effect. But there was something more elemental and significant going on.

Digital technology is amazingly attuned, especially when properly designed and deployed, to extend the functions of human knowledge gathering and processing. In this way it can be interpreted as an extension of human evolution–of the nature of human society acting as a complex adaptive system. In fact, there are so many connections between early physical, methodological, and industrial societal developments to digitization, such as the connection between the development of the Jacquard Loom to the development of the computer punch card (and there are others) that it seems that human society would have found a way to get to this point regardless of the existence of the intervening human pioneers, though their actual contributions are clear. (For further information on the waves of development see the books Future Shock and The Third Wave by Alvin Toffler.)

When many of us first applied digitized technology to knowledge workers (in my case in the field of contract management) we found that the very introduction of the technology changed perceptions, work habits, and organizational structures in very essential ways. Like the effect of the idea of evolution as described by Daniel Dennett, the application of digital evolution is like a universal acid–it eats through and transforms everything it touches.

For example, a report that, in the past, would have taken a week or two to complete, mostly because of the research required, now took a day or so. Procurement Action Lead Times (PALT) realized significant improvements since information previously only available in paper form was now provided on-line. At the same time, systems were now able to handle greater volumes of demand. As a result, customers’ expectations changed so much that they no longer felt that they had to hold back requests for fear of overloading the system and depend on human intervention. Suppliers, seeing many commodities experiencing steady and stable growth, reverted to just-in-time manufacturing.

Over time, typing pools and secretarial staffs, the former being commonplace well into the 1980s and the latter into the 1990s, except as symbols of privilege or prestige, disappeared. Middle management and many support staffs followed this trend in the early 2000s. Today, consulting services consisting of staffing personnel to apply non-value added manual solutions such as Excel spreadsheets and PowerPoint slides to display data that has already been captured and processed, still manage to hold on in isolated pockets. That this model is not sustainable nor efficient should be obvious except for the continued support these models lend to the self-serving concept of “tools.”

Thus, the next step in the alignment of data capture and utilization to organizational vision is the interplay between our models. Practical experience suggests, though anecdotal, that as forward-facing organizations adopt more powerful digitized technologies designed to capture more and larger datasets, and to better utilize that data, that they tend to move to expand their self-awareness–their Informational Johari Window.

This, in turn, allows them to distinguish between structured and unstructured data and the value–the qualitative information content–of these datasets. This knowledge is then applied to reduce the labor and custom code required for larger data capture and utilization. In the end, these developments then determine what is the art of the possible by moving and expanding the Informational Overton Window.

Combining these concepts from a data perspective results in a combined model as illustrated below from the perspective of the subject:

Data Window of Perception and Possibility (Subject)

Extending this concept to the external subject (object or others) results in the following:

Data Window of Perception and Possibility (Object or Others)

This simplistic model describes several ways of looking at the problem of data and how to align it with its use to serve our purposes. When we gather data from the world the result can be symmetrical or asymmetrical. That is, each of us does not have the capacity to collect the same data that may be relevant to our existence or the survival of our organizations or institutions.

This same concept of symmetry and asymmetry applies to our ability to process data into information and–further–to properly apply information to when it will contribute to a decisive outcome in terms of knowledge, understanding, insight, or action.

As with the psychological Johari Window, our model takes it account the unknown within the much larger data space. Think of our Big Blue Ball (which is not so big) within the context of space. All of space represents the data of the universe. We are finding that the secrets of vast space-time are found in quanta as well in the observations of large and distant celestial events and objects. Data is everywhere. Yet, we can perceive only a small part of the universe. That is why our Data Window does not encompass the entire data space.

The quadrants, of course, are rarely co-equal, but for purposes of simplicity they are shown as such. As with the psychological Johari Window of self-awareness, the tension and conflict within the individual and its relationship with the external world is in the adjustment of the sizes of the quadrants that, hopefully, tend toward more self-awareness and openness. From the perspective of data, the equivalent is toward the expansion of the physical expansion of the Data Window while the quadrants within the window expand to minimize asymmetry of external knowledge and the unknown.

The physical limitations of symmetry, asymmetry, and the unknown portions of the data space is further limited by our perceptions. Our understanding of what is possible, acceptable, sensible, radical, unthinkable, and impossible is influenced by these perceptions. Those areas of information management that fall within some mean or midpoint of the limitations of our perceptions represent current practice and which, as with the original Johari Window, I label as “policy,” though a viable alternative label would be “practice.”

Note that there perceptions vary by the position of the subject. In the case of our own perceptions, as for those reading this post, the first variation of the model is aligned vertically. For the case of the perceptions of others, which are important in understanding their position when advocating a particular course of action, the perception model is aligned horizontally across the quadrants.

The interplay of the quadrants within the Data Window directly affect how we perceive the use of data and its potential. Thus, I have labeled the no-man’s-land portion that pushes into areas that are unknown to the subject and external object is labeled as “The Frontier.”

To an American a “frontier” is an unexplored country while, historically, in the Old World a “frontier” is a border. The former promises not only risk, but, also opportunity and invites exploration. The latter is a limitation. No doubt, my use of the term is culturally biased to the first definition.

Intellectually and physically, as we enter the frontier and learn what secrets await us there, we learn. For data we may first see a Repository of Babel and deal with it as if it were flat. But, given enough exploration we will learn its lexicon and underlying structure and, eventually, learn how to process it into information and harness its content. This, in turn, will influence the size of the Data Window, the relative sizes of the quadrants, and our perceptions of the art of the possible.

Conception to Application

This model, I believe, is a useful antecedent concept in approaching and making comprehensible what is often called Big Data. The model also helps us be more precise in how we perceive and define the term as technology changes, given that exponential increases in hardware storage and processing capabilities expand our Data Window.

Furthermore, understanding the interplay of how wee approach data, and the consequences of our perceptions of it, allow us to weigh the risk when looking at new technologies and the characteristics they need to possess in order to meet organizational goals and vision. The initial bias, as noted by Paul Kahneman in his book Thinking, Fast and Slow, is for people to stick with the status quo or the familiar–the devil they know–in lieu of something new and innovative, even when the advantages of adoption of the new innovation are clearly obvious. It requires a reorientation of thinking to allow the acceptance of the new.

Our familiar patterns when thinking about information is to look for solutions that are “tools.” The new, unfamiliar concept that we find challenging is the understanding that we do not know what we do not know when it come to data and its potential–that we must push into the frontier in order to do so–and doing so will require not only new technology that is oriented toward the optimization of data, its processing from information to knowledge, and its use, but also a new way of thinking about it and how it will align with our organizational strategy.

This can only be done by first starting with a benchmark–to practically take stock–of where we individually as organizations and where we need to be in terms of understanding our mission or purpose. For project controls and project management there is no area more at odds with this alignment.

Recently, Dave Gordon in his blog The Practicing IT Project Manager argued why project managers needed to align their projects with organizational strategy. He noted that in 2015, during the development of the “Talent Triangle” that the Project Management Institute found that a major deficiency noted by organizations was that project managers needed to take an active role in aligning their projects with organizational strategy.

As I previously noted, there are a number of project management tools on the market today and a number of data visualization tools. Yet, there are significant gaps not only in the capture, quality, and processing of data, but also in the articulation of a consistent data strategy that aligns with the project organization and the overarching organization’s business strategy, goals, and priorities.

For example, in government, program managers spend a large portion of the year defending their programs to show that they are effectively and efficiently overseeing the expenditure of resources: that they are “executing program.” Failure to execute program will result in a budget mark, or worse, result in a re-baseline, or possible restructuring or cancellation. Projected production may be scaled back in favor of more immediate priorities.

Yet, none of our so-called “tools” fully capture program execution as it is defined by agencies and Congress. We have performance management tools, earned value tools, and the list can go on. A typical program manager in government spends almost five months assessing and managing program execution, and defending program and only a few minutes each month reviewing performance. This fact alone should be indicative that our priorities are misaligned.

The intersection of organizational alignment and program management in this case is related to resource utilization and program execution. No doubt, project controls and performance management contribute to our understanding of program execution, but they are removed from informing both the program manager and the organization in a comprehensive manner about execution, risk, and opportunity–and whether those elements conflict with or align with the agency’s goals. They are even further removed from an understanding of decisions related to program execution on the interrelationships across spectrum of the project and program portfolio.

The reason for this condition is that the data is currently not being captured and processed in a comprehensive manner to be positioned for its effective exploitation and utilization in meeting the needs of the various levels of the organization, nor does the perception of the specific data needed align with organizational needs.

Correspondingly, in construction and upstream oil and gas, project managers and stakeholders are most concerned with scope, timeliness, and the inevitable questions of claims–especially the avoidance or equitable settlement of the last.

As with government, our data strategy must align with our organizational goals and vision from the perspective of all stakeholders in the effort. At the heart of this alignment is data and those technologies “fitted” to exploit it and align it with our needs.

Potato, Potahto, Tomato, Tomahto: Data Normalization vs. Standardization, Why the Difference Matters

In my vocation I run a technology company devoted to program management solutions that is primarily concerned with taking data and converting it into information to establish a knowledge-based environment. Similarly, in my avocation I deal with the meaning of information and how to turn it into insight and knowledge. This latter activity concerns the subject areas of history, sociology, and science.

In my travels just prior to and since the New Year, I have come upon a number of experts and fellow enthusiasts in these respective fields. The overwhelming numbers of these encounters have been productive, educational, and cordial. We respectfully disagree in some cases about the significance of a particular approach, governance when it comes to project and program management policy, but generally there is a great deal of agreement, particularly on basic facts and terminology. But some areas of disagreement–particularly those that come from left field–tend to be the most interesting because they create an opportunity to clarify a larger issue.

In a recent venue I encountered this last example where the issue was the use of the phrase data normalization. The issue at hand was that the use of “data normalization” suggested some statistical methodology in reconciling data into a standard schema. Instead, it was suggested, the term “data standardization” was more appropriate.

These phrases do not describe the same thing, but they do describe processes that are symbiotic, not mutually exclusive. So what about data normalization? No doubt there is a statistical use of the term, but we are dealing with the definition as used in digital technology here, just as the use of “standardization” was suggested in the same context. There are many examples of technical terminology that do not have the same meaning when used in different contexts. Here is the definition of normalization applied to data science from Technopedia, which is the proper use of the term in this case:

Normalization is the process of reorganizing data in a database so that it meets two basic requirements: (1) There is no redundancy of data (all data is stored in only one place), and (2) data dependencies are logical (all related data items are stored together). Normalization is important for many reasons, but chiefly because it allows databases to take up as little disk space as possible, resulting in increased performance.

Normalization is also known as data normalization

This is pretty basic (and necessary) stuff. I have written at length about data normalization, but also pair it with two other terms. This is data rationalization and contextualization. Here is a short definition of rationalization:

What is the benefit of Data Rationalization? To be able to effectively exploit, manage, reuse, and govern enterprise data assets (including the models which describe them), it is necessary to be able to find them. In addition, there is (or should be) a wealth of semantics (e.g. business names, definitions, relationships) embedded within an organization’s models that can be exposed for improved analysis and knowledge transfer. By linking model objects (across or within models) it is possible to discover the higher order conceptual objects for any given object. Conversely, it is possible to identify what implementation artifacts implement a higher order model object. For example, using data rationalization, one can traverse from a conceptual model entity to a logical model entity to a physical model table to a database table, etc. Similarly, Data Rationalization enables understanding of a database table by traversing up through the different model levels.

Finally, we have contextualization. Here is a good definition using Wikipedia:

Context or contextual information is any information about any entity that can be used to effectively reduce the amount of reasoning required (via filtering, aggregation, and inference) for decision making within the scope of a specific application.[2] Contextualisation is then the process of identifying the data relevant to an entity based on the entity’s contextual information. Contextualisation excludes irrelevant data from consideration and has the potential to reduce data from several aspects including volume, velocity, and variety in large-scale data intensive applications

There is no approximation of reflecting the accuracy of data in any of these terms wihin the domain of data and computer science. Nor are there statistical methods involved to approximate what needs to be accomplished precisely. The basic skill required to accomplish these tasks–knowing that the data is structured and pre-conditioned–is to reconcile the various lexicons from differing sources, much as I reconcile in my avocation the meaning of words and phrases across periods in history and across languages.

In this discussion we are dealing with the issue of different words used to describe a process or phenomenon. Similarly, we find this challenge in data.

So where does this leave data standardization? In terms of data and computer science, this describes a completely different method. Here is a definition from Wikipedia, which is the proper contextual use of the term under “Standard data model”:

A standard data model or industry standard data model (ISDM) is a data model that is widely applied in some industry, and shared amongst competitors to some degree. They are often defined by standards bodies, database vendors or operating system vendors.

In the context of project and program management, particularly as it relates to government data submission and international open standards across vendors in an industry, is the use of a common schema. In this case there is a DoD version of a UN/CEFACT XML file currently set as the standard, but soon to be replaced by a new standard using the JSON file structure.

In any event, what is clear here is that, while standardization is a necessary part of a data policy to allow for sharing of information, the strength of the chosen schema and the instructions regarding it will vary–and this variation will have an effect on the quality of the information shared. But that is not all.

This is where data normalization, rationalization, and contextualization come into play. In order to create data for the a standardized format, it is first necessary to convert what is an otherwise opaque set of data due to differences into a cohesive lexicon. In data, this is accomplished by reconciling data dictionaries to determine which items are describing the same thing, process, measure, or phenomenon. In a domain like program management, this is a finite set. But it is also specialized knowledge and where the value is added to any end product that is produced. Then, once we know how to identify the data, we must be able to map those terms to the standard schema but, keeping on eye on the use of the data down the line, must be able to properly structure and ensure interrelationships of the data are established and/or maintained to ensure its effective use. This is no mean task and why all data transformation methods and companies are not the same.

Furthermore, these functions can be accomplished efficiently or inefficiently. The inefficient method is to take the old-fashioned business intelligence method that has been around since the 1980s and before, where a team of data scientists and analysts deal with data as if it is flat and, essentially, reinvents the wheel in establishing the meaning and proper context of the data. Given enough time and money anything can be accomplished, but brute force labor will not defeat the Second Law of Thermodynamics.

In computing, which comes close to minimizing that physical law, we know that data has already been imbued with meaning upon its initial processing. In lieu of brute force labor we apply intelligence and knowledge to accomplish this requirement. This is called normalization, rationalization, and contextualization of data. It requires a small fraction of other methods in terms of time and effort, and is infinitely more transparent.

Using these methods is also where innovation, efficiency, performance, accuracy, scalability, and anticipating future requirements based on the latest technology trends comes into play. Establishing a seamless flow of data integration allows, for example, the capture of more data being able to be properly structured in a database, which lays the ground for the transition from 2D to 3D and 4D (that is, what is often called integrated) program management, as well as more effective analytics.

The term “standardization” also suffers from a weakness in data and computer science that requires that it be qualified. After all, data standardization in an enterprise or organization does not preclude the prescription of a propriety dataset. In government, this is contrary to both statutory and policy mandates. Furthermore, even given an effective, open standard, there will be a large pool of legacy and other non-conforming data that will still require capture and transformation.

The Section 809 Panel study dealt directly with this issue:

Use existing defense business system open-data requirements to improve strategic decision making on acquisition and workforce issues…. DoD has spent billions of dollars building the necessary software and institutional infrastructure to collect enterprise wide acquisition and financial data. In many cases, however, DoD lacks the expertise to effectively use that data for strategic planning and to improve decision making. Recommendation 88 would mitigate this problem by implementing congressional open-data mandates and using existing hiring authorities to bolster DoD’s pool of data science professionals.

Section 809 Volume 3, Section 9, p.477

As operating environment companies expose more and more capability into the market through middleware and other open systems methods of visualizing data, the key to a system no longer resides in its ability to produce charts and graphs. The use of Excel as an ad hoc data repository with its vulnerability to error, to manipulation, and for its resistance to the establishment of an optimized data management and corporate knowledge environment is a symptom of the larger issue.

Data and its proper structuring is at the core of organizational success and process improvement. Standardization alone will not address barriers to data optimization. According to RAND studies in 2015 and 2017* these are:

  • Data Quality and Discontinuities
  • Data Silos and Underutilized Repositories
  • Timeliness of Data for use by SMEs and Decision-makers
  • Lack of Access and Contextualization
  • Traceability and Auditability
  • Lack of the Ability to Apply Discovery in the Data
  • The issue of Contractual Technical Data and Proprietary Data

That these issues also exist in private industry demonstrates the universality of the issue. Thus, yes, standardize by all means. But also ensure that the standard is open and that transformation is traceable and auditable from the the source system to the standard schema, and then into the target database. Only then will the enterprise, the organization, and the government agency have full ownership of the data it requires to efficiently and effectively carry out its purpose.

*RAND Corporation studies are “Issues with Access to Acquisition Data and Information in the DoD: Doing Data Right in Weapons System Acquisition” (RR880, 2017), and “Issues with Access to Acquisition Data and Information in the DoD: Policy and Practice (RR1534, 2015). These can be found here.

Ring Out the Old, Ring in the New: Data Transformation Podcasting

Robin Williams at Innovate IPM interviewed me a few weeks ago and has a new podcast up to cap off the year. The main thrust of our discussion, as it turned out, which began as a wide-ranging one, settled on digital transformation and the changes and developments that I’ve seen in this area over the last three decades.

I met Rob at a recent Projects Controls conference. He is a professional, curious, and engaging individual who quickly puts one at ease. We both found a lot in common regarding our perspectives on project management and project controls and I agreed to the podcast interview. Our discussion was no different than many that I’ve had with other professionals in my areas of interest in my own living room, and the discussion comes off as a similarly engaging and informal conversation between like-minded individuals.

Before he posted the podcast, I managed to get a preview. Despite years of doing interviews, hosting symposiums, an occasional emcee or radio spot, home movies, and other recordings, I still cannot get over the strange feeling of hearing my own voice during a long conversation. I am constantly looking for faults, and cringed with the utterance of each “ah” or “um” while listening to myself–returning in my head to the admonitions of my supervisors when I was taught to be a Navy instructor–though, thankfully, they are few.

Still, thanks to the magic of editing, Rob managed to keep the focus on the main point of the conversation when I strayed into some side discussion. During the time of the interview Rob caught me at a time when I was working on a paper to present to DoD professionals regarding digital transformation, and so the interview caught me in real-time while I was developing in my mind two main concepts that I picked up by reading the literature in the areas of establishing a Master Data Management (MDM) strategy, and a knowledge management environment. While I do not mention these items in the interview, the discussion allowed me to subsequently sort out where these concepts apply.

In any event, the podcast can be found here: https://www.innovateipm.com/podcast/episode/206e7fbd/13-history-of-digital-transformation-with-nick-pisano. I hope you find it interesting and informative.

(Data) Transformation–Fear and Loathing over ETL in Project Management

ETL stands for data extract, transform, and load. This essential step is the basis for all of the new capabilities that we wish to acquire during the next wave of information technology: business analytics, big(ger) data, interdisciplinary insight into processes that provide insights into improving productivity and efficiency.

I’ve been dealing with a good deal of fear and loading regarding the introduction of this concept, even though in my day job my organization is a leading practitioner in the field in its vertical. Some of this is due to disinformation by competitors in playing upon the fears of the non-technically minded–the expected reaction of those who can’t do in the last throws of avoiding irrelevance. Better to baffle them with bullshit than with brilliance, I guess.

But, more importantly, part of this is due to the state of ETL and how it is communicated to the project management and business community at large. There is a great deal to be gained here by muddying the waters even by those who know better and have the technology. So let’s begin by clearing things up and making this entire field a bit more coherent.

Let’s start with the basics. Any organization that contains the interaction of people is a system. For purposes of a project management team, a business enterprise, or a governmental body we deal with a special class of systems known as Complex Adaptive Systems: CAS for short. A CAS is a non-linear learning system that reacts and evolves to its environment. It is complex because of the inter-relationships and interactions of more than two agents in any particular portion of the system.

I was first introduced to the concept of CAS through readings published out of the Santa Fe Institute in New Mexico. Most noteworthy is the work The Quark and the Jaguar by the physicist Murray Gell-Mann. Gell-Mann is received the Nobel in physics in 1969 for his work on elementary particles, such as the quark, and is co-founder of the Institute. He also was part of the team that first developed simulated Monte Carlo analysis during a period he spent at RAND Corporation. Anyone interested in the basic science of quanta and how the universe works that then leads to insights into subjects such as day-to-day probability and risk should read this book. It is a good popular scientific publication written by a brilliant mind, but very relevant to the subjects we deal with in project management and information science.

Understanding that our organizations are CAS allows us to apply all sorts of tools to better understand them and their relationship to the world at large. From a more practical perspective, what are the risks involved in the enterprise in which we are engaged and what are the probabilities associated with any of the range of outcomes that we can label as success. For my purposes, the science of information theory is at the forefront of these tools. In this world an engineer by the name of Claude Shannon working at Bell Labs essentially invented the mathematical basis for everything that followed in the world of telecommunications, generating, interpreting, receiving, and understanding intelligence in communication, and the methods of processing information. Needless to say, computing is the main recipient of this theory.

Thus, all CAS process and react to information. The challenge for any entity that needs to survive and adapt in a continually changing universe is to ensure that the information that is being received is of high and relevant quality so that the appropriate adaptation can occur. There will be noise in the signals that we receive. What we are looking for from a practical perspective in information science are the regularities in the data so that we can make the transformation of receiving the message in a mathematical manner (where the message transmitted is received) into the definition of information quality that we find in the humanities. I believe that we will find that mathematical link eventually, but there is still a void there. A good discussion of this difference can be found here in the on-line publication Double Dialogues.

Regardless of this gap, the challenge of those of us who engage in the business of ETL must bring to the table the ability not only to ensure that the regularities in the information are identified and transmitted to the intended (or necessary) users, but also to distinguish the quality of the message in the terms of the purpose of the organization. Shannon’s equation is where we start, not where we end. Given this background, there are really two basic types of data that we begin with when we look at a set of data: structured and unstructured data.

Structured data are those where the qualitative information content is either predefined by its nature or by a tag of some sort. For example, schedule planning and performance data, regardless of the idiosyncratic/proprietary syntax used by a software publisher, describes the same phenomena regardless of the software application. There are only so many ways to identify snow–and, no, the Inuit people do not have 100 words to describe it. Qualifiers apply in the humanities, but usually our business processes more closely align with statistical and arithmetic measures. As a result, structured data is oftentimes defined by its position in a hierarchical, time-phased, or interrelated system that contains a series of markers, indexes, and tables that allow it to be interpreted easily through the identification of a Rosetta stone, even when the system, at first blush, appears to be opaque. When you go to a book, its title describes what it is. If its content has a table of contents and/or an index it is easy to find the information needed to perform the task at hand.

Unstructured data consists of the content of things like letters, e-mails, presentations, and other forms of data disconnected from its source systems and collected together in a flat repository. In this case the data must be mined to recreate what is not there: the title that describes the type of data, a table of contents, and an index.

All data requires initial scrubbing and pre-processing. The difference here is the means used to perform this operation. Let’s take the easy path first.

For project management–and most business systems–we most often encounter structured data. What this means is that by understanding and interpreting standard industry terminology, schemas, and APIs that the simple process of aligning data to be transformed and stored in a database for consumption can be reduced to a systemic and repeatable process without the redundancy of rediscovery applied in every instance. Our business intelligence and business analytics systems can be further developed to anticipate a probable question from a user so that the query is pre-structured to allow for near immediate response. Further, structuring the user interface in such as way as to make the response to the query meaningful, especially integrated with and juxtaposed other types of data requires subject matter expertise to be incorporated into the solution.

Structured ETL is the place that I most often inhabit as a provider of software solutions. These processes are both economical and relatively fast, particularly in those cases where they are applied to an otherwise inefficient system of best-of-breed applications that require data transfers and cross-validation prior to official reporting. Time, money, and effort are all saved by automating this process, improving not only processing time but also data accuracy and transparency.

In the case of unstructured data, however, the process can be a bit more complicated and there are many ways to skin this cat. The key here is that oftentimes what seems to be unstructured data is only so because of the lack of domain knowledge by the software publisher in its target vertical.

For example, I recently read a white paper published by a large BI/BA publisher regarding their approach to financial and accounting systems. My own experience as a business manager and Navy Supply Corps Officer provide me with the understanding that these systems are highly structured and regulated. Yet, business intelligence publishers treated this data–and blatantly advertised and apparently sold as state of the art–an unstructured approach to mining this data.

This approach, which was first developed back in the 1980s when we first encountered the challenge of data that exceeded our expertise at the time, requires a team of data scientists and coders to go through the labor- and time-consuming process of pre-processing and building specialized processes. The most basic form of this approach involves techniques such as frequency analysis, summarization, correlation, and data scrubbing. This last portion also involves labor-intensive techniques at the microeconomic level such as binning and other forms of manipulation.

This is where the fear and loathing comes into play. It is not as if all information systems do not perform these functions in some manner, it is that in structured data all of this work has been done and, oftentimes, is handled by the database system. But even here there is a better way.

My colleague, Dave Gordon, who has his own blog, will emphasize that the identification of probable questions and configuration of queries in advance combined with the application of standard APIs will garner good results in most cases. Yet, one must be prepared to receive a certain amount of irrelevant information. For example, the query on Google of “Fun Things To Do” that you may use if you are planning for a weekend will yield all sorts of results, such as “50 Fun Things to Do in an Elevator.”  This result includes making farting sounds. The link provides some others, some of which are pretty funny. In writing this blog post, a simple search on Google for “Google query fails” yields what can only be described as a large number of query fails. Furthermore, this approach relies on the data originator to have marked the data with pointers and tags.

Given these different approaches to unstructured data and the complexity involved, there is a decision process to apply:

1. Determine if the data is truly unstructured. If the data is derived from a structured database from an existing application or set of applications, then it is structured and will require domain expertise to inherit the values and information content without expending unnecessary resources and time. A structured, systemic, and repeatable process can then be applied. Oftentimes an industry schema or standard can be leveraged to ensure consistency and fidelity.

2. Determine whether only a portion of the unstructured data is relative to your business processes and use it to append and enrich the existing structured data that has been used to integrate and expand your capabilities. In most cases the identification of a Rosetta Stone and standard APIs can be used to achieve this result.

3. For the remainder, determine the value of mining the targeted category of unstructured data and perform a business case analysis.

Given the rapidly expanding size of data that we can access using the advancing power of new technology, we must be able to distinguish between doing what is necessary from doing what is impressive. The definition of Big Data has evolved over time because our hardware, storage, and database systems allow us to access increasingly larger datasets that ten years ago would have been unimaginable. What this means is that–initially–as we work through this process of discovery, we will be bombarded with a plethora of irrelevant statistical measures and so-called predictive analytics that will eventually prove out to not pass the “so-what” test. This process places the users in a state of information overload, and we often see this condition today. It also means that what took an army of data scientists and developers to do ten years ago takes a technologist with a laptop and some domain knowledge to perform today. This last can be taught.

The next necessary step, aside from applying the decision process above, is to force our information systems to advance their processing to provide more relevant intelligence that is visualized and configured to the domain expertise required. In this way we will eventually discover the paradox that effectively accessing larger sets of data will yield fewer, more relevant intelligence that can be translated into action.

At the end of the day the manager and user must understand the data. There is no magic in data transformation or data processing. Even with AI and machine learning it is still incumbent upon the people within the organization to be able to apply expertise, perspective, knowledge, and wisdom in the use of information and intelligence.

Money for Nothing — Project Performance Data and Efficiencies in Timeliness

I operate in a well regulated industry focused on project management. What this means practically is that there are data streams that flow from the R&D activities, recording planning and progress, via control and analytical systems to both management and customer. The contract type in most cases is Cost Plus, with cost and schedule risk often flowing to the customer in the form of cost overruns and schedule slippages.

Among the methodologies used to determine progress and project eventual outcomes is earned value management (EVM). Of course, this is not the only type of data that flows in performance management streams, but oftentimes EVM is used as shorthand to describe all of the data captured and submitted to customers in performance management. Other planning and performance management data includes time-phased scheduling of tasks and activities, cost and schedule risk assessments, and technical performance.

Previously in my critique regarding the differences between project monitoring and project management (before Hurricane Irma created some minor rearranging of my priorities), I pointed out that “looking in the rear view mirror” was often used as an excuse for by-passing unwelcome business intelligence. I followed this up with an intro to the synergistic economics of properly integrated data. In the first case I answered the critique demonstrating that it is based on an old concept that no longer applies. In the second case I surveyed the economics of data that drives efficiencies. In both cases, new technology is key to understanding the art of the possible.

As I have visited sites in both government and private industry, I find that old ways of doing things still persist. The reason for this is multivariate. First, technology is developing so quickly that there is fear that one’s job will be eliminated with the introduction of technology. Second, the methodology of change agents in introducing new technology often lacks proper socialization across the various centers of power that inevitably exist in any organization. Third, the proper foundation to clearly articulate the need for change is not made. This last is particularly important when stakeholders perform a non-rational assessment in their minds of cost-benefit. They see many downsides and cannot accept the benefits, even when they are obvious. For more on this and insight into other socioeconomic phenomena I strongly recommend Daniel Kahneman’s Thinking Fast and Slow. There are other reasons as well, but these are the ones that are most obvious when I speak with individuals in the field.

The Past is Prologue

For now I will restrict myself to the one benefit of new technology that addresses the “looking in the rear window” critique. It is important to do so because the critique is correct in application (for purposes that I will outline) if incorrect in its cause-and-effect. It is also important to focus on it because the critique is so ubiquitous.

As I indicated above, there are many sources of data in project management. They derive from the following systems (in brief):

a. The planning and scheduling applications, which measure performance through time in the form of discrete activities and events. In the most sophisticated implementations, these applications will include the assignment of resources, which requires the integration of these systems with resource management. Sometimes simple costs are also assigned and tracked through time as well.

b. The cost performance (earned value) applications, which ideally are aligned with the planning and scheduling applications, providing cross-integration with WBS and OBS structures, but focused on work accomplishment defined by the value of work completed against a baseline plan. These performance figures are tied to work accomplishment through expended effort collected by and, ideally, integrated with the financial management system. It involves the proper application of labor rates and resource expenditures in the accomplishment of the work to not only provide an statistical assessment of performance to date, but a projection of likely cost performance outcomes at completion of the effort.

c. Risk assessment applications which, depending of their sophistication and ease of use, provide analysis of possible cost and schedule outcomes, identify the sensitivity of particular activities and tasks, provide an assessment of alternative driving and critical paths, and apply different models of baseline performance to predict future outcomes.

d. Systems engineering applications that provide an assessment of technical performance to date and the likely achievement of technical parameters within the scope of the effort.

e. The financial management applications that provide an accounting of funds allocation, cash-flow, and expenditure, including planning information regarding expenditures under contract and planned expenditures in the future.

These are the core systems of record upon which performance information is derived. There are others as well, depending on the maturity of the project such as ERP systems and MRP systems. But for purposes of this post, we will bound the discussion to these standard sources of data.

In the near past, our ability to understand the significance of the data derived from these systems required manual processing. I am not referring to the sophistication of human computers of 1960s and before, dramatized to great effect in the uplifting movie Hidden Figures. Since we are dealing with business systems, these methodologies were based on simple business metrics and other statistical methods, including those that extended the concept of earned value management.

With the introduction of PCs in the workplace in the 1980s, desktop spreadsheet applications allowed this data to be entered, usually from printed reports. Each analyst not only used standard methods common in the discipline, but also developed their own methods to process and derive importance from the data, transforming it into information and useful intelligence.

Shortly after this development simple analytical applications were introduced to the market that allowed for pairing back the amount of data deriving from some of these systems and performing basic standard calculations, rendering redundant calculations unnecessary. Thus, for example, instead of a person having to calculate multiple estimates to complete, the application could perform those calculations as part of its functionality and deliver them to the analyst for use in, hopefully, their own more extensive assessments.

But even in this case, the data flow was limited to the EVM silo. The data streams relating to schedule, risk, SE, and FM were left to their own devices, oftentimes requiring manual methods or, in the best of cases, cut-and-paste, to incorporate data from reports derived from these systems. In the most extreme cases, for project oversight organizations, this caused analysts to acquire a multiplicity of individual applications (with the concomitant overhead and complexity of understanding differing lexicons and software application idiosyncrasies) in order to read proprietary data types from the various sources just to perform simple assessments of the data before even considering integrating it properly into the context of all of the other project performance data that was being collected.

The bottom line of outlining these processes is to note that, given a combination of manual and basic automated tools, that putting together and reporting on this data takes time, and time, as Mr. Benjamin Franklin noted, is money.

By itself the critique that “looking in the rear view mirror” has no value and attributing it to one particular type of information (EVM) is specious. After all, one must know where one has been and presently is before you can figure out where you need to go and how to get there and EVM is just one dimension of a multidimensional space.

But there is a utility value associated with the timing and locality of intelligence and that is the issue.

Contributors to time

Time when expended to produce something is a form of entropy. For purposes of this discussion at this level of existence, I am defining entropy as availability of the energy in a system to do work. The work in this case is the processing and transformation of data into information, and the further transformation of information into usable intelligence.

There are different levels and sub-levels when evaluating the data stream related to project management. These are:

a. Within the supplier/developer/manufacturer

(1) First tier personnel such as Control Account Managers, Schedulers (if separate), Systems Engineers, Financial Managers, and Procurement personnel among other actually recording and verifying the work accomplishment;

(2) Second tier personnel that includes various levels of management, either across teams or in typical line-and-staff organizations.

b. Within customer and oversight organizations

(1) Reporting and oversight personnel tasks with evaluating the fidelity of specific business systems;

(2) Counterpart project or program officer personnel tasked with evaluating progress, risk, and any factors related to scope execution;

(3) Staff organizations designed to supplement and organize the individual project teams, providing a portfolio perspective to project management issues that may be affected by other factors outside of the individual project ecosystem;

(4) Senior management at various levels of the organization.

Given the multiplicity of data streams it appears that the issue of economies is vast until it is understood that the data that underlies the consumers of the information is highly structured and specific to each of the domains and sub-domains. Thus there are several opportunities for economies.

For example, cost performance and scheduling data have a direct correlation and are closely tied. Thus, these separate streams in the A&D industry were combined under a common schema, first using the UN/CEFACT XML, and now transitioning to a more streamlined JSON schema. Financial management has gone through a similar transition. Risk and SE data are partially incorporated into project performance schemas, but the data is also highly structured and possesses commonalities to be directly accessed using technologies that effectively leverage APIs.

Back to the Future

The current state, despite advances in the data formats that allow for easy rationalization and normalization of data that breaks through propriety barriers, still largely is based a slightly modified model of using a combination of manual processing augmented by domain-specific analytical tools. (Actually sub-domain analytical tools that support sub-optimization of data that are a barrier to incorporation of cross-domain integration necessary to create credible project intelligence).

Thus, it is not unusual at the customer level to see project teams still accepting a combination of proprietary files, hard copy reports, and standard schema reports. Usually the data in these sources is manually entered into Excel spreadsheets or a combination of Excel and some domain-specific analytical tool (and oftentimes several sub-specialty analytical tools). After processing, the data is oftentimes exported or built in PowerPoint in the form of graphs or standard reporting formats. This is information management by Excel and PowerPoint.

In sum, in all too many cases the project management domain, in terms of data and business intelligence, continues to party like it is 1995. This condition also fosters and reinforces insular organizational domains, as if the project team is disconnected from and can possess goals antithetical and/or in opposition to the efficient operation of the larger organization.

A typical timeline goes like this:

a. Supplier provides project performance data 15-30 days after the close of a period. (Some contract clauses give more time). Let’s say the period closed at the end of July. We are now effectively in late August or early September.

b. Analysts incorporate stove-piped domain data into their Excel spreadsheets and other systems another week or so after submittal.

c. Analysts complete processing and analyzing data and submit in standard reporting formats (Excel and PowerPoint) for program review four to six weeks after incorporation of the data.

Items a through c now put a typical project office at project review for July information at the end of September or beginning of October. Furthermore, this information is focused on individual domains, and given the lack of cross-domain knowledge, can be contradictory.

This system is broken.

Even suppliers who have direct access to systems of record all too often rely on domain-specific solutions to be able to derive significance from the processing of project management data. The larger suppliers seem to have recognized this problem and have been moving to address it, requiring greater integration across solutions. But the existence of a 15-30 day reconciliation period after the end of a period, and formalized in contract clauses, is indicative of an opportunity for greater efficiency in that process as well.

The Way Forward

But there is another way.

The opportunities for economy in the form of improvements in time and effort are in the following areas, given the application of the right technology:

  1. In the submission of data, especially by finding data commonalities and combining previously separate domain data streams to satisfy multiple customers;
  2. In retrieving all data so that it is easily accessible to the organization at the level of detailed required by the task at hand;
  3. In processing this data so that it can converted by the analyst into usable intelligence;
  4. In properly accessing, displaying, and reporting properly integrated data across domains, as appropriate, to each level of the organization regardless of originating data stream.

Furthermore, there opportunities to realizing business value by improving these processes:

  1. By extending expertise beyond a limited number of people who tend to monopolize innovations;
  2. By improving organizational knowledge by incorporating innovation into the common system;
  3. By gaining greater insight into more reliable predictors of project performance across domains instead of the “traditional” domain-specific indices that have marginal utility;
  4. By developing a project focused organization that breaks down domain-centric thinking;
  5. By developing a culture that ties cross-domain project knowledge to larger picture metrics that will determine the health of the overarching organization.

It is interesting that when I visit the field how often it is asserted that “the technology doesn’t matter, it’s process that matters”.

Wrong. Technology defines the art of the possible. There is no doubt that in an ideal world we would optimize our systems prior to the introduction of new technology. But that assumes that the most effective organization (MEO) is achievable without technological improvements to drive the change. If one cannot efficiently integrate all submitted cross-domain information effectively and efficiently using Excel in any scenario (after all, it’s a lot of data), then the key is the introduction of new technology that can do that very thing.

So what technologies will achieve efficiency in the use of this data? Let’s go through the usual suspects:

a. Will more effective use of PowerPoint reduce these timelines? No.

b. Will a more robust set of Excel workbooks reduce these timelines? No.

c. Will an updated form of a domain-specific analytical tool reduce these timelines? No.

d. Will a NoSQL solution reduce these timelines? Yes, given that we can afford the customization.

e. Will a COTS BI application that accepts a combination of common schemas and APIs reduce these timelines? Yes.

The technological solution must be fitted to its purpose and time. Technology matters because we cannot avoid the expenditure of time or energy (entropy) in the processing of information. We can perform these operations using a large amount of energy in the form of time and effort, or we can conserve time and effort by substituting the power of computing and information processing. While we will never get to the point where we completely eliminate entropy, our application of appropriate technology makes it seem as if effort in the form of time is significantly reduced. It’s not quite money for nothing, but it’s as close as we can come and is an obvious area of improvement that can be made for a relatively small investment.

Rear View Mirror — Correcting a Project Management Fallacy

“The past is never dead. It’s not even past.” —  William Faulkner, Requiem for a Nun

Over the years I and others have briefed project managers on project performance using KPPs, earned value management, schedule analysis, business analytics, and what we now call predictive analytics. Oftentimes, some set of figures will be critiqued as being ineffective or unhelpful; that the analytics “only look in the rear view mirror” and that they “tell me what I already know.”

In approaching this critique, it is useful to understand Faulkner’s oft-cited quote above.  When we walk down a street, let us say it is a busy city street in any community of good size, we are walking in the past.  The moment we experience something it is in the past.  If we note the present condition of our city street we will see that for every building, park, sidewalk, and individual that we pass on that sidewalk, each has a history.  These structures and the people are as much driven by their pasts as their expectations for the future.

Now let us take a snapshot of our street.  In doing so we can determine population density, ethnic demographics, property values, crime rate, and numerous other indices and parameters regarding what is there.  No doubt, if we stop here we are just “looking in the rear view mirror” and noting what we may or may not know, however certain our anecdotal filter.

Now, let us say that we have an affinity for this street and may want to live there.  We will take the present indices and parameters that noted above, which describe our geographical environment, and trend it.  We may find that housing pricing are rising or falling, that crime is rising or falling, etc.  If we delve into the street’s ownership history we may find that one individual or family possesses more than one structure, or that there is a great deal of diversity.  We may find that a Superfund site is not too far away.  We may find that economic demographics are pointing to stagnation of the local economy, or that the neighborhood is becoming gentrified.  Just by time-phasing and delving into history–by mapping out the trends and noting the significant historical background–provides us with enough information to inform us about whether our affinity is grounded in reality or practicality.

But let us say that, despite negatives, we feel that this is the next up-and-coming neighborhood.  We would need signs to make that determination.  For example, what kinds of businesses have moved into the neighborhood and what is their number?  What demographic do they target?  There are many other questions that can be asked to see if our economic analysis is valid–and that analysis would need to be informed by risk.

The fact of the matter is that we are always living with the past: the cumulative effect of the past actions of numerous individuals, including our own, and organizations, groups of individuals, and institutions; not to mention larger economic forces well beyond our control.  Any desired change in the trajectory of the system being evaluated must identify those elements that can be impacted or influenced, and an analysis of the effort that must be expended to bring about the change, is also essential.

This is a scientific fact, proven countless times by physics, biology, and other disciplines.  A deterministic universe, which provides for some uncertainty at any given point at our level of existence, drives the possible within very small limits of possibility and even smaller limits of probability.  What this means in plain language is that the future is usually a function of the past.

Any one number or index, no doubt, does not necessarily tell us something important.  But it could if it is relevant, material, and prompts further inquiry essential to project performance.

For example, let us look at an integrated master schedule that underlies a typical medium-sized project.

 

We will select a couple of metrics that indicates project schedule performance.  In the case below we are looking at task hits and misses and Baseline Execution Index, a popular index that determines efficiency in meeting baseline schedule planning.

Note that the chart above plots the performance over time.  What will it take to improve our efficiency?  So as a quick logic check on realism, let’s take a look at the work to date with all of the late starts and finishes.

Our bow waves track the cumulative effort to date.  As we work to clear missed starts or missed finishes in a project we also must devote resources to the accomplishment of current work that is still in line with the baseline.  What this means is that additional resources may need to be devoted to particular areas of work accomplishment or risk handling.

This is not, of course, the limit to our analysis that should be undertaken.  The point here is that at every point in history in every system we stand at a point of the cumulative efforts, risk, failure, success, and actions of everyone who came before us.  At the microeconomic level this is also true within our project management systems.  There are also external constraints and influences that will define the framing assumptions and range of possibilities and probabilities involved in project outcomes.

The shear magnitude of the bow waves that we face in all endeavors will often be too great to fully overcome.  As an analogy, a bow wave in complex systems is more akin to a tsunami as opposed to the tidal waves that crash along our shores.  All of the force of all of the collective actions that have preceded present time will drive our trajectory.

This is known as inertia.

Identifying and understanding the contributors to the inertia that is driving our performance is important to knowing what to do.  Thus, looking in the rear view mirror is important and not a valid argument for ignoring an inconvenient metric that may only require additional context.  Furthermore, knowing where we sit is important and not insignificant.  Knowing the factors that put us where we are–and the effort that it will take to influence our destiny–will guide what is possible and not possible in our future actions.

Note:  All charted data is notional and is not from an actual project.

Post-Blogging NDIA Blues — The Latest News (Project Management Wonkish)

The National Defense Industrial Association’s Integrated Program Management Division (NDIA IPMD) just had its quarterly meeting here in sunny Orlando where we braved the depths of sub-60 degrees F temperatures to start out each day.

For those not in the know, these meetings are an essential coming together of policy makers, subject matter experts, and private industry practitioners regarding the practical and mundane state-of-the-practice in complex project management, particularly focused on the concerns of the the federal government and the Department of Defense.  The end result of these meetings is to publish white papers and recommendations regarding practice to support continuous process improvement and the practical application of project management practices–allowing for a cross-pollination of commercial and government lessons learned.  This is also the intersection where innovation among the large and small are given an equal vetting and an opportunity to introduce new concepts and solutions.  This is an idealized description, of course, and most of the petty personality conflicts, competition, and self-interest that plagues any group of individuals coming together under a common set of interests also plays out here.  But generally the days are long and the workshops generally produce good products that become the de facto standard of practice in the industry. Furthermore the control that keeps the more ruthless personalities in check is the fact that, while it is a large market, the complex project management community tends to be a relatively small one, which reinforces professionalism.

The “blues” in this case is not so much borne of frustration or disappointment but, instead, from the long and intense days that the sessions offer.  The biggest news from an IT project management and application perspective was twofold. The data stream used by the industry in sharing data in an open systems manner will be simplified.  The other was the announcement that the technology used to communicate will move from XML to JSON.

Human readable formatting to Data-focused formatting.  Under Kendall’s Better Buying Power 3.0 the goal of the Department of Defense (DoD) has been to incorporate better practices from private industry where they can be applied.  I don’t see initiatives for greater efficiency and reduction of duplication going away in the new Administration, regardless of what a new initiative is called.

In case this is news to you, the federal government buys a lot of materials and end items–billions of dollars worth.  Accountability must be put in place to ensure that the money is properly spent to acquire the things being purchased.  Where technology is pushed and where there are no commercial equivalents that can be bought off the shelf, as in the systems purchased by the Department of Defense, there are measures of progress and performance (given that the contract is under a specification) that are submitted to the oversight agency in DoD.  This is a lot of data and to be brutally frank the method and format of delivery has been somewhat chaotic, inefficient, and duplicative.  The Department moved to address this by a somewhat modest requirement of open systems submission of an application-neutral XML file under the standards established by the UN/CEFACT XML organization.  This was called the Integrated Program Management Report (IMPR).  This move garnered some improvement where it has been applied, but contracts are long-term, so incorporating improvements though new contractual requirements tends to take time.  Plus, there is always resistance to change.  The Department is moving to accelerate addressing these inefficiencies in their data streams by eliminating the unnecessary overhead associated with specifications of formatting data for paper forms and dealing with data as, well, data.  Great idea and bravo!  The rub here is that in making the change, the Department has proposed dropping XML as the technology used to transfer data and move to JSON.

XML to JSON. Before I spark another techie argument about the relative merits of each, there are some basics to understand here.  First, XML is a language, JSON is simply data exchange format.  This means that XML is specifically designed to deal with hierarchical and structured data that can be queried and where validation and fidelity checks within the data are inherent in the technology. Furthermore, XML is known to scale while maintaining the integrity of the data, which is intended for use in relational databases.  Furthermore, XML is hard to break.  It is meant for editing and will maintain its structure and integrity afterward.

The counter argument encountered is that JSON is new! and uses fewer characters! (which usually turns out to be inconsequential), and people are talking about it for Big Data and NoSQL! (but this happened after the fact and the reason for shoehorning it this way is discussed below).

So does it matter?  Yes and no.  As a supplier specializing in delivering solutions that normalize and rationalize data across proprietary file structures and leverage database capabilities, I don’t care.  I can adapt quickly and will have a proof-of-concept solution out within 30 days of receiving the schema.

The risk here, which applies to DoD and the industry, is that the decision to go to JSON is made only because it is the shiny new thing used by gamers and social networking developers.  There has also been a move to adapt to other uses because of the history of significant security risks that had been found in Java, so much so that an entire Wikipedia page is devoted to them.  Oracle just killed off Java applets, though Java hangs on.  JSON, of course, isn’t Java, but it was designed from birth as JavaScript Object Notation (hence the acronym JSON), with the purpose of handling relatively small bits of data across web servers in a number of proprietary settings.

To address JSON deficiencies relative to XML, a number of tools have been and are being developed to replicate the fidelity and reliability found in XML.  Whether this is sufficient to be effective against a structured LANGUAGE is to be seen.  Much of the overhead that technies complain about in XML is due to the native functionality related to the power it brings to the table.  No doubt, a bicycle is simpler than a Formula One racer–and this is an apt comparison.  Claiming “simpler” doesn’t pass the “So What?” test knowing the business processes involved.  The technology needs to be fit to the solution.  The purpose of data transmission using APIs is not only to make it easy to produce but for it to–you know–achieve the goals of normalization and rationalization so that it can be used on the receiving end which is where the consumer (which we usually consider to be the customer) sits.

At the end of the day the ability to scale and handle hierarchical, structured data will rely on the quality and strength of the schema and the tools that are published to enforce its fidelity and compliance.  Otherwise consuming organizations will be receiving a dozen different proprietary JSON files, and that does not address the present chaos but simply adds to it.  These issues were aired out during the meeting and it seems that everyone is aware of the risks and that they can be addressed.  Furthermore, as the schema is socialized across solutions providers, it will be apparent early if the technology will be able handle the project performance data resulting from the development of a high performance aircraft or a U.S. Navy destroyer.

Something New (Again)– Top Project Management Trends 2017

Atif Qureshi at Tasque, which I learned via Dave Gordon’s blog, went out to LinkedIn’s Project Management Community to ask for the latest tends in project management.  You can find the raw responses to his inquiry at his blog here.  What is interesting is that some of these latest trends are much like the old trends which, given continuity makes sense.  But it is instructive to summarize the ones that came up most often.  Note that while Mr. Qureshi was looking for ten trends, and taken together he definitely lists more than ten, there is a lot of overlap.  In total the major issues seem to the five areas listed below.

a.  Agile, its hybrids, and its practical application.

It should not surprise anyone that the latest buzzword is Agile.  But what exactly is it in its present incarnation?  There is a great deal of rising criticism, much of it valid, that it is a way for developers and software PMs to avoid accountability. Anyone ready Glen Alleman’s Herding Cat’s Blog is aware of the issues regarding #NoEstimates advocates.  As a result, there are a number hybrid implementations of Agile that has Agile purists howling and non-purists adapting as they always do.  From my observations, however, there is an Ur-Agile that is out there common to all good implementations and wrote about them previously in this blog back in 2015.  Given the time, I think it useful to repeat it here.

The best articulation of Agile that I have read recently comes from Neil Killick, whom I have expressed some disagreement on the #NoEstimates debate and the more cultish aspects of Agile in past posts, but who published an excellent post back in July (2015) entitled “12 questions to find out: Are you doing Agile Software Development?”

Here are Neil’s questions:

  1. Do you want to do Agile Software Development? Yes – go to 2. No – GOODBYE.
  2. Is your team regularly reflecting on how to improve? Yes – go to 3. No – regularly meet with your team to reflect on how to improve, go to 2.
  3. Can you deliver shippable software frequently, at least every 2 weeks? Yes – go to 4. No – remove impediments to delivering a shippable increment every 2 weeks, go to 3.
  4. Do you work daily with your customer? Yes – go to 5. No – start working daily with your customer, go to 4.
  5. Do you consistently satisfy your customer? Yes – go to 6. No – find out why your customer isn’t happy, fix it, go to 5.
  6. Do you feel motivated? Yes – go to 7. No – work for someone who trusts and supports you, go to 2.
  7. Do you talk with your team and stakeholders every day? Yes – go to 8. No – start talking with your team and stakeholders every day, go to 7.
  8. Do you primarily measure progress with working software? Yes – go to 9. No – start measuring progress with working software, go to 8.
  9. Can you maintain pace of development indefinitely? Yes – go to 10. No – take on fewer things in next iteration, go to 9.
  10. Are you paying continuous attention to technical excellence and good design? Yes – go to 11. No – start paying continuous attention to technical excellent and good design, go to 10.
  11. Are you keeping things simple and maximising the amount of work not done? Yes – go to 12. No – start keeping things simple and writing as little code as possible to satisfy the customer, go to 11.
  12. Is your team self-organising? Yes – YOU’RE DOING AGILE SOFTWARE DEVELOPMENT!! No – don’t assign tasks to people and let the team figure out together how best to satisfy the customer, go to 12.

Note that even in software development based on Agile you are still “provid(ing) value by independently developing IP based on customer requirements.”  Only you are doing it faster and more effectively.

With the possible exception of the “self-organizing” meme, I find that items through 11 are valid ways of identifying Agile.  Given that the list says nothing about establishing closed-loop analysis of progress says nothing about estimates or the need to monitor progress, especially on complex projects.  As a matter of fact one of the biggest impediments noted elsewhere in industry is the inability of Agile to scale.  This limitations exists in its most simplistic form because Agile is fine in the development of well-defined limited COTS applications and smartphone applications.  It doesn’t work so well when one is pushing technology while developing software, especially for a complex project involving hundreds of stakeholders.  One other note–the unmentioned emphasis in Agile is technical performance measurement, since progress is based on satisfying customer requirements.  TPM, when placed in the context of a world of limited resources, is the best measure of all.

b.  The integration of new technology into PM and how to upload the existing PM corporate knowledge into that technology.

This is two sides of the same coin.  There is always  debate about the introduction of new technologies within an organization and this debate places in stark contrast the differences between risk aversion and risk management.

Project managers, especially in the complex project management environment of aerospace & defense tend, in general, to be a hardy lot.  Consisting mostly of engineers they love to push the envelope on technology development.  But there is also a stripe of engineers among them that do not apply this same approach of measured risk to their project management and business analysis system.  When it comes to tracking progress, resource management, programmatic risk, and accountability they frequently enter the risk aversion mode–believing that the less eyes on what they do the more leeway they have in achieving the technical milestones.  No doubt this is true in a world of unlimited time and resources, but that is not the world in which we live.

Aside from sub-optimized self-interest, the seeds of risk aversion come from the fact that many of the disciplines developed around performance management originated in the financial management community, and many organizations still come at project management efforts from perspective of the CFO organization.  Such rice bowl mentality, however, works against both the project and the organization.

Much has been made of the wall of honor for those CIA officers that have given their lives for their country, which lies to the right of the Langley headquarters entrance.  What has not gotten as much publicity is the verse inscribed on the wall to the left:

“And ye shall know the truth and the truth shall make you free.”

      John VIII-XXXII

In many ways those of us in the project management community apply this creed to the best of our ability to our day-to-day jobs, and it lies as the basis for all of the management improvement from Deming’s concept of continuous process improvement, through the application of Six Sigma and other management improvement methods.  What is not part of this concept is that one will apply improvement only when a customer demands it, though they have asked politely for some time.  The more information we have about what is happening in our systems, the better the project manager and the project team is armed with applying the expertise which qualified the individuals for their jobs to begin with.

When it comes to continual process improvement one does not need to wait to apply those technologies that will improve project management systems.  As a senior management (and well-respected engineer) when I worked in Navy told me; “if my program managers are doing their job virtually every element should be in the yellow, for only then do I know that they are managing risk and pushing the technology.”

But there are some practical issues that all managers must consider when managing the risks in introducing new technology and determining how to bring that technology into existing business systems without completely disrupting the organization.  This takes–good project management practices that, for information systems, includes good initial systems analysis, identification of those small portions of the organization ripe for initial entry in piloting, and a plan of data normalization and rationalization so that corporate knowledge is not lost.  Adopting systems that support more open systems that militate against proprietary barriers also helps.

c.  The intersection of project management and business analysis and its effects.

As data becomes more transparent through methods of normalization and rationalization–and the focus shifts from “tools” to the knowledge that can be derived from data–the clear separation that delineated project management from business analysis in line-and-staff organization becomes further blurred.  Even within the project management discipline, the separation in categorization of schedule analysts from cost analysts from financial analyst are becoming impediments in fully exploiting the advantages in looking at all data that is captured and which affects project performance.

d.  The manner of handling Big Data, business intelligence, and analytics that result.

Software technologies are rapidly developing that break the barriers of self-contained applications that perform one or two focused operations or a highly restricted group of operations that provide functionality focused on a single or limited set of business processes through high level languages that are hard-coded.  These new technologies, as stated in the previous section, allow users to focus on access to data, making the interface between the user and the application highly adaptable and customizable.  As these technologies are deployed against larger datasets that allow for integration of data across traditional line-and-staff organizations, they will provide insight that will garner businesses competitive advantages and productivity gains against their contemporaries.  Because of these technologies, highly labor-intensive data mining and data engineering projects that were thought to be necessary to access Big Data will find themselves displaced as their cost and lack of agility is exposed.  Internal or contracted out custom software development devoted along these same lines will also be displaced just as COTS has displaced the high overhead associated with these efforts in other areas.  This is due to the fact that hardware and processes developments are constantly shifting the definition of “Big Data” to larger and larger datasets to the point where the term will soon have no practical meaning.

e.  The role of the SME given all of the above.

The result of the trends regarding technology will be to put the subject matter expert back into the driver’s seat.  Given adaptive technology and data–and a redefinition of the analyst’s role to a more expansive one–we will find that the ability to meet the needs of functionality and the user experience is almost immediate.  Thus, when it comes to business and project management systems, the role of Agile, while these developments reinforce the characteristics that I outlined above are made real, the weakness of its applicability to more complex and technical projects is also revealed.  It is technology that will reduce the risk associated with contract negotiation, processes, documentation, and planning.  Walking away from these necessary components to project management obfuscates and avoids the hard facts that oftentimes must be addressed.

One final item that Mr. Qureshi mentions in a follow-up post–and which I have seen elsewhere in similar forums–concerns operational security.  In deployment of new technologies a gatekeeper must be aware of whether that technology will not open the organization’s corporate knowledge to compromise.  Given the greater and more integrated information and knowledge garnered by new technology, as good managers it is incumbent to ensure these improvements do not translate into undermining the organization.

Takin’ Care of Business — Information Economics in Project Management

Neoclassical economics abhors inefficiency, and yet inefficiencies exist.  Among the core issues that create inefficiencies is the asymmetrical nature of information.  Asymmetry is an accepted cornerstone of economics that leads to inefficiency.  We can see in our daily lives and employment the effects of one party in a transaction having more information than the other:  knowing whether the used car you are buying is a lemon, measuring risk in the purchase of an investment and, apropos to this post, identifying how our information systems allow us to manage complex projects.

Regarding this last proposition we can peel this onion down through its various levels: the asymmetry in the information between the customer and the supplier, the asymmetry in information between the board and stockholders, the asymmetry in information between management and labor, the asymmetry in information between individual SMEs and the project team, etc.–it’s elephants all the way down.

This asymmetry, which drives inefficiency, is exacerbated in markets that are dominated by monopoly, monopsony, and oligopoly power.  When informed by the work of Hart and Holmström regarding contract theory, which recently garnered the Nobel in economics, we have a basis for understanding the internal dynamics of projects in seeking efficiency and productivity.  What is interesting about contract theory is that it incorporates the concept of asymmetrical information (labeled as adverse selection), but expands this concept in human transactions at the microeconomic level to include considerations of moral hazard and the utility of signalling.

The state of asymmetry and inefficiency is exacerbated by the patchwork quilt of “tools”–software applications that are designed to address only a very restricted portion of the total contract and project management system–that are currently deployed as the state of the art.  These tend to require the insertion of a new class of SME to manage data by essentially reversing the efficiencies in automation, involving direct effort to reconcile differences in data from differing tools. This is a sub-optimized system.  It discourages optimization of information across the project, reinforces asymmetry, and is economically and practically unsustainable.

The key in all of this is ensuring that sub-optimal behavior is discouraged, and that those activities and behaviors that are supportive of more transparent sharing of information and, therefore, contribute to greater efficiency and productivity are rewarded.  It should be noted that more transparent organizations tend to be more sustainable, healthier, and with a higher degree of employee commitment.

The path forward where there is monopsony power, where there is a dominant buyer, is to impose the conditions for normative behavior that would otherwise be leveraged through practice in a more open market.  For open markets not dominated by one player as either supplier or seller, instituting practices that reward behavior that reduces the effects of asymmetrical information, and contracting disincentives in business transactions on the open market is the key.

In the information management market as a whole the trends that are working against asymmetry and inefficiency involve the reduction of data streams, the construction of cross-domain data repositories (or reservoirs) that allow for the satisfaction of multiple business stakeholders, and the introduction of systems that are more open and adaptable to the needs of the project system in lieu of a limited portion of the project team.  These solutions exist, yet their adoption is hindered because of the long-term infrastructure that is put in place in complex project management.  This infrastructure is supported by incumbents that are reinforcing to the status quo.  Because of this, from the time a market innovation is introduced to the time that it is adopted in project-focused organizations usually involves the expenditure of several years.

This argues for establishing an environment that is more nimble.  This involves the adoption of a series of approaches to achieve the goals of broader information symmetry and efficiency in the project organization.  These are:

a. Instituting contractual relationships, both internally and externally, that encourage project personnel to identify risk.  This would include incentives to kill efforts that have breached their framing assumptions, or to consolidate progress that the project has achieved to date–sending it as it is to production–while killing further effort that would breach framing assumptions.

b. Institute policy and incentives on the data supply end to reduce the number of data streams.  Toward this end both acquisition and contracting practices should move to discourage proprietary data dead ends by encouraging normalized and rationalized data schemas that describe the environment using a common or, at least, compatible lexicon.  This reduces the inefficiency derived from opaqueness as it relates to software and data.

c.  Institute policy and incentives on the data consumer end to leverage the economies derived from the increased computing power from Moore’s Law by scaling data to construct interrelated datasets across multiple domains that will provide a more cohesive and expansive view of project performance.  This involves the warehousing of data into a common repository or reduced set of repositories.  The goal is to satisfy multiple project stakeholders from multiple domains using as few streams as necessary and encourage KDD (Knowledge Discovery in Databases).  This reduces the inefficiency derived from data opaqueness, but also from the traditional line-and-staff organization that has tended to stovepipe expertise and information.

d.  Institute acquisition and market incentives that encourage software manufacturers to engage in positive signalling behavior that reduces the opaqueness of the solutions being offered to the marketplace.

In summary, the current state of project data is one that is characterized by “best-of-breed” patchwork quilt solutions that tend to increase direct labor, reduces and limits productivity, and drives up cost.  At the end of the day the ability of the project to handle risk and adapt to technical challenges rests on the reliability and efficiency of its information systems.  A patchwork system fails to meet the needs of the organization as a whole and at the end of the day is not “takin’ care of business.”

River Deep, Mountain High — A Matrix of Project Data

Been attending conferences and meetings of late and came upon a discussion of the means of reducing data streams while leveraging Moore’s Law to provide more, better data.  During a discussion with colleagues over lunch they asked if asking for more detailed data would provide greater insight.  This led to a discussion of the qualitative differences in data depending on what information is being sought.  My response to more detailed data was to respond: “well there has to be a pony in there somewhere.”  This was greeted by laughter, but then I finished the point: more detailed data doesn’t necessarily yield greater insight (though it could and only actually looking at it will tell you that, particularly in applying the principle of KDD).  But more detailed data that is based on a hierarchical structure will, at the least, provide greater reliability and pinpoint areas of intersection to detect areas of risk manifestation that is otherwise averaged out–and therefore hidden–at the summary levels.

Not to steal the thunder of new studies that are due out in the area of data later this spring but, for example, I am aware after having actually achieved lowest level integration for extremely complex projects through my day job, that there is little (though not zero) insight gained in predictive power between say, the control account level of a WBS and the work package level.  Going further down to element of cost may, in the words of the character in the movie Still Alice, where “You may say that this falls into the great academic tradition of knowing more and more about less and less until we know everything about nothing.”  But while that may be true for project management, that isn’t necessarily so when collecting parametrics and auditing the validity of financial information.

Rolling up data from individually detailed elements of a hierarchy is the proper way to ensure credibility.  Since we are at the point where a TB of data has virtually the same marginal cost of a GB of data (which is vanishingly small to begin with), then the more the merrier in eliminating the abuse associated with human-readable summary reporting.  Furthermore, I have long proposed through this blog and elsewhere, that the emphasis should be away from people, process, and tools, to people, process, and data.  This rightly establishes the feedback loop necessary for proper development and project management.  More importantly, the same data available through project management processes satisfy the different purposes of domains both within the organization, and of multiple external stakeholders.

This then leads us to the concept of integrated project management (IPM), which has become little more than a buzz-phrase, and receives a lot of hand waves, mostly by technology companies that want to push their tools–which are quickly becoming obsolete–while appearing forward leaning.  This tool-centric approach is nothing more than marketing–focusing on what the software manufacturer would have us believe is important based on the functionality baked into their applications.  One can see where this could be a successful approach, given the emphasis on tools in the PM triad.  But, of course, it is self-limiting in a self-interested sort of way.  The emphasis needs to be on the qualitative and informative attributes of available data–not of tool functionality–that meet the requirements of different data consumers while minimizing, to the extent possible, the number of data streams.

Thus, there are at least two main aspects of data that are important in understanding the utility of project management: early warning/predictiveness and credibility/traceability/fidelity.  The chart attached below gives a rough back-of-the-envelope outline of this point, with some proposed elements, though this list is not intended to be exhaustive.

PM Data Matrix

PM Data Matrix

In order to capture data across the essential elements of project management, our data must demonstrate both a breadth and depth that allows for the discovery of intersections of the different elements.  The weakness in the two-dimensional model above is that it treats each indicator by itself.  But, when we combine, for example, IMS consecutive slips with other elements listed, the informational power of the data becomes many times greater.  This tells us that the weakness in our present systems is that we treat the data as a continuity between autonomous elements.  But we know that the project consists of discontinuities where the next level of achievement/progress is a function of risk.  Thus, when we talk about IPM, the secret is in focusing on data that informs us what our systems are doing.  This will require more sophisticated types of modeling.