Shake it Out – Embracing the Future of Program Management – Part Two: Private Industry Program and Project Management in Aerospace, Space, and Defense

In my previous post, I focused on Program and Project Management in the Public Interest, and the characteristics of its environment, especially from the perspective of the government program and acquisition disciplines. The purpose of this exploration is to lay the groundwork for understanding the future of program management—and the resulting technological and organizational challenges that are required to support that change.

The next part of this exploration is to define the motivations, characteristics, and disciplines of private industry equivalencies. Here there are commonalities, but also significant differences, that relate to the relationship and interplay between public investment, policy and acquisition, and private business interests.

Consistent with our initial focus on public interest project and program management (PPM), the vertical with the greatest relationship to it is found in the very specialized fields of aerospace, space, and defense. I will therefore first begin with this industry vertical.

Private Industry Program and Project Management

Aerospace, Space & Defense (ASD). It is here that we find commercial practice that comes closest to the types of structure, rules, and disciplines found in public interest PPM. As a result, it is also here where we find the most interesting areas of conflict and conciliation between private motivations and public needs and duties. Particularly since most of the business activity in this vertical is generated by and dependent on federal government acquisition strategy and policy.

On the defense side, the antecedent policy documents guiding acquisition and other measures are the National Security Strategy (NSS), which is produced by the President’s staff, the National Defense Strategy (NDS), which further translates and refines the NSS, and the National Military Strategy (NMS), which is delivered to the Secretary of Defense by the Joint Chiefs of Staff of the various military services, which is designed to provide unfettered military advise to the Secretary of Defense.

Note that the U.S. Department of Defense (DoD) and the related agencies, including the intelligence agencies, operate under a strict chain of command that ensures civilian control under the National Military Establishment. Aside from these structures, the documents and resulting legislation from DoD actions also impact such civilian agencies as the Department of Energy (DOE), Department of Homeland Security (DHS), the National Aeronautics and Space Administration (NASA), and the Federal Aviation Administration (FAA), among others.

The countervailing power and checks-and-balances on this Executive Branch power lies with the appropriation and oversight powers of the Congress. Until the various policies are funded and authorized by Congress, the general tenor of military, intelligence, and other operations have tangential, though not insignificant effects, on the private economy. Still, in terms of affecting how programs and projects are monitored, it is within the appropriation and authorization bills that we find the locus of power. As one of my program managers reminded me during my first round through the budget hearing process, “everyone talks, but money walks.”

On the Aerospace side, there are two main markets. One is related to commercial aircraft, parts, and engines sold to the various world airlines. The other is related to government’s role in non-defense research and development, as well as activities related to private-public partnerships, such as those related to space exploration. The individual civilian departments of government also publish their own strategic plans based on their roles, from which acquisition strategy follows. These long terms strategic plans, usually revised at least every five years, are then further refined into strategic implementation plans by various labs and directorates.

The suppliers and developers of the products and services for government, which represents the bulk of ASD, face many of the same challenges delineated in surveying their government counterparts. The difference, of course, is that these are private entities where the obligations and resulting mores are derived from business practice and contractual obligations and specifications.

This is not to imply a lack of commitment or dedication on the part of private entities. But it is an important distinction, particularly since financial incentives and self-interest are paramount considerations. A contract negotiator, for example, in order to be effective, must understand the underlying pressures and relative position of each of the competitors in the market being addressed. This individual should also be familiar with the particular core technical competencies of the competitors as well as their own strategic plans, the financial positions and goals that they share with their shareholders in the case of publicly traded corporations, and whether actual competition exists.

The Structure of the Market. Given the mergers and acquisitions of the last 30 years, along with the consolidation promoted by the Department of Defense as unofficial policy after the fall of the Berlin Wall and the lapse of antitrust enforcement, the portion of ASD and Space that rely on direct government funding, even those that participate in public-private ventures where risk sharing is involved, operate in a monopsony—the condition in which a single buyer—the U.S. government—substantially controls the market as the main purchaser of supplies and services. This monopsony market is then served by a supplier market that is largely an oligopoly—where there are few suppliers and limited competition—and where, in some technical domains, some suppliers exert monopoly power.

Acknowledging this condition informs us regarding the operational motivators of this market segment in relation to culture, practice, and the disciplines and professions employed.

In the first case, given the position of the U.S. government, the normal pressures of market competition and market incentives do not apply to the few competitors participating in the market. As a result, only the main buyer has the power to recreate, in an artificial manner, an environment which replicate the market incentives and penalties normally employed in a normative, highly diverse and competitive market.

Along these lines, for market incentives, the government can, and often does, act as the angel investor, given the rigorous need for R&D in such efforts. It can also lower the barriers to participation in order to encourage more competition and innovation. This can be deployed across the entire range of limited competitors, or it can be expansive in its approach to invite new participants.

Market penalties that are recreated in this environment usually target what economists call “rent-seeking behavior.” This is a situation where there may be incumbents that seek to increase their own wealth without creating new benefits, innovation, or providing additional wealth to society. Lobbying, glad-handing, cronyism, and other methods are employed and, oftentimes, rampant under monosponistic systems. Revolving-door practices, in which the former government official responsible for oversight obtains employment in the same industry and, oftentimes, with the same company, is too often seen in these cases.

Where there are few competitors, market participants will often play follow-the-leader and align themselves to dominate particular segments of the market in appealing to the government or elected representatives for business. This may mean that, in many cases, they team with their ostensible competitors to provide a diverse set of expertise from the various areas of specialty. As with any business, profitability is of paramount importance, for without profit there can be no business operations. It is here: the maximization of profit and shareholder value, that is the locus of power in understanding the motivation of these and most businesses.

This is not a value judgment. As faulty and risky as this system may be, no better business structure has been found to provide value to the public through incentives for productive work, innovation, the satisfaction of demand, and efficiency. The challenge, apart from what political leadership decides to do regarding the rules of the market, is to make those rules that do exist work in the public interest through fair, ethical, and open contracting practices.

To do this successfully requires contracting and negotiating expertise. To many executives and non-contracting personnel, negotiations appear to be a zero-sum game. No doubt, popular culture, mass media and movies, and self-promoting business people help mold this perception. Those from the legal profession, in particular, deal with a negotiation as an extension of the adversarial processes through which they usually operate. This is understandable given their education, and usually disastrous.

As an attorney friend of mine once observed: “My job, if I have done it right, is to ensure that everyone walking out of the room is in some way unhappy. Your job, in contrast, is to ensure that everyone walking out of it is happy.” While a generalization—and told tongue-in-cheek—it highlights the core difference in approach between these competing perspectives.

A good negotiator has learned that, given two motivated sides coming together to form a contract, that there is an area of intersection where both parties will view the deal being struck as meeting their goals, and as such, fair and reasonable. It is the job of the negotiator to find that area of mutual fairness, while also ensuring that the contract is clear and free of ambiguity, and that the structure of the instrument—price and/or cost, delivery, technical specification, statement of work or performance specification, key performance parameters, measures of performance, measures of effectiveness, management, sufficiency of capability (responsibility), and expertise—sets up the parties involved for success. A bad contract can no more be made good than the poorly prepared and compacted soil and foundation of a house be made good after the building goes up.

The purpose of a good contract is to avoid litigation, not to increase the likelihood of it happening. Furthermore, it serves the interests of neither side to obtain a product or service at a price, or under such onerous conditions, where the enterprise fails to survive. Alternatively, it does a supplier little good to obtain a contract that provides the customer with little financial flexibility, that fails to fully deliver on its commitments, that adversely affects its reputation, or that is perceived in a negative light by the public.

Effective negotiators on both sides of the table are aware of these risks and hazards, and so each is responsible for the final result, though often the power dynamic between the parties may be asymmetrical, depending on the specific situation. It is one of the few cases in which parties having both mutual and competing interests are brought together where each side is responsible for ensuring that the other does not hazard their organization. It is in this way that a contract—specifically one that consists of a long-term R&D cost-plus contract—is much like a partnership. Both parties must act in good faith to ensure the success of the project—all other considerations aside—once the contract is signed.

In this way, the manner of negotiating and executing contracts is very much a microcosm of civil society as a whole, for good or for bad, depending on the practices employed.

Given that the structure of aerospace, space, and defense consists of one dominant buyer with few major suppliers, the disciplines required relate to the details of the contract and its resulting requirements that establish the rules of governance.

As I outlined in my previous post, the characteristics of program and project management in the public interest, which are the products of contract management, are focused on successfully developing and obtaining a product to meet particular goals of the public under law, practice, and other delineated specific characteristics.

As a result, the skill-sets that are of paramount importance to business in this market prior to contract award are cost estimating, applied engineering expertise including systems engineering, financial management, contract negotiation, and law. The remainder of disciplines regarding project and program management expertise follow based on what has been established in the contract and the amount of leeway the contracting instrument provides in terms of risk management, cost recovery, and profit maximization, but the main difference is that this approach to the project leans more toward contract management.

Another consideration in which domains are brought to bear relates to position of the business in terms of market share and level of dominance in a particular segment of the market. For example, a company may decide to allow a lower than desired target profit. In the most extreme cases, the company may allow the contract to become a loss leader in order to continue to dominate a core competency or to prevent new entries into that portion of the market.

On the other side of the table, government negotiators are prohibited by the Federal Acquisition Regulation (the FAR) from allowing companies to “buy-in” by proposing an obviously lowball offer, but some do in any event, whether it is due to lack of expertise or bowing to the exigencies of price or cost. This last condition, combined with rent-seeking behavior mentioned earlier, where they occur, will distort and undermine the practices and indicators needed for effective project and program management. In these cases, the dysfunctional result is to create incentives to maximize revenue and scope through change orders, contracting language ambiguity, and price inelasticity. This also creates an environment that is resistant to innovation and rewards inefficiency.

But apart from these exceptions, the contract and its provisions, requirements, and type are what determine the structure of the eventual project or program management team. Unlike the commercial markets in which there are many competitors, the government through negotiation will determine the manner of burdening rate structures and allowable profit or margin. This last figure is determined by the contract type and the perceived risk of the contract goals to the contractor. The higher the risk, the higher the allowed margin or profit. The reverse applies as well.

Given this basis, the interplay between private entities and the public acquisition organizations, including the policy-setting staffs, are also of primary concern. Decision-makers, influences, and subject-matter experts from these entities participate together in what are ostensibly professional organizations, such as the National Defense Industrial Association (NDIA), the Project Management Institute (PMI), the College of Scheduling (CoS), the College of Performance Management (CPM), the International Council on Systems Engineering (INCOSE), the National Contract Management Association (NCMA), and the International Cost Estimating and Analysis Association (ICEAA), among the most frequently attended by these groups. Corresponding and associated private and professional groups are the Project Control Academy and the Association for Computing Machinery (ACM).

This list is by no means exhaustive, but from the perspective of suppliers to public agencies, NDIA, PMI, CoS, and CPM are of particular interest because much of the business of influencing policy and the details of its application are accomplished here. In this manner, the interests of the participants from the corporate side of the equation relate to those areas always of concern: business certainty, minimization of oversight, market and government influence. The market for several years now has been reactive, not proactive.

There is no doubt that business organizations from local Chambers of Commerce to specialized trade groups that bring with them the advantages of finding mutual interests and synergy. All also come with the ills and dysfunction, to varying degrees, borne from self-promotion, glad-handing, back-scratching, and ossification.

In groups where there is little appetite to upend the status quo, innovation and change, is viewed with suspicion and as being risky. In such cases the standard reaction is cognitive dissonance. At least until measures can be taken to subsume or control the pace and nature of the change. This is particularly true in the area of project and program management in general and integrated project, program and portfolio management (IPPM), in particular.

Absent the appetite on the part of DoD to replicate market forces that drive the acceptance of innovative IPPM approaches, one large event and various evolutionary aviation and space technology trends have upended the ecosystem of rent-seeking, reaction, and incumbents bent on maintaining the status quo.

The one large event, of course, came about from the changes wrought by the Covid pandemic. The other, evolutionary changes, are a result of the acceleration of software technology in capturing and transforming big(ger) dataset combined with open business intelligence systems that can be flexibly delivered locally and via the Cloud.

I also predict that these changes will make hard-coded, purpose-driven niche applications obsolete within the next five years, as well as those companies that have built their businesses around delivering custom, niche applications, and MS Excel spreadsheets, and those core companies that are comfortable suboptimizing and reacting to delivering the letter, if not the spirit, of good business practice expected under their contracts.

Walking hand-in-hand with these technological and business developments, the business of the aerospace, space and defense market, in general, is facing a window opening for new entries and greater competition borne of emergent engineering and technological exigencies that demand innovation and new approaches to old, persistent problems.

The coronavirus pandemic and new challenges from the realities of global competition, global warming, geopolitical rivalries; aviation, space and atmospheric science; and the revolution in data capture, transformation, and optimization are upending a period of quiescence and retrenchment in the market. These factors are moving the urgency of innovation and change to the left both rapidly and in a disruptive manner that will only accelerate after the immediate pandemic crisis passes.

In my studies of Toynbee and other historians (outside of my day job, I am also credentialed in political science and history, among other disciplines, through both undergraduate and graduate education), I have observed that societies and cultures that do not embrace the future and confront their challenges effectively, and that do not do so in a constructive manner, find themselves overrun by it and them. History is the chronicle of human frailty, tragedy, and failure interspersed by amazing periods of resilience, human flourishing, advancement, and hope.

As it relates to our more prosaic concerns, Deloitte has published an insightful paper on the 2021 industry outlook. Among the identified short-term developments are:

  1. A slow recovery in passenger travel may impact aircraft deliveries and industry revenues in commercial aviation,
  2. The defense sector will remain stable as countries plan to sustain their military capabilities,
  3. Satellite broadband, space exploration and militarization will drive growth,
  4. Industry will shift to transforming supply chains into more resilient and dynamic networks,
  5. Merger and acquisitions are likely to recover in 2021 as a hedge toward ensuring long-term growth and market share.

More importantly, the longer-term changes to the industry are being driven by the following technological and market changes:

  • Advanced aerial mobility (AAM). Both FAA and NASA are making investments in this area, and so the opening exists for new entries into the market, including new entries in the supply chain, that will disrupt the giants (absent a permissive M&A stance under the new Administration in Washington). AAM is the new paradigm to introduce safe, short-distance, daily-commute flying technologies using vertical lift.
  • Hypersonics. Given the touted investment of Russia and China into this technology as a means of leveraging against the power projection of U.S. forces, particularly its Navy and carrier battle groups (aside from the apparent fact that Vladimir Putin, the president of Upper Volta with Missiles and Hackers, really hates Disney World), the DoD is projected to fast-track hypersonic capabilities and countermeasures.
  • Electric propulsion. NASA is investing in cost-sharing capabilities to leverage electric propulsion technologies, looking to benefit from the start-up growth in this sector. This is an exciting development which has the potential to transform the entire industry over the next decade and after.
  • Hydrogen-powered aircraft. OEMs are continuing to pour private investment money into start-ups looking to introduce more fuel-efficient and clean energy alternatives. As with electric propulsion, there are prototypes of these aircraft being produced and as public investments into cost-sharing and market-investment strategies take hold, the U.S., Europe, and Asia are looking at a more diverse and innovative aerospace, space, and defense market.

Given the present condition of the industry, and the emerging technological developments and resulting transformation of flight, propulsion, and fuel sources, the concept and definitions used in project and program management require a revision to meet the exigencies of the new market.

For both industry and government, in order to address these new developments, I believe that a new language is necessary, as well as a complete revision to what is considered to be the acceptable baseline of best business practice and the art of the possible. Only then will organizations and companies be positioned to address the challenges these new forms of investment and partnering systems will raise.

The New Language of Integrated Program, Project, and Portfolio Management (IPPM).

First a digression to the past: while I was on active duty in the Navy, near the end of my career, I was assigned to the staff of the Office of the Undersecretary of Defense for Acquisition and Technology (OUSD(A&T)). Ostensibly, my assignment was to give me a place to transition from the Service. Thus, I followed the senior executive, who was PEO(A) at NAVAIR, to the Pentagon, simultaneously with the transition of NAVAIR to Patuxent River, Maryland. In reality, I had been tasked by the senior executive, Mr. Dan Czelusniak, to explore and achieve three goals:

  1. To develop a common schema by supporting an existing contract for the collection of data from DoD suppliers from cost-plus R&D contracts with the goal in mind of creating a master historical database of contract performance and technological development risk. This schema would first be directed to cost performance, or EVM;
  2. To continue to develop a language, methodology, and standard, first started and funded by NAVAIR, for the integration of systems engineering and technical performance management into the program management business rhythm;
  3. To create and define a definition of Integrated Program Management.

I largely achieved the first two during my relatively brief period there.

The first became known and the Integrated Digital Environment (IDE), which was refined and fully implemented after my departure from the Service. Much of this work is the basis for data capture, transformation, and load (ETL) today. There had already been a good deal of work by private individuals, organizations, and other governments in establishing common schemas, which were first applied to the transportation and shipping industries. But the team of individuals I worked with were able to set the bar for what followed across datasets.

The second was completed and turned over to the Services and federal agencies, many of whom adopted the initial approach, and refined it as well to inform, through the identification of technical risk, cost performance and technical achievement. Much of this knowledge already existed in the Systems Engineering community, but working with INCOSE, a group of like-minded individuals were able to take the work from the proof-of-concept, which was awarded the Acker in Skill in Communication award at the DAU Acquisition Research Symposium, and turn it into the TPM and KPP standard used by organizations today.

The third began with establishing my position, which hadn’t existed until my arrival: Lead Action Officer, Integrated Program Management. Gary Christle, who was the senior executive in charge of the staff, asked me “What is Integrated Program Management?” I responded: “I don’t know, sir, but I intend to find out.” Unfortunately, this is the initiative that has still eluded both industry and government, but not without some advancement.

Note that this position with its charter to define IPM was created over 24 years ago—about the same time it takes, apparently, to produce an operational fighter jet. I note this with no flippancy, for I believe that the connection is more than just coincidental.

When spoken of, IPM and IPPM are oftentimes restricted to the concept of cost (read cost performance or EVM) and schedule integration, with aggregated portfolio organization across a selected number of projects thrown in, in the latter case. That was considered advancement in 1997. But today, we seem to be stuck in time. In light of present technology and capabilities, this is a self-limiting concept.

This concept is technologically supported by a neutral schema that is authored and managed by DoD. While essential to data capture and transformation—and because of this fact—it is currently the target by incumbents as a means of further limiting even this self-limited definition in practice. It is ironic that a technological advance that supports data-driven in lieu of report-driven information integration is being influenced to support the old paradigm.

The motivations are varied: industry suppliers who aim to restrict access to performance data under project and program management, incumbent technology providers who wish to keep the changes in data capture and transformation restricted to their limited capabilities, consulting companies aligned with technology incumbents, and staff augmentation firms dependent on keeping their customers dependent on custom application development and Excel workbooks. All of these forces work through the various professional organizations which work to influence government policy, hoping to establish themselves as the arbiters of the possible and the acceptable.

Note that oftentimes the requirements under project management are often critiqued under the rubric of government regulation. But that is a misnomer: it is an extension of government contract management. Another critique is made from the perspective of overhead costs. But management costs money, and one would not (or at least should not) drive a car or own a house without insurance and a budget for maintenance, much less a multi-year high-cost project involving the public’s money. In addition, as I have written previously which is supported by the literature, data-driven systems actually reduce costs and overhead.

All of these factors contribute to ossification, and impose artificial blinders that, absent reform, will undermine meeting the new paradigms of 21st Century project management, given that the limited concept of IPM was obviously insufficient to address the challenges of the transitional decade that broached the last century.

Embracing the Future in Aerospace, Space, and Defense

As indicated, the aerospace and space science and technology verticals are entering a new and exciting phase of technological innovation resulting from investments in start-ups and R&D, including public-private cost-sharing arrangements.

  1. IPM to Project Life-Cycle Management. Given the baggage that attends the acronym IPM, and the worldwide trend to data-driven decision-making, it is time to adjust the language of project and program management to align to it. In lieu of IPM, I suggest Project Life-Cycle Management to define the approach to project and program data and information management.
  2. Functionality-Driven to Data-Driven Applications. Our software, systems and procedures must be able to support that infrastructure and be similarly in alignment with that manner of thinking. This evolution includes the following attributes:
    • Data Agnosticism. As our decision-making methods expand to include a wider, deeper, and more comprehensive interdisciplinary approach, our underlying systems must be able to access data in this same manner. As such, these systems must be data agnostic.
    • Data neutrality. In order to optimize access to data, the overhead and effort needed to access data must be greatly reduced. Using data science and analysis to restructure pre-conditioned data in order to overcome proprietary lexicons—an approach used for business intelligence systems since the 1980s—provides no added value to either the data or the organization. If data access is ad hoc and customized in every implementation, the value of the effort cannot either persist, nor is the return on investment fully realized. It backs the customer into a corner in terms of flexibility and innovation. Thus, pre-configured data capture, extract, transformation, and load (ETL) into a non-proprietary and objective format, which applies to all data types used in project and program management systems, is essential to providing the basis for a knowledge-based environment that encourages discovery from data. This approach in ETL is enhanced by the utilization of neutral data schemas.
    • Data in Lieu of Reporting and Visualization. No doubt that data must be visualized at some point—preferably after its transformation and load into the database with other, interrelated data elements that illuminate information to enhance the knowledge of the decisionmaker. This implies that systems that rely on physical report formats, charts, and graphs as the goal are not in alignment with the new paradigm. Where Excel spreadsheets and PowerPoint are used as a management system, it is the preparer is providing the interpretation, in a manner that predisposes the possible alternatives of interpretation. The goal, instead, is to have data speak for itself. It is the data, transformed into information, interrelated and contextualized to create intelligence that is the goal.
    • All of the Data, All of the Time. The cost of 1TB of data compared to 1MB of data is the marginal cost of the additional electrons to produce it. Our systems must be able to capture all of the data essential to effective decision-making in the periodicity determined by the nature of the data. Thus, our software systems must be able to relate data at all levels and to scale from simplistic datasets to extremely large ones. It should do so in such a way that the option for determining what, among the full menu of data options available, is relevant rests in the consumer of that data.
    • Open Systems. Software solution providers beginning with the introduction of widespread CPU capability have manufactured software to perform particular functions based on particular disciplines and very specific capabilities. As noted earlier, these software applications are functionality-focused and proprietary in structure, method, and data. For data-driven project and program requirements, software systems must be flexible enough to accommodate a wide range of analytical and visualization demands in allowing the data to determine the rules of engagement. This implies systems that are open in two ways: data agnosticism, as already noted, but also open in terms of the user environment.
    • Flexible Application Configuration. Our systems must be able to address the needs of the various disciplines in their details, while also allowing for integration and contextualization of interrelated data across domains. As with Open Systems to data and the user environment, openness through the ability to roll out multiple specialized applications from a common platform places the subject matter expert and program manager in the driver’s seat in terms of data analysis and visualization. An effective open platform also reduces the overhead associated with limited purpose-driven, disconnected and proprietary niche applications.
    • No-Code/Low-Code. Given that data and the consumer will determine both the source and method of delivery, our open systems should provide an environment that supports Agile development and deployment of customization and new requirements.
    • Knowledge-Based Content. Given the extensive amount of experience and education recorded and documented in the literature, our systems must, at the very least, provide a baseline of predictive analytics and visualization methods usually found in the more limited, purpose-built hardcoded applications, if not more expansive. This knowledge-based content, however, must be easily expandable and refinable, given the other attributes of openness, flexibility, and application configuration. In this manner, our 21st century project and program management systems must possess the attributes of a hybrid system: providing the functionality of the traditional niche systems with the flexibility and power of a business intelligence system enhanced by COTS data capture and transformation.
    • Ease of Use. The flexibility and power of these systems must be such that implementation and deployment are rapid, and that new user environment applications can be quickly deployed. Furthermore, the end user should be able to determine the level of complexity or simplicity of the environment to support ease of use.
  1. Focus on the Earliest Indicator. A good deal of effort since the late 1990s has been expended on defining the highest level of summary data that is sufficient to inform earned value, with schedule integration derived from the WBS, oftentimes summarized on a one-to-many basis as well. This perspective is biased toward believing that cost performance is the basis for determining project control and performance. But even when related to cost, the focus is backwards. The project lifecycle in its optimized form exists of the following progression:

    Project Goals and Contract (framing assumptions) –> Systems Engineering, CDRLs, KPPs, MoEs, MoPs, TPMs –> Project Estimate –> Project Plan –> IMS –> Risk and Uncertainty Analysis –> Financial Planning and Execution –> PMB –> EVM

    As I’ve documented in this blog over the years, DoD studies have shown that, while greater detail within the EVM data may not garner greater early warning, proper integration with the schedule at the work package level does. Program variances first appear in the IMS. A good IMS, thus, is key to collecting and acting as the main execution document. This is why many program managers who are largely absent in the last decade or so from the professional organizations listed, tend to assert that EVM is like “looking in the rearview mirror.” It isn’t that it is not essential, but it is true that it is not the earliest indicator of variances from expected baseline project performance.

    Thus, the emphasis going forward under this new paradigm is not to continue the emphasis and a central role for EVM, but a shift to the earliest indicator for each aspect of the program that defines its framing assumptions.
  1. Systems Engineering: It’s not Space Science, it’s Space Engineering, which is harder.
    The focus on start-up financing and developmental cost-sharing shifts the focus to systems engineering configuration control and technical performance indicators. The emphasis on meeting expectations, program goals, and achieving milestones within the cost share make it essential to be able to identify fatal variances, long before conventional cost performance indicators show variances. The concern of the program manager in these cases isn’t so much on the estimate at complete, but whether the industry partner will be able to deploy the technology within the acceptable range of the MoEs, MoPs, TPPs, and KPPs, and not exceed the government’s portion of the cost share. Thus, the incentive is to not only identify variances and unacceptable risk at the earliest indicator, but to do so in terms of whether the end-item technology will be successfully deployed, or whether the government should cut its losses.
  1. Risk and Uncertainty is more than SRA. The late 20th century approach to risk management is to run a simulated Monte Carlo analysis against the schedule, and to identify alternative critical paths and any unacceptable risks within the critical path. This is known as the schedule risk analysis, or SRA. While valuable, the ratio of personnel engaged in risk management is much smaller than the staffs devoted to schedule and cost analysis.

    This is no doubt due to the specialized language and techniques devoted to risk and uncertainty. This segregation of risk from mainstream project and program analysis has severely restricted both the utility and the real-world impact of risk analysis on program management decision-making.

    But risk and uncertainty extend beyond the schedule risk analysis, and their utility in an environment of aggressive investment in new technology, innovation, and new entries to the market will place these assessments at center stage. In reality, our ability to apply risk analysis techniques extends to the project plan, to technical performance indicators, to estimating, to the integrated master schedule (IMS), and to cost, both financial and from an earned value perspective. Combined with the need to identify risk and major variances using the earliest indicator, risk analysis becomes pivotal to mainstream program analysis and decision-making.

Conclusions from Part Two

The ASD industry is most closely aligned with PPM in the public interest. Two overarching trends that are transforming this market that are overcoming the inertia and ossification of PPM thought are the communications and information systems employed in response to the coronavirus pandemic, which opened pathways to new ways of thinking about the status quo, and the start-ups and new entries into the ASD market, borne from the investments in new technologies arising from external market, geo-political, space science, global warming, and propulsion trends, as well as new technologies and methods being employed in data and information technology that drive greater efficiency and productivity. These changes have forced a new language and new expectations as to the art of the necessary, as well as the art of the possible, for PPM. This new language includes a transition to the concept of the optimal capture and use of all data across the program management life cycle with greater emphasis on systems engineering, technical performance, and risk.

Having summarized the new program paradigm in Aerospace, Space, and Defense, my next post will assess the characteristics of program management in various commercial industries, the rising trends in these verticals, and what that means for the project and program management discipline.

Innervisions: The Connection Between Data and Organizational Vision

During my day job I provide a number of fairly large customers with support to determine their needs for software that meets the criteria from my last post. That is, I provide software that takes an open data systems approach to data transformation and integration. My team and I deliver this capability with an open user interface based on Windows and .NET components augmented by time-phased and data management functionality that puts SMEs back in the driver’s seat of what they need in terms of analysis and data visualization. In virtually all cases our technology obviates the need for the extensive, time consuming, and costly services of a data scientist or software developer.

Over the course of my career both as a consumer and a provider of technology solutions, I have seen an evolution in software that began with simple point solutions being developed to automate particular manual processes, to more sophisticated solutions that are designed to automate a complex function. In most of these cases, a customer has identified a gap or deficiency in their requirements that represents an inefficiency or sub-optimization of their processes and then seek a software “tool” to acquire in order to address that specific purpose. The application of these “tools” combine to meet the overall vision of the organization or sub-system within the organization.

What Do You Do With A Problem Like “Tools”

The capabilities of software in terms of data handling capabilities and functionality double every 12-18 months in today’s environment. The use of the term “tools” for software, which is really based on a pre-2000 concept, is that in the mind’s eye software is analogous to any other tool. In the literature, particularly in that authored by consultants, this analogy is oftentimes extended to common household or construction tools: a wrench, a screwdriver, or a power drill. Under this concept each tool has a specific purpose and it is up to the SME to determine which tool is best for a specific job.

The problem with this concept is that not only is it obsolete, but it does great harm financially to the organization in terms of overhead costs, organizational efficiency and effectiveness.

First of all, most physical tools are fairly static in their specific use. A hammer is still a hammer, even if some sort of power is extended to give it power. It’s purpose remains to use force to insert a connective fastener, like a nail, into a medium, like a piece of wood. A nail gun, for instance, is a type of hammer. It is more powerful and efficient but, still, it is a glorified hammer. It is a superior tool in construction because it is more efficient, provides a consistency in quality, and is faster. It also eliminates the factors of arm strength, physical coordination, and visual alignment skills of the user; as anyone who has experienced a sore thumb as a result of a misaligned strike can attest. But a nail gun is still restricted to its specific function–sinking nails for the purpose of fastening.

Software, as it has evolved, was similarly based on the concept of a tool. The physical functions of a specific vocation were the first to undergo digitization: accountants and business operations personnel had spreadsheet software applications, secretarial and clerical staffs (yes, they used to exist) had word processing software, marketing and middle management could relay their ideas with presentation software, and the list went on.

As the power of software improved it followed the functions of traditional line-and-staff organizations. Many of these were built to replace the physical calculation of formulae and concepts that required a slide rule and, later, a scientific calculator. Soon scheduling software replaced manual GANTT planning, earned value software automated the calculation of basic EVM analytics, and risk software allowed for the complex formulation involved in assessing risk for the branch of a plan using simulated Monte Carlo analysis.

Each of these software applications targeted a specific occupation, and incorporated specific knowledge (functionality) required of that occupation.

Organizational software for multiple functions usually consisted of a suite of tools under the rubric of an ERP or Business Intelligence System. Modules and “bolt-ons” consisted of tying together business processes and point software requirements augmented by large software consulting staffs to customize the solutions. In actual practice, however, these were software tools tied together though a common brand and operating environment. Oftentimes the individual bolt-ons and tools weren’t even authored by the same development team with a common vision in mind, but a reaction to market forces that required a gap be filled through acquisition of a company or intellectual property.

Needless to say, these “enterprise” solutions aren’t that at all. Instead, they are a business-driven means to penetrate a vertical by providing scattershot functionality. Once inside a company or organization the other bolt-ons and modules are marketed in order to take over other business processes. Integration is achieved across domains through data transfer or other interpretive methods.

This approach has been successful, as it has been since the halcyon days when IBM dominated the computing market, especially among the larger software firms. It also meets many of the emotional and psychic needs of many senior managers. After all, the software firm–given its economic size–feels solid. The numbers of specialists introduced into the organization to augment staff provide a feeling of safety and accomplishment. C-level management and stockholders feel that risk is handled given that their software needs are being met at some level.

What this approach did not, and does not, meet is genuine data integration, especially given the realization that the data we have been using has been inadequate and artificially restricted based on what software providers were convincing their customers was the art of the possible. The term “Big Data” began to be introduced into the lexicon, and with it the economic realization that capturing and integrating datasets that were previously “impossible” to capture and integrate was (and presently is) an economic imperative.

But the approach of incumbents, whose priority is to remain “sticky” and to defend territory against new technologies, was to respond: “we have a tool for that.” Thus, the result has been the further introduction of inefficient individual applications with their inability to fully exploit data. Among these tools are largely “dumb”–that is, viewing data flat–data visualization tools that essentially paint pretty pictures from Excel or, when they need to be applied on a larger scale, default to the old business intelligence brute force approach of applying labor to derive the importance in data. Old habits are hard to change and what one person has done another can do. But this is the economic equivalent of what is called rent-seeking behavior. That is, it is inefficient and exploitative.

After all, if you buy what was advertised as a sports car you expect to see an engine under the hood and a transmission connected to a drivetrain and a pretty powerful one at that. What one does not expect is to buy the car but have to design and build the features of these essential systems while a team of individuals are paid by the hour to push us to where we want to go. Yet, organizations (and especially consultants) seem to be happy with this model when it comes to information management.

Thus, when a technology company like mine comes across a request for proposal, an informal invitation to participate in market research, or in exploratory professional meetings (largely virtual as of this writing), the emphasis and terminology is on software “tools”, which limits the ability of consumers to exploit technology because it mentally paints a picture that limits the definition of what software should do and can do.

This mindset, however, is beginning to change and, no doubt, our current predicament under the Coronavirus crisis will accelerate that transition.

To take our analogy one step further, we are long past the time when we must buy each component of an automobile individually and then assemble it in our own garage. Point solutions, which are set and inelastic, are like individual parts of the car.

Enterprise solutions consisting of different modules and datasets, oftentimes constructed from incompatible foundations, exacerbate this situation and add the element of labor to a supposedly automated process, like buying OEM products and having to upgrade the automobile we supposed bought to do its job, but still needed (with the help of a mechanic) to perform the normal functions of steering, stopping, and accelerating.

Open systems solutions provide more flexibility, but they can be both a blessing and a curse. The challenge is to provide the right balance of out-of-the-box point solution-type functionality while still providing enough flexibility for adaptability. Taking a common data approach is key to achieving this balance. This will require the abandonment of the concept of software “tools” and shifting the focus on data.

Data and Information Take Over: Two Models

The economic imperative for data integration and optimization developed from the needs of the organization and its practitioners–whether it be managers, analysts, or auditors working in a company, a business unit, a governmental agency, or a program or project organization–is to be positioned facing forward.

In order to face forward one must first establish a knowledge-based organization or, as oftentimes identified, a data-driven organization. What this means in real terms is that data is captured, processed, and contextualized so that its importance and meaning can be derived in a timely manner so that something can be done about what is happening. During our own present situation this is not just an economic imperative, but for public health an existential one for many of us.

Thus, we are faced with several key dimensions that must be addressed: size, manner of integration, contextualization, timeliness, and target. This applies to both known and unknown datasets.

Our known datasets are those that are already being used and populated in existing systems. We know, for example, that in program and project management that we require an estimate and plan, a schedule, a manner of organizing and tracking our progress, financial management and material management systems and others. These represent our pool of structured data, and understanding the lexicon of these systems is what is necessary to normalize and rationalize the data through a universal translator.

Our unknown datasets are those that require collection but, when done, is collected and processed in an ad hoc manner. Usually the need for this data collection is learned through the school of hard knocks. In other cases, the information is not collected at all or accidentally, such as when management relies on outside experts and anecdotal information. This is the equivalent of an organizational JOHARI window shown below.

Overview of Johari Window with quadrants
showing the relationships of self-knowledge and understanding

The Johari Window explains our perceptions and our relationship to the outside world. Our universe is not a construction of our own making or imagination. We cannot make our own reality nor are there “alternative facts.” The most colorful example of refuting this specious philosophical mind game is relayed to us in Boswell’s Life of Samuel Johnson.

After we came out of the church, we stood talking for some time together of Bishop Berkeley’s ingenious sophistry to prove the nonexistence of matter, and that every thing in the universe is merely ideal. I observed, that though we are satisfied his doctrine is not true, it is impossible to refute it. I never shall forget the alacrity with which Johnson answered, striking his foot with mighty force against a large stone, till he rebounded from it — “I refute it thus.”

We can deny what we do not know, or construct magical thinking. but reality is unmoved. In the case of Johnson he kicked the stone and the stone, also unmoved, kicked back in the form of the pain that Johnson felt when he “rebounded from it”.

Nor are the quadrants equal in our perceptual windows. Some people and organizations are very well informed and others less so, but the tension and conflict of our lives–both internally and externally–relates to expanding the “open” and “facade” portions of the Johari window so that we are not only informed of how others register us, but also to uncover the unknown, and to attempt to control how others perceive us in our various roles and guises.

We see this playing out in tracking the current Coronavirus pandemic. The absence of reliable widespread tests and testing infrastructure has impeded an understanding of the virus and the most effective strategies to deploy in dealing with it. Absent data, health and governmental agencies have been left with no choice but to use the same social distancing and travel restrictions deployed during the 1918 Influenza Pandemic and then, if lifting some of these, hope for the best.

This is the situation despite the fact that national risk assessments and risk registers, such as the U.S. National Security Council Pandemic Playbook and the U.K. National Risk Register, outlined measures to be taken given certain particular indicators. No doubt there are lessons to be learned here, but at the core lesson is the fact that, absent reliable and timely data that is converted into information that can be used in a decisive and practical manner, an organization, a state, or a nation risks its survival when it fails to imagine what information it needs to collect, absent the prosaic information that comes from performing the day-to-day routine.

Admittedly, there is no great insight here regarding this need (or, at least, there shouldn’t be). This condition is the reason why intelligence systems and agencies were created in the first place. It is why military and health services imagine scenarios and war-game them, and why organizations deploy brain-storming. Individuals and organizations that go into the world uninformed or self-deluded do not last long, and history is replete with such examples. Blanche DuBois relied on the kindness of strangers and we are best served by her experience as an archetype.

And yet, we still find ourselves struggling to properly collect, integrate, and utilize information at the same time that we have come to the realization that we need to collect and process information from larger pools of data. The root cause of this condition, as asserted above, rests in the mental framing of how to approach data and the problem that needs to be solved. It requires us to change the conceptual framework that relies on the concept of “tools.”

We can make this adjustment by realigning the object of the challenge so that it conforms with what we imagine to be the desired end-state. But, still, how do we determine what we need to collect? This is first a question of perception as opposed to one regarding knowledge: what one views as not only necessary but within the realm of possibility.

Once again, this dilemma is best served by models and, in this case, it is not unlike the Overton Window. Those preferring to eschew Wikipedia entries can also find a more detailed and nuanced definition at the source through the Mackinac Center for Public Policy website.

Overton Windows showing degrees of acceptability as modified by Joshua Trevino

Joseph Overton described the window as one of defining acceptable political policies in the mind of the public. He used the terms “more free” and “less free” to describe policies that think tanks recommend to describe the amount of government intervention, avoiding the left-right comparisons used by polemicists. Various adjustments and variations to the basic window have been proposed since his original use of the model, but it has been expanded to describe public perceptions in general on a host of socioeconomic concerns.

As with the Johari Window, I would posit that there is an analogous Overton Window in relation to information that frames what is viewed as the art of the possible. These perceptions influence the actions of decision-makers in assessing the risk involved in buying software solutions. When it comes to the rapidly developing field of data capture, transformation, and effective utilization, the perception from the start suggests some degree of risk and the danger of moving too quickly. For those in the field of data optimization, given that new technology capacity increases exponentially in shorter periods of time, the barrier here is to shift the informational Overton Window so that the market is educated on the risk-reward equation.

A Unified Model for Aligning Our Data

We have discussed two models up to this point in our exploration: an Informational Johari Window and an Informational Overton Window. Each of these models, using a simplified method, isolates different dimensions of the problem of data, which when freed of the concept of “tools” unlocking it, provides us with a clearer picture of the essential nature of its capture and utilization, and to what purposes.

We are now ready to take the next step in defining how to approach data to serve the strategic interests of the enterprise or organization.

For those of us in the information field, especially in the early years when applying solutions to line-and-staff organizations, what we found is that the very introduction of the new technology changed both the structure and nature of the organization. Initially we noted a sophisticated and accelerated version of the Hawthorne Effect. But there was something more elemental and significant going on.

Digital technology is amazingly attuned, especially when properly designed and deployed, to extend the functions of human knowledge gathering and processing. In this way it can be interpreted as an extension of human evolution–of the nature of human society acting as a complex adaptive system. In fact, there are so many connections between early physical, methodological, and industrial societal developments to digitization, such as the connection between the development of the Jacquard Loom to the development of the computer punch card (and there are others) that it seems that human society would have found a way to get to this point regardless of the existence of the intervening human pioneers, though their actual contributions are clear. (For further information on the waves of development see the books Future Shock and The Third Wave by Alvin Toffler.)

When many of us first applied digitized technology to knowledge workers (in my case in the field of contract management) we found that the very introduction of the technology changed perceptions, work habits, and organizational structures in very essential ways. Like the effect of the idea of evolution as described by Daniel Dennett, the application of digital evolution is like a universal acid–it eats through and transforms everything it touches.

For example, a report that, in the past, would have taken a week or two to complete, mostly because of the research required, now took a day or so. Procurement Action Lead Times (PALT) realized significant improvements since information previously only available in paper form was now provided on-line. At the same time, systems were now able to handle greater volumes of demand. As a result, customers’ expectations changed so much that they no longer felt that they had to hold back requests for fear of overloading the system and depend on human intervention. Suppliers, seeing many commodities experiencing steady and stable growth, reverted to just-in-time manufacturing.

Over time, typing pools and secretarial staffs, the former being commonplace well into the 1980s and the latter into the 1990s, except as symbols of privilege or prestige, disappeared. Middle management and many support staffs followed this trend in the early 2000s. Today, consulting services consisting of staffing personnel to apply non-value added manual solutions such as Excel spreadsheets and PowerPoint slides to display data that has already been captured and processed, still manage to hold on in isolated pockets. That this model is not sustainable nor efficient should be obvious except for the continued support these models lend to the self-serving concept of “tools.”

Thus, the next step in the alignment of data capture and utilization to organizational vision is the interplay between our models. Practical experience suggests, though anecdotal, that as forward-facing organizations adopt more powerful digitized technologies designed to capture more and larger datasets, and to better utilize that data, that they tend to move to expand their self-awareness–their Informational Johari Window.

This, in turn, allows them to distinguish between structured and unstructured data and the value–the qualitative information content–of these datasets. This knowledge is then applied to reduce the labor and custom code required for larger data capture and utilization. In the end, these developments then determine what is the art of the possible by moving and expanding the Informational Overton Window.

Combining these concepts from a data perspective results in a combined model as illustrated below from the perspective of the subject:

Data Window of Perception and Possibility (Subject)

Extending this concept to the external subject (object or others) results in the following:

Data Window of Perception and Possibility (Object or Others)

This simplistic model describes several ways of looking at the problem of data and how to align it with its use to serve our purposes. When we gather data from the world the result can be symmetrical or asymmetrical. That is, each of us does not have the capacity to collect the same data that may be relevant to our existence or the survival of our organizations or institutions.

This same concept of symmetry and asymmetry applies to our ability to process data into information and–further–to properly apply information to when it will contribute to a decisive outcome in terms of knowledge, understanding, insight, or action.

As with the psychological Johari Window, our model takes it account the unknown within the much larger data space. Think of our Big Blue Ball (which is not so big) within the context of space. All of space represents the data of the universe. We are finding that the secrets of vast space-time are found in quanta as well in the observations of large and distant celestial events and objects. Data is everywhere. Yet, we can perceive only a small part of the universe. That is why our Data Window does not encompass the entire data space.

The quadrants, of course, are rarely co-equal, but for purposes of simplicity they are shown as such. As with the psychological Johari Window of self-awareness, the tension and conflict within the individual and its relationship with the external world is in the adjustment of the sizes of the quadrants that, hopefully, tend toward more self-awareness and openness. From the perspective of data, the equivalent is toward the expansion of the physical expansion of the Data Window while the quadrants within the window expand to minimize asymmetry of external knowledge and the unknown.

The physical limitations of symmetry, asymmetry, and the unknown portions of the data space is further limited by our perceptions. Our understanding of what is possible, acceptable, sensible, radical, unthinkable, and impossible is influenced by these perceptions. Those areas of information management that fall within some mean or midpoint of the limitations of our perceptions represent current practice and which, as with the original Johari Window, I label as “policy,” though a viable alternative label would be “practice.”

Note that there perceptions vary by the position of the subject. In the case of our own perceptions, as for those reading this post, the first variation of the model is aligned vertically. For the case of the perceptions of others, which are important in understanding their position when advocating a particular course of action, the perception model is aligned horizontally across the quadrants.

The interplay of the quadrants within the Data Window directly affect how we perceive the use of data and its potential. Thus, I have labeled the no-man’s-land portion that pushes into areas that are unknown to the subject and external object is labeled as “The Frontier.”

To an American a “frontier” is an unexplored country while, historically, in the Old World a “frontier” is a border. The former promises not only risk, but, also opportunity and invites exploration. The latter is a limitation. No doubt, my use of the term is culturally biased to the first definition.

Intellectually and physically, as we enter the frontier and learn what secrets await us there, we learn. For data we may first see a Repository of Babel and deal with it as if it were flat. But, given enough exploration we will learn its lexicon and underlying structure and, eventually, learn how to process it into information and harness its content. This, in turn, will influence the size of the Data Window, the relative sizes of the quadrants, and our perceptions of the art of the possible.

Conception to Application

This model, I believe, is a useful antecedent concept in approaching and making comprehensible what is often called Big Data. The model also helps us be more precise in how we perceive and define the term as technology changes, given that exponential increases in hardware storage and processing capabilities expand our Data Window.

Furthermore, understanding the interplay of how wee approach data, and the consequences of our perceptions of it, allow us to weigh the risk when looking at new technologies and the characteristics they need to possess in order to meet organizational goals and vision. The initial bias, as noted by Paul Kahneman in his book Thinking, Fast and Slow, is for people to stick with the status quo or the familiar–the devil they know–in lieu of something new and innovative, even when the advantages of adoption of the new innovation are clearly obvious. It requires a reorientation of thinking to allow the acceptance of the new.

Our familiar patterns when thinking about information is to look for solutions that are “tools.” The new, unfamiliar concept that we find challenging is the understanding that we do not know what we do not know when it come to data and its potential–that we must push into the frontier in order to do so–and doing so will require not only new technology that is oriented toward the optimization of data, its processing from information to knowledge, and its use, but also a new way of thinking about it and how it will align with our organizational strategy.

This can only be done by first starting with a benchmark–to practically take stock–of where we individually as organizations and where we need to be in terms of understanding our mission or purpose. For project controls and project management there is no area more at odds with this alignment.

Recently, Dave Gordon in his blog The Practicing IT Project Manager argued why project managers needed to align their projects with organizational strategy. He noted that in 2015, during the development of the “Talent Triangle” that the Project Management Institute found that a major deficiency noted by organizations was that project managers needed to take an active role in aligning their projects with organizational strategy.

As I previously noted, there are a number of project management tools on the market today and a number of data visualization tools. Yet, there are significant gaps not only in the capture, quality, and processing of data, but also in the articulation of a consistent data strategy that aligns with the project organization and the overarching organization’s business strategy, goals, and priorities.

For example, in government, program managers spend a large portion of the year defending their programs to show that they are effectively and efficiently overseeing the expenditure of resources: that they are “executing program.” Failure to execute program will result in a budget mark, or worse, result in a re-baseline, or possible restructuring or cancellation. Projected production may be scaled back in favor of more immediate priorities.

Yet, none of our so-called “tools” fully capture program execution as it is defined by agencies and Congress. We have performance management tools, earned value tools, and the list can go on. A typical program manager in government spends almost five months assessing and managing program execution, and defending program and only a few minutes each month reviewing performance. This fact alone should be indicative that our priorities are misaligned.

The intersection of organizational alignment and program management in this case is related to resource utilization and program execution. No doubt, project controls and performance management contribute to our understanding of program execution, but they are removed from informing both the program manager and the organization in a comprehensive manner about execution, risk, and opportunity–and whether those elements conflict with or align with the agency’s goals. They are even further removed from an understanding of decisions related to program execution on the interrelationships across spectrum of the project and program portfolio.

The reason for this condition is that the data is currently not being captured and processed in a comprehensive manner to be positioned for its effective exploitation and utilization in meeting the needs of the various levels of the organization, nor does the perception of the specific data needed align with organizational needs.

Correspondingly, in construction and upstream oil and gas, project managers and stakeholders are most concerned with scope, timeliness, and the inevitable questions of claims–especially the avoidance or equitable settlement of the last.

As with government, our data strategy must align with our organizational goals and vision from the perspective of all stakeholders in the effort. At the heart of this alignment is data and those technologies “fitted” to exploit it and align it with our needs.

Potato, Potahto, Tomato, Tomahto: Data Normalization vs. Standardization, Why the Difference Matters

In my vocation I run a technology company devoted to program management solutions that is primarily concerned with taking data and converting it into information to establish a knowledge-based environment. Similarly, in my avocation I deal with the meaning of information and how to turn it into insight and knowledge. This latter activity concerns the subject areas of history, sociology, and science.

In my travels just prior to and since the New Year, I have come upon a number of experts and fellow enthusiasts in these respective fields. The overwhelming numbers of these encounters have been productive, educational, and cordial. We respectfully disagree in some cases about the significance of a particular approach, governance when it comes to project and program management policy, but generally there is a great deal of agreement, particularly on basic facts and terminology. But some areas of disagreement–particularly those that come from left field–tend to be the most interesting because they create an opportunity to clarify a larger issue.

In a recent venue I encountered this last example where the issue was the use of the phrase data normalization. The issue at hand was that the use of “data normalization” suggested some statistical methodology in reconciling data into a standard schema. Instead, it was suggested, the term “data standardization” was more appropriate.

These phrases do not describe the same thing, but they do describe processes that are symbiotic, not mutually exclusive. So what about data normalization? No doubt there is a statistical use of the term, but we are dealing with the definition as used in digital technology here, just as the use of “standardization” was suggested in the same context. There are many examples of technical terminology that do not have the same meaning when used in different contexts. Here is the definition of normalization applied to data science from Technopedia, which is the proper use of the term in this case:

Normalization is the process of reorganizing data in a database so that it meets two basic requirements: (1) There is no redundancy of data (all data is stored in only one place), and (2) data dependencies are logical (all related data items are stored together). Normalization is important for many reasons, but chiefly because it allows databases to take up as little disk space as possible, resulting in increased performance.

Normalization is also known as data normalization

This is pretty basic (and necessary) stuff. I have written at length about data normalization, but also pair it with two other terms. This is data rationalization and contextualization. Here is a short definition of rationalization:

What is the benefit of Data Rationalization? To be able to effectively exploit, manage, reuse, and govern enterprise data assets (including the models which describe them), it is necessary to be able to find them. In addition, there is (or should be) a wealth of semantics (e.g. business names, definitions, relationships) embedded within an organization’s models that can be exposed for improved analysis and knowledge transfer. By linking model objects (across or within models) it is possible to discover the higher order conceptual objects for any given object. Conversely, it is possible to identify what implementation artifacts implement a higher order model object. For example, using data rationalization, one can traverse from a conceptual model entity to a logical model entity to a physical model table to a database table, etc. Similarly, Data Rationalization enables understanding of a database table by traversing up through the different model levels.

Finally, we have contextualization. Here is a good definition using Wikipedia:

Context or contextual information is any information about any entity that can be used to effectively reduce the amount of reasoning required (via filtering, aggregation, and inference) for decision making within the scope of a specific application.[2] Contextualisation is then the process of identifying the data relevant to an entity based on the entity’s contextual information. Contextualisation excludes irrelevant data from consideration and has the potential to reduce data from several aspects including volume, velocity, and variety in large-scale data intensive applications

There is no approximation of reflecting the accuracy of data in any of these terms wihin the domain of data and computer science. Nor are there statistical methods involved to approximate what needs to be accomplished precisely. The basic skill required to accomplish these tasks–knowing that the data is structured and pre-conditioned–is to reconcile the various lexicons from differing sources, much as I reconcile in my avocation the meaning of words and phrases across periods in history and across languages.

In this discussion we are dealing with the issue of different words used to describe a process or phenomenon. Similarly, we find this challenge in data.

So where does this leave data standardization? In terms of data and computer science, this describes a completely different method. Here is a definition from Wikipedia, which is the proper contextual use of the term under “Standard data model”:

A standard data model or industry standard data model (ISDM) is a data model that is widely applied in some industry, and shared amongst competitors to some degree. They are often defined by standards bodies, database vendors or operating system vendors.

In the context of project and program management, particularly as it relates to government data submission and international open standards across vendors in an industry, is the use of a common schema. In this case there is a DoD version of a UN/CEFACT XML file currently set as the standard, but soon to be replaced by a new standard using the JSON file structure.

In any event, what is clear here is that, while standardization is a necessary part of a data policy to allow for sharing of information, the strength of the chosen schema and the instructions regarding it will vary–and this variation will have an effect on the quality of the information shared. But that is not all.

This is where data normalization, rationalization, and contextualization come into play. In order to create data for the a standardized format, it is first necessary to convert what is an otherwise opaque set of data due to differences into a cohesive lexicon. In data, this is accomplished by reconciling data dictionaries to determine which items are describing the same thing, process, measure, or phenomenon. In a domain like program management, this is a finite set. But it is also specialized knowledge and where the value is added to any end product that is produced. Then, once we know how to identify the data, we must be able to map those terms to the standard schema but, keeping on eye on the use of the data down the line, must be able to properly structure and ensure interrelationships of the data are established and/or maintained to ensure its effective use. This is no mean task and why all data transformation methods and companies are not the same.

Furthermore, these functions can be accomplished efficiently or inefficiently. The inefficient method is to take the old-fashioned business intelligence method that has been around since the 1980s and before, where a team of data scientists and analysts deal with data as if it is flat and, essentially, reinvents the wheel in establishing the meaning and proper context of the data. Given enough time and money anything can be accomplished, but brute force labor will not defeat the Second Law of Thermodynamics.

In computing, which comes close to minimizing that physical law, we know that data has already been imbued with meaning upon its initial processing. In lieu of brute force labor we apply intelligence and knowledge to accomplish this requirement. This is called normalization, rationalization, and contextualization of data. It requires a small fraction of other methods in terms of time and effort, and is infinitely more transparent.

Using these methods is also where innovation, efficiency, performance, accuracy, scalability, and anticipating future requirements based on the latest technology trends comes into play. Establishing a seamless flow of data integration allows, for example, the capture of more data being able to be properly structured in a database, which lays the ground for the transition from 2D to 3D and 4D (that is, what is often called integrated) program management, as well as more effective analytics.

The term “standardization” also suffers from a weakness in data and computer science that requires that it be qualified. After all, data standardization in an enterprise or organization does not preclude the prescription of a propriety dataset. In government, this is contrary to both statutory and policy mandates. Furthermore, even given an effective, open standard, there will be a large pool of legacy and other non-conforming data that will still require capture and transformation.

The Section 809 Panel study dealt directly with this issue:

Use existing defense business system open-data requirements to improve strategic decision making on acquisition and workforce issues…. DoD has spent billions of dollars building the necessary software and institutional infrastructure to collect enterprise wide acquisition and financial data. In many cases, however, DoD lacks the expertise to effectively use that data for strategic planning and to improve decision making. Recommendation 88 would mitigate this problem by implementing congressional open-data mandates and using existing hiring authorities to bolster DoD’s pool of data science professionals.

Section 809 Volume 3, Section 9, p.477

As operating environment companies expose more and more capability into the market through middleware and other open systems methods of visualizing data, the key to a system no longer resides in its ability to produce charts and graphs. The use of Excel as an ad hoc data repository with its vulnerability to error, to manipulation, and for its resistance to the establishment of an optimized data management and corporate knowledge environment is a symptom of the larger issue.

Data and its proper structuring is at the core of organizational success and process improvement. Standardization alone will not address barriers to data optimization. According to RAND studies in 2015 and 2017* these are:

  • Data Quality and Discontinuities
  • Data Silos and Underutilized Repositories
  • Timeliness of Data for use by SMEs and Decision-makers
  • Lack of Access and Contextualization
  • Traceability and Auditability
  • Lack of the Ability to Apply Discovery in the Data
  • The issue of Contractual Technical Data and Proprietary Data

That these issues also exist in private industry demonstrates the universality of the issue. Thus, yes, standardize by all means. But also ensure that the standard is open and that transformation is traceable and auditable from the the source system to the standard schema, and then into the target database. Only then will the enterprise, the organization, and the government agency have full ownership of the data it requires to efficiently and effectively carry out its purpose.

*RAND Corporation studies are “Issues with Access to Acquisition Data and Information in the DoD: Doing Data Right in Weapons System Acquisition” (RR880, 2017), and “Issues with Access to Acquisition Data and Information in the DoD: Policy and Practice (RR1534, 2015). These can be found here.

(Data) Transformation–Fear and Loathing over ETL in Project Management

ETL stands for data extract, transform, and load. This essential step is the basis for all of the new capabilities that we wish to acquire during the next wave of information technology: business analytics, big(ger) data, interdisciplinary insight into processes that provide insights into improving productivity and efficiency.

I’ve been dealing with a good deal of fear and loading regarding the introduction of this concept, even though in my day job my organization is a leading practitioner in the field in its vertical. Some of this is due to disinformation by competitors in playing upon the fears of the non-technically minded–the expected reaction of those who can’t do in the last throws of avoiding irrelevance. Better to baffle them with bullshit than with brilliance, I guess.

But, more importantly, part of this is due to the state of ETL and how it is communicated to the project management and business community at large. There is a great deal to be gained here by muddying the waters even by those who know better and have the technology. So let’s begin by clearing things up and making this entire field a bit more coherent.

Let’s start with the basics. Any organization that contains the interaction of people is a system. For purposes of a project management team, a business enterprise, or a governmental body we deal with a special class of systems known as Complex Adaptive Systems: CAS for short. A CAS is a non-linear learning system that reacts and evolves to its environment. It is complex because of the inter-relationships and interactions of more than two agents in any particular portion of the system.

I was first introduced to the concept of CAS through readings published out of the Santa Fe Institute in New Mexico. Most noteworthy is the work The Quark and the Jaguar by the physicist Murray Gell-Mann. Gell-Mann is received the Nobel in physics in 1969 for his work on elementary particles, such as the quark, and is co-founder of the Institute. He also was part of the team that first developed simulated Monte Carlo analysis during a period he spent at RAND Corporation. Anyone interested in the basic science of quanta and how the universe works that then leads to insights into subjects such as day-to-day probability and risk should read this book. It is a good popular scientific publication written by a brilliant mind, but very relevant to the subjects we deal with in project management and information science.

Understanding that our organizations are CAS allows us to apply all sorts of tools to better understand them and their relationship to the world at large. From a more practical perspective, what are the risks involved in the enterprise in which we are engaged and what are the probabilities associated with any of the range of outcomes that we can label as success. For my purposes, the science of information theory is at the forefront of these tools. In this world an engineer by the name of Claude Shannon working at Bell Labs essentially invented the mathematical basis for everything that followed in the world of telecommunications, generating, interpreting, receiving, and understanding intelligence in communication, and the methods of processing information. Needless to say, computing is the main recipient of this theory.

Thus, all CAS process and react to information. The challenge for any entity that needs to survive and adapt in a continually changing universe is to ensure that the information that is being received is of high and relevant quality so that the appropriate adaptation can occur. There will be noise in the signals that we receive. What we are looking for from a practical perspective in information science are the regularities in the data so that we can make the transformation of receiving the message in a mathematical manner (where the message transmitted is received) into the definition of information quality that we find in the humanities. I believe that we will find that mathematical link eventually, but there is still a void there. A good discussion of this difference can be found here in the on-line publication Double Dialogues.

Regardless of this gap, the challenge of those of us who engage in the business of ETL must bring to the table the ability not only to ensure that the regularities in the information are identified and transmitted to the intended (or necessary) users, but also to distinguish the quality of the message in the terms of the purpose of the organization. Shannon’s equation is where we start, not where we end. Given this background, there are really two basic types of data that we begin with when we look at a set of data: structured and unstructured data.

Structured data are those where the qualitative information content is either predefined by its nature or by a tag of some sort. For example, schedule planning and performance data, regardless of the idiosyncratic/proprietary syntax used by a software publisher, describes the same phenomena regardless of the software application. There are only so many ways to identify snow–and, no, the Inuit people do not have 100 words to describe it. Qualifiers apply in the humanities, but usually our business processes more closely align with statistical and arithmetic measures. As a result, structured data is oftentimes defined by its position in a hierarchical, time-phased, or interrelated system that contains a series of markers, indexes, and tables that allow it to be interpreted easily through the identification of a Rosetta stone, even when the system, at first blush, appears to be opaque. When you go to a book, its title describes what it is. If its content has a table of contents and/or an index it is easy to find the information needed to perform the task at hand.

Unstructured data consists of the content of things like letters, e-mails, presentations, and other forms of data disconnected from its source systems and collected together in a flat repository. In this case the data must be mined to recreate what is not there: the title that describes the type of data, a table of contents, and an index.

All data requires initial scrubbing and pre-processing. The difference here is the means used to perform this operation. Let’s take the easy path first.

For project management–and most business systems–we most often encounter structured data. What this means is that by understanding and interpreting standard industry terminology, schemas, and APIs that the simple process of aligning data to be transformed and stored in a database for consumption can be reduced to a systemic and repeatable process without the redundancy of rediscovery applied in every instance. Our business intelligence and business analytics systems can be further developed to anticipate a probable question from a user so that the query is pre-structured to allow for near immediate response. Further, structuring the user interface in such as way as to make the response to the query meaningful, especially integrated with and juxtaposed other types of data requires subject matter expertise to be incorporated into the solution.

Structured ETL is the place that I most often inhabit as a provider of software solutions. These processes are both economical and relatively fast, particularly in those cases where they are applied to an otherwise inefficient system of best-of-breed applications that require data transfers and cross-validation prior to official reporting. Time, money, and effort are all saved by automating this process, improving not only processing time but also data accuracy and transparency.

In the case of unstructured data, however, the process can be a bit more complicated and there are many ways to skin this cat. The key here is that oftentimes what seems to be unstructured data is only so because of the lack of domain knowledge by the software publisher in its target vertical.

For example, I recently read a white paper published by a large BI/BA publisher regarding their approach to financial and accounting systems. My own experience as a business manager and Navy Supply Corps Officer provide me with the understanding that these systems are highly structured and regulated. Yet, business intelligence publishers treated this data–and blatantly advertised and apparently sold as state of the art–an unstructured approach to mining this data.

This approach, which was first developed back in the 1980s when we first encountered the challenge of data that exceeded our expertise at the time, requires a team of data scientists and coders to go through the labor- and time-consuming process of pre-processing and building specialized processes. The most basic form of this approach involves techniques such as frequency analysis, summarization, correlation, and data scrubbing. This last portion also involves labor-intensive techniques at the microeconomic level such as binning and other forms of manipulation.

This is where the fear and loathing comes into play. It is not as if all information systems do not perform these functions in some manner, it is that in structured data all of this work has been done and, oftentimes, is handled by the database system. But even here there is a better way.

My colleague, Dave Gordon, who has his own blog, will emphasize that the identification of probable questions and configuration of queries in advance combined with the application of standard APIs will garner good results in most cases. Yet, one must be prepared to receive a certain amount of irrelevant information. For example, the query on Google of “Fun Things To Do” that you may use if you are planning for a weekend will yield all sorts of results, such as “50 Fun Things to Do in an Elevator.”  This result includes making farting sounds. The link provides some others, some of which are pretty funny. In writing this blog post, a simple search on Google for “Google query fails” yields what can only be described as a large number of query fails. Furthermore, this approach relies on the data originator to have marked the data with pointers and tags.

Given these different approaches to unstructured data and the complexity involved, there is a decision process to apply:

1. Determine if the data is truly unstructured. If the data is derived from a structured database from an existing application or set of applications, then it is structured and will require domain expertise to inherit the values and information content without expending unnecessary resources and time. A structured, systemic, and repeatable process can then be applied. Oftentimes an industry schema or standard can be leveraged to ensure consistency and fidelity.

2. Determine whether only a portion of the unstructured data is relative to your business processes and use it to append and enrich the existing structured data that has been used to integrate and expand your capabilities. In most cases the identification of a Rosetta Stone and standard APIs can be used to achieve this result.

3. For the remainder, determine the value of mining the targeted category of unstructured data and perform a business case analysis.

Given the rapidly expanding size of data that we can access using the advancing power of new technology, we must be able to distinguish between doing what is necessary from doing what is impressive. The definition of Big Data has evolved over time because our hardware, storage, and database systems allow us to access increasingly larger datasets that ten years ago would have been unimaginable. What this means is that–initially–as we work through this process of discovery, we will be bombarded with a plethora of irrelevant statistical measures and so-called predictive analytics that will eventually prove out to not pass the “so-what” test. This process places the users in a state of information overload, and we often see this condition today. It also means that what took an army of data scientists and developers to do ten years ago takes a technologist with a laptop and some domain knowledge to perform today. This last can be taught.

The next necessary step, aside from applying the decision process above, is to force our information systems to advance their processing to provide more relevant intelligence that is visualized and configured to the domain expertise required. In this way we will eventually discover the paradox that effectively accessing larger sets of data will yield fewer, more relevant intelligence that can be translated into action.

At the end of the day the manager and user must understand the data. There is no magic in data transformation or data processing. Even with AI and machine learning it is still incumbent upon the people within the organization to be able to apply expertise, perspective, knowledge, and wisdom in the use of information and intelligence.

Learning the (Data) — Data-Driven Management, HBR Edition

The months of December and January are usually full of reviews of significant events and achievements during the previous twelve months. Harvard Business Review makes the search for some of the best writing on the subject of data-driven transformation by occasionally publishing in one volume the best writing on a critical subject of interest to professional through the magazine OnPoint. It is worth making part of your permanent data management library.

The volume begins with a very concise article by Thomas C. Redman with the provocative title “Does Your Company Know What to Do with All Its Data?” He then goes on to list seven takeaways of optimizing the use of existing data that includes many of the themes that I have written about in this blog: better decision-making, innovation, what he calls “informationalize products”, and other significant effects. Most importantly, he refers to the situation of information asymmetry and how this provides companies and organizations with a strategic advantage that directly affects the bottom line–whether that be in negotiations with peers, contractual relationships, or market advantages. Aside from the OnPoint article, he also has some important things to say about corporate data quality. Highly recommended and a good reason to implement systems that assure internal information systems fidelity.

Edd Wilder-James also covers a theme that I have hammered home in a number of blog posts in the article “Breaking Down Data Silos.” The issue here is access to data and the manner in which it is captured and transformed into usable analytics. His recommended approach to a task that is often daunting is to find the path of least resistance in finding opportunities to break down silos and maximize data to apply advanced analytics. The article provides a necessary balm that counteracts the hype that often accompanies this topic.

Both of these articles are good entrees to the subject and perfectly positioned to prompt both thought and reflection of similar experiences. In my own day job I provide products that specifically address these business needs. Yet executives and management in all too many cases continue to be unaware of the economic advantages of data optimization or the manner in which continuing to support data silos is limiting their ability to effectively manage their organizations. There is no doubt that things are changing and each day offers a new set of clients who are feeling their way in this new data-driven world, knowing that the promises of almost effort-free goodness and light by highly publicized data gurus are not the reality of practitioners, who apply the detail work of data normalization and rationalization. At the end it looks like magic, but there is effort that needs to be expended up-front to get to that state. In this physical universe under the Second Law of Thermodynamics there are no free lunches–energy must be borrowed from elsewhere in order to perform work. We can minimize these efforts through learning and the application of new technology, but managers cannot pretend not to have to understand the data that they intend to use to make business decisions.

All of the longer form articles are excellent, but I am particularly impressed with the Leandro DalleMule and Thomas H. Davenport article entitled “What’s Your Data Strategy?” from the May-June 2017 issue of HBR. Oftentimes when addressing big data at professional conferences and in visiting businesses the topic often runs to the manner of handling the bulk of non-structured data. But as the article notes, less than half of an organization’s relevant structured data is actually used in decision-making. The most useful artifact that I have permanently plastered at my workplace is the graphic “The Elements of Data Strategy”, and I strongly recommend that any manager concerned with leveraging new technology to optimize data do the same. The graphic illuminates the defensive and offensive positions inherent in a cohesive data strategy leading an organization to the state: “In our experience, a more flexible and realistic approach to data and information architectures involves both a single source of truth (SSOT) and multiple versions of the truth (MVOTs). The SSOT works at the data level; MVOTs support the management of information.” Elimination of proprietary data silos, elimination of redundant data streams, and warehousing of data that is accessed using a number of analytical methods achieve the necessary states of SSOT that provides the basis for an environment supporting MVOTs.

The article “Why IT Fumbles Analytics” by Donald A. Marchand and Joe Peppard from 2013, still rings true today. As with the article cited above by Wilder-James, the emphasis here is with the work necessary to ensure that new data and analytical capabilities succeed, but the emphasis shifts to “figuring out how to use the information (the new system) generates to make better decisions or gain deeper…insights into key aspects of the business.” The heart of managing the effort in providing this capability is to put into place a project organization, as well as systems and procedures, that will support the organizational transformation that will occur as a result of the explosion of new analytical capability.

The days of simply buying an off-the-shelf silo-ed “tool” and automating a specific manual function are over, especially for organizations that wish to be effective and competitive–and more profitable–in today’s data and analytical environment. A more comprehensive and collaborative approach is necessary. As with the DalleMule and Davenport article, there is a very useful graphic that contrasts traditional IT project approaches against Analytics and Big Data (or perhaps “Bigger” Data) Projects. Though the prescriptions in the article assume an earlier concept of Big Data optimization focused on non-structured data, thereby making some of these overkill, an implementation plan is essential in supporting the kind of transformation that will occur, and managers act at their own risk if they fail to take this effect into account.

All of the other articles in this OnPoint issue are of value. The bottom line, as I have written in the past, is to keep a focus on solving business challenges, rather than buying the new bright shiny object. Alternatively, in today’s business environment the day that business decision-makers can afford to stay within their silo-ed comfort zone are phasing out very quickly, so they need to shift their attention to those solutions that address these new realities.

So why do this apart from the fancy term “data optimization”? Well, because there is a direct return-on-investment in transforming organizations and systems to data-driven ones. At the end of the day the economics win out. Thus, our organizations must be prepared to support and have a plan in place to address the core effects of new data-analytics and Big Data technology:

a. The management and organizational transformation that takes place when deploying the new technology, requiring proactive socialization of the changing environment, the teaching of new skill sets, new ways of working, and of doing business.

b. Supporting transformation from a sub-optimized silo-ed “tell me what I need to know” work environment to a learning environment, driven by what the data indicates, supporting the skills cited above that include intellectual curiosity, engaging domain expertise, and building cross-domain competencies.

c. A practical plan that teaches the organization how best to use the new capability through a practical, hands-on approach that focuses on addressing specific business challenges.

Post-Blogging NDIA Blues — The Latest News (Project Management Wonkish)

The National Defense Industrial Association’s Integrated Program Management Division (NDIA IPMD) just had its quarterly meeting here in sunny Orlando where we braved the depths of sub-60 degrees F temperatures to start out each day.

For those not in the know, these meetings are an essential coming together of policy makers, subject matter experts, and private industry practitioners regarding the practical and mundane state-of-the-practice in complex project management, particularly focused on the concerns of the the federal government and the Department of Defense.  The end result of these meetings is to publish white papers and recommendations regarding practice to support continuous process improvement and the practical application of project management practices–allowing for a cross-pollination of commercial and government lessons learned.  This is also the intersection where innovation among the large and small are given an equal vetting and an opportunity to introduce new concepts and solutions.  This is an idealized description, of course, and most of the petty personality conflicts, competition, and self-interest that plagues any group of individuals coming together under a common set of interests also plays out here.  But generally the days are long and the workshops generally produce good products that become the de facto standard of practice in the industry. Furthermore the control that keeps the more ruthless personalities in check is the fact that, while it is a large market, the complex project management community tends to be a relatively small one, which reinforces professionalism.

The “blues” in this case is not so much borne of frustration or disappointment but, instead, from the long and intense days that the sessions offer.  The biggest news from an IT project management and application perspective was twofold. The data stream used by the industry in sharing data in an open systems manner will be simplified.  The other was the announcement that the technology used to communicate will move from XML to JSON.

Human readable formatting to Data-focused formatting.  Under Kendall’s Better Buying Power 3.0 the goal of the Department of Defense (DoD) has been to incorporate better practices from private industry where they can be applied.  I don’t see initiatives for greater efficiency and reduction of duplication going away in the new Administration, regardless of what a new initiative is called.

In case this is news to you, the federal government buys a lot of materials and end items–billions of dollars worth.  Accountability must be put in place to ensure that the money is properly spent to acquire the things being purchased.  Where technology is pushed and where there are no commercial equivalents that can be bought off the shelf, as in the systems purchased by the Department of Defense, there are measures of progress and performance (given that the contract is under a specification) that are submitted to the oversight agency in DoD.  This is a lot of data and to be brutally frank the method and format of delivery has been somewhat chaotic, inefficient, and duplicative.  The Department moved to address this by a somewhat modest requirement of open systems submission of an application-neutral XML file under the standards established by the UN/CEFACT XML organization.  This was called the Integrated Program Management Report (IMPR).  This move garnered some improvement where it has been applied, but contracts are long-term, so incorporating improvements though new contractual requirements tends to take time.  Plus, there is always resistance to change.  The Department is moving to accelerate addressing these inefficiencies in their data streams by eliminating the unnecessary overhead associated with specifications of formatting data for paper forms and dealing with data as, well, data.  Great idea and bravo!  The rub here is that in making the change, the Department has proposed dropping XML as the technology used to transfer data and move to JSON.

XML to JSON. Before I spark another techie argument about the relative merits of each, there are some basics to understand here.  First, XML is a language, JSON is simply data exchange format.  This means that XML is specifically designed to deal with hierarchical and structured data that can be queried and where validation and fidelity checks within the data are inherent in the technology. Furthermore, XML is known to scale while maintaining the integrity of the data, which is intended for use in relational databases.  Furthermore, XML is hard to break.  It is meant for editing and will maintain its structure and integrity afterward.

The counter argument encountered is that JSON is new! and uses fewer characters! (which usually turns out to be inconsequential), and people are talking about it for Big Data and NoSQL! (but this happened after the fact and the reason for shoehorning it this way is discussed below).

So does it matter?  Yes and no.  As a supplier specializing in delivering solutions that normalize and rationalize data across proprietary file structures and leverage database capabilities, I don’t care.  I can adapt quickly and will have a proof-of-concept solution out within 30 days of receiving the schema.

The risk here, which applies to DoD and the industry, is that the decision to go to JSON is made only because it is the shiny new thing used by gamers and social networking developers.  There has also been a move to adapt to other uses because of the history of significant security risks that had been found in Java, so much so that an entire Wikipedia page is devoted to them.  Oracle just killed off Java applets, though Java hangs on.  JSON, of course, isn’t Java, but it was designed from birth as JavaScript Object Notation (hence the acronym JSON), with the purpose of handling relatively small bits of data across web servers in a number of proprietary settings.

To address JSON deficiencies relative to XML, a number of tools have been and are being developed to replicate the fidelity and reliability found in XML.  Whether this is sufficient to be effective against a structured LANGUAGE is to be seen.  Much of the overhead that technies complain about in XML is due to the native functionality related to the power it brings to the table.  No doubt, a bicycle is simpler than a Formula One racer–and this is an apt comparison.  Claiming “simpler” doesn’t pass the “So What?” test knowing the business processes involved.  The technology needs to be fit to the solution.  The purpose of data transmission using APIs is not only to make it easy to produce but for it to–you know–achieve the goals of normalization and rationalization so that it can be used on the receiving end which is where the consumer (which we usually consider to be the customer) sits.

At the end of the day the ability to scale and handle hierarchical, structured data will rely on the quality and strength of the schema and the tools that are published to enforce its fidelity and compliance.  Otherwise consuming organizations will be receiving a dozen different proprietary JSON files, and that does not address the present chaos but simply adds to it.  These issues were aired out during the meeting and it seems that everyone is aware of the risks and that they can be addressed.  Furthermore, as the schema is socialized across solutions providers, it will be apparent early if the technology will be able handle the project performance data resulting from the development of a high performance aircraft or a U.S. Navy destroyer.

Takin’ Care of Business — Information Economics in Project Management

Neoclassical economics abhors inefficiency, and yet inefficiencies exist.  Among the core issues that create inefficiencies is the asymmetrical nature of information.  Asymmetry is an accepted cornerstone of economics that leads to inefficiency.  We can see in our daily lives and employment the effects of one party in a transaction having more information than the other:  knowing whether the used car you are buying is a lemon, measuring risk in the purchase of an investment and, apropos to this post, identifying how our information systems allow us to manage complex projects.

Regarding this last proposition we can peel this onion down through its various levels: the asymmetry in the information between the customer and the supplier, the asymmetry in information between the board and stockholders, the asymmetry in information between management and labor, the asymmetry in information between individual SMEs and the project team, etc.–it’s elephants all the way down.

This asymmetry, which drives inefficiency, is exacerbated in markets that are dominated by monopoly, monopsony, and oligopoly power.  When informed by the work of Hart and Holmström regarding contract theory, which recently garnered the Nobel in economics, we have a basis for understanding the internal dynamics of projects in seeking efficiency and productivity.  What is interesting about contract theory is that it incorporates the concept of asymmetrical information (labeled as adverse selection), but expands this concept in human transactions at the microeconomic level to include considerations of moral hazard and the utility of signalling.

The state of asymmetry and inefficiency is exacerbated by the patchwork quilt of “tools”–software applications that are designed to address only a very restricted portion of the total contract and project management system–that are currently deployed as the state of the art.  These tend to require the insertion of a new class of SME to manage data by essentially reversing the efficiencies in automation, involving direct effort to reconcile differences in data from differing tools. This is a sub-optimized system.  It discourages optimization of information across the project, reinforces asymmetry, and is economically and practically unsustainable.

The key in all of this is ensuring that sub-optimal behavior is discouraged, and that those activities and behaviors that are supportive of more transparent sharing of information and, therefore, contribute to greater efficiency and productivity are rewarded.  It should be noted that more transparent organizations tend to be more sustainable, healthier, and with a higher degree of employee commitment.

The path forward where there is monopsony power, where there is a dominant buyer, is to impose the conditions for normative behavior that would otherwise be leveraged through practice in a more open market.  For open markets not dominated by one player as either supplier or seller, instituting practices that reward behavior that reduces the effects of asymmetrical information, and contracting disincentives in business transactions on the open market is the key.

In the information management market as a whole the trends that are working against asymmetry and inefficiency involve the reduction of data streams, the construction of cross-domain data repositories (or reservoirs) that allow for the satisfaction of multiple business stakeholders, and the introduction of systems that are more open and adaptable to the needs of the project system in lieu of a limited portion of the project team.  These solutions exist, yet their adoption is hindered because of the long-term infrastructure that is put in place in complex project management.  This infrastructure is supported by incumbents that are reinforcing to the status quo.  Because of this, from the time a market innovation is introduced to the time that it is adopted in project-focused organizations usually involves the expenditure of several years.

This argues for establishing an environment that is more nimble.  This involves the adoption of a series of approaches to achieve the goals of broader information symmetry and efficiency in the project organization.  These are:

a. Instituting contractual relationships, both internally and externally, that encourage project personnel to identify risk.  This would include incentives to kill efforts that have breached their framing assumptions, or to consolidate progress that the project has achieved to date–sending it as it is to production–while killing further effort that would breach framing assumptions.

b. Institute policy and incentives on the data supply end to reduce the number of data streams.  Toward this end both acquisition and contracting practices should move to discourage proprietary data dead ends by encouraging normalized and rationalized data schemas that describe the environment using a common or, at least, compatible lexicon.  This reduces the inefficiency derived from opaqueness as it relates to software and data.

c.  Institute policy and incentives on the data consumer end to leverage the economies derived from the increased computing power from Moore’s Law by scaling data to construct interrelated datasets across multiple domains that will provide a more cohesive and expansive view of project performance.  This involves the warehousing of data into a common repository or reduced set of repositories.  The goal is to satisfy multiple project stakeholders from multiple domains using as few streams as necessary and encourage KDD (Knowledge Discovery in Databases).  This reduces the inefficiency derived from data opaqueness, but also from the traditional line-and-staff organization that has tended to stovepipe expertise and information.

d.  Institute acquisition and market incentives that encourage software manufacturers to engage in positive signalling behavior that reduces the opaqueness of the solutions being offered to the marketplace.

In summary, the current state of project data is one that is characterized by “best-of-breed” patchwork quilt solutions that tend to increase direct labor, reduces and limits productivity, and drives up cost.  At the end of the day the ability of the project to handle risk and adapt to technical challenges rests on the reliability and efficiency of its information systems.  A patchwork system fails to meet the needs of the organization as a whole and at the end of the day is not “takin’ care of business.”

Big Time — Elements of Data Size in Scaling

I’ve run into additional questions about scalability.  It is significant to understand the concept in terms of assessing software against data size, since there are actually various aspect of approaching the issue.

Unlike situations where data is already sorted and structured as part of the core functionality of the software service being provided, this is in dealing in an environment where there are many third-party software “tools” that put data into proprietary silos.  These act as barriers to optimizing data use and gaining corporate intelligence.  The goal here is to apply in real terms the concept that the customers generating the data (or stakeholders who pay for the data) own the data and should have full use of it across domains.  In project management and corporate governance this is an essential capability.

For run-of-the-mill software “tools” that are focused on solving one problem, this often is interpreted as just selling a lot more licenses to a big organization.  “Sure we can scale!” is code for “Sure, I’ll sell you more licenses!”  They can reasonably make this assertion, particularly in client-server or web environments, where they can point to the ability of the database system on which they store data to scale.  This also comes with, usually unstated, the additional constraint that their solution rests on a proprietary database structure.  Such responses, through, are a form of sidestepping the question, nor is it the question being asked.  Thus, it is important for those acquiring the right software to understand the subtleties.

A review of what makes data big in the first place is in order.  The basic definition, which I outlined previously, came from NASA in describing data that could not be held in local memory or local storage.  Hardware capability, however, continues to grow exponentially, so that what is big data today is not big data tomorrow.  But in handling big data, it then becomes incumbent on software publishers to drive performance to allow their customers to take advantage of the knowledge contained in these larger data sets.

The elements that determine the size of data are:

a.  Table size

b.  Row or Record size

c.  Field size

d.  Rows per table

e.  Columns per table

f.  Indexes per table

Note the interrelationships of these elements in determining size.  Thus, recently I was asked how many records are being used on the largest tables being accessed by a piece of software.  That is fine as shorthand, but the other elements add to the size of the data that is being accessed.  Thus, a set of data of say 800K records may be just as “big” as one containing 2M records because of the overall table size of fields, and the numbers of columns and indices, as well as records.  Furthermore, this question didn’t take into account the entire breadth of data across all tables.

Understanding the definition of data size then leads us to understanding the nature of software scaling.  There are two aspects to this.

The first is the software’s ability to presort the data against the database in such as way as to ensure that latency–the delay in performance when the data is loaded–is minimized.  The principles applied here go back to database management practices back in the day when organizations used to hire teams of data scientists to rationalize data that was processed in machine language–especially when it used to be stored in ASCII or, for those who want to really date themselves, EBCDIC, which were incoherent by today’s generation of more human-readable friendly formats.

Quite simply, the basic steps applied has been to identify the syntax, translate it, find its equivalents, and then sort that data into logical categories that leverage database pointers.  What you don’t want the software to do is what used to be done during the earliest days of dealing with data, which was smaller by today’s standards, of serially querying ever data element in order to fetch only what the user is calling.  Furthermore, it also doesn’t make much sense to deal with all data as a Repository of Babel to apply labor-intensive data mining in non-relational databases, especially in cases where the data is well understood and fairly well structured, even if in a proprietary structure.  If we do business in a vertical where industry standards in data apply, as in the use of the UN/CEFACT XML convention, then much of the presorting has been done for us.  In addition, more powerful industry APIs (like OLE DB and ODBC) that utilize middleware (web services, XML, SOAP, MapReduce, etc.) multiply the presorting capabilities of software, providing significant performance improvements in accessing big data.

The other aspect is in the software’s ability to understand limitations in data communications hardware systems.  This is a real problem, because the backbone in corporate communication systems, especially to ensure security, is still largely done over a wire.  The investments in these backbones is usually categorized as a capital investment, and so upgrades to the system are slow.  Furthermore, oftentimes backbone systems are embedded in physical plant building structures.  So any software performance is limited by the resistance and bandwidth of wiring.  Thus, we live in a world where hardware storage and processing is doubling every 12-18 months, and software is designed to better leverage such expansion, but the wires over which data communication depends remains stuck in the past–constrained by the basic physics of CAT 6 or Fiber Optic cabling.

Needless to say, software manufacturers who rely on constant communications with the database will see significantly degraded performance.  Some software publishers who still rely on this model use the “check out” system, treating data like a lending library, where only one user or limited users can access the same data.  This, of course, reduces customer flexibility.  Strategies that are more discrete in handling data are the needed response here until day-to-day software communications can reap the benefits of physical advancements in this category of hardware.  Furthermore, organizations must understand that the big Cloud in the sky is not the answer, since it is constrained by the same physics as the rest of the universe–with greater security risks.

All of this leads me to a discussion I had with a colleague recently.  He opened his iPhone and did a query in iTunes for an album.  In about a second or so his query selected the artist and gave a list of albums–all done without a wire connection.  “Why can’t we have this in our industry?” he asked.  Why indeed?  Well, first off, Apple iTunes has sorted its data to optimize performance with its app, and it is performed within a standard data stream and optimized for the Apple iOS and hardware.  Secondly, the variables of possible queries in iTunes are predefined and tied to a limited and internally well-defined set of data.  Thus, the data and application challenges are not equivalent as found in my friend’s industry vertical.  For example, aside from the challenges of third party data normalization and rationalization, iTunes is not dealing with dynamic time-phased or trending data that requires multiple user updates to reflect changes using predictive analytics which is then served to different classes of users in a highly secure environment.  Finally, and most significantly, Apple spent more on that system than the annual budget of my friend’s organization.  In the end his question was a good one, but in discussing this it was apparent that just saying “give me this” is a form of magical thinking and hand waving.  The devil is in the details, though I am confident that we will eventually get to an equivalent capability.

At the end of the day IT project management strategy must take into account the specific needs of classes of users in making determinations of scaling.  What this means is a segmented approach: thick-client users, compartmentalized local installs with subsets of data, thin-client, and web/mobile or terminal services equivalents.  The practical solution is still an engineered one that breaks the elephant into digestible pieces that lean forward to leveraging advances in hardware, database, and software operating environments.  These are the essential building blocks to data optimization and scaling.

I Can See Clearly Now — Knowledge Discovery in Databases, Data Scalability, and Data Relevance

I recently returned from a travel and much of the discussion revolved around the issues of scalability and the use of data.  What is clear is that the conversation at the project manager level is shifting from a long-running focus on reports and metrics to one focused on data and what can be learned from it.  As with any technology, information technology exploits what is presented before it.  Most recently, accelerated improvements in hardware and communications technology has allowed us to begin to collect and use ever larger sets of data.

The phrase “actionable” has been thrown around quite a bit in marketing materials, but what does this term really mean?  Can data be actionable?  No.  Can intelligence derived from that data be actionable?  Yes.  But is all data that is transformed into intelligence actionable?  No.  Does it need to be?  No.

There are also kinds and levels of intelligence, particularly as it relates to organizations and business enterprises.  Here is a short list:

a. Competitive intelligence.  This is intelligence derived from data that informs decision makers about how their organization fits into the external environment, further informing the development of strategic direction.

b. Business intelligence.  This is intelligence derived from data that informs decision makers about the internal effectiveness of their organization both in the past and into the future.

c. Business analytics.  The transformation of historical and trending enterprise data used to provide insight into future performance.  This includes identifying any underlying drivers of performance, and any emerging trends that will manifest into risk.  The purpose is to provide sufficient early warning to allow risk to be handled before it fully manifests, therefore keeping the effort being measured consistent with the goals of the organization.

Note, especially among those of you who may have a military background, that what I’ve outlined is a hierarchy of information and intelligence that addresses each level of an organization’s operations:  strategic, operational, and tactical.  For many decision makers, translating tactical level intelligence into strategic positioning through the operational layer presents the greatest challenge.  The reason for this is that, historically, there often has been a break in the continuity between data collected at the tactical level and that being used at the strategic level.

The culprit is the operational layer, which has always been problematic for organizations and those individuals who find themselves there.  We see this difficulty reflected in the attrition rate at this level.  Some individuals cannot successfully make this transition in thinking. For example, in the U.S. Army command structure when advancing from the battalion to the brigade level, in the U.S. Navy command structure when advancing from Department Head/Staff/sea command to organizational or fleet command (depending on line or staff corps), and in business for those just below the C level.

Another way to look at this is through the traditional hierarchical pyramid, in which data represents the wider floor upon which each subsequent, and slightly reduced, level is built.  In the past (and to a certain extent this condition still exists in many places today) each level has constructed its own data stream, with the break most often coming at the operational level.  This discontinuity is then reflected in the inconsistency between bottom-up and top-down decision making.

Information technology is influencing and changing this dynamic by addressing the main reason for the discontinuity existing–limitations in data and intelligence capabilities.  These limitations also established a mindset that relied on limited, summarized, and human-readable reporting that often was “scrubbed” (especially at the operational level) as it made its way to the senior decision maker.  Since data streams were discontinuous, there were different versions of reality.  When aspects of the human equation are added, such as selection bias, the intelligence will not match what the data would otherwise indicate.

As I’ve written about previously in this blog, the application of Moore’s Law in physical computing performance and storage has pushed software to greater needs in scaling in dealing with ever increasing datasets.  What is defined as big data today will not be big data tomorrow.

Organizations, in reaction to this condition, have in many cases tended to simply look at all of the data they collect and throw it together into one giant pool.  Not fully understanding what the data may say, a number of ad hoc approaches have been taken.  In some cases this has caused old labor-intensive data mining and rationalization efforts to once again rise from the ashes to which they were rightly consigned in the past.  On the opposite end, this has caused a reliance on pre-defined data queries or hard-coded software solutions, oftentimes based on what had been provided using human-readable reporting.  Both approaches are self-limiting and, to a large extent, self-defeating.  In the first case because the effort and time to construct the system will outlive the needs of the organization for intelligence, and in the second case, because no value (or additional insight) is added to the process.

When dealing with large, disparate sources of data, value is derived through that additional knowledge discovered through the proper use of the data.  This is the basis of the concept of what is known as KDD.  Given that organizations know the source and type of data that is being collected, it is not necessary to reinvent the wheel in approaching data as if it is a repository of Babel.  No doubt the euphemisms, semantics, and lexicon used by software publishers differs, but quite often, especially where data underlies a profession or a business discipline, these elements can be rationalized and/or normalized given that the appropriate business cross-domain knowledge is possessed by those doing the rationalization or normalization.

This leads to identifying the characteristics* of data that is necessary to achieve a continuity from the tactical to the strategic level that will achieve some additional necessary qualitative traits such as fidelity, credibility, consistency, and accuracy.  These are:

  1. Tangible.  Data must exist and the elements of data should record something that correspondingly exists.
  2. Measurable.  What exists in data must be something that is in a form that can be recorded and is measurable.
  3. Sufficient.  Data must be sufficient to derive significance.  This includes not only depth in data but also, especially in the case of marking trends, across time-phasing.
  4. Significant.  Data must be able, once processed, to contribute tangible information to the user.  This goes beyond statistical significance noted in the prior characteristic, in that the intelligence must actually contribute to some understanding of the system.
  5. Timely.  Data must be timely so that it is being delivered within its useful life.  The source of the data must also be consistently provided over consistent periodicity.
  6. Relevant.  Data must be relevant to the needs of the organization at each level.  This not only is a measure to test what is being measured, but also will identify what should be but is not being measured.
  7. Reliable.  The sources of the data be reliable, contributing to adherence to the traits already listed.

This is the shorthand that I currently use in assessing a data requirements and the list is not intended to be exhaustive.  But it points to two further considerations when delivering a solution.

First, at what point does the person cease to be the computer?  Business analytics–the tactical level of enterprise data optimization–oftentimes are stuck in providing users with a choice of chart or graph to use in representing such data.  And as noted by many writers, such as this one, no doubt the proper manner of representing data will influence its interpretation.  But in this case the person is still the computer after the brute force computing is completed digitally.  There is a need for more effective significance-testing and modeling of data, with built-in controls for selection bias.

Second, how should data be summarized to the operational and strategic levels so that “signatures” can be identified that inform information?  Furthermore, it is important to understand what kind of data must supplement the tactical level data at those other levels.  Thus, data streams are not only minimized to eliminate redundancy, but also properly aligned to the level of data intelligence.

*Note that there are other aspects of data characteristics noted by other sources here, here, and here.  Most of these concern themselves with data quality and what I would consider to be baseline data traits, which need to be separately assessed and tested, as opposed to antecedent characteristics.

 

Living in the Material World — A Call for Open Data from Materials Science

Since advocating for transparent and open data schemas and data repositories, I’ve run into other high tech professionals who have opined that such data requirements are only applicable to my core competency in the U.S. Aerospace & Defense vertical.  Now, in the 26 November on-line edition of the journal Science, comes an opinion piece from a number of Chinese corrosion scientists under the title “Materials science: Share corrosion data” have advocated the development of open data infrastructures to make the age and life of existing metallurgic infrastructure easily accessible to technology in a non-proprietary format.

As the article states, “corrosion costs six cents for every dollar of gross domestic product in the United States. Globally, that amounts to more than US$4 trillion a year — equivalent to damages from 40 Hurricane Katrinas. Half of that cost is in corrosion prevention and control, the other half in damages and lost productivity.”

In the software technology industry, the recent issues regarding the Trans-Pacific Partnership (TPP) has revealed strains in U.S.-Chinese relations on intellectual property and proprietary systems.  So the danger is always that the source of the advocacy will be ignored, despite the clear economic argument in favor of what the editorial proposes.

But the concern is transnational in nature.  As the piece points out, the U.S. Department of Energy has initiated its own program in alignment with the U.S. National Institute of Standards and Technology (NIST) Materials Genome Initiative (MGI) to establish open repositories of data in support of the alternative energy industry.  Given the aging U.S. infrastructure and the dangers from sudden failure from these engineering systems, it seems wise to track and share data for materials corrosion and lifespans.  Think of the sudden bridge failures that have hit headlines over the last several years.

Since the cost of not doing so can be measured in lives as well as economic cost, such data normalization and rationalization in this area takes on the role of being an essential element of governance.