(Data) Transformation–Fear and Loathing over ETL in Project Management

ETL stands for data extract, transform, and load. This essential step is the basis for all of the new capabilities that we wish to acquire during the next wave of information technology: business analytics, big(ger) data, interdisciplinary insight into processes that provide insights into improving productivity and efficiency.

I’ve been dealing with a good deal of fear and loading regarding the introduction of this concept, even though in my day job my organization is a leading practitioner in the field in its vertical. Some of this is due to disinformation by competitors in playing upon the fears of the non-technically minded–the expected reaction of those who can’t do in the last throws of avoiding irrelevance. Better to baffle them with bullshit than with brilliance, I guess.

But, more importantly, part of this is due to the state of ETL and how it is communicated to the project management and business community at large. There is a great deal to be gained here by muddying the waters even by those who know better and have the technology. So let’s begin by clearing things up and making this entire field a bit more coherent.

Let’s start with the basics. Any organization that contains the interaction of people is a system. For purposes of a project management team, a business enterprise, or a governmental body we deal with a special class of systems known as Complex Adaptive Systems: CAS for short. A CAS is a non-linear learning system that reacts and evolves to its environment. It is complex because of the inter-relationships and interactions of more than two agents in any particular portion of the system.

I was first introduced to the concept of CAS through readings published out of the Santa Fe Institute in New Mexico. Most noteworthy is the work The Quark and the Jaguar by the physicist Murray Gell-Mann. Gell-Mann is received the Nobel in physics in 1969 for his work on elementary particles, such as the quark, and is co-founder of the Institute. He also was part of the team that first developed simulated Monte Carlo analysis during a period he spent at RAND Corporation. Anyone interested in the basic science of quanta and how the universe works that then leads to insights into subjects such as day-to-day probability and risk should read this book. It is a good popular scientific publication written by a brilliant mind, but very relevant to the subjects we deal with in project management and information science.

Understanding that our organizations are CAS allows us to apply all sorts of tools to better understand them and their relationship to the world at large. From a more practical perspective, what are the risks involved in the enterprise in which we are engaged and what are the probabilities associated with any of the range of outcomes that we can label as success. For my purposes, the science of information theory is at the forefront of these tools. In this world an engineer by the name of Claude Shannon working at Bell Labs essentially invented the mathematical basis for everything that followed in the world of telecommunications, generating, interpreting, receiving, and understanding intelligence in communication, and the methods of processing information. Needless to say, computing is the main recipient of this theory.

Thus, all CAS process and react to information. The challenge for any entity that needs to survive and adapt in a continually changing universe is to ensure that the information that is being received is of high and relevant quality so that the appropriate adaptation can occur. There will be noise in the signals that we receive. What we are looking for from a practical perspective in information science are the regularities in the data so that we can make the transformation of receiving the message in a mathematical manner (where the message transmitted is received) into the definition of information quality that we find in the humanities. I believe that we will find that mathematical link eventually, but there is still a void there. A good discussion of this difference can be found here in the on-line publication Double Dialogues.

Regardless of this gap, the challenge of those of us who engage in the business of ETL must bring to the table the ability not only to ensure that the regularities in the information are identified and transmitted to the intended (or necessary) users, but also to distinguish the quality of the message in the terms of the purpose of the organization. Shannon’s equation is where we start, not where we end. Given this background, there are really two basic types of data that we begin with when we look at a set of data: structured and unstructured data.

Structured data are those where the qualitative information content is either predefined by its nature or by a tag of some sort. For example, schedule planning and performance data, regardless of the idiosyncratic/proprietary syntax used by a software publisher, describes the same phenomena regardless of the software application. There are only so many ways to identify snow–and, no, the Inuit people do not have 100 words to describe it. Qualifiers apply in the humanities, but usually our business processes more closely align with statistical and arithmetic measures. As a result, structured data is oftentimes defined by its position in a hierarchical, time-phased, or interrelated system that contains a series of markers, indexes, and tables that allow it to be interpreted easily through the identification of a Rosetta stone, even when the system, at first blush, appears to be opaque. When you go to a book, its title describes what it is. If its content has a table of contents and/or an index it is easy to find the information needed to perform the task at hand.

Unstructured data consists of the content of things like letters, e-mails, presentations, and other forms of data disconnected from its source systems and collected together in a flat repository. In this case the data must be mined to recreate what is not there: the title that describes the type of data, a table of contents, and an index.

All data requires initial scrubbing and pre-processing. The difference here is the means used to perform this operation. Let’s take the easy path first.

For project management–and most business systems–we most often encounter structured data. What this means is that by understanding and interpreting standard industry terminology, schemas, and APIs that the simple process of aligning data to be transformed and stored in a database for consumption can be reduced to a systemic and repeatable process without the redundancy of rediscovery applied in every instance. Our business intelligence and business analytics systems can be further developed to anticipate a probable question from a user so that the query is pre-structured to allow for near immediate response. Further, structuring the user interface in such as way as to make the response to the query meaningful, especially integrated with and juxtaposed other types of data requires subject matter expertise to be incorporated into the solution.

Structured ETL is the place that I most often inhabit as a provider of software solutions. These processes are both economical and relatively fast, particularly in those cases where they are applied to an otherwise inefficient system of best-of-breed applications that require data transfers and cross-validation prior to official reporting. Time, money, and effort are all saved by automating this process, improving not only processing time but also data accuracy and transparency.

In the case of unstructured data, however, the process can be a bit more complicated and there are many ways to skin this cat. The key here is that oftentimes what seems to be unstructured data is only so because of the lack of domain knowledge by the software publisher in its target vertical.

For example, I recently read a white paper published by a large BI/BA publisher regarding their approach to financial and accounting systems. My own experience as a business manager and Navy Supply Corps Officer provide me with the understanding that these systems are highly structured and regulated. Yet, business intelligence publishers treated this data–and blatantly advertised and apparently sold as state of the art–an unstructured approach to mining this data.

This approach, which was first developed back in the 1980s when we first encountered the challenge of data that exceeded our expertise at the time, requires a team of data scientists and coders to go through the labor- and time-consuming process of pre-processing and building specialized processes. The most basic form of this approach involves techniques such as frequency analysis, summarization, correlation, and data scrubbing. This last portion also involves labor-intensive techniques at the microeconomic level such as binning and other forms of manipulation.

This is where the fear and loathing comes into play. It is not as if all information systems do not perform these functions in some manner, it is that in structured data all of this work has been done and, oftentimes, is handled by the database system. But even here there is a better way.

My colleague, Dave Gordon, who has his own blog, will emphasize that the identification of probable questions and configuration of queries in advance combined with the application of standard APIs will garner good results in most cases. Yet, one must be prepared to receive a certain amount of irrelevant information. For example, the query on Google of “Fun Things To Do” that you may use if you are planning for a weekend will yield all sorts of results, such as “50 Fun Things to Do in an Elevator.”  This result includes making farting sounds. The link provides some others, some of which are pretty funny. In writing this blog post, a simple search on Google for “Google query fails” yields what can only be described as a large number of query fails. Furthermore, this approach relies on the data originator to have marked the data with pointers and tags.

Given these different approaches to unstructured data and the complexity involved, there is a decision process to apply:

1. Determine if the data is truly unstructured. If the data is derived from a structured database from an existing application or set of applications, then it is structured and will require domain expertise to inherit the values and information content without expending unnecessary resources and time. A structured, systemic, and repeatable process can then be applied. Oftentimes an industry schema or standard can be leveraged to ensure consistency and fidelity.

2. Determine whether only a portion of the unstructured data is relative to your business processes and use it to append and enrich the existing structured data that has been used to integrate and expand your capabilities. In most cases the identification of a Rosetta Stone and standard APIs can be used to achieve this result.

3. For the remainder, determine the value of mining the targeted category of unstructured data and perform a business case analysis.

Given the rapidly expanding size of data that we can access using the advancing power of new technology, we must be able to distinguish between doing what is necessary from doing what is impressive. The definition of Big Data has evolved over time because our hardware, storage, and database systems allow us to access increasingly larger datasets that ten years ago would have been unimaginable. What this means is that–initially–as we work through this process of discovery, we will be bombarded with a plethora of irrelevant statistical measures and so-called predictive analytics that will eventually prove out to not pass the “so-what” test. This process places the users in a state of information overload, and we often see this condition today. It also means that what took an army of data scientists and developers to do ten years ago takes a technologist with a laptop and some domain knowledge to perform today. This last can be taught.

The next necessary step, aside from applying the decision process above, is to force our information systems to advance their processing to provide more relevant intelligence that is visualized and configured to the domain expertise required. In this way we will eventually discover the paradox that effectively accessing larger sets of data will yield fewer, more relevant intelligence that can be translated into action.

At the end of the day the manager and user must understand the data. There is no magic in data transformation or data processing. Even with AI and machine learning it is still incumbent upon the people within the organization to be able to apply expertise, perspective, knowledge, and wisdom in the use of information and intelligence.

Learning the (Data) — Data-Driven Management, HBR Edition

The months of December and January are usually full of reviews of significant events and achievements during the previous twelve months. Harvard Business Review makes the search for some of the best writing on the subject of data-driven transformation by occasionally publishing in one volume the best writing on a critical subject of interest to professional through the magazine OnPoint. It is worth making part of your permanent data management library.

The volume begins with a very concise article by Thomas C. Redman with the provocative title “Does Your Company Know What to Do with All Its Data?” He then goes on to list seven takeaways of optimizing the use of existing data that includes many of the themes that I have written about in this blog: better decision-making, innovation, what he calls “informationalize products”, and other significant effects. Most importantly, he refers to the situation of information asymmetry and how this provides companies and organizations with a strategic advantage that directly affects the bottom line–whether that be in negotiations with peers, contractual relationships, or market advantages. Aside from the OnPoint article, he also has some important things to say about corporate data quality. Highly recommended and a good reason to implement systems that assure internal information systems fidelity.

Edd Wilder-James also covers a theme that I have hammered home in a number of blog posts in the article “Breaking Down Data Silos.” The issue here is access to data and the manner in which it is captured and transformed into usable analytics. His recommended approach to a task that is often daunting is to find the path of least resistance in finding opportunities to break down silos and maximize data to apply advanced analytics. The article provides a necessary balm that counteracts the hype that often accompanies this topic.

Both of these articles are good entrees to the subject and perfectly positioned to prompt both thought and reflection of similar experiences. In my own day job I provide products that specifically address these business needs. Yet executives and management in all too many cases continue to be unaware of the economic advantages of data optimization or the manner in which continuing to support data silos is limiting their ability to effectively manage their organizations. There is no doubt that things are changing and each day offers a new set of clients who are feeling their way in this new data-driven world, knowing that the promises of almost effort-free goodness and light by highly publicized data gurus are not the reality of practitioners, who apply the detail work of data normalization and rationalization. At the end it looks like magic, but there is effort that needs to be expended up-front to get to that state. In this physical universe under the Second Law of Thermodynamics there are no free lunches–energy must be borrowed from elsewhere in order to perform work. We can minimize these efforts through learning and the application of new technology, but managers cannot pretend not to have to understand the data that they intend to use to make business decisions.

All of the longer form articles are excellent, but I am particularly impressed with the Leandro DalleMule and Thomas H. Davenport article entitled “What’s Your Data Strategy?” from the May-June 2017 issue of HBR. Oftentimes when addressing big data at professional conferences and in visiting businesses the topic often runs to the manner of handling the bulk of non-structured data. But as the article notes, less than half of an organization’s relevant structured data is actually used in decision-making. The most useful artifact that I have permanently plastered at my workplace is the graphic “The Elements of Data Strategy”, and I strongly recommend that any manager concerned with leveraging new technology to optimize data do the same. The graphic illuminates the defensive and offensive positions inherent in a cohesive data strategy leading an organization to the state: “In our experience, a more flexible and realistic approach to data and information architectures involves both a single source of truth (SSOT) and multiple versions of the truth (MVOTs). The SSOT works at the data level; MVOTs support the management of information.” Elimination of proprietary data silos, elimination of redundant data streams, and warehousing of data that is accessed using a number of analytical methods achieve the necessary states of SSOT that provides the basis for an environment supporting MVOTs.

The article “Why IT Fumbles Analytics” by Donald A. Marchand and Joe Peppard from 2013, still rings true today. As with the article cited above by Wilder-James, the emphasis here is with the work necessary to ensure that new data and analytical capabilities succeed, but the emphasis shifts to “figuring out how to use the information (the new system) generates to make better decisions or gain deeper…insights into key aspects of the business.” The heart of managing the effort in providing this capability is to put into place a project organization, as well as systems and procedures, that will support the organizational transformation that will occur as a result of the explosion of new analytical capability.

The days of simply buying an off-the-shelf silo-ed “tool” and automating a specific manual function are over, especially for organizations that wish to be effective and competitive–and more profitable–in today’s data and analytical environment. A more comprehensive and collaborative approach is necessary. As with the DalleMule and Davenport article, there is a very useful graphic that contrasts traditional IT project approaches against Analytics and Big Data (or perhaps “Bigger” Data) Projects. Though the prescriptions in the article assume an earlier concept of Big Data optimization focused on non-structured data, thereby making some of these overkill, an implementation plan is essential in supporting the kind of transformation that will occur, and managers act at their own risk if they fail to take this effect into account.

All of the other articles in this OnPoint issue are of value. The bottom line, as I have written in the past, is to keep a focus on solving business challenges, rather than buying the new bright shiny object. Alternatively, in today’s business environment the day that business decision-makers can afford to stay within their silo-ed comfort zone are phasing out very quickly, so they need to shift their attention to those solutions that address these new realities.

So why do this apart from the fancy term “data optimization”? Well, because there is a direct return-on-investment in transforming organizations and systems to data-driven ones. At the end of the day the economics win out. Thus, our organizations must be prepared to support and have a plan in place to address the core effects of new data-analytics and Big Data technology:

a. The management and organizational transformation that takes place when deploying the new technology, requiring proactive socialization of the changing environment, the teaching of new skill sets, new ways of working, and of doing business.

b. Supporting transformation from a sub-optimized silo-ed “tell me what I need to know” work environment to a learning environment, driven by what the data indicates, supporting the skills cited above that include intellectual curiosity, engaging domain expertise, and building cross-domain competencies.

c. A practical plan that teaches the organization how best to use the new capability through a practical, hands-on approach that focuses on addressing specific business challenges.

Rear View Mirror — Correcting a Project Management Fallacy

“The past is never dead. It’s not even past.” —  William Faulkner, Requiem for a Nun

Over the years I and others have briefed project managers on project performance using KPPs, earned value management, schedule analysis, business analytics, and what we now call predictive analytics. Oftentimes, some set of figures will be critiqued as being ineffective or unhelpful; that the analytics “only look in the rear view mirror” and that they “tell me what I already know.”

In approaching this critique, it is useful to understand Faulkner’s oft-cited quote above.  When we walk down a street, let us say it is a busy city street in any community of good size, we are walking in the past.  The moment we experience something it is in the past.  If we note the present condition of our city street we will see that for every building, park, sidewalk, and individual that we pass on that sidewalk, each has a history.  These structures and the people are as much driven by their pasts as their expectations for the future.

Now let us take a snapshot of our street.  In doing so we can determine population density, ethnic demographics, property values, crime rate, and numerous other indices and parameters regarding what is there.  No doubt, if we stop here we are just “looking in the rear view mirror” and noting what we may or may not know, however certain our anecdotal filter.

Now, let us say that we have an affinity for this street and may want to live there.  We will take the present indices and parameters that noted above, which describe our geographical environment, and trend it.  We may find that housing pricing are rising or falling, that crime is rising or falling, etc.  If we delve into the street’s ownership history we may find that one individual or family possesses more than one structure, or that there is a great deal of diversity.  We may find that a Superfund site is not too far away.  We may find that economic demographics are pointing to stagnation of the local economy, or that the neighborhood is becoming gentrified.  Just by time-phasing and delving into history–by mapping out the trends and noting the significant historical background–provides us with enough information to inform us about whether our affinity is grounded in reality or practicality.

But let us say that, despite negatives, we feel that this is the next up-and-coming neighborhood.  We would need signs to make that determination.  For example, what kinds of businesses have moved into the neighborhood and what is their number?  What demographic do they target?  There are many other questions that can be asked to see if our economic analysis is valid–and that analysis would need to be informed by risk.

The fact of the matter is that we are always living with the past: the cumulative effect of the past actions of numerous individuals, including our own, and organizations, groups of individuals, and institutions; not to mention larger economic forces well beyond our control.  Any desired change in the trajectory of the system being evaluated must identify those elements that can be impacted or influenced, and an analysis of the effort that must be expended to bring about the change, is also essential.

This is a scientific fact, proven countless times by physics, biology, and other disciplines.  A deterministic universe, which provides for some uncertainty at any given point at our level of existence, drives the possible within very small limits of possibility and even smaller limits of probability.  What this means in plain language is that the future is usually a function of the past.

Any one number or index, no doubt, does not necessarily tell us something important.  But it could if it is relevant, material, and prompts further inquiry essential to project performance.

For example, let us look at an integrated master schedule that underlies a typical medium-sized project.

 

We will select a couple of metrics that indicates project schedule performance.  In the case below we are looking at task hits and misses and Baseline Execution Index, a popular index that determines efficiency in meeting baseline schedule planning.

Note that the chart above plots the performance over time.  What will it take to improve our efficiency?  So as a quick logic check on realism, let’s take a look at the work to date with all of the late starts and finishes.

Our bow waves track the cumulative effort to date.  As we work to clear missed starts or missed finishes in a project we also must devote resources to the accomplishment of current work that is still in line with the baseline.  What this means is that additional resources may need to be devoted to particular areas of work accomplishment or risk handling.

This is not, of course, the limit to our analysis that should be undertaken.  The point here is that at every point in history in every system we stand at a point of the cumulative efforts, risk, failure, success, and actions of everyone who came before us.  At the microeconomic level this is also true within our project management systems.  There are also external constraints and influences that will define the framing assumptions and range of possibilities and probabilities involved in project outcomes.

The shear magnitude of the bow waves that we face in all endeavors will often be too great to fully overcome.  As an analogy, a bow wave in complex systems is more akin to a tsunami as opposed to the tidal waves that crash along our shores.  All of the force of all of the collective actions that have preceded present time will drive our trajectory.

This is known as inertia.

Identifying and understanding the contributors to the inertia that is driving our performance is important to knowing what to do.  Thus, looking in the rear view mirror is important and not a valid argument for ignoring an inconvenient metric that may only require additional context.  Furthermore, knowing where we sit is important and not insignificant.  Knowing the factors that put us where we are–and the effort that it will take to influence our destiny–will guide what is possible and not possible in our future actions.

Note:  All charted data is notional and is not from an actual project.

Like Tinker to Evers to Chance: BI to BA to KDD

It’s spring training time in sunny Florida, as well as other areas of the country with mild weather and baseball.  For those of you new to the allusion, it comes from a poem by Franklin Pierce Adams and is also known as “Baseball’s Sad Lexicon”.  Tinker, Evers, and Chance were the double play combination of the 1910 Chicago Cubs (shortstop, second base, and first base).  Because of their effectiveness on the field these Cubs players were worthy opponents of the old New York Giants, for whom Adams was a fan, and who were the kings of baseball during most of the first fifth of a century of the modern era (1901-1922).  That is, until they were suddenly overtaken by their crosstown rivals, the Yankees, who came to dominate baseball for the next 40 years, beginning with the arrival of Babe Ruth.

The analogy here is that the Cubs infielders, while individuals, didn’t think of their roles as completely separate.  They had common goals and, in order to win on the field, needed to act as a unit.  In the case of executing the double play, they were a very effective unit.  So why do we have these dichotomies in information management when the goals are the same?

Much has been written both academically and commercially about Business Intelligence, Business Analytics, and Knowledge Discovery in Databases.  I’ve surveyed the literature and for good and bad, and what I find is that these terms are thrown around, mostly by commercial firms in either information technology or consulting, all with the purpose of attempting to provide a discriminator for their technology or service.  Many times the concepts are used interchangeably, or one is set up as a strawman to push an agenda or product.  Thus, it seems some hard definitions are in order.

According to Technopedia:

Business Intelligence (BI) is the use of computing technologies for the identification, discovery and analysis of business data – like sales revenue, products, costs and incomes.

Business analytics (BA) refers to all the methods and techniques that are used by an organization to measure performance. Business analytics are made up of statistical methods that can be applied to a specific project, process or product. Business analytics can also be used to evaluate an entire company.

Knowledge Discover in Databases (KDD) is the process of discovering useful knowledge from a collection of data. This widely used data mining technique is a process that includes data preparation and selection, data cleansing, incorporating prior knowledge on data sets and interpreting accurate solutions from the observed results.

As with much of computing in its first phases, these functions were seen to be separate.

The perception of BI, based largely on the manner in which it has been implemented in its first incarnations, is viewed as a means of gathering data into relational data warehouses or data marts and then building out decision support systems.  These methods have usually involved a great deal of overhead in both computing and personnel, since practical elements of gathering, sorting, and delivering data involved additional coding and highly structured user interfaces.  The advantage of BI is its emphasis on integration.  The disadvantage from the enterprise perspective, is that the method and mode of implementation is phlegmatic at best.

BA is BI’s younger cousin.  Applications were developed and sold as “analytical tools” focused on a niche of data within the enterprise’s requirements.  In this manner decision makers could avoid having to wait for the overarching and ponderous BI system to get to their needs, if ever.  This led many companies to knit together specialized tools in so-called “best-of-breed” configurations to achieve some measure of integration across domains.  Of course, given the plethora of innovative tools, much data import and reconciliation has had to be inserted into the process.  Thus, the advantages of BA in the market have been to reward innovation and focus on the needs of the domain subject matter expert (SME).  The disadvantages are the insertion of manual intervention in an automated process due to lack of integration, which is further exacerbated by so-called SMEs in data reconciliation–a form of rent seeking behavior that only rewards body shop consulting, unnecessarily driving up overhead.  The panacea applied to this last disadvantage has been the adoption of non-proprietary XML schemas across entire industries that reduce both the overhead and data silos found in the BA market.

KDD is our both our oldster and youngster–grandpa and the grandson hanging out.  It is a term that describes a necessary function of insight–allowing one to determine what the data tells us are needed for analytics rather than relying on a “canned” solution to determine how to approach a particular set of data.  But it does so, oftentimes, using an older approach that predates BI, known as data mining.  You will often find KDD linked to arguments in favor of flat file schemas, NoSQL (meaning flat non-relational databases), and free use of the term Big Data, which is becoming more meaningless each year that it is used, given Moore’s Law.  The advantage of KDD is that it allows for surveying across datasets to pick up patterns and interrelationships within our systems that are otherwise unknown, particularly given the way in which the human mind can fool itself into reifying an invalid assumption.  The disadvantage, of course, is that KDD will have us go backward in terms of identifying and categorizing data by employing Data Mining, which is an older concept from early in computing in which a team of data scientists and data managers develop solutions to identify, categorize, and use that data–manually doing what automation was designed to do.  Understanding these limitations, companies focused on KDD have developed heuristics (cognitive computing) that identify patterns and possible linkages, removing a portion of the overhead associated with Data Mining.

Keep in mind that you never get anything for nothing–the Second Law of Thermodynamics ensures that energy must be borrowed from somewhere in order to produce something–and its corollaries place limits on expected efficiencies.  While computing itself comes as close to providing us with Maxwell’s Demon as any technology, even in this case entropy is being realized elsewhere (in the software developer and the hardware manufacturing process), even though it is not fully apparent in the observed data processing.

Thus, manual effort must be expended somewhere along the way.  In any sense, all of these methods are addressing the same problem–the conversion of data into information.  It is information that people can consume, understand, place into context, and act upon.

As my colleague Dave Gordon has pointed out to me several times that there are also additional methods that have been developed across all of these methods to make our use of data more effective.  These include more powerful APIs, the aforementioned cognitive computing, and searching based on the anticipated questions of the user as is used by search engines.

Technology, however, is moving very rapidly and so the lines between BI, BA and KDD are becoming blurred.  Fourth generation technology that leverages API libraries to be agnostic to underlying data, and flexible and adaptive UI technology can provide a  comprehensive systemic solution to bring together the goals of these approaches to data. With the ability to leverage internal relational database tools and flat schemas for non-relational databases, the application layer, which is oftentimes a barrier to delivery of information, becomes open as well, putting the SME back in the driver’s seat.  Being able to integrate data across domain silos provide insight into systems behavior and performance not previously available with “canned” applications written to handle and display data a particular way, opening up knowledge discovery in the data.

What this means practically is that those organizations that are sensitive to these changes will understand the practical application of sunk cost when it comes to aging systems being provided by ponderous behemoths that lack agility in their ability to introduce more flexible, less costly, and lower overhead software technologies.  It means that information management can be democratized within the organization among the essential consumers and decision makers.

Productivity and effectiveness are the goals.

Something New (Again)– Top Project Management Trends 2017

Atif Qureshi at Tasque, which I learned via Dave Gordon’s blog, went out to LinkedIn’s Project Management Community to ask for the latest tends in project management.  You can find the raw responses to his inquiry at his blog here.  What is interesting is that some of these latest trends are much like the old trends which, given continuity makes sense.  But it is instructive to summarize the ones that came up most often.  Note that while Mr. Qureshi was looking for ten trends, and taken together he definitely lists more than ten, there is a lot of overlap.  In total the major issues seem to the five areas listed below.

a.  Agile, its hybrids, and its practical application.

It should not surprise anyone that the latest buzzword is Agile.  But what exactly is it in its present incarnation?  There is a great deal of rising criticism, much of it valid, that it is a way for developers and software PMs to avoid accountability. Anyone ready Glen Alleman’s Herding Cat’s Blog is aware of the issues regarding #NoEstimates advocates.  As a result, there are a number hybrid implementations of Agile that has Agile purists howling and non-purists adapting as they always do.  From my observations, however, there is an Ur-Agile that is out there common to all good implementations and wrote about them previously in this blog back in 2015.  Given the time, I think it useful to repeat it here.

The best articulation of Agile that I have read recently comes from Neil Killick, whom I have expressed some disagreement on the #NoEstimates debate and the more cultish aspects of Agile in past posts, but who published an excellent post back in July (2015) entitled “12 questions to find out: Are you doing Agile Software Development?”

Here are Neil’s questions:

  1. Do you want to do Agile Software Development? Yes – go to 2. No – GOODBYE.
  2. Is your team regularly reflecting on how to improve? Yes – go to 3. No – regularly meet with your team to reflect on how to improve, go to 2.
  3. Can you deliver shippable software frequently, at least every 2 weeks? Yes – go to 4. No – remove impediments to delivering a shippable increment every 2 weeks, go to 3.
  4. Do you work daily with your customer? Yes – go to 5. No – start working daily with your customer, go to 4.
  5. Do you consistently satisfy your customer? Yes – go to 6. No – find out why your customer isn’t happy, fix it, go to 5.
  6. Do you feel motivated? Yes – go to 7. No – work for someone who trusts and supports you, go to 2.
  7. Do you talk with your team and stakeholders every day? Yes – go to 8. No – start talking with your team and stakeholders every day, go to 7.
  8. Do you primarily measure progress with working software? Yes – go to 9. No – start measuring progress with working software, go to 8.
  9. Can you maintain pace of development indefinitely? Yes – go to 10. No – take on fewer things in next iteration, go to 9.
  10. Are you paying continuous attention to technical excellence and good design? Yes – go to 11. No – start paying continuous attention to technical excellent and good design, go to 10.
  11. Are you keeping things simple and maximising the amount of work not done? Yes – go to 12. No – start keeping things simple and writing as little code as possible to satisfy the customer, go to 11.
  12. Is your team self-organising? Yes – YOU’RE DOING AGILE SOFTWARE DEVELOPMENT!! No – don’t assign tasks to people and let the team figure out together how best to satisfy the customer, go to 12.

Note that even in software development based on Agile you are still “provid(ing) value by independently developing IP based on customer requirements.”  Only you are doing it faster and more effectively.

With the possible exception of the “self-organizing” meme, I find that items through 11 are valid ways of identifying Agile.  Given that the list says nothing about establishing closed-loop analysis of progress says nothing about estimates or the need to monitor progress, especially on complex projects.  As a matter of fact one of the biggest impediments noted elsewhere in industry is the inability of Agile to scale.  This limitations exists in its most simplistic form because Agile is fine in the development of well-defined limited COTS applications and smartphone applications.  It doesn’t work so well when one is pushing technology while developing software, especially for a complex project involving hundreds of stakeholders.  One other note–the unmentioned emphasis in Agile is technical performance measurement, since progress is based on satisfying customer requirements.  TPM, when placed in the context of a world of limited resources, is the best measure of all.

b.  The integration of new technology into PM and how to upload the existing PM corporate knowledge into that technology.

This is two sides of the same coin.  There is always  debate about the introduction of new technologies within an organization and this debate places in stark contrast the differences between risk aversion and risk management.

Project managers, especially in the complex project management environment of aerospace & defense tend, in general, to be a hardy lot.  Consisting mostly of engineers they love to push the envelope on technology development.  But there is also a stripe of engineers among them that do not apply this same approach of measured risk to their project management and business analysis system.  When it comes to tracking progress, resource management, programmatic risk, and accountability they frequently enter the risk aversion mode–believing that the less eyes on what they do the more leeway they have in achieving the technical milestones.  No doubt this is true in a world of unlimited time and resources, but that is not the world in which we live.

Aside from sub-optimized self-interest, the seeds of risk aversion come from the fact that many of the disciplines developed around performance management originated in the financial management community, and many organizations still come at project management efforts from perspective of the CFO organization.  Such rice bowl mentality, however, works against both the project and the organization.

Much has been made of the wall of honor for those CIA officers that have given their lives for their country, which lies to the right of the Langley headquarters entrance.  What has not gotten as much publicity is the verse inscribed on the wall to the left:

“And ye shall know the truth and the truth shall make you free.”

      John VIII-XXXII

In many ways those of us in the project management community apply this creed to the best of our ability to our day-to-day jobs, and it lies as the basis for all of the management improvement from Deming’s concept of continuous process improvement, through the application of Six Sigma and other management improvement methods.  What is not part of this concept is that one will apply improvement only when a customer demands it, though they have asked politely for some time.  The more information we have about what is happening in our systems, the better the project manager and the project team is armed with applying the expertise which qualified the individuals for their jobs to begin with.

When it comes to continual process improvement one does not need to wait to apply those technologies that will improve project management systems.  As a senior management (and well-respected engineer) when I worked in Navy told me; “if my program managers are doing their job virtually every element should be in the yellow, for only then do I know that they are managing risk and pushing the technology.”

But there are some practical issues that all managers must consider when managing the risks in introducing new technology and determining how to bring that technology into existing business systems without completely disrupting the organization.  This takes–good project management practices that, for information systems, includes good initial systems analysis, identification of those small portions of the organization ripe for initial entry in piloting, and a plan of data normalization and rationalization so that corporate knowledge is not lost.  Adopting systems that support more open systems that militate against proprietary barriers also helps.

c.  The intersection of project management and business analysis and its effects.

As data becomes more transparent through methods of normalization and rationalization–and the focus shifts from “tools” to the knowledge that can be derived from data–the clear separation that delineated project management from business analysis in line-and-staff organization becomes further blurred.  Even within the project management discipline, the separation in categorization of schedule analysts from cost analysts from financial analyst are becoming impediments in fully exploiting the advantages in looking at all data that is captured and which affects project performance.

d.  The manner of handling Big Data, business intelligence, and analytics that result.

Software technologies are rapidly developing that break the barriers of self-contained applications that perform one or two focused operations or a highly restricted group of operations that provide functionality focused on a single or limited set of business processes through high level languages that are hard-coded.  These new technologies, as stated in the previous section, allow users to focus on access to data, making the interface between the user and the application highly adaptable and customizable.  As these technologies are deployed against larger datasets that allow for integration of data across traditional line-and-staff organizations, they will provide insight that will garner businesses competitive advantages and productivity gains against their contemporaries.  Because of these technologies, highly labor-intensive data mining and data engineering projects that were thought to be necessary to access Big Data will find themselves displaced as their cost and lack of agility is exposed.  Internal or contracted out custom software development devoted along these same lines will also be displaced just as COTS has displaced the high overhead associated with these efforts in other areas.  This is due to the fact that hardware and processes developments are constantly shifting the definition of “Big Data” to larger and larger datasets to the point where the term will soon have no practical meaning.

e.  The role of the SME given all of the above.

The result of the trends regarding technology will be to put the subject matter expert back into the driver’s seat.  Given adaptive technology and data–and a redefinition of the analyst’s role to a more expansive one–we will find that the ability to meet the needs of functionality and the user experience is almost immediate.  Thus, when it comes to business and project management systems, the role of Agile, while these developments reinforce the characteristics that I outlined above are made real, the weakness of its applicability to more complex and technical projects is also revealed.  It is technology that will reduce the risk associated with contract negotiation, processes, documentation, and planning.  Walking away from these necessary components to project management obfuscates and avoids the hard facts that oftentimes must be addressed.

One final item that Mr. Qureshi mentions in a follow-up post–and which I have seen elsewhere in similar forums–concerns operational security.  In deployment of new technologies a gatekeeper must be aware of whether that technology will not open the organization’s corporate knowledge to compromise.  Given the greater and more integrated information and knowledge garnered by new technology, as good managers it is incumbent to ensure these improvements do not translate into undermining the organization.

The Future — Data Focus vs. “Tools” Focus

The title in this case is from the Leonard Cohen song.

Over the last few months I’ve come across this issue quite a bit and it goes to the heart of where software technology is leading us.  The basic question that underlies this issue can be boiled down into the issue of whether software should be thought of as a set of “tools” or an overarching solution that can handle data in a way that the organization requires.  It is a fundamental question because what we call Big Data–despite all of the hoopla–is really a relative term that changes with hardware, storage, and software scalability.  What was Big Data in 1997 is not Big Data in 2016.

As Moore’s Law expands scalability at lower cost, organizations and SMEs are finding that the dedicated software tools at hand are insufficient to leverage the additional information that can be derived from that data.  The reason for this is simple.  A COTS tools publisher will determine the functionality required based on a structured set of data that is to be used and code to that requirement.  The timeframe is usually extended and the approach highly structured.  There are very good reasons for this approach in particular industries where structure is necessary and the environment is fairly stable.  The list of industries that fall into this category is rapidly becoming smaller.  Thus, there is a large gap that must be filled by workarounds, custom code, and suboptimized use of Excel.  Organizations and people cannot wait until the self-styled software SMEs get around to providing that upgrade two years from now so that people can do their jobs.

Thus, the focus must be shifted to data and the software technologies that maximize its immediate exploitation for business purposes to meet organizational needs.  The key here is the arise of Fourth Generation applications that leverage object oriented programming language that most closely replicate the flexibility of open source.  What this means is that in lieu of buying a set of “tools”–each focused on solving a specific problem stitched together by a common platform or through data transfer–that software that deals with both data and UI in an agnostic fashion is now available.

The availability of flexible Fourth Generation software is of great concern, as one would imagine, to incumbents who have built their business model on defending territory based on a set of artifacts provided in the software.  Oftentimes these artifacts are nothing more than automatically filled in forms that previously were filled in manually.  That model was fine during the first and second waves of automation from the 1980s and 1990s, but such capabilities are trivial in 2016 given software focused on data that can be quickly adapted to provide functionality as needed.  What this development also does is eliminate and make trivial those old checklists that IT shops used to send out in a lazy way of assessing relative capabilities of software to simplify the competitive range.

Tools restrict themselves to a subset of data by definition to provide a specific set of capabilities.  Software that expands to include any set of data and allows that data to be displayed and processed as necessary through user configuration adapts itself more quickly and effectively to organizational needs.  They also tend to eliminate the need for multiple “best-of-breed” toolset approaches that are not the best of any breed, but more importantly, go beyond the limited functionality and ways of deriving importance from data found in structured tools.  The reason for this is that the data drives what is possible and important, rather than tools imposing a well-trod interpretation of importance based on a limited set of data stored in a proprietary format.

An important effect of Fourth Generation software that provides flexibility in UI and functionality driven by the user is that it puts the domain SME back in the driver’s seat.  This is an important development.  For too long SMEs have had to content themselves with recommending and advocating for functionality in software while waiting for the market (software publishers) to respond.  Essential business functionality with limited market commonality often required that organizations either wait until the remainder of the market drove software publishers to meet their needs, finance expensive custom development (either organic or contracted), or fill gaps with suboptimized and ad hoc internal solutions.  With software that adapts its UI and functionality based on any data that can be accessed, using simple configuration capabilities, SMEs can fill these gaps with a consistent solution that maintains data fidelity and aids in the capture and sustainability of corporate knowledge.

Furthermore, for all of the talk about Agile software techniques, one cannot implement Agile using software languages and approaches that were designed in an earlier age that resists optimization of the method.  Fourth Generation software lends itself most effectively to Agile since configuration using simple object oriented language gets us to the ideal–without a reliance on single points of failure–of releasable solutions at the end of a two-week sprint.  No doubt there are developers out there making good money that may challenge this assertion, but they are the exceptions to the rule that prove the point.  An organization should be able to optimize the pool of contributors to solution development and rollout in supporting essential business processes.  Otherwise Agile is just a pretext to overcome suboptimized developmental approaches, software languages, and the self-interest of developers that can’t plan or produce a releasable product in a timely manner within budgetary constraints.

In the end the change in mindset from tools to data goes to the issue of who owns the data: the organization that creates and utilizes the data (the customer), or the proprietary software tool publishers?  Clearly the economics will win out in favor of the customer.  It is time to displace “tools” thinking.

Note:  I’ve revised the title of the blog for clarity.

Walk This Way — DoD IG Reviews DCMA Contracting Officer Business Systems Deficiencies

The sufficiency and effectiveness of business systems is an essential element in the project management ecosystem.  Far beyond performance measurement of the actual effort, the sufficiency of the business systems to support the effort are essential in its success.  If the systems in place do not properly track and record the transactions behind the work being performed, the credibility of the data is called into question.  Furthermore, support and logistical systems, such as procurement, supply, and material management, contribute in a very real way, to work accomplishment.  If that spare part isn’t in-house on time, the work stops.

In catching up on reading this month, I found that the DoD Inspector General issued a report on October 1 showing that of 21 audits demonstrating business system deficiencies, contracting officer timeliness in meeting DFARS deadlines at various milestones existed in every case.  For example, in 17 of those cases Contracting Officers did not issue final determination letters within 30 days of the report as required by the DFARS.  In eight cases required withholds were not assessed.

For those of you who are unfamiliar with the six business systems assessed under DoD contractor project management, they consist of accounting, estimating, material management, purchasing, earned value management, and government property.  The greater the credibility and fidelity of these systems, the greater level of confidence that the government can have in ensuring that the data received in reporting on execution of public funds under these contracts.

To a certain extent the deadlines under the DFARS are so tightly scheduled that they fail to take into account normal delays in operations.  Forbid that the Contracting Officer may be on leave when the audit is received or is engaged in other detailed negotiations.  In recent years the contracting specialty within the government, like government in general, has been seriously understaffed, underfunded, and unsupported.  Given that oftentimes the best and the brightest soon leave government service for greener pastures in the private sector, what is often left are inexperienced and overworked (though mostly dedicated) personnel who do not have the skills or the time to engage in systems thinking in approaching noted deficiencies in these systems.

This pressure for staff reduction, even in areas that have been decimated by austerity politics, is significant.  In the report I could not help but shake my head when an Excel spreadsheet was identified as the “Contractor Business System Determination Timeline Tracking Tool.”  This reminds me of my initial assignment as a young Navy officer and my first assignment as a contract negotiator where I also performed collateral duties in building simple automated tools.  (This led to me being assigned later as the program manager of the first Navy contract and purchase order management system.) That very first system that I built, however, was tracking contract milestone deadlines.  It was done in VisiCalc and the year was 1984.

That a major procurement agency of the U.S. Department of Defense is still using a simple and ineffective spreadsheet tracking “tool” more than 30 years after my own experience is both depressing and alarming.  There is a long and winding history on why they would find themselves in this condition, but some additional training, which was the agency’s response to the IG, is not going to solve the problem.  In fact, such an approach is so ineffective it’s not even a Band-Aid.  It’s a bureaucratic function of answering the mail.

The reason why it won’t solve the problem is because there is no magic wand to get those additional contract negotiators and contracting officers in place.  The large intern program of recruiting young people from colleges to grow talent and provide people with a promising career track is long gone.  Interdisciplinary and cross-domain expertise required in today’s world to reflect the new realities when procuring products and services are not in the works.  In places where they are being attempted, outmoded personnel classification systems based on older concepts of division of labor stand in the way.

The list of systemic causes could go on, but in the end it’s not in the DCMA response because no one cares, and if they do care, they can’t do anything about it.  It’s not as if “BEST TALENT LEAVES DUE TO PUBLIC HOSTILITY TO PUBLIC SERVICE”  was a headline of any significance.  The Post under Bezos is not going to run that one anytime soon, though we’ve been living under it since 1981.  The old “thank you for your service” line for veterans has become a joke.  Those who use this line might as well say what that really means, which is: “I’m glad it was you and not me.”

The only realistic way to augment an organization in this state in order the break the cycle is to automate the system–and to do it in a way as to tie together the entire system.  When I run into my consulting friends and colleagues and they repeat the mantra: “software doesn’t matter, it’s all based on systems” I can only shake my head.  I have learned to be more tactful.

In today’s world software matters.  Try doing today what we used to do with slide rules, scientific calculators, and process charts absent software.  Compare organizations that use the old division-of-labor, “best of breed” tool concept against those who have integrated their systems and use data across domains effectively.  Now tell me again why “software doesn’t matter.”  Not only does it matter but “software” isn’t all the same.  Some “software” consists of individual apps that do one thing.  Some “software” is designed to address enterprise challenges.  Some “software” is designed not only to enterprise challenges, but also to address the maximization of value in enterprise data.

In the case of procurement and business systems assessment, the only path forward for the agency will be to apply data-driven measures to the underlying systems and tie those assessments into a systemic solution that includes the contracting officers, negotiators, administrators, contracting officer representatives, the auditors, analysts, and management.  One can see, just in writing one line, how much more complex are the requirements for the automated panacea to replace “Contractor Business System Determination Timeline Tracking Tool.”  Is there any question why the “tool” is ineffective?

If this were the 1990s, though the practice still persists, we would sit down, perform systems analysis, outline the systems and subsystem solutions, and then through various stages of project management, design the software system to reflect the actual system in place as if organizational change did not exist.  This is the process that has a 90% failure rate across government and industry.  The level of denial to this figure is so great that I run into IT managers and CIOs every day that fail to know it or, if they do, believe that it will apply to them–and these are brilliant people.  It is selection bias and optimism, with a little (or a lot) of narcissism, run amok.  The physics and math on this are so well documented that you might as well take your organization’s money and go to Vegas with it.  Your local bookie could give you better odds.

The key is risk handling (not the weasel word “management,” not “mitigation” since some risks must simply be accepted, and certainly not the unrealistic term “avoidance”), and the deployment of technology that provides at least a partial solution to the entire problem, augmented by incremental changes to incorporate each system into the overall solution. For example, DeLong and Froomkin’s seminal paper on what they called “The Next Economy” holds true today.  The lack of transparency in software technologies requires a process whereby the market is surveyed, vendors must go through a series of assessments and demonstration tests, and where the selected technology then goes through stage gates: proof-of-concept, pilot, and, eventually deployment.  Success at each level gets rewarded with proceeding to the next step.

Thus, ideally the process includes introducing into the underlying functionality the specific functionality required by the organization through Agile processes where releasable versions of the solution are delivered at the end of each sprint.  One need not be an Agile Cultist to do this.  In my previous post I referred to Neil Killick’s simple checklist for whether you are engaged in Agile.  It is the best and most succinct distillation of both the process and value inherent in Agile that I have found to date, with all of the “woo-woo” taken out.  For an agency as Byzantine as DCMA, this is really the only realistic and effective approach.

DCMA is an essential agency in DoD acquisition management, but it cannot do what it once did under a more favorable funding environment.  To be frank, it didn’t even do its job all that well when a more favorable condition was in place, though things were better.  But this is also a factor in why it finds itself in its current state.  It was punished for its transgressions, perhaps too much.  Several waves of personnel cuts, staff reductions, and domain and corporate knowledge loss on top of the general trend has created an agency in a condition of siege.  As with any organization under siege, backbiting and careerism for those few remaining is rewarded.  Iconoclasts and thought leaders stay for a while before being driven away.  They are seen as being too risky.

This does not create a condition for an agency ready to accept or quickly execute change through new technology.  What it does do is allow portions of the agency to engage in cargo cult change management.  That is, it has the appearance of change but keeps self-interest comfortable and change in its place.  Over time–several years–with the few remaining resources committed to this process, they will work the “change.”  Eventually, they may even get something tangible, though suboptimized to conform to rice bowls; preferably after management has their retirement plans secured.

Still, the reality is that DCMA must be made to do it’s job because it is in the best interests of the U.S. Department of Defense.  The panacea will not be found through “collaboration” with industry, which consists of the companies which DCMA is tasked with overseeing and regulating.  We all know how well deregulation and collaboration has worked in the financial derivatives, banking, mortgage, and stock markets.  Nor will it come from organic efforts within an understaffed and under-resourced agency that will be unable to leverage the best and latest technology solutions under the unforgiving math of organic IT failure rates.  Nor will deploying the long outmoded approach of deploying suboptimized “tools” to address a particular problem.  The proper solution is to leverage effective COTS solutions that facilitate the challenge of systems integration and thinking.