Both Sides Now — The Value of Data Exploration

Over the last several months I have authored a number of stillborn articles that just did not live up to the standards that I set for this blog site. After all, sometimes we just have nothing important to add to the conversation. In a world dominated by narcissism, it is not necessary to constantly have something to say. Some reflection and consideration are necessary, especially if one is to be as succinct as possible.

A quote ascribed to Woodrow Wilson, which may be apocryphal, though it does appear in two of his biographies, was in response to being lauded by someone for making a number of short, succinct, and informative speeches. When asked how he was able to do this, President Wilson is supposed to have replied:

“It depends. If I am to speak ten minutes, I need a week for preparation; if fifteen minutes, three days; if half an hour, two days; if an hour, I am ready now.”

An undisciplined mind has a lot to say about nothing in particular with varying degrees of fidelity to fact or truth. When in normal conversation we most often free ourselves from the discipline expected for more rigorous thinking. This is not necessarily a bad thing if we are saying nothing of consequence and there are gradations, of course. Even the most disciplined mind gets things wrong. We all need editors and fact checkers.

While I am pulling forth possibly apocryphal quotes, the one most applicable that comes to mind is the comment by Hemingway as told by his deckhand in Key West and Cuba, Arnold Samuelson. Hemingway was supposed to have given this advice to the aspiring writer:

“Don’t get discouraged because there’s a lot of mechanical work to writing. There is, and you can’t get out of it. I rewrote the first part of A Farewell to Arms at least fifty times. You’ve got to work it over. The first draft of anything is shit. When you first start to write you get all the kick and the reader gets none, but after you learn to work it’s your object to convey everything to the reader so that he remembers it not as a story he had read but something that happened to himself.”

Though it deals with fiction, Hemingway’s advice applies to any sort of writing and rhetoric. Dr. Roger Spiller, who more than anyone mentored me as a writer and historian, once told me, “Writing is one of those skills that, with greater knowledge, becomes harder rather than easier.”

As a result of some reflection, over the last few months, I had to revisit the reason for the blog. Thus, this is still its purpose: it is a way to validate ideas and hypotheses with other professionals and interested amateurs in my areas of interest. I try to keep uninformed opinion in check, as all too many blogs turn out to be rants. Thus, a great deal of research goes into each of these posts, most from primary sources and from interactions with practitioners in the field. Opinions and conclusions are my own, and my reasoning for good or bad are exposed for all the world to see and I take responsibility for them.

This being said, part of my recent silence has also been due to my workload in–well–the effort involved in my day job of running a technology company, and in my recent role, since late last summer, as the Managing Editor of the College of Performance Management’s publication known as the Measurable News. Our emphasis in the latter case has been to find new contributions to the literature regarding business analytics and to define the concept of integrated project, program, and portfolio management. Stepping slightly over the line to make a pitch, I recommend anyone interested in contributing to the publication to submit an article. The submission guidelines can be found here.

Both Sides Now: New Perspectives

That out of the way, I recently saw, again on the small screen, the largely underrated movie about Neil Armstrong and the Apollo 11 moon landing, “First Man”, and was struck by this scene:

Unfortunately, the first part of the interview has been edited out of this clip and I cannot find a full scene. When asked “why space” he prefaces his comments by stating that the atmosphere of the earth seems to be so large from the perspective of looking at it from the ground but that, having touched the edge of space previously in his experience as a test pilot of the X15, he learned that it is actually very thin. He then goes on to posit that looking at the earth from space will give us a new perspective. His conclusion to this observation is then provided in the clip.

Armstrong’s words were prophetic in that the space program provided a new perspective and a new way of looking at things that were in front of us the whole time. Our spaceship Earth is a blue dot in a sea of space and, at least for a time, the people of our planet came to understand both our loneliness in space and our interdependence.

Earth from Apollo 8. Photo courtesy of NASA.


The impact of the Apollo program resulted in great strides being made in environmental and planetary sciences, geology, cosmology, biology, meteorology, and in day-to-day technology. The immediate effect was to inspire the environmental and human rights movements, among others. All of these advances taken together represent a new revolution in thought equal to that during the initial Enlightenment, one that is not yet finished despite the headwinds of reaction and recidivism.

It’s Life’s Illusions I Recall: Epistemology–Looking at and Engaging with the World

In his book Darwin’s Dangerous Idea, Daniel Dennett posited that what was “dangerous” about Darwinism is that it acts as a “universal acid” that, when touching other concepts and traditions, transforms them in ways that change our world-view. I have accepted this position by Dennett through the convincing argument he makes and the evidence in front of us, and it is true that Darwinism–the insight in the evolution of species over time through natural selection–has transformed our perspective of the world and left the old ways of looking at things both reconstructed and unrecognizable.

In his work, Time’s Arrow, Time’s Cycle, Stephen Jay Gould noted that Darwinism is part of one of the three great reconstructions of human thought that, in quoting Sigmund Freud, where “Humanity…has had to endure from the hand of science…outrages upon its naive self-love.” These outrages include the Copernican revolution that removed the Earth from the center of the universe, Darwinism and the origin of species, including the descent of humanity, and what John McPhee, coined as the concept of “deep time.”

But–and there is a “but”–I would propose that Darwinism and the other great reconstructions noted are but different ingredients of a larger and more broader, though compatible, type of innovation in the way the world is viewed and how it is approached–a more powerful universal acid. That innovation in thought is empiricism.

It is this approach to understanding that eats through the many ills of human existence that lead to self-delusion and folly. Though you may not know it, if you are in the field of information technology or any of the sciences, you are part of this way of viewing and interacting with the world. Married with rational thinking, this epistemology–coming from the perspectives of the astronomical observations of planets and other heavenly bodies by Charles Sanders Peirce, with further refinements by William James and John Dewey, and others have come down to us in what is known as Pragmatism. (Note that the word pragmatism in this context is not the same as the more generally used colloquial form of the word. For this type of reason Peirce preferred the term “pragmaticism”). For an interesting and popular reading of the development of modern thought and the development of Pragmatism written for the general reader I highly recommend the Pulitzer Prize-winning The Metaphysical Club by Louis Menand.

At the core of this form of empiricism is that the collection of data, that is, recording, observing, and documenting the universe and nature as it is will lead us to an understanding of things that we otherwise would not see. In our more mundane systems, such as business systems and organized efforts applying disciplined project and program management techniques and methods, we also can learn more about these complex adaptive systems through the enhanced collection and translation of data.

I Really Don’t Know Clouds At All: Data, Information, Intelligence, and Knowledge

The term “knowledge discovery in data”, or KDD for short, is an aspirational goal and so, in terms of understanding that goal, is a point of departure from the practice information management and science. I’m taking this stance because the technology industry uses terminology that, as with most language, was originally designed to accurately describe a specific phenomenon or set of methods in order to advance knowledge, only to find that that terminology has been watered down to the point where it obfuscates the issues at hand.

As I traveled to locations across the U.S. over the last three months, I found general agreement among IT professionals who are dealing with the issues of “Big Data”, data integration, and the aforementioned KDD of this state of affairs. In almost every case there is hesitation to use this terminology because it has been absconded and abused by mainstream literature, much as physicists rail against the misuse of the concept of relativity by non-scientific domains.

The impact of this confusion in terminology has caused organizations to make decisions where this terminology is employed to describe a nebulous end-state, without the initiators having an idea of the effort or scope. The danger here, of course, is that for every small innovative company out there, there is also a potential Theranos (probably several). For an in-depth understanding of the psychology and double-speak that has infiltrated our industry I highly recommend the HBO documentary, “The Inventor: Out for Blood in Silicon Valley.”

The reason why semantics are important (as they always have been despite the fact that you may have had an associate complain about “only semantics”) is that they describe the world in front of us. If we cloud the meanings of words and the use of language, it undermines the basis of common understanding and reveals the (poor) quality of our thinking. As Dr. Spiller noted, the paradox of writing and in gathering knowledge is that the more you know, the more you realize you do not know, and the harder writing and communicating knowledge becomes, though we must make the effort nonetheless.

Thus KDD is oftentimes not quite the discovery of knowledge in the sense that the term was intended to mean. It is, instead, a discovery of associations that may lead us to knowledge. Knowing this distinction is important because the corollary processes of data mining, machine learning, and the early application of AI in which we find ourselves is really the process of finding associations, correlations, trends, patterns, and probabilities in data that is approached in a manner as if all information is flat, thereby obliterating its context. This is not knowledge.

We can measure the information content of any set of data, but the real unlocked potential in that information content will come with the processing of it that leads to knowledge. To do that requires an underlying model of domain knowledge, an understanding of the different lexicons in any given set of domains, and a Rosetta Stone that provides a roadmap that identifies those elements of the lexicon that are describing the same things across them. It also requires capturing and preserving context.

For example, when I use the chat on my iPhone it attempts to anticipate what I want to write. I am given three choices of words to choose if I want to use this shortcut. In most cases, the iPhone guesses wrong, despite presenting three choices and having at its disposal (at least presumptively) a larger vocabulary than the writer. Oftentimes it seems to take control, assuming that I have misspelled or misidentified a word and chooses the wrong one for me, where my message becomes a nonsense message.

If one were to believe the hype surrounding AI, one would think that there is magic there but, as Arthur C. Clarke noted (known as Clarke’s Third Law): “Any sufficiently advanced technology is indistinguishable from magic.” Familiar with the new technologies as we are, we know that there is no magic there, and also that it is consistently wrong a good deal of the time. But many individuals come to rely upon the technology nonetheless.

Despite the gloss of something new, the long-established methods of epistemology, code-breaking, statistics, and Calculus apply–as do standards of establishing fact and truth. Despite a large set of data, the iPhone is wrong because the iPhone does not understand–does not possess knowledge–to know why it is wrong. As an aside, its dictionary is also missing a good many words.

A Segue and a Conclusion–I Still Haven’t Found What I’m Looking For: Why Data Integration?…and a Proposed Definition of the Bigness of Data

As with the question to Neil Armstrong, so the question on data. And so the answer is the same. When we look at any set of data under a particular structure of a domain, the information we derive provides us with a manner of looking at the world. In economic systems, businesses, and projects that data provides us with a basis for interpretation, but oftentimes falls short of allowing us to effectively describe and understand what is happening.

Capturing interrelated data across domains allows us to look at the phenomena of these human systems from a different perspective, providing us with the opportunity to derive new knowledge. But in order to do this, we have to be open to this possibility. It also calls for us to, as I have hammered home in this blog, reset our definitions of what is being described.

For example, there are guides in project and program management that refer to statistical measures as “predictive analytics.” This further waters down the intent of the phrase. Measures of earned value are not predictive. They note trends and a single-point outcome. Absent further analysis and processing, the statistical fallacy of extrapolation can be baked into our analysis. The same applies to any index of performance.

Furthermore, these indices and indicators–for that is all they are–do not provide knowledge, which requires a means of not only distinguishing between correlation and causation but also applying contextualization. All systems operate in a vector space. When we measure an economic or social system we are really measuring its behavior in the vector space that it inhabits. This vector space includes the way it is manifested in space-time: the equivalent of length, width, depth (that is, its relative position, significance, and size within information space), and time.

This then provides us with a hint of a definition of what often goes by the definition of “big data.” Originally, as noted in previous blogs, big data was first used in NASA in 1997 by Cox and Ellsworth (not as credited to John Mashey on Wikipedia with the dishonest qualifier “popularized”) and was simply a statement meaning “datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.”

This is a relative term given Moore’s Law. But we can begin to peel back a real definition of the “bigness” of data. It is important to do this because too many approaches to big data assume it is flat and then apply probabilities and pattern recognition to data that undermines both contextualization and knowledge. Thus…

The Bigness of Data (B) is a function (f ) of the entropy expended (S) to transform data into information, or to extract its information content.

Information evolves. It evolves toward greater complexity just as life evolves toward greater complexity. The universe is built on coded bits of information that, taken together and combined in almost unimaginable ways, provides different forms of life and matter. Our limited ability to decode and understand this information–and our interactions in it– are important to us both individually and collectively.

Much entropy is already expended in the creation of the data that describes the activity being performed. Its context is part of its information content. Obliterating the context inherent in that information content causes all previous entropy to be of no value. Thus, in approaching any set of data, the inherent information content must be taken into account in order to avoid the unnecessary (and erroneous) application of data interpretation.

More to follow in future posts.

I Can’t Drive 55 — The New York Times and Moore’s Law

Yesterday the New York Times published an article about Moore’s Law.  While interesting in that John Markoff, who is the Times science writer, speculates that in about 5 years the computing industry will be “manipulating material as small as atoms” and therefore may hit a wall in what has become a back of the envelope calculation of the multiplicative nature of computing complexity and power in the silicon age.

This article prompted a follow on from Brian Feldman at NY Mag, that the Institute of Electrical and Electronics Engineers (IEEE) has anticipated a broader definition of the phenomenon of the accelerating rate of computing power to take into account quantum computing.  Note here that the definition used in this context is the literal one: the doubling of the number of transistors over time that can be placed on a microchip.  That is a correct summation of what Gordon Moore said, but it not how Moore’s Law is viewed or applied within the tech industry.

Moore’s Law (which is really a rule of thumb or guideline in lieu of an ironclad law) has been used, instead, as a analogue to describe the geometric acceleration that has been seen in computer power over the last 50 years.  As Moore originally described the phenomenon, the doubling of transistors occurred every two years.  Then it was revised later to occur about every 18 months or so, and now it is down to 12 months or less.  Furthermore, aside from increasing transistors, there are many other parallel strategies that engineers have applied to increase speed and performance.  When we combine the observation of Moore’s Law with other principles tied to the physical world, such as Landauer’s Principle and Information Theory, we begin to find a coherence in our observations that are truly tied to physics.  Thus, rather than being a break from Moore’s Law (and the observations of these other principles and theory noted above), quantum computing, to which the articles refer, sits on a continuum rather than a break with these concepts.

Bottom line: computing, memory, and storage systems are becoming more powerful, faster, and expandable.

Thus, Moore’s Law in terms of computing power looks like this over time:

Moore's Law Chart

Furthermore, when we calculate the cost associated with erasing a bit of memory we begin to approach identifying the Demon* in defying the the Second Law of Thermodynamics.

Moore's Law Cost Chart

Note, however, that the Second Law is not really being defied, it is just that we are constantly approaching zero, though never actually achieving it.  But the principle here is that the marginal cost associated with each additional bit of information become vanishingly small to the point of not passing the “so what” test, at least in everyday life.  Though, of course, when we get to neural networks and strong AI such differences are very large indeed–akin to mathematics being somewhat accurate when we want to travel from, say, San Francisco to London, but requiring more rigor and fidelity when traveling from Kennedy Space Center to Gale Crater on Mars.

The challenge, then, in computing is to be able to effectively harness such power.  Our current programming languages and operating environments are only scratching the surface of how to do this, and the joke in the industry is that the speed of software is inversely proportional to the advance in computing power provided by Moore’s Law.  The issue is that our brains, and thus the languages we harness to utilize computational power, are based in an analog understanding of the universe, while the machines we are harnessing are digital.  For now this knowledge can only build bad software and robots, but given our drive into the brave new world of heuristics, may lead us to Skynet and the AI apocalypse if we are not careful–making science fiction, once again, science fact.

Back to present time, however, what this means is that for at least the next decade, we will see an acceleration of the ability to use more and larger sets of data.  The risks, that we seem to have to relearn due to a new generation of techies entering the market which lack a well rounded liberal arts education, is that the basic statistical and scientific rules in the conversion, interpretation, and application of intelligence and information can still be roundly abused and violated.  Bad management, bad decision making, bad leadership, bad mathematics, bad statisticians, specious logic, and plain old common human failings are just made worse, with greater impact on more people, given the misuse of that intelligence and information.

The watchman against these abuses, then, must be incorporated into the solutions that use this intelligence and information.  This is especially critical given the accelerated pace of computing power, and the greater interdependence of human and complex systems that this acceleration creates.

*Maxwell’s Demon

Note:  I’ve defaulted to the Wikipedia definitions of both Landauer’s Principle and Information Theory for the sake of simplicity.  I’ve referenced more detailed work on these concepts in previous posts and invite readers to seek those out in the archives of this blog.

Super Doodle Dandy (Software) — Decorator Crabs and Wirth’s Law


The song (absent the “software” part) in the title is borrowed from the soundtrack of the movie, The Incredible Mr. Limpet.  Made in the day before Pixar and other recent animation technologies, it remains a largely unappreciated classic; combining photography and animation in a time of more limited tools, but with Don Knotts creating another unforgettable character beyond Barney Fife.  Somewhat related to what I am about to write, Mr. Limpet taught the creatures of the sea new ways of doing things, helping them overcome their mistaken assumptions about the world.

The photo that opens this post is courtesy of the Monterey Aquarium and looks to be the crab Oregonia gracilis, commonly referred to as the Graceful Decorator Crab.  There are all kinds of Decorator Crabs, most of which belong to the superfamily Majoidea.  The one I most often came across and raised in aquaria was Libinia dubia, an east coast cousin.  You see, back in a previous lifetime I had aspirations to be a marine biologist.  My early schooling was based in the sciences and mathematics.  Only later did I gradually gravitate to history, political science, and the liberal arts–finally landing in acquisition and high tech project management, which tends to borrow something from all of these disciplines.  I believe that my former concentration of studies have kept me grounded in reality–in viewing life the way it is and the mysteries that are yet to be solved in the universe absent resort to metaphysics or irrationality–while the latter concentrations have connected me to the human perspective in experiencing and recording existence.

But there is more to my analogy than self-explanation.  You see, software development exhibits much of the same behavior of Decorator Crabs.

In my previous post I talk about Moore’s Law and the compounding (doubling) of greater processor power in computing every 12 to 24 months.  (It does not seem to be as much a physical law as an observation, and we can only guess how long this trend will continue).  We also see a corresponding reduction in cost vis-à-vis this greater capability.  Yet, despite these improvements, we find that software often lags behind and fails to leverage this capability.

The observation that has recorded this phenomenon is found in Wirth’s Law, which posits that software is getting slower at a faster rate than computer hardware is getting faster.  There are two variants of this law, one ironic and the other only less so.  These are May’s and Gates’ variants.  Basically these posit that software speed halves every 18 months, thereby negating Moore’s Law.  But why is this?

For first causes one need only look to the Decorator Crab.  You see, the crab, all by itself, is a typical crab: an arthropod invertebrate with a hard carapace, spikes on its exoskeleton, segmented body with jointed limbs, five pairs of legs, the first pair of legs usually containing chelae (the familiar pincers and claws).  There are all kinds of crabs in salt, fresh, and brackish water.  They tend to be well adapted to their environment.  But they are also tasty and high in protein value, thus having a number of predators.  So the Decorator Crab has determined that what evolution has provided is not enough–it borrows features and items from its environment to enhance its capabilities as a defense mechanism.  There is a price to being a Decorator Crab.  Encrustations also become encumbrances.  Where crabs have learned to enhance their protections, for example by attaching toxic sponges and anemones, these enhancements may also have made them complaisant because, unlike most crabs, Decorator Crabs don’t tend to scurry from crevice to crevice, but tend to walk awkwardly and more slowing than many of their cousins in the typical sideways crab gait.  This behavior makes them interesting, popular, and comical subjects in both public and private aquaria.

In a way, we see an analogy in the case of software.  In earlier generations of software design, applications were generally built to solve a particular challenge that mimicked the line and staff structure of the organizations involved–designed to fit its environmental niche.  But over time, of course, people decide that they want enhancements and additional features.  The user interface, when hardcoded, must be adjusted every time a new function or feature is added.

Rather than rewriting the core code from scratch–which will take time and resource-consuming reengineering and redesign of the overall application–modules, subroutines, scripts, etc. are added to software to adapt to the new environment.  Over time, software takes on the characteristics of the Decorator Crab.  The new functions are not organic to the core structure of the software, just as the attached anemone, sponges, and algae are not organic features of the crab.  While they may provide the features desired, they are not optimized, tending to use brute force computing power as the means of accounting for lack of elegance.  Thus, the more powerful each generation of hardware computing power tends to provide, the less effective each enhancement release of software tends to be.

Furthermore, just as when a crab tends to look less like a crab, it requires more effort and intelligence to identify the crab, so too with software.  The greater the encrustation of features that tend to attach themselves to an application, the greater the effort that is required to use those new features.  Learning the idiosyncrasies of the software is an unnecessary barrier to the core purposes of software–to increase efficiency, improve productivity, and improve speed.  It serves only one purpose: to increase the “stickiness” of the application within the organization so that it is harder to displace by competitors.

It is apparent that this condition is not sustainable–or acceptable–especially where the business environment is changing.  New software generations, especially Fourth Generation software, provide opportunities to overcome this condition.

Thus, as project management and acquisition professionals, the primary considerations that must be taken into account are optimization of computing power and the related consideration of sustainability.  This approach militates against complacency because it influences the environment of software toward optimization.  Such an approach will also allow organizations to more fully realize the benefits of Moore’s Law.

A Little Bit Moore — Moore’s Law and Public Sector Economics

Back in the saddle and have to just find the time to put some thoughts down.  In dealing with high tech and data issues one of the most frequent counterfactuals that I have been running into lately is in regard to the “cost” associated with data, especially data submissions.  If one approaches this issue using standard economic theory pre-high tech then the positive correlation applies.

But we live in a different world now folks.

I find myself increasingly having to explain that, given that most data concerning business exists in some form other than human readable form, that the classical economic correlation no longer applies.  This assertion is usually met with a blank stare.

Thankfully there is a concept that has been proven correct to support this assertion.  It started out as an anecdote but has a great deal of credibility attached to it.  It is Moore’s Law.  For a pretty complete overview the Wikipedia entry works fine.  When we inform Moore’s Law by Landauer’s Principle, that is, that the energy expended in each additional bit of computation becomes vanishingly small, it becomes clear that the difference in cost in transferring a MB of data as opposed to a KB of data is virtually TSTM (“too small to measure”).

The challenge that we have had to face in leveraging data is not in the cost associated with data, but in two other factors.  These factors involve those of human intervention and software application limitations.  Both of these can be ameliorated and incentivized toward maximizing the greater computational and media storage capabilities that appear every 12 to 24 months through the application of Public Sector economics.

This is an area that is ripe for contributions to the literature only because most of freshwater classical economics has been so blind to it because of ideological reasons.  Our emphasis on free market fundamentalism has caused otherwise rationale people to artificially draw a solid line between the effects of public policy and expenditure, and private market activity.  But it is not that neat and easy.  One effects and influences the other.

The proof to this assertion can be found in practical form on K Street in Washington, D.C., for anyone interested.  On the the macroeconomic front, practical experience since the 2007-08 crash has been a live course in the effectiveness of neo-Keynesian modeling.  Furthermore, in this blog I have documented technologies that were developed in the public sector that were transitioned to the public domain and which created significant private industry market verticals, including the IT sector as it currently exists.

The area that I operate in most often is the Aerospace & Defense market vertical focused on project management where the connection is most strongly seen and felt.  Because of this reason there have been a number of initiatives going back to the 1960s to incentivize this sector to strengthen this interaction between public and private resources in order to leverage technological, process, and other developments in order to optimize performance.

Full disclosure on my part: I have been an active participant in some of these successive initiatives over the years: first, during the round of Acquisition Reform under Defense Secretary Caspar Weinberger in the 1980s, and again in the late 1990s under Vice President Al Gore’s “reinventing government”, so I come to this subject with a bit of history and practical experience.  I am also currently engaged on some issues related to Undersecretary of Defense for Acquisition Frank Kendall’s Better Buying Power initiative.

The danger in highly structured public sector industries is that “good enough” often rules the day.  Incentives exist to play it safe and to avoid risk whenever possible.  This creates an atmosphere of complacency.  From the IT perspective, such concepts as Moore’s Law, which has been around for quite a while, tend to be seen as suspect.  Thus, economic incentives must be established through policymakers to replicate the pressure for performance and excellence found in more competitive markets.

The way of doing this is to provide economic incentives that punish complacency and that force participants to encompass the latest developments in technology.  Keep in mind that we’re talking an 18 to 24 month cycle.  The complaint, of course is that this is seen as too rapid a change.  Of course, how many Boeing 707s do you still find flying in commercial airlines these days?  So the ground rules must be established to encourage a race to the top instead of a race to the status quo.