Like Tinker to Evers to Chance: BI to BA to KDD

It’s spring training time in sunny Florida, as well as other areas of the country with mild weather and baseball.  For those of you new to the allusion, it comes from a poem by Franklin Pierce Adams and is also known as “Baseball’s Sad Lexicon”.  Tinker, Evers, and Chance were the double play combination of the 1910 Chicago Cubs (shortstop, second base, and first base).  Because of their effectiveness on the field these Cubs players were worthy opponents of the old New York Giants, for whom Adams was a fan, and who were the kings of baseball during most of the first fifth of a century of the modern era (1901-1922).  That is, until they were suddenly overtaken by their crosstown rivals, the Yankees, who came to dominate baseball for the next 40 years, beginning with the arrival of Babe Ruth.

The analogy here is that the Cubs infielders, while individuals, didn’t think of their roles as completely separate.  They had common goals and, in order to win on the field, needed to act as a unit.  In the case of executing the double play, they were a very effective unit.  So why do we have these dichotomies in information management when the goals are the same?

Much has been written both academically and commercially about Business Intelligence, Business Analytics, and Knowledge Discovery in Databases.  I’ve surveyed the literature and for good and bad, and what I find is that these terms are thrown around, mostly by commercial firms in either information technology or consulting, all with the purpose of attempting to provide a discriminator for their technology or service.  Many times the concepts are used interchangeably, or one is set up as a strawman to push an agenda or product.  Thus, it seems some hard definitions are in order.

According to Technopedia:

Business Intelligence (BI) is the use of computing technologies for the identification, discovery and analysis of business data – like sales revenue, products, costs and incomes.

Business analytics (BA) refers to all the methods and techniques that are used by an organization to measure performance. Business analytics are made up of statistical methods that can be applied to a specific project, process or product. Business analytics can also be used to evaluate an entire company.

Knowledge Discover in Databases (KDD) is the process of discovering useful knowledge from a collection of data. This widely used data mining technique is a process that includes data preparation and selection, data cleansing, incorporating prior knowledge on data sets and interpreting accurate solutions from the observed results.

As with much of computing in its first phases, these functions were seen to be separate.

The perception of BI, based largely on the manner in which it has been implemented in its first incarnations, is viewed as a means of gathering data into relational data warehouses or data marts and then building out decision support systems.  These methods have usually involved a great deal of overhead in both computing and personnel, since practical elements of gathering, sorting, and delivering data involved additional coding and highly structured user interfaces.  The advantage of BI is its emphasis on integration.  The disadvantage from the enterprise perspective, is that the method and mode of implementation is phlegmatic at best.

BA is BI’s younger cousin.  Applications were developed and sold as “analytical tools” focused on a niche of data within the enterprise’s requirements.  In this manner decision makers could avoid having to wait for the overarching and ponderous BI system to get to their needs, if ever.  This led many companies to knit together specialized tools in so-called “best-of-breed” configurations to achieve some measure of integration across domains.  Of course, given the plethora of innovative tools, much data import and reconciliation has had to be inserted into the process.  Thus, the advantages of BA in the market have been to reward innovation and focus on the needs of the domain subject matter expert (SME).  The disadvantages are the insertion of manual intervention in an automated process due to lack of integration, which is further exacerbated by so-called SMEs in data reconciliation–a form of rent seeking behavior that only rewards body shop consulting, unnecessarily driving up overhead.  The panacea applied to this last disadvantage has been the adoption of non-proprietary XML schemas across entire industries that reduce both the overhead and data silos found in the BA market.

KDD is our both our oldster and youngster–grandpa and the grandson hanging out.  It is a term that describes a necessary function of insight–allowing one to determine what the data tells us are needed for analytics rather than relying on a “canned” solution to determine how to approach a particular set of data.  But it does so, oftentimes, using an older approach that predates BI, known as data mining.  You will often find KDD linked to arguments in favor of flat file schemas, NoSQL (meaning flat non-relational databases), and free use of the term Big Data, which is becoming more meaningless each year that it is used, given Moore’s Law.  The advantage of KDD is that it allows for surveying across datasets to pick up patterns and interrelationships within our systems that are otherwise unknown, particularly given the way in which the human mind can fool itself into reifying an invalid assumption.  The disadvantage, of course, is that KDD will have us go backward in terms of identifying and categorizing data by employing Data Mining, which is an older concept from early in computing in which a team of data scientists and data managers develop solutions to identify, categorize, and use that data–manually doing what automation was designed to do.  Understanding these limitations, companies focused on KDD have developed heuristics (cognitive computing) that identify patterns and possible linkages, removing a portion of the overhead associated with Data Mining.

Keep in mind that you never get anything for nothing–the Second Law of Thermodynamics ensures that energy must be borrowed from somewhere in order to produce something–and its corollaries place limits on expected efficiencies.  While computing itself comes as close to providing us with Maxwell’s Demon as any technology, even in this case entropy is being realized elsewhere (in the software developer and the hardware manufacturing process), even though it is not fully apparent in the observed data processing.

Thus, manual effort must be expended somewhere along the way.  In any sense, all of these methods are addressing the same problem–the conversion of data into information.  It is information that people can consume, understand, place into context, and act upon.

As my colleague Dave Gordon has pointed out to me several times that there are also additional methods that have been developed across all of these methods to make our use of data more effective.  These include more powerful APIs, the aforementioned cognitive computing, and searching based on the anticipated questions of the user as is used by search engines.

Technology, however, is moving very rapidly and so the lines between BI, BA and KDD are becoming blurred.  Fourth generation technology that leverages API libraries to be agnostic to underlying data, and flexible and adaptive UI technology can provide a  comprehensive systemic solution to bring together the goals of these approaches to data. With the ability to leverage internal relational database tools and flat schemas for non-relational databases, the application layer, which is oftentimes a barrier to delivery of information, becomes open as well, putting the SME back in the driver’s seat.  Being able to integrate data across domain silos provide insight into systems behavior and performance not previously available with “canned” applications written to handle and display data a particular way, opening up knowledge discovery in the data.

What this means practically is that those organizations that are sensitive to these changes will understand the practical application of sunk cost when it comes to aging systems being provided by ponderous behemoths that lack agility in their ability to introduce more flexible, less costly, and lower overhead software technologies.  It means that information management can be democratized within the organization among the essential consumers and decision makers.

Productivity and effectiveness are the goals.

Post-Blogging NDIA Blues — The Latest News (Project Management Wonkish)

The National Defense Industrial Association’s Integrated Program Management Division (NDIA IPMD) just had its quarterly meeting here in sunny Orlando where we braved the depths of sub-60 degrees F temperatures to start out each day.

For those not in the know, these meetings are an essential coming together of policy makers, subject matter experts, and private industry practitioners regarding the practical and mundane state-of-the-practice in complex project management, particularly focused on the concerns of the the federal government and the Department of Defense.  The end result of these meetings is to publish white papers and recommendations regarding practice to support continuous process improvement and the practical application of project management practices–allowing for a cross-pollination of commercial and government lessons learned.  This is also the intersection where innovation among the large and small are given an equal vetting and an opportunity to introduce new concepts and solutions.  This is an idealized description, of course, and most of the petty personality conflicts, competition, and self-interest that plagues any group of individuals coming together under a common set of interests also plays out here.  But generally the days are long and the workshops generally produce good products that become the de facto standard of practice in the industry. Furthermore the control that keeps the more ruthless personalities in check is the fact that, while it is a large market, the complex project management community tends to be a relatively small one, which reinforces professionalism.

The “blues” in this case is not so much borne of frustration or disappointment but, instead, from the long and intense days that the sessions offer.  The biggest news from an IT project management and application perspective was twofold. The data stream used by the industry in sharing data in an open systems manner will be simplified.  The other was the announcement that the technology used to communicate will move from XML to JSON.

Human readable formatting to Data-focused formatting.  Under Kendall’s Better Buying Power 3.0 the goal of the Department of Defense (DoD) has been to incorporate better practices from private industry where they can be applied.  I don’t see initiatives for greater efficiency and reduction of duplication going away in the new Administration, regardless of what a new initiative is called.

In case this is news to you, the federal government buys a lot of materials and end items–billions of dollars worth.  Accountability must be put in place to ensure that the money is properly spent to acquire the things being purchased.  Where technology is pushed and where there are no commercial equivalents that can be bought off the shelf, as in the systems purchased by the Department of Defense, there are measures of progress and performance (given that the contract is under a specification) that are submitted to the oversight agency in DoD.  This is a lot of data and to be brutally frank the method and format of delivery has been somewhat chaotic, inefficient, and duplicative.  The Department moved to address this by a somewhat modest requirement of open systems submission of an application-neutral XML file under the standards established by the UN/CEFACT XML organization.  This was called the Integrated Program Management Report (IMPR).  This move garnered some improvement where it has been applied, but contracts are long-term, so incorporating improvements though new contractual requirements tends to take time.  Plus, there is always resistance to change.  The Department is moving to accelerate addressing these inefficiencies in their data streams by eliminating the unnecessary overhead associated with specifications of formatting data for paper forms and dealing with data as, well, data.  Great idea and bravo!  The rub here is that in making the change, the Department has proposed dropping XML as the technology used to transfer data and move to JSON.

XML to JSON. Before I spark another techie argument about the relative merits of each, there are some basics to understand here.  First, XML is a language, JSON is simply data exchange format.  This means that XML is specifically designed to deal with hierarchical and structured data that can be queried and where validation and fidelity checks within the data are inherent in the technology. Furthermore, XML is known to scale while maintaining the integrity of the data, which is intended for use in relational databases.  Furthermore, XML is hard to break.  It is meant for editing and will maintain its structure and integrity afterward.

The counter argument encountered is that JSON is new! and uses fewer characters! (which usually turns out to be inconsequential), and people are talking about it for Big Data and NoSQL! (but this happened after the fact and the reason for shoehorning it this way is discussed below).

So does it matter?  Yes and no.  As a supplier specializing in delivering solutions that normalize and rationalize data across proprietary file structures and leverage database capabilities, I don’t care.  I can adapt quickly and will have a proof-of-concept solution out within 30 days of receiving the schema.

The risk here, which applies to DoD and the industry, is that the decision to go to JSON is made only because it is the shiny new thing used by gamers and social networking developers.  There has also been a move to adapt to other uses because of the history of significant security risks that had been found in Java, so much so that an entire Wikipedia page is devoted to them.  Oracle just killed off Java applets, though Java hangs on.  JSON, of course, isn’t Java, but it was designed from birth as JavaScript Object Notation (hence the acronym JSON), with the purpose of handling relatively small bits of data across web servers in a number of proprietary settings.

To address JSON deficiencies relative to XML, a number of tools have been and are being developed to replicate the fidelity and reliability found in XML.  Whether this is sufficient to be effective against a structured LANGUAGE is to be seen.  Much of the overhead that technies complain about in XML is due to the native functionality related to the power it brings to the table.  No doubt, a bicycle is simpler than a Formula One racer–and this is an apt comparison.  Claiming “simpler” doesn’t pass the “So What?” test knowing the business processes involved.  The technology needs to be fit to the solution.  The purpose of data transmission using APIs is not only to make it easy to produce but for it to–you know–achieve the goals of normalization and rationalization so that it can be used on the receiving end which is where the consumer (which we usually consider to be the customer) sits.

At the end of the day the ability to scale and handle hierarchical, structured data will rely on the quality and strength of the schema and the tools that are published to enforce its fidelity and compliance.  Otherwise consuming organizations will be receiving a dozen different proprietary JSON files, and that does not address the present chaos but simply adds to it.  These issues were aired out during the meeting and it seems that everyone is aware of the risks and that they can be addressed.  Furthermore, as the schema is socialized across solutions providers, it will be apparent early if the technology will be able handle the project performance data resulting from the development of a high performance aircraft or a U.S. Navy destroyer.

Something New (Again)– Top Project Management Trends 2017

Atif Qureshi at Tasque, which I learned via Dave Gordon’s blog, went out to LinkedIn’s Project Management Community to ask for the latest tends in project management.  You can find the raw responses to his inquiry at his blog here.  What is interesting is that some of these latest trends are much like the old trends which, given continuity makes sense.  But it is instructive to summarize the ones that came up most often.  Note that while Mr. Qureshi was looking for ten trends, and taken together he definitely lists more than ten, there is a lot of overlap.  In total the major issues seem to the five areas listed below.

a.  Agile, its hybrids, and its practical application.

It should not surprise anyone that the latest buzzword is Agile.  But what exactly is it in its present incarnation?  There is a great deal of rising criticism, much of it valid, that it is a way for developers and software PMs to avoid accountability. Anyone ready Glen Alleman’s Herding Cat’s Blog is aware of the issues regarding #NoEstimates advocates.  As a result, there are a number hybrid implementations of Agile that has Agile purists howling and non-purists adapting as they always do.  From my observations, however, there is an Ur-Agile that is out there common to all good implementations and wrote about them previously in this blog back in 2015.  Given the time, I think it useful to repeat it here.

The best articulation of Agile that I have read recently comes from Neil Killick, whom I have expressed some disagreement on the #NoEstimates debate and the more cultish aspects of Agile in past posts, but who published an excellent post back in July (2015) entitled “12 questions to find out: Are you doing Agile Software Development?”

Here are Neil’s questions:

  1. Do you want to do Agile Software Development? Yes – go to 2. No – GOODBYE.
  2. Is your team regularly reflecting on how to improve? Yes – go to 3. No – regularly meet with your team to reflect on how to improve, go to 2.
  3. Can you deliver shippable software frequently, at least every 2 weeks? Yes – go to 4. No – remove impediments to delivering a shippable increment every 2 weeks, go to 3.
  4. Do you work daily with your customer? Yes – go to 5. No – start working daily with your customer, go to 4.
  5. Do you consistently satisfy your customer? Yes – go to 6. No – find out why your customer isn’t happy, fix it, go to 5.
  6. Do you feel motivated? Yes – go to 7. No – work for someone who trusts and supports you, go to 2.
  7. Do you talk with your team and stakeholders every day? Yes – go to 8. No – start talking with your team and stakeholders every day, go to 7.
  8. Do you primarily measure progress with working software? Yes – go to 9. No – start measuring progress with working software, go to 8.
  9. Can you maintain pace of development indefinitely? Yes – go to 10. No – take on fewer things in next iteration, go to 9.
  10. Are you paying continuous attention to technical excellence and good design? Yes – go to 11. No – start paying continuous attention to technical excellent and good design, go to 10.
  11. Are you keeping things simple and maximising the amount of work not done? Yes – go to 12. No – start keeping things simple and writing as little code as possible to satisfy the customer, go to 11.
  12. Is your team self-organising? Yes – YOU’RE DOING AGILE SOFTWARE DEVELOPMENT!! No – don’t assign tasks to people and let the team figure out together how best to satisfy the customer, go to 12.

Note that even in software development based on Agile you are still “provid(ing) value by independently developing IP based on customer requirements.”  Only you are doing it faster and more effectively.

With the possible exception of the “self-organizing” meme, I find that items through 11 are valid ways of identifying Agile.  Given that the list says nothing about establishing closed-loop analysis of progress says nothing about estimates or the need to monitor progress, especially on complex projects.  As a matter of fact one of the biggest impediments noted elsewhere in industry is the inability of Agile to scale.  This limitations exists in its most simplistic form because Agile is fine in the development of well-defined limited COTS applications and smartphone applications.  It doesn’t work so well when one is pushing technology while developing software, especially for a complex project involving hundreds of stakeholders.  One other note–the unmentioned emphasis in Agile is technical performance measurement, since progress is based on satisfying customer requirements.  TPM, when placed in the context of a world of limited resources, is the best measure of all.

b.  The integration of new technology into PM and how to upload the existing PM corporate knowledge into that technology.

This is two sides of the same coin.  There is always  debate about the introduction of new technologies within an organization and this debate places in stark contrast the differences between risk aversion and risk management.

Project managers, especially in the complex project management environment of aerospace & defense tend, in general, to be a hardy lot.  Consisting mostly of engineers they love to push the envelope on technology development.  But there is also a stripe of engineers among them that do not apply this same approach of measured risk to their project management and business analysis system.  When it comes to tracking progress, resource management, programmatic risk, and accountability they frequently enter the risk aversion mode–believing that the less eyes on what they do the more leeway they have in achieving the technical milestones.  No doubt this is true in a world of unlimited time and resources, but that is not the world in which we live.

Aside from sub-optimized self-interest, the seeds of risk aversion come from the fact that many of the disciplines developed around performance management originated in the financial management community, and many organizations still come at project management efforts from perspective of the CFO organization.  Such rice bowl mentality, however, works against both the project and the organization.

Much has been made of the wall of honor for those CIA officers that have given their lives for their country, which lies to the right of the Langley headquarters entrance.  What has not gotten as much publicity is the verse inscribed on the wall to the left:

“And ye shall know the truth and the truth shall make you free.”

      John VIII-XXXII

In many ways those of us in the project management community apply this creed to the best of our ability to our day-to-day jobs, and it lies as the basis for all of the management improvement from Deming’s concept of continuous process improvement, through the application of Six Sigma and other management improvement methods.  What is not part of this concept is that one will apply improvement only when a customer demands it, though they have asked politely for some time.  The more information we have about what is happening in our systems, the better the project manager and the project team is armed with applying the expertise which qualified the individuals for their jobs to begin with.

When it comes to continual process improvement one does not need to wait to apply those technologies that will improve project management systems.  As a senior management (and well-respected engineer) when I worked in Navy told me; “if my program managers are doing their job virtually every element should be in the yellow, for only then do I know that they are managing risk and pushing the technology.”

But there are some practical issues that all managers must consider when managing the risks in introducing new technology and determining how to bring that technology into existing business systems without completely disrupting the organization.  This takes–good project management practices that, for information systems, includes good initial systems analysis, identification of those small portions of the organization ripe for initial entry in piloting, and a plan of data normalization and rationalization so that corporate knowledge is not lost.  Adopting systems that support more open systems that militate against proprietary barriers also helps.

c.  The intersection of project management and business analysis and its effects.

As data becomes more transparent through methods of normalization and rationalization–and the focus shifts from “tools” to the knowledge that can be derived from data–the clear separation that delineated project management from business analysis in line-and-staff organization becomes further blurred.  Even within the project management discipline, the separation in categorization of schedule analysts from cost analysts from financial analyst are becoming impediments in fully exploiting the advantages in looking at all data that is captured and which affects project performance.

d.  The manner of handling Big Data, business intelligence, and analytics that result.

Software technologies are rapidly developing that break the barriers of self-contained applications that perform one or two focused operations or a highly restricted group of operations that provide functionality focused on a single or limited set of business processes through high level languages that are hard-coded.  These new technologies, as stated in the previous section, allow users to focus on access to data, making the interface between the user and the application highly adaptable and customizable.  As these technologies are deployed against larger datasets that allow for integration of data across traditional line-and-staff organizations, they will provide insight that will garner businesses competitive advantages and productivity gains against their contemporaries.  Because of these technologies, highly labor-intensive data mining and data engineering projects that were thought to be necessary to access Big Data will find themselves displaced as their cost and lack of agility is exposed.  Internal or contracted out custom software development devoted along these same lines will also be displaced just as COTS has displaced the high overhead associated with these efforts in other areas.  This is due to the fact that hardware and processes developments are constantly shifting the definition of “Big Data” to larger and larger datasets to the point where the term will soon have no practical meaning.

e.  The role of the SME given all of the above.

The result of the trends regarding technology will be to put the subject matter expert back into the driver’s seat.  Given adaptive technology and data–and a redefinition of the analyst’s role to a more expansive one–we will find that the ability to meet the needs of functionality and the user experience is almost immediate.  Thus, when it comes to business and project management systems, the role of Agile, while these developments reinforce the characteristics that I outlined above are made real, the weakness of its applicability to more complex and technical projects is also revealed.  It is technology that will reduce the risk associated with contract negotiation, processes, documentation, and planning.  Walking away from these necessary components to project management obfuscates and avoids the hard facts that oftentimes must be addressed.

One final item that Mr. Qureshi mentions in a follow-up post–and which I have seen elsewhere in similar forums–concerns operational security.  In deployment of new technologies a gatekeeper must be aware of whether that technology will not open the organization’s corporate knowledge to compromise.  Given the greater and more integrated information and knowledge garnered by new technology, as good managers it is incumbent to ensure these improvements do not translate into undermining the organization.

Do You Know Where You’re Going To? — SecDef Ash Carter talks to Neil DeGrasse Tyson…and some thoughts on the international technology business

It’s time to kick off my 2017 blogging activity and my readers have asked about my absence on this blog.  Well because of the depth and research required by some of the issues that I consider essential, most of my blogging energy has been going to contributions to AITS.org.  I strongly recommend that you check out the site if you haven’t already.  A great deal of useful PM information and content can be found there–and they have a strong editorial staff so that what does get to publication is pretty well sourced.  My next post on the site is scheduled for 25 January.  I will link to it once it becomes available.

For those of us just getting back into the swing of things after the holidays, there were a number of interesting events that occurred during that time that I didn’t get a chance to note.  Among these is that SecDef Ash Carter appeared (unfortunately a subscription wall) on an episode of Neil DeGrasse Tyson’s excellent show “StarTalk“, which appears on the National Geographic Channel.

Secretary Carter had some interesting things to say, among them are:

a. His mentors in science, many of whom were veterans of the Second World War, instilled in him the concept of public service and giving back to the country.

b.  His experience under former SecDef Perry, when he was Assistant Secretary of Defense for International Security Policy, taught him that the DoD needed to be the “petri dish” for R&D in new technologies.

c.  That the approach of the DoD has been to leverage the R&D into new technologies that can be leveraged from the international technology industry, given that there are many good ideas and developments that occur outside of the United States.

d.  He encouraged more scientists to serve in the federal government and the Department of Defense, if even for a short while to get a perspective on how things work at that level.

e.  He doesn’t see the biggest source of instability will necessarily be from nation states, but that small groups of individuals, given that destructive power is becoming portable, will be the emerging threat that his successor will face.

f. There imperative that the U.S. maintain its technological edge is essential in guaranteeing international stability and peace.

Secretary Carter’s comments, in particular, in realizing that the technology industry is an international one strikes a particular personal cord with me since my present vocation has caused me to introduce new capabilities in the U.S. market built from technologies that were developed by a close European ally.  The synergy that this meeting of the minds has created has begun to have a positive impact on the small portion of the market that my firm inhabits, changing the way people do business and shifting the focus from “tools” as the source of information to data, and what the data suggests.

This is not to say that cooperation in the international technology market is not fraught with the same rocks and shoals found in any business area.  But it is becoming increasingly apparent that new information technologies can be used as a means of evening the playing field because of the asymmetrical nature of information itself, which then lends itself to leverage given relatively small amounts of effort.

This also points to the importance of keeping an open mind and encouraging international trade, especially among our allies that are among the liberal democracies.  Recently my firm was the target of a protest for a government contract where this connection to international trade was used as a means of questioning whether the firm was, indeed, a bonafide U.S. business.  The answer under U.S. law is a resounding “yes”–and that first decision was upheld on appeal.  For what we have done is–under U.S. management–leveraged technology first developed elsewhere, extended its capabilities, designed, developed, and localized it for the U.S. market, and in the process created U.S. jobs and improved U.S. processes.  This is a good deal all around.

Back in the day when I wore a U.S. Navy uniform during the Cold War military, many of us in the technology and acquisition specialties looked to reform our systems and introduce innovative methods from wherever we could find them, whether they came from private industry or other government agencies.  When coming upon resistance because something was “the way it always was done” our characterization of that attitude was “NIH”.  That is, “Not Invented Here.”  NIH was a term that, in shorthand, described an invalid counterargument against process improvement that did not rely on the merits or evidence.

And so it is today.  The world is always changing, but given new technologies the rate of change is constantly accelerating.  Adapting and adopting the best technologies available will continue to give us the advantage as a nation.  It simply requires openness and the ability to identify innovation when we see it.

Back in the Saddle Again — Putting the SME into the UI Which Equals UX

“Any customer can have a car painted any colour that he wants so long as it is black.”  — Statement by Henry Ford in “My Life and Work”, by Henry Ford, in collaboration with Samuel Crowther, 1922, page 72

The Henry Ford quote, which he made half-jokingly to his sales staff in 1909, is relevant to this discussion because the information sector has developed along the lines of the auto and many other industries.  The statement was only half-joking because Ford’s cars could be had in three colors.  But in 1909 Henry Ford had found a massive market niche that would allow him to sell inexpensive cars to the masses.  His competition wasn’t so much as other auto manufacturers, many of whom catered to the whims of the rich and more affluent members of society, but against the main means of individualized transportation at the time–the horse and buggy.  The color was not so much important to this market as was the need for simplicity and utility.

Since the widespread adoption of the automobile and the expansion of the market with multiple competitors, high speed roadways, a more affluent society anchored by a middle class, and the impact of industrial and information systems development in shaping societal norms, the automobile consumer has, over time, become more varied and sophisticated.  Today automobiles have introduced a number of new features (and color choices)–from backup cameras, to blind spot and back-up warning signals, to lane control, auto headline adjustment, and many other innovations.  Enhancements to industrial production that began with the introduction of robotics into the assembly line back in the late 1970s and early 1980s, through to the adoption of Just-in-Time (JiT) and Lean principles in overall manufacturing, provide consumers a a multitude of choices.

We are seeing a similar evolution in information systems, which leads me to the title of this post.  During the first waves of information systems development and introduction into our governing and business systems, the process has been one in which software is developed first to address an activity that is completed manually.  There would be a number of entries into a niche market (or for more robustly capitalized enterprises into an entire vertical).  The software would be fairly simplistic and the features limited, the objects (the way the information is presented and organized on the screen, the user selections, and the charts, graphs, and analytics allowed to enhance information visibility) well defined, and the UI (user interface) structured along the lines of familiar formats and views.

To include the input of the SME into this process, without specific soliciting of advice, was considered both intrusive and disruptive.  After all, software development largely was an activity confined to a select and highly trained specialty involving sophisticated coding languages that required a good deal of talent to be considered “elegant”.  I won’t go into a definition of elegance here, which I’ve addressed in previous posts, but for a short definition it is this:  the fewest bits of code possible that both maximizes computing power and provides the greatest flexibility for any given operation or set of operations.

This is no mean feat and a great number of software applications are produced in this way.  Since the elegance of any line of code varies widely by developer and organization, the process of update and enhancement can involve a great deal of human capital and time.  Thus, the general rule has been that the more sophisticated that any software application is, the more effort and thus less flexibility that the application possesses.  Need a new chart?  We’ll update you next year.  Need a new set of views or user settings?  We’ll put your request on the road-map and maybe you’ll see something down the road.

It is not as if the needs and requests of users have always been ignored.  Most software companies try to satisfy the needs of their customer, balancing the demands of the market against available internal resources.  Software websites, such as at UXmatters in this article, have advocated the ways that the SME (subject-matter expert) needs to be at the center of the design process.

With the introduction of fourth-generation adaptive software environments–that is, those systems that leverage underlying operating environments and objects such as .NET and WinForms, that are open to any data through OLE DB and ODBC, and that leave the UI open to simple configuration languages that leverage these underlying capabilities and place them at the feet of the user–put the SME at the center of the design process into practice.

This is a development in software as significant as the introduction of JiT and Lean in manufacturing, since it removes both the labor and time-intensiveness involved in rolling out software solutions and enhancements.  Furthermore, it goes one step beyond these processes by allowing the SME to roll out multiple software solutions from one common platform that is only limited by access to data.  It is as if each organization and SME has a digital printer for software applications.

Under this new model, software application manufacturers have a flexible environment to pre-configure the 90% solution to target any niche or market, allowing their customers to fill in any gaps or adapt the software as they see fit.  There is still IP involved in the design and construction of the “canned” portion of the solution, but the SME can be placed into the middle of the design process for how the software interacts with the user–and to do so at the localized and granular level.

This is where we transform UI into UX, that is, the total user experience.  So what is the difference?  In the words of Dain Miller in a Web Designer Depot article from 2011:

UI is the saddle, the stirrups, and the reigns.

UX is the feeling you get being able to ride the horse, and rope your cattle.

As we adapt software applications to meet the needs of the users, the role of the SME can answer many of the questions that have vexed many software implementations for years such as user perceptions and reactions to the software, real and perceived barriers to acceptance, variations in levels of training among users, among others.  Flexible adaptation of the UI will allow software applications to be more successfully localized to not only meet the business needs of the organization and the user, but to socialize the solution in ways that are still being discovered.

In closing this post a bit of full disclosure is in order.  I am directly involved in such efforts through my day job and the effects that I am noting are not simply notional or aspirational.  This is happening today and, as it expands throughout industry, will disrupt the way in which software is designed, developed, sold and implemented.

You Know I’m No Good: 2016 Election Polls and Predictive Analytics

While the excitement and emotions of this past election work themselves out in the populace at large, as a writer and contributor to the use of predictive analytics, I find the discussion about “where the polls went wrong” to be of most interest.  This is an important discussion, because the most reliable polling organizations–those that have proven themselves out by being right consistently on a whole host of issues since most of the world moved to digitization and the Internet of Things in their daily lives–seemed to be dead wrong in certain of their predictions.  I say certain because the polls were not completely wrong.

For partisans who point to Brexit and polling in the U.K., I hasten to add that this is comparing apples to oranges.  The major U.S. polling organizations that use aggregation and Bayesian modeling did not poll Brexit.  In fact, there was one reliable U.K. polling organization that did note two factors:  one was that the trend in the final days was toward Brexit, and the other is that the final result was based on turnout, where greater turnout favored the “stay” vote.

But aside from these general details, this issue is of interest in project management because, unlike national and state polling, where there are sufficient numbers to support significance, at the micro-microeconomic level of project management we deal with very small datasets that expand the range of probable results.  This is not an insignificant point that has been made time-and-time again over the years, particularly given single-point estimates using limited time-phased data absent a general model that provides insight into what are the likeliest results.  This last point is important.

So let’s look at the national polls on the eve of the election according to RealClear.  IBD/TIPP Tracking had it Trump +2 at +/-3.1% in a four way race.  LA Times/USC had it Trump +3 at the 95% confidence interval, which essentially means tied.  Bloomberg had Clinton +3, CBS had Clinton +4, Fox had Clinton +4, Reuters/Ipsos had Clinton +3, ABC/WashPost, Monmouth, Economist/YouGov, Rasmussen, and NBC/SM had Clinton +2 to +6.  The margin for error for almost all of these polls varies from +/-3% to +/-4%.

As of this writing Clinton sits at about +1.8% nationally, the votes are still coming in and continue to confirm her popular vote lead, currently standing at about 300,000 votes.  Of the polls cited, Rasmussen was the closest to the final result.  Virtually every other poll, however, except IBD/TIPP, was within the margin of error.

The polling that was off in predicting the election were those that aggregated polls along with state polls, adjusted polling based on non-direct polling indicators, and/or then projected the chances of winning based on the probable electoral vote totals.  This is where things were off.

Among the most popular of these sites is Nate Silver’s FiveThirtyEight blog.  Silver established his bonafides in 2008 by picking winners with incredible accuracy, particularly at the state level, and subsequently in his work at the New York Times which continued to prove the efficacy of data in predictive analytics in everything from elections to sports.  Since that time his significant reputation has only grown.

What Silver does is determine the probability of an electoral outcome by using poll results that are transparent in their methodologies and that have a high level of confidence.  Silver’s was the most conservative of these types of polling organizations.  On the eve of the election Silver gave Clinton a 71% chance of winning the presidency. The other organizations that use poll aggregation, poll normalization, or other adjusting indicators (such as betting odds, financial market indicators, and political science indicators) include the New York Times Upshot (Clinton 85%), HuffPost (Clinton 98%), PredictWise (Clinton 89%), Princeton (Clinton >99%), DailyKos (Clinton 92%), Cook (Lean Clinton), Roth (Lean Clinton), and Sabato (Lean Clinton).

In order to understand what probability means in this context, the polls were using both bottom-up state polling to track the electoral college combined with national popular vote polling.  But keep in mind that, as Nate Silver wrote over the course of the election, that just a 17% chance of winning “is the same as your chances of losing a “game” of Russian roulette”.  Few of us would take that bet, particularly since the result of losing the game is finality.

Still, except for FiveThirtyEight, none of the other methods using probability got it right.  None, except FiveThirtyEight, left enough room for drawing the wrong chamber.  Also, in fairness, the Cook, Rothenberg, and Sabato projections also left enough room to see a Trump win if the state dominoes fell right.

The place that the models failed were in the states of Florida, North Carolina, Pennsylvania, Michigan, and Wisconsin.  In particular, even with Florida (result Trump +1.3%) and North Carolina (result Trump +3.8%), Trump would not win if Pennsylvania (result Trump +1.2%), Michigan (result Trump +.3), and Wisconsin (result Trump +1.0)–supposed Clinton firewall states–were not breached.  So what happened?

Among the possible factors are the effect of FBI Director Comey’s public intervention, which was too soon to the election to register in the polling; ineffective polling methods in rural areas (garbage in-garbage out), bad state polling quality, voter suppression, purging, and restrictions (of the battleground states this includes Florida, North Carolina, Wisconsin, Ohio, and Iowa), voter turnout and enthusiasm (aside from the factors of voter suppression), and the inability to peg the way the high level of undecided voters would go at the last minute.

In hindsight, the national polls were good predictors.  The sufficiency of the data in drawing significance, and the high level of confidence in their predictive power is borne out by the final national vote totals.

I think that where the polling failed in the projections of the electoral college was from the inability to take into account non-statistical factors, selection bias, and that the state poll models probably did not accurately reflect the electorate in the states given the lessons from the primaries.  Along these lines, I believe that if pollsters look at the demographics in the respective primaries that they will find that both voter enthusiasm and composition provide the corrective to their projections. Given these factors, the aggregators and probabilistic models should all have called the race too close to call.  I think both Monte Carlo and Bayesian methods in simulations will bear this out.

For example, as one who also holds a political science degree and so will put on that hat.  It is a basic tenet that negative campaigns depress voter participation.  This causes voters to select the lesser of two evils (or lesser of two weevils).  Voter participation was down significantly due to a unprecedentedly negative campaign.  When this occurs, the most motivated base will usually determine the winner in an election.  This is why midterm elections are so volatile, particularly after a presidential win that causes a rebound of the opposition party.  Whether this trend continues with the reintroduction of gerrymandering is yet to be seen.

What all this points to from a data analytics perspective is that one must have a model to explain what is happening.  Statistics by themselves, while correct a good bit of the time, will cause one to be overconfident of a result based solely on the numbers and simulations that give the false impression of solidity, particularly when one is in a volatile environment.  This is known as reification.  It is a fallacious way of thinking.  Combined with selection bias and the absence of a reasonable narrative model–one that introduces the social interactions necessary to understand the behavior of complex adaptive systems–one will often find that invalid results result.

New York Times Says Research and Development Is Hard…but maybe not

At least that is what a reader is led to believe by reading this article that appeared over the weekend.  For those of you who didn’t catch it, Alphabet, which formerly had an R&D shop under the old Google moniker known as Google X, does pure R&D.  According to the reporter, one Conor Doughtery, the problem, you see, is that R&D doesn’t always translate into a direct short-term profit.  He then makes this absurd statement:  “Building a research division is an old and often unsuccessful concept.”  He knows this because some professor at Arizona State University–that world-leading hotbed of innovation and high tech–told him so.  (Yes, there is sarcasm in that sentence).

Had Mr. Doughtery understood new technology, he would know that all technology companies are, at core, research organizations that sometimes make money in the form of net profits, just as someone once accurately described to me that Tesla is a battery company that also makes cars (and lately its showing).  But let’s return the howler of a statement about research divisions being unsuccessful, apply some, you know, facts and empiricist thought, and go from there.

The most obvious example of a research division is Bell Labs.  From the article one would think that Bell Labs is a dinosaur of the past, but no, it still exists as Nokia Bell Labs.  Bell Labs was created in 1925, but has its antecedents in both Western Electric and AT&T, but its true roots go back to 1880 when Alexander Graham Bell, after being awarded the Volta prize for the invention of the telephone, opened Volta Labs in Washington, D.C.  But it was in the 1920s that Bell Labs, “the Idea Factory” really hit its stride.  Its researchers improved telephone switching, sound transmission, and invented radio astronomy, the transistor, the laser, information theory (of which I’ve written about extensively and which directly impacts on computing and software), Unix, the languages C, C++.  Bell established the precedent that researchers kept and were compensated for use of their inventions and IP.  This goes well beyond the assertion in the article that Bell Labs largely made “contributions to basic, university-style research.”  I guess New York Times reporters, fact checkers, and editors don’t have access to the Google search engine or Wikipedia.

Between 1937 and 2014 seventeen of their researchers have been awarded the Nobel Prize or Turing Award.  Even those who never garnered an award like Claude Shannon, of the aforementioned information theory, is among a Who’s Who of researchers into high tech.  What they didn’t invent directly they augmented and facilitated to practical use, with a good deal of their input going into public R&D through consulting and other contracts with the Department of Defense and federal government.

The reason why Bell Labs didn’t continue as a research division of AT&T wasn’t due to some dictate of the market or investor dissatisfaction.  On the contrary, AT&T (Ma Bell) dominated its market, and Bell Labs ensured that it stayed far ahead of any possible entry.  This is why in 1984 the U.S. Justice Department reached a divestiture agreement for AT&T under antitrust laws to split off Bell Labs from its local carriers in order to promote competition.  Whether the divestiture agreement was a good deal for the American people and had positive economic effects is still a cause for debate, but it is likely that the plethora of choices in cell phone and other technologies that have emerged since that time would not have gone to market without that antitrust action.

Since 1984, Bell Labs continued its significant contributions to the high tech industry through AT&T Technologies which was spun off in 1996 as Lucent Technologies, which is probably why Mr. Doughtery didn’t recognize it.  A merger with Alcaltel and then acquisition by Nokia has provided it with its current moniker.  Bell Labs over that period continued to innovate and has contributed significantly to pushing the boundaries of broadband speed and the use of imaging technology in the medical field.

So what this shows is that, while not every bit of R&D leads directly to profit, especially in the short term, a mix of types of R&D do yield practical results.  Anyone who has worked in project management understands that R&D, by definition, represents the handling of risk.  Furthermore, the lessons learned and spin offs are hard to estimate in advance, though they may result in practical technologies in the short and medium term.

When one reads past the lede and the “research division is an old and often unsuccessful concept” gaffe, among others, what you find is that Google specifically wants this portion of the research division to come up with a series of what it calls a “moon shots”.  In techie lingo this is often called a unicorn, and from personal experience I am part of a company that recently was characterized as delivering a unicorn.  This is simply a shorthand term for producing a solution that is practical, groundbreaking, and shifts the dialogue of what is possible.  (Note that I’m avoiding the tech hipster term “disruption”).

Another significant fact that we find out about Google X is the following:

X employees avoid talking about money, but it is not a subject they can ignore. They face financial barriers that can shut down a project if it does not pan out as quickly as planned. And they have to meet various milestones before they can hire more people for their teams.

This sounds a lot like project and risk management.  But Google X goes a bit further.

Failure bonuses are also an example of how X, which was set up independent of Google from the outset, is a leading indicator of sorts for how the autonomous Alphabet could work. In Alphabet, employees who do not work for Mother Google are supposed to have their financial futures tied to their own company instead of Google’s search ads. At X, that means killing things before they become too expensive.

Note that the incentive here, given in terms of a real financial incentive to the team members, is to manage risk.  No doubt, there are no #NoEstimates cultists at Google.  Psychologically, providing an incentive to find failure no doubt defeats Groupthink and optimism selection bias.  Much of this sounds, particularly in the expectation of non-existential failure, amazingly along the lines of an article recently published on AITS.org by yours truly.

The delayed profitability of software and technology companies is commonplace.  The reason for this is that, at least to my thinking, any technology type worth their salt will continue to push the technology once they have their first version marked to market.  If you’re resting on your laurels then you’re no longer in the software technology business, you’re in the retail business and might as well be selling candy bars or any other consumer product.  What you’re not doing is being engaged in providing a solution that is essential to the target domain.  Practically what this means is that, in garnering value, net profitability is not necessary the measure of success, especially in the first years.

For example, such market leaders such as Box, Workday, and Salesforce have gone years without a net profit, though revenues and market share are significant.  Facebook did not turn a profit for five yearsAmazon took six years, and even those figures were questionable.  The competing need for any executive running a company is between value (the intrinsic value of IP, existing customer base, and potential customer base), and profit.  The job of the CEO is not just to stockholders, yet the article in its lede clearly is biased in that way.  The fiduciary and legal responsibility of the CEO is to the customers, the employees, the entity, and the stockholders–and not necessarily in that order.  This is thus a natural conflict in balancing these competing interests.

Overall, if one ignores the contributions of the reporter, the case of Google X is a fascinating one for its expectations and handling or risk in R&D-focused project management.  It takes value where it can and cuts its losses through incentives to find risk that can’t be handled.  An investor that lives in the real world should find this reassuring.  Perhaps these lessons on incentives can be applied elsewhere.