A Domain-Specific Architecture for Deep Neural Networks | September 2018 | Communications of the ACM

Moore’s Law is dead. Let’s not mince words.

DRAM chips introduced in 2014 contained eight billion transistors. Chips with 16-billion will not be mass produced until 2019. At the very least, the growth exponent has flattened.

Meanwhile, the company associated with Moore, Intel, is struggling to stay relevant with its CISC architecture. The RISC-based ARM architecture used by Apple in the iPad line is approaching the performance of Intel, while using significantly less power.

The lesser-known Dennard Scaling law is also dead. Power utilization is scaling more linearly with the number of circuits.

What this means, in practice, is new computing nodes based on traditional architectures will require more power to feed smaller advances in performance. Consumers haven been aware of this trend already, but it will only get worse until new architectures achieve mainstream status. And then advances will depend a lot more on what you are planning to do with it.

We are already at the early edge of this. Or not. Gamers have long been aware of the benefits of graphical processing units (GPUs) for improve gaming performance and would pay dearly for those benefits. GPUs exploit parallelism in matrix calculations (see linear mathematics) to significantly improve performance in specific domains (I.e. gaming) while leaving other forms of computation untouched. Few other desktop computing applications exploit GPUs capabilities.

There are a couple popular exceptions:

  • Bitcoin mining

  • Neural network learning

The rise in these applications has been a boon for Nvidia, while the recent declines in the former has dropped the bottom out of their growth projections.

What the death of Moore’s Law and Denning Scaling portends is:

  • Flattening of general purpose processing CPUs.

  • A shift from CISC to RISC CPU architectures. Apple and Microsoft have already shown signs they want to shift their PCs towards ARM-based processors that use less power — this will be pronounced by 2020.

  • Increasing reliance on alternative, domain-specific computation architectures that excel in specific areas.

Other buzzwords include:

  • Field Programmable Gate Arrays (FPGAs)

  • Custom ASICs

  • Tensor Processing Unit (TPU)

  • Neuromorphic computing

Most of these concepts are not new, but all are becoming newly relevant.

I have to admit I find the literature on computing architecture to be tedious and dull. I stopped following advances in Intel chips a long time ago. However, if computing architecture is your thing, it is a great time to be alive.

A Domain-Specific Architecture for Deep Neural Networks | September 2018 | Communications of the ACM
— Read on

IT Service Management Trends

Do boring speakers really talk for longer?

This headline violates Betteridge’s Law of Headlines — Yes, boring speakers do drone on longer than interesting ones.

Although brief, the article contains some good pointers to keep in mind for my upcoming presentation on Managing IT in 2019.

“Dull talks at conferences can feel interminable.”
— Read on


Demonstration of Data Analysis with Quantum Computing

It’s hard to miss the news on quantum computing. Breakthroughs in the last few years have demonstrated the opportunities and potential of quantum computing. The question is whether it will scale to more qbits while maintaining the stability of quantum entanglement. There are detractors, but it is too promising and far-reaching to ignore.

The work that Huang and co have done is to run this algorithm on a quantum computer in a proof-of-principle experiment. The team uses a six-photon quantum processor to analyze the topological features of Betti numbers in a network of three data points at two different scales. And the outcome is exactly as expected.

Of course, this example is not so hard for classical computers or even human brains to analyze. But the key point is that the Chinese have made it work on a quantum computer, a device that is set to dramatically outperform conventional computers in the coming years.

Article in Technology Review


A Critique of Pure Data: Part 2

Please see Part 1 here.

Enter Big Data

In the June 2013 issue of Foreign Affairs (“The Rise of Big Data”), Kenneth Cukier and Viktor Mayer-Schoenberger describe the phenomena as more than larger sets of data. It is also the digitization of information previously stored in non-digital formats, and the availability of data, such as location and personal connections, that was never previously available.

They describe three profound changes in how we approach data.

  1. We collect complete sets of data, rather than samples that must be interpreted with traditional techniques of statistics.
  2. We are exchanging our preferences for curated, high quality data sets for variable, messy ones whose benefits outweigh the costs of curating.
  3. We tolerate correlation in the absence of causation. In other words, we accept the likelihood of what will happen without knowing why it will happen.

Big data has demonstrated significant gains, and a notable one is language translation. Formal models of language never progressed to a usable point, despite decades of effort. In the 1990s IBM broke through using statistical translation from a French-English dictionary gleaned from high-quality Canadian parliamentary transcripts. Then progress stalled until Google applied massive memory and processing power to much larger and messier data sets of words measuring in the billions. Machine translations are now much more accurate and cover 65 languages (which it can detect automatically when most humans could not).

Another notable success was the 2011 victory of IBM’s Watson over former winners in the game Jeopardy. Like Google Translate, the victory was based primarily on the statistical analysis of 200 million pages of structured and unstructured content. It was not based on a model of the human brain. Watson falls short of a true Turing Test, but it is significant nonetheless.

The loss of causality is not, by definition, a loss of useful information. UPS uses sensors to diagnose likely engine failures without understanding the cause of failure, reducing time spent on the roadside. Medical researchers in Canada have correlated small changes in large data streams of vital statistics to serious health problems, without understanding why those changes occur.

Given these successes, and the presence of influential political movements that attempt to discredit the validity of scientific models in areas such as evolutionary biology and climate science, it is tempting to announce the death of models. Indeed many pundits of late have written obituaries on causation.

I believe these proclamations are premature. For starters, models in the form of data structures and algorithms are the backbone of big data. The rise of big data is derived not only from the increased availability of processing power, memory, and storage, but also from the algorithms that use these resources more efficiently and enable new methods of identifying the correlations. Some of these techniques are implicit, such as the rise of NoSQL databases that eliminate structured data tables and table joins. Others are innovative ways to find patterns in the data. Regardless, understanding which algorithms to apply to which data sets requires the understanding of them as abstract models of reality.

As practitioners discover more correlations that were never known before, researchers will ask more questions and better questions about why those correlations exist. We won’t get away from the why entirely, in part because the new correlations will be so intriguing that the causation will become more important. Researchers can not only ask better questions, but they will have new computational techniques and larger data sets with which to establish the validity of new models. In other words, the same advances that enable big data will enable the generation of new models, albeit with a time lag.

Moreover, as we press for more answers from the large data sets. we will find it increasingly harder to establish correlations. Analysts will solve this in part by finding new sets of data, and there will always be more data generated. However much of the data will be redundant with existing data sets, or of poorer quality. As the correlations become more ambiguous, analysts will have to work harder to ask why. Analysts will inevitably have to establish causation in order to improve the quality of their predictions.

Please note that I don’t discount the successes of big data. This is one of the most important developments in the industry. Instead I conclude the availability of new data sources and means to process them does not mean the death of modeling. It is leading instead to a great renaissance of model creation that advances hand-in-hand with big data.

Modelling Trends

A Critique of Pure Data: Part 1

Rationalism was a European philosophy popular in the 18th and 19th centuries that emphasized discovering knowledge through the use of pure reason, independent of experience. It rejected the assertion of Empiricism that no knowledge can be deduced a priori. At the center of the dispute was cause and effect–whether effects could ever be determined from causes, whether causes could ever be deduced from effects, or whether they had to be learned through experimentation. Kant, a Rationalist, observed that both positions are necessary to understanding.

Modern science descended from Empiricism, but like Kant is pragmatic, neither accepting nor rejecting either position entirely. Scientists observe nature, deduce models, make predictions using the models, and test the predictions against observations. They describe the assumptions and limits of the models, and refine the models to adapt to new observations.

The old quip says all models are wrong, but some are useful. Scientific models are are useful only to the extent they are demonstrated useful. At their simplest, they are abstract representations of the real world that are simpler and easier to comprehend than the complex phenomena they attempt to explain. They can be intuited from pure thought, or induced from observation. The benefit of models is their simplicity–they are easier to manipulate and analyze than their real-world counterparts.

Models are useful in some situations and not useful in others. Good models are fertile, meaning they apply to several fields of study beyond those originally envisioned. For example, agent models have demonstrated how cities segregate despite widespread tolerance of variation. Colonel Blotto outcomes can be applied to electoral college politics, sports, legal strategies, and screening of candidates.

To be useful, models are predictive, meaning they can infer effects from causes. For example, a model can predict that a given force (i.e. a rocket) applied to a object of a given mass (i.e. a payload) will cause a given amount of acceleration, which causes an increase in velocity over time. Models Screenshot_5_20_13_12_09_PMpredict that clocks in orbit on Earth satellites are slightly faster than those on the surface, resulting from gravitational time dilation predicted by general relativity. Models may be useful in one domain but not appropriate for another. Users have to be aware of the capabilities and their limitations.

Models give us the ability to distinguish causation from correlation. We may correlate schools running equestrian programs with higher academic performance, but we would be unwise to accept causation. We would have to create a model to show how aspects of equestrian activities improve cognitive development, and to discount the relevance of other models that may show causation to other factors. We would then search out data that can confirm or deny the affects of equestrian development on cognition. (It is more likely there are other causal factors acting on both equestrian programs and academic performance.) Whether or not a model can show causal connections to all world phenomena, they can guide us to better questions.

For this discussion we are interested in computation, and that means Alan Turing who, in 1936, devised a Universal Turing Machine (UTM) that is a simple model for a computer. Turing showed the UTM can be used to compute any computable sequence. At the time this conclusion was astonishing. The benefit of UTM lay not in its practicality–it is not a practical device–but in the simplicity of the model. In order to prove a problem is computable, you just need to demonstrate a program in the UTM. Separately, Turing also gave us the Turing Test, an approximate model of intelligence.

Those who use models to make predictions are demonstrated more accurate than experts or non-experts using intuition. This last point is the most important, and is the main reason we develop and use them.

The IT Service Management industry lacks academic rigor because it has never been modeled. Most academic research focuses on mostly vain attempts to measure satisfaction and financial returns. Lacking a model, it is impossible to predict the effect of an “ITIL Implementation Project” on an organization or how changes to the frameworks will affect industry performance. Is ITIL 2011 any better than ITIL V2? We presume it is, but we don’t know.

Continued in Part 2

IT Governance IT Service Management Project Management Trends

Service Management Is Dead

“Service Management is dead.”

That was my first thought when I read McKinsey Querterly’s “Capturing value from IT infrastructure innovation” from October 2012.

That was going to be the point of this blog post.

Then I read it again.

Conclusion 1: Innovation is more than just technology.

Conclusion 3: The path to end-user productivity is still evolving.

Conclusion 5: Proactive Engagement with the business is required.

Conclusion 6: Getting the right talent is increasingly critical

Conclusion 7: Vendor relationships must focus on innovation.

Getting the most from IT infrastructure has never been about technology (though technology is an important capability of IT). Innovating, maximizing productivity, and managing complexity evokes the mundane, at the expense of sexy.

It engages users.

It demands service.

It depends on process and automation.

It focuses on data and knowledge.

It understands and balances the needs of all stakeholders.

Technology is fun. Where technologists hang out are fun places to be. I know this may sound strange to those outside the industry, but the people who move technology are fascinating.

The most boring business events involve Project Managers and Risk and Compliance Officers. I have been to many meetings, and they are yawners, even for me.

That’s because project managers and auditors focus on the boring stuff.

Who are the stakeholders?

Who makes what decisions?

What do they want?

What kind of data do we have?

What kind of data we need?

Where is the data?

How do we use the data most effectively?

What are the risks, and how do we mitigate them?


For better or worse, this is the stuff that underpins business value; the foundation on which innovation is built.

Long live Service Management.

COBIT Incident Management IT Service Management ITIL Trends

The Role of COBIT5 in IT Service Management

In Improvement in COBIT5 I discussed my preference for the Continual Improvement life cycle.

Recently I was fact-checking a post on ITIL (priorities in Incident Management) and I became curious about the guidance in COBIT5.

The relevant location is “DSS02.02 Record, classify and prioritize requests and incidents” in “DSS02 Manage Service Requests and Incidents”. Here is what is says:

3. Prioritise service requests and incidents based on SLA service definition of business impact and urgency.

Yes, that’s all it says. Clearly COBIT5 has some room for improvement.

COBIT5 is an excellent resource that compliments several frameworks, including ITIL, without being able to replace them. For the record, the COBIT5 framework says it serves as a “reference and framework to integrate multiple frameworks,” including ITIL. COBIT5 never claims it replaces other frameworks.

We shouldn’t expect to throw away ITIL books for a while. Damn! I was hoping to clear up some shelf space.

IT Service Management ITIL Knowledge Management Tools Trends

HP’s $10 billion SKMS

In August 2011 HP announced the acquisition of enterprise search firm, Autonomy, for $10 billion.

It is possible HP was just crazy and former CEO, Leo Apotheker, was desperate to juice up HP’s stock price. With Knowledge Management.

Within ITSM the potential value is huge. Value can be seen in tailored services and improved usage, faster resolution of Incidents, improved availability, faster on-boarding of new employees, and reduction of turnover. (Ironically, improved access to knowledge can reduce loss through employee attrition).

In 2011 Client X asked me for some background on Knowledge Management. I did prepare some background information on ITIL’s Knowledge Management that was never acted on. It seemed like too much work for too little benefit.

ITIL’s description does seem daunting. The process is riddled with abstractions like the Data —> Information —> Knowledge —> Wisdom lifecycle. It elaborates on diverse sources of data such as issue and customer history, reporting, structured and unstructured databases, and IT processes and procedures. ITIL overwhelms one with integration points between the Service Desk system, the Known Error Database, the Confirmation Management Database, and the Service Catalog. Finally, ITIL defines a whole new improvement (Analysis, Strategy, Architecture, Share/Use, and Evaluate), a continuous improvement method distinct from the CSI 7-Step Method.

Is ITIL’s method realistic? Not really. It is unnecessarily complex. It focuses too much on architecture and integrating diverse data sources. It doesn’t focus enough on use-cases and quantifying value.

What are typical adoption barriers? Here are some:

  1. Data is stored in a variety of structured, semi-structured, and unstructured formats. Unlocking this data requires disparate methods and tools.
  2.  Much of the data sits inside individual heads. Recording this requires time and effort.
  3. Publishing this data requires yet another tool or multiple tools.
  4. Rapid growth of data and complexity stays ahead of our ability to stay on top of it.
  5. Thinking about this requires way too much management bandwidth.

In retrospect, my approach with Client X was completely wrong. If I could, I would go back and change that conversation. What should I have done?

  1. Establish the potential benefits.
  2. Identify the most promising use cases.
  3. Quantify the value.
  4. Identify the low hanging fruit.
  5. Choose the most promising set of solutions to address the low hanging fruit and long-term growth potential.

What we need is a big, red button that says “Smartenize”. Maybe HP knew Autonomy was on to something. There is a lot of value in extracting knowledge from information, meaning from data. The rest of the world hasn’t caught up yet, but it will soon.


The 17 Step Expert

Originally To A Friend Struggling With Career:

I was chatting with an old friend a few days ago who is struggling with career direction and what she wants to do. Here is my advice, compiled from a variety of “expert” sources and personal experience. Keep in mind these pertain to expertise in a knowledge-based industry.

  1. Decide what you want to do. I am not a fan of 1 year, 2 year, 5 year, 10 year, and 20 year plans. Basically think big–end of lifetime goals, the stuff you would write on your gravestone. Then make short-term plans that get you there. Otherwise, the economy and your current circumstances change too much to predict where you will be in 5 years, or 10 years.
  2. Choose the area you want to be an expert. Choose a subject that is sufficiently narrow. “Expert in Information Technology” or “teaching” is too broad. However, “Expert in Agile software methods” is probably right.
  3. It takes one hour per day for 3 years to be an expert on a subject. This represents approximately 1,000 hours of effort.
  4. It takes one hour per day for 5 years will make you a nationally recognized expert. This represents approximately 1,800 hours of effort.
  5. It takes one hour per day for 7 years will make you an internationally recognized expert. This represents approximately 2,5000 hours of effort. (Among performers, for example professional musicians or athletes, the general rule of thumb is 10,000 hours of practice. Malcom Gladwell in Outliers also estimated that Bill Gates spent 10,000 programming computers before he started Microsoft. Please note that we have defined a field of study that is more narrowly defined, relative to Bill Gates. In addition we are avoiding areas that involve significant practice eye-hand coordination and muscular development.)
  6. Motivation will be an issue. Staying at something daily for this long is a challenge. Try to find ways to reward yourself along the way. If you achieve a certain milestone, then reward yourself with a vacation. This is personal, so take some time to think about this. On the other hand, some aspects of this are self-rewarding.
  7. Buy and read all the books on that subject. Summarize. Condense. Publish book reviews on Amazon or on a Blog about each book. If you are able to read and repeat the contents of three books on the subject, you are probably qualified to present at college-level seminars on the subject.
  8. Present at seminars and conferences.
  9. Find all the academic articles you can. Outline them. Summarize the arguments. Compare and contrast the findings. If possible, write and publish your own academic paper.
  10. Connect with the experts in the field: email, LinkedIn, FaceBook, Twitter, etc. The world of social networks has made it easier than ever to identify and connect with the world’s experts.
  11. Start a weblog. Try to write something twice a week. You won’t make money on a weblog, but that isn’t the point. It is about publishing your thoughts and expertise. Respond to comments. Engage with readers. There are methods to improving the readership and popularity of your weblog. I am not an expert in them, but they are there and you should research them.
  12. Cross-post your Blog posts on the social networks.
  13. Find online chat groups. Participate: ask questions, answer questions.
  14. Identify the conferences on the subject. Attend them if possible.
  15. Even better, present at the conference.
  16. If you entrepreneurial, start your own company doing just that. If not it helps to be working in or near that field, even if for someone else.
  17. Invent your own theories and methods. Publish them. Try them out in the real world.

Everyone struggles with money, but try not to worry about that in the short-term. After you have achieved expertise and recognition, the money will follow. But you need to focus every day, at least an hour. And try to do all of the above every week. It is difficult, but don’t let one aspect slip for too long.

Strategy Trends

Empowered: More FLAWs than an Uncelebrated HERO

If nothing else, Empowered, Unleash Your Employees, Energize Your Customers, Transform Your Business has given the world several new FLAWs (four letter acronym words). At last reckoning there were three: HERO, IDEA, and POST, but one of these was introduced in an earlier book, Groundswell.

Empowered has given the world a lot more than that. My title is unfair perhaps, because I liked this book, and the further I read the more I liked it. You cannot read a business book these days that doesn’t introduce a new acronym, and I have come to see it as a proxy for strong knowledge or good writing. Fortunately Bernoff and Schadler are both knowledgeable and good writers, so I wish they wouldn’t resort to gimmicks.

The best part of the book is the specific examples of real companies doing real projects, mostly Forrester customers. Empowered ties together many trends that, although I was aware of them individually, was not seeing them so closely interlinked. Social media (Twitter, Facebook, and LinkedIn), mobile computing, project management, information security, and the traditional roles of customer service are among the topics that are addressed. The hero of the story is, of course, the HERO, or highly empowered resourceful operatives who are dragging companies, kicking and screaming, into the 21st century.

HERO means more than it seems. Imagine a 2-dimensional matrix forming a quadrant—yes this quadrant is in the book, but not until chapter 8. On the X-axis (from left to right) is empowerment. On the Y-axis (from bottom to top) is resourcefulness. At the bottom left of the quadrant are disenfranchised employees who are neither empowered nor resourceful, making approximately one-third of most companies. The next one-third of employees are those who are locked-down—empowered but not resourceful. The smallest percent, maybe one-eighth, are those who are the rogues who are resourceful but not empowered. The rest are HEROs. The goal of organizations, then is not to expand that quadrant as big as possible, but to get the best people into the HERO roles and to get the organization behind them. Easier said than done, but there is a lot of  substance in Empowered to help on the journey.

The book is divided roughly in half. Part one discusses HEROs and HERO projects in detail, including how they have saved organizations and how the lack of a HERO has led to substandard responses and embarrassing situations. Prominent here are the realities of social media and mobile technologies. Part two discusses actions organizations can take to enable the HERO. Similar themes run through the book, and this is not a collection of random blog posts.

Part one did turn me off in many places. The author seemed to target me, an IT professional and my colleagues as the chief disablers of HERO behaviors. I hope that we can be forgiven. We understand as well as anyone the complexity behind modern businesses, and how frail it really is under the hood. We are the individuals whose heads get beat whenever a server crashes or data is compromised, regardless of whether we had anything to do with the initial implementation. We’ve been SOX’ed, mandated, legislated, and audited to death. A little more respect would be nice.

Fortunately, the book delivers some more of that in part two. It recognizes some of the issues faced by IT and provides some guidance for IT professionals. It spends time on a couple IT leaders who have reached out to other business units to build creative and innovative solutions. Ultimately this is not about IT, but about the business leaders understanding the borders of the organization are no longer around its physical premise and its high-walled data centers. The borders around the organization are around its people. Employees and customers are using Twitter and YouTube, and the conduits for leakage is unfathomable. Employees have to exercise common sense and be professional. The emphasis of the Information Security office has to migrate from applying technical band-aids to engaging leaders and employees. It will happen, and I predict IT will be leaders in this process, not inhibiters.