Service Management Is Dead

“Service Management is dead.”

That was my first thought when I read McKinsey Querterly’s “Capturing value from IT infrastructure innovation” from October 2012.

That was going to be the point of this blog post.

Then I read it again.

Conclusion 1: Innovation is more than just technology.

Conclusion 3: The path to end-user productivity is still evolving.

Conclusion 5: Proactive Engagement with the business is required.

Conclusion 6: Getting the right talent is increasingly critical

Conclusion 7: Vendor relationships must focus on innovation.

Getting the most from IT infrastructure has never been about technology (though technology is an important capability of IT). Innovating, maximizing productivity, and managing complexity evokes the mundane, at the expense of sexy.

It engages users.

It demands service.

It depends on process and automation.

It focuses on data and knowledge.

It understands and balances the needs of all stakeholders.

Technology is fun. Where technologists hang out are fun places to be. I know this may sound strange to those outside the industry, but the people who move technology are fascinating.

The most boring business events involve Project Managers and Risk and Compliance Officers. I have been to many meetings, and they are yawners, even for me.

That’s because project managers and auditors focus on the boring stuff.

Who are the stakeholders?

Who makes what decisions?

What do they want?

What kind of data do we have?

What kind of data we need?

Where is the data?

How do we use the data most effectively?

What are the risks, and how do we mitigate them?

Yawn.

For better or worse, this is the stuff that underpins business value; the foundation on which innovation is built.

Long live Service Management.

The Role of COBIT5 in IT Service Management

In Improvement in COBIT5 I discussed my preference for the Continual Improvement life cycle.

Recently I was fact-checking a post on ITIL (priorities in Incident Management) and I became curious about the guidance in COBIT5.

The relevant location is “DSS02.02 Record, classify and prioritize requests and incidents” in “DSS02 Manage Service Requests and Incidents”. Here is what is says:

3. Prioritise service requests and incidents based on SLA service definition of business impact and urgency.

Yes, that’s all it says. Clearly COBIT5 has some room for improvement.

COBIT5 is an excellent resource that compliments several frameworks, including ITIL, without being able to replace them. For the record, the COBIT5 framework says it serves as a “reference and framework to integrate multiple frameworks,” including ITIL. COBIT5 never claims it replaces other frameworks.

We shouldn’t expect to throw away ITIL books for a while. Damn! I was hoping to clear up some shelf space.

Incident Prioritization in Detail

One of the advantages of working with BMC FootPrints is the lack of definition “out of the box”.1 The tool provides easy configuration for fields, priorities and statuses, and workflows within multiple workspaces, but there are few defaults (besides sample workspaces that are not very usable). This lack of out of the box configurations has exposed me to an infinite variety of choices used by different organizations.

Priority

One organization used the term Severity. This has the advantage of abbreviating to “Sev”, so incidents can be described as Sev1, Sev2, etc. Nevertheless, most organizations stick with Priority.

I have seen these range all the way from 2 (Critical, Normal) to as many as 7 or 8.

 

2 3 4 5 6 7
Critical High Critical Criitical Critical P1
Normal Medium High High High P2
Low Medium Medium Medium P3
Low Normal Low P4
Project Service Request P5
Normal

The table above shows the more common configurations. In my experience the use of terms (High, Medium, Low) is more common than numbering (P1, P2, P3), but the latter is also used.

One of my clients had used numbering, P1 through P5, but they had overused P1 so badly they had to insert a new P0 to achieve the purpose of P1–fortunately they have since fixed the issue. (This reminds me of the project “prioritization” of a former employer. Everything on the list was “High Priority”. They effectively said everything was the same priority, and they were all low.)

I encourage the use of “Normal” instead of “Low”, because no user wants their issue perceived as “low priority”. I have also seen a customer take this advice but swap it out for Medium instead of Low. Most organizations track Service Requests with Incidents, so we usually want some mechanism for differentiating them, but note that new priorities are not required (see Urgency below).

I also find it common to create a separate priority level Project, for handling projects (or extended service requests) that are scheduled past normal Service Level targets. A Project choice is particularly useful when Service Level measurements are tied to Priority (my colleague, Evans Martin, has written about this already.)

I have also seen duplicated sets of Priority choices tied to different Service Levels depending on which team was assigned the work, or regional organizational differences. For example, an software issue assigned to a development group might have a separate set of service levels but remain assigned in the same Service Desk system for tracking purposes. In this case they might have choices like P1, P2, P3, P1-Dev, P2-Dev, and P3-Dev.

Impact

Impact describes the level of business impact. Usually this is described in terms of the number or percentage of users impacted. I had one customer who described the percentage of configuration items (CIs) at its facilities that were impacted (see column 5 below).

 

1 2 3 4 5
High 10+ People Organization Entire Company 80-100%
Medium 2-9 People Department One/Multiple Sites 50-80%
Low 1 Person Individual Department 20-50%
VIP Under 20%
Individual

I have seen organizations describe the number of people affected (column 2), but most common are the choices ranging from Entire Organization to Individual. The choices in between need to reflect your own organization. One customer who ran fitness outlets needed to distinguish corporate sites from fitness centers.

The default configuration High/Medium/Low (column 1) is too ambiguous in most cases, but I have seen it used.

Many organizations separate VIPs from non-VIP individuals. VIPs will often map to Priority similar to Departments.

Urgency

Urgency describes how quickly the incidents should be resolved. In the simplest case this can be High, Medium, and Low, but as with Impact this is usually too ambiguous to be useful.

 

1 2 3
High 0-2 Hours Down
Medium 3-4 Hours Affected
Low 4-8 Hours Service Request
1-2 Days Project
Over 3 Days

I have also seen Urgency described in Resolution time frames. There are two issues with this: the time frame is easily confused with Service Levels (which they are not), and the time frames are also ambiguous especially in situations when no downtime is acceptable. I find Down, Affected, and Request to be useful.

The combination of column 4 in Impact and column 3 in Urgency results in sentences that read in English: the Company is Affected, or the Individual is Down. I like this because the intent is clear.

Mapping Table

The mapping from Impact and Urgency to Priority can usually be described in a table like below. There are no right or wrong answers here, and it varies by organization and by choices for Impact and Urgency. In the table, Impact runs in the first column and Urgency runs in the first row.

 

Down Affected Request Project
Entire Company Critical Critical High Project
Department Critical High Medium Project
VIP High Medium Normal Project
Individual Medium Normal Normal Project

In some cases multiple choices of Impact or Urgency will always map to the same priority. For example, VIP often maps like Department. Although I encourage simplicity, sometimes it makes sense to break them out in order to make the choices clear. (You could also stack choices, such as “Department / VIP”).

You will also need to decide whether to allow overriding these choices. If so, you will need to add a third field (called something like Override or Priority Override) to your mapping table.

Other Issues

  1. Start the discussion with minimal choices for Impact, Urgency, and Priority. Add choices only necessary.
  2. If the tool has default choices, start with those.
  3. You may have Service Level Agreements tied to your Priority that need to be factored in.
  4. Avoid duplicating terms across fields, such as using High/Medium/Low in both Urgency and Priority.
  5. You need to decide whether customers / users can choose the Priority. I don’t encourage it, because the user may not be qualified to understand the Impact. Moreover they will always choose critical. Nevertheless, many organizations do allow it.
  6. Decide if you want default choices for Impact and Urgency. Doing so may limit the usefulness of Priority (IT agents are lazy and often leave the defaults).
  7. As discussed before, you may need a policy for when and whether Priority can be changed.

1 Several customers preferred more options out the box. I can understand the desire for more the “standard configurations” provided by other vendors, but at the time it seemed strange and undesirable.

Changing Incident Priority

The correlation between sanity and Linkedin Groups is inverted. I joined several groups because I like to stay connected with the industry, but the disinformation (and verbosity) can be infuriating. Recently I read the following and several people agreed.

The priority of an incident must never be changed, once determined

For the record, here was my response:

Whether and how the priority should change is a policy issue for the organization. I am not aware of any “good practices” that says one way or the other. Some organizations allow the customer or user to provide the initial prioritization. The Service Desk should review the initial prioritization as a matter of good practice (and obvious necessity).

As Stephen suggested, and as described in ITIL 2011, the calculation of Priority will often be based on Urgency and Impact.

If you enforced this policy in the tool, just imagine the consequences of a simple data entry error that wasn’t detected prior to saving. Fortunately, few organizations use this policy, and ITIL 2011 is even more liberal.

It should be noted that an incident’s priority may be dynamic — if circumstances change, or if an indent is not resolved within SLA target times, then the priority must be altered to reflect the new situation. Changes to priority that might occur throughout the management of an incident should be recorded in the incident record to provide an audit trail of why the priority was changed.

In my experience few organizations create an audit trail for the change of an incident prioritization (although some tools, such as FootPrints Service Core, tracks these changes in the History). As a general good practice I stand by my original comment.

I will discuss the details of incident priorities in an upcoming post.

Improvement in COBIT 5

In a previous post I discussed starting your service or process improvements efforts with Continual Service Improvement (or just Improvement).

I prefer COBIT5, and the issue is ITIL. The good news is the Continual Service Improvement is the shortest of the five core books of ITIL 2011. CSI defines a 7 Step Improvement Process:

  1. Identify the strategy for improvement
  2. Define what you will measure
  3. Gather the data
  4. Process the data
  5. Analyze the information and data
  6. Present and use the information
  7. Implement improvement

This method, as the name suggests, is heavily focused on service and process improvement. It is infeasible in situations where there is no discernible process, a complete absence of metrics, and a lack of activity that could be captured for measurement and analysis. It is infeasible in most services and processes described in most organizations, due to this lack of maturity.

I find the COBIT5 method is more flexible. It also provides 7 steps, but it also views them from multiple standpoints, such as program management, change enablement, and the continuous improvement life cycle.

For example, the program management view consists of:

  1. Initiate program
  2. Define problems and opportunities
  3. Define road map
  4. Plan program
  5. Execute plan
  6. Realize benefits
  7. Review effectiveness

COBIT5 provides a framework that is more flexible and yet more concise, but still provides detailed guidance on implementation and improvement efforts in terms of a) roles and responsibilities, b) tasks, c) inputs and d) outputs among others.

Therefore I find the COBIT5 framework, particularly the COBIT5 Implementation guide, superior to the Continual Service Improvement book of ITIL 2011.

In addition COBIT5 provides a goals cascade that provides detailed guidance and mapping between organizational and IT-related goals and processes throughout the framework that may influence those goals. The goals cascade is useful guidance for improvement efforts, but alas it is the subject of another discussion.

Starting With Improvement

At last week’s Service Management Fusion 12 conference, I attended a brief presentation on Event Management that left a lot of time for questions and answers. One of the questioners had an ordinary concern for organizations starting down the road of “implementing ITIL”: where should we start?1

In this case the speaker demurred using ordinary consultant speak: it depends on your organization and objectives. Event Management supports Incident Management, and that is where many organizations start their journey. I raised my hand and offered a brief alternative: start with Continual Service Improvement (CSI). I didn’t want to upstage the speaker, so I left my comment brief and exited for another speaker whom I also wanted to see.

The 5 books of ITIL imply a natural flow: Service Strategy leads naturally to Service Design. Services are then ready for testing and deployment as part of Service Transition, which will then require support as defined by Service Operations. Once in production, services can be improved with Continual Service Improvement.

This is a natural life cycle for individual services and processes, but ITIL never says services or processes should be improved (or defined) in this order. In fact, ITIL does not offer much guidance on this at all. Because of this, and because organizations are all unique, each organization needs to define its own road map. CSI is one tool for doing this.

I encourage organizations to assemble a board to oversee the development and improvement of service and processes. The board may consist of stakeholders from IT and other functional units that depend heavily on IT’s services, as well as executive management who oversee them. The composition will vary by organization, and would meet quarterly or monthly.

The board’s agenda will include several items, including upcoming projects (new services), reviews or assessments of service and process maturity (if any), reviews of user satisfaction surveys or interviews, and review of existing implementation and improvement efforts. Most importantly, existing performance metrics should be summarized and reviewed. Care should be taken to avoid making this a project review meeting. Instead the focus is on the assessment and maturity improvement of overall IT services and processes in order to guide future development initiatives.

The board serves several purposes:

  • Ensures the prioritization of implementation and improvement efforts receives feedback from a variety of stakeholders.
  • Ensures there is a method or process for implementing and improving services and processes.
  • Provides a forum for reviewing service and process maturity.
  • Provides a mechanism for reviewing service and process performance metrics with various stakeholders.

This concept of a governance board presented here may not apply to all organizations. I have applied it only to one organization. For IT organizations who are challenged with immature service definitions (lack of a Service Catalog), poor operational dialog with other business units, or poorly understood maturity of services and processes, the board is one mechanism for prioritizing and overseeing the improvement efforts.

I emphasize both concepts of implementation and improvement. The practices presented in ITIL v3/2011 are more complete and mature than most IT organizations. In fact I have encountered few organizations with maturity in more than a small fraction, and even fewer with usable performance metrics. Most of the time we start with implementation, because they have too little to engage in improvement, but the improvement board should still oversee and prioritize the implementations.

1 ITIL as a framework cannot be “implemented”. However, we can engage in improvement efforts using the framework as guidance.

 

Process Before Tool (right)?

Tonight at the IT Service Management Fusion 12 conference I ran into an old colleague. It was nice to see him again. We worked together at an IT good practice consultancy, and like me he later moved on to a tool vendor.

This isn’t unusual. Most work in this industry involves the configuration or customization of tools to meet the specific needs of the organization. Consultants need to earn a living and that means going where the work is, or much of it anyway.

He and I still focus on good practices, but now from a different perspective. Operational excellence, consistently executed, usually requires defining the process at some level. Sometimes this definition is informal, in the heads of the stakeholders, and sometimes the process is defined more formally, using Visio diagrams and descriptions of process details and statements of policy.

In many organizations the sum total of the process is expressed in the configuration of the tool. This is not good practice and I don’t advise it. It happens a lot.

Consultants in this space repeat “process before tool” ad nauseum. Another variation is “a fool with a tool is still a fool (or a faster fool).” At conferences and in presentations there is no shortage of this advice, and I expect to hear it repeated several times this week. Tweeting that will make me a faster fool too.

There are, however, some problems with this advice. A process defined completely in abstract, and devoid of any tool consideration, is unlikely to be useful. It will demand process steps that cannot be readily automated, or cannot be enforced through automation. Or it will demand complex configurations (or customizations) that make the tool brittle. It will ignore current state processes implemented in the tool and try to supplant it with something that is foreign.

We almost never define services devoid of any tool considerations.

The definition and improvement of the services and processes go hand-in-hand with the implementation and configuration of the automation. The industry calls it Continual Service Improvement (CSI), and it is important to get this right sooner rather than later. CSI is internal to the organization and very organic. It is not a binder delivered by credentialed IT Service Management consultants or the tool vendors.

The automation of IT service delivery and process execution is underway. It has been for several years, and new tools are appearing to make this easier and better. Publicly-traded BMC Software acquired Numara Software in February 2012. ServiceNow went public in June 2012.

Not only will the trend continue, it will accelerate. In fact, I believe the “Continuous Highly Automated Service Management” organization will require integrated automation that is several orders of magnitude more effective than today. Crossing that chasm will take a lot of work from vendors and their customers, and we have some hard problems to solve.

And yes, it will be outside-in, as well as outside-out, inside-out, and inside-in. In short, it will be awesome, but we will develop this theme in more detail later.

Key takeaways:

  • Get Continual Service Improvement right first
  • Improve services and process together with the automation
  • Automation of services and processes will accelerate non-linearly and disruptively (a chasm)

ITIL Exam Statistics Updated for July 2012

APM Group has released their ITIL exam statistics through July 2012. I have compiled their statistics and present them with a little more context.

ITIL Foundation

  • Over 148,000 Foundation exams have been administered in so far in 2012, resulting in over 132,000 certificates to date.
  • Passrate is 90% in 2012, up steadily from 85% in 2010.
  • Total results for 2012 is on a trajectory for 10% growth over 2011. That year ended with 250,000 exams taken resulting in 220,000 Foundation certificates issued.
  • Asia has overtaken Europe in July at 40% of exams taken globally. This is partially attributable to seasonal cycles in both regions, but Asia’s share has risen steadily from around 25% in the first half of 2010.
  • Using unverified but credible data from another source that dates back to 1994, I estimate just under 1.4 million ITIL Foundation certificates have been issued total worldwide.

ITIL Advanced Certificates

  • No V2 or V3 Bridge certifications were issued in 2012.
  • Almost 29,500 intermediate exams were taken in 2012, resulting in over 23,000 intermediate certificates. (Note: a certificate does not imply a unique individual.)
  • Interest in Lifecycle track continues to rise relative to Capability track. Adjusting for credit disparities, Lifecycle track constituted 69% of the certificates in 2012, up from 59% in 2009.
  • Over 2,000 ITIL V3 Experts have been minted thus far in 2012, via the Managing Across the Lifecycle (MALC; alt. MATL) exam.
  • Although interest in the ITIL Expert certification via MALC continues to climb, it will not exceed on an annual basis the 5,000 ITIL V3 Experts minted in 2011 via the Managers Bridge exam until 2014 at the earliest.
  • Europe continues to dominate the advanced ITIL certification market at over 40%. However, Asian interest continues to climb and now constitutes over 30% of the advanced certifications.

Click on a thumbnail below to see an expanded chart.

Definitive Process Library? Huh?

This morning one of North America’s leaders in IT best practice consulting, PLEXENT, surprised me with a headline: IT Improvement: What is a Definitive Process Library (DPL)?

Besides a marketing term they made up, it made me wonder, what exactly is a Definitive Process Library?

My conclusion after research: it is a marketing term they made up.

ITIL does not define a DPL. ITIL does define a Definitive Media Library (DML) in Service Transition (Release and Deployment Management):

One or more locations in which the definitive and authorized versions of all software CIs are securely stored. The DML may also contain associated CIs such as licenses and documentation. It is a single logical storage area even if there are multiple locations. The DML is controlled by Service Asset and Configuration Management and is recorded in the Configuration Management System (CMS).

Replace “software” with “processes” and you almost have a definition of DPL, if you would choose to do so (for reasons other than marketing and self-promotion). But why would you?

An organization oriented around services supported by processes would be deeply affected by at all levels, including:

  • The organizational chart
  • Roles and responsibilities
  • Approval matrices
  • Authorization rights
  • Communication plans
  • Key Performance Indicators and reporting metrics
  • Human capital management
  • Automation tools

To name just a few. ITIL provides a conceptual framework dealing with these challenges, including the CMDB, the CMS, and the SKMS.

For services ITIL has added the Service Portfolio and the Service Catalog, concepts which for knowledge management purposes could be dealt with through the broader framework of Knowledge Management.

For processes, they are stored in the CMDB and managed through Change Management. No other consideration is required, besides how you publish, communicate and manage the downstream impacts (some of which are mentioned above).

In practice I have not observed any outstanding or notable best practices. I have seen them stored and published on a file share, on SharePoint, on the Intranet in a CMDB, and as email attachments. Have you seen best practices that uniquely stand out? If so let me know, I would love to hear it.

Is IT Value Intrinsically Linked to Organizaitonal Strategy?

Revenues of Inditex Group, Spanish parent of the global Zara fashion store chain, grew from 756 million euros in 19941 to 13.8 billion euros in 2011, a compound annual growth rate (CAGR) of 19%. The number of stores has risen from 424 to 5044 over the same period (CAGR 16%). Comparatively the apparel industry in the US grew 1.9 in 2010 and the entire apparel industry grew 5.83% in 2011.

Zara’s competitive success is the result of good strategy. In “Good Strategy, Bad Strategy: The Difference and Why It Matters” author Richard P. Rumelt identifies three necessary components of the kernel or core of a good strategy2:

1. A diagnosis of the main competitive challenges,
2. Guiding policies that address the diagnosis, and
3. Coherent set of actions that implement the policies

Zara, once a low-cost manufacturer, correctly diagnosed that as the industry moved toward low-cost manufacturing centers in Asia, a shorter supply chain with design and manufacturing remaining in Spain could compete by 1) remaining close to customers and 2) rapidly responding to quickly changing fashion trends.

Zara designed a coherent set of actions across the company to adapt and respond to these insights. Designers create new designs in two weeks. Manufacturing remained in Spain. Designers work closely with manufacturing to ensure the new design can scale up production quickly with reasonable controls on cost. Store managers watch customer trends closely and enter orders nightly via hand held terminals. Orders ship daily regardless of percent utilization of the vehicles (other stores will hold the shipment until the truck is full). The entire system is designed to keep feedback loops as short as possible.

The company does not advertise in order to shape or explain customer value. Instead Zara listens and responds to changing customer preferences much more rapidly than competitors. Inventory is kept small and discounting due to overstocks are lowest in the industry. Where is IT in this story? IT is not central to the organization, and spend is lower than the rest of the industry. Information Technology is used, for example, to optimize logistics routes to reduce shipping times and reduce CO2 emissions. IT support operational processes as at most organizations, and as a public company is used to manage risks. The technical environment is intentionally maintained as simple as possible and the number of applications are minimized in order to reduce costs and risks.

Zara is just one example of how Information Technology supports the organizational strategy. It is particularly revealing because the organization actually has such a strategy, but it is not the only example. Wal-Mart Stores have been analyzed in great detail already, but one point worth examining is Wal-Mart’s use of bar-code scanners. The adoption of bar-code scanners is almost synonymous with Wal-Mart Stores, but the firm did not invent even become an early adopter. Kmart began adopting bar-code scanners at the same time as Wal-Mart in the early 80’s, and they were in use in grocery stores before that. However, Wal-Mart seemed to benefit more than anyone else. The firm integrated bar-code data into its logistics system faster than its competitors, and traded its bar-code data with suppliers in return for product discounts.3 The important point for Wal-Mart is the use of bar-code data integrated with and supported the rest of Wal-Mart’s logistical system as part of a integrated and self-reinforcing design. They were not one CIO’s pet project that was tangential to the rest of the organization.

Organizations try to provide superior value to customers over a sustained period of time. However, this is insufficient. The organization tries to capture a significant portion of that value in a way that is difficult for competitors to imitate. As an internal Type I or Type II provider, the IT organization in general needs to support and enhance the strategy of the organization. Operational excellence (warranty) is at times necessary but is never sufficient. (Indeed the sole focus on operational excellence is a race to the bottom, as all industry participants have to spend more to produce decreasingly differentiated products.) Utility as defined in ITIL is not useful. More utility is not better.

Enough utility to enable, support, and integrated with the organization’s competitive differentiators is what we seek to create. The CIO deserves a seat at the table. As both information and technology change, improve, increase, and differentiate, the need to have the CIO at the table will only increase, both to manage risks and to define and improve the organization’s strategy. In addition, IT will need to execute that strategy as part of a coherent and integrated set of actions across the organization.

1 Spanish pesetas converted 166.386 ESP/EUR, the official exchange rate when it was converted in 1999.
2 Bad strategy, by contrast is not the absence of good strategy. Rather bad strategies are fat analysis documents that fail to focus resources and actions, or are performance targets that fail to diagnose underlying competitive challenges to growth.
3 Good Strategy, Bad Strategy: The Difference and Why It Matters