Der grosse Graben zwischen Entwicklung und Betrieb...

The SLA Cheat Sheet

SLA Cheating? Better not, unless you want to mess up the relationship with your customers in long-term. So, why this blog? First of all, because cheating in terms of achieving Service Level Agreement targets is widely common, especially in the IT-outsourcing business. This shows up in various ways, just to give you some examples:

  • Simply putting requests to ‚customer pending‘, if time-related SLA targets are in danger
  • or making Service availability targets practically unmeasurable
  • or finally revealing hidden contractual terms & conditions if it comes to the payment of SLA penalties

Now the question comes up, why this happens. There is a common argument that in such situations the SLA is not defined well enough (because it does not reflect customer needs) and if a KPI is misused, you just need to put additional/other KPI’s and rules into place to prevent this in the future (-> balanced KPI set). There is a well-founded and recognized framework behind that. This is somehow basic and might help in many cases, but in other instances this will make things even worse. One thing we have to consider is that IT departments or external Service Providers sometimes start looking at IT Service Management in a purely tayloristic way, turning it into something that is solely a system of processes, roles and finally KPI’s for the governance around it. As history showed us, in such a limited view, everybody ends up as an unmotivated, trained monkey, instead of an individual, looking for service excellence and being proud of it (…never underestimate motivation). Sure, we all agree that a common ruleset around things is important, but if the service understanding and the overall context gets lost on an individual level and everything is purely about fulfilling a contract (which will never cover everything), this will open the door for a lot of cheating. Then, it’s about finding the most cost-efficient way to reach the goals you have been set:

Dagobert is finally achieving his KPI's

Dagobert is finally achieving his KPI’s

Unfortunately, ITSM Frameworks such as ITIL, which I think provides great ideas on how to build and improve your Service Management, are often misinterpreted and limited to the tayloristic view. I think it is key to understand that you will never be able to put all necessary KPI’s in an SLA and on top some important key success factors as in the cartoon above (e.g. friendliness of Service Desk agent) are hard to measure. At the end, an SLA just should be a guidance body, it helps you the get an agreed level of service delivery between service provider and customer and sets the base for continuos improvement, and gives an understanding of the ciriticality of services and therefore setting the right priorities – nothing more. Altough contracts, processes and tools are important, it is the people which deliver the services and it is the service attitude and relationships which makes the difference. Enough talking for now, let’s have a look at the cheats I have come across the last years:

The ‘Make a second ticket’ Cheat

Having trouble to meet your Incident resolution targets? No problem. As this KPI is always measured based on an individual ticket base, one common trick is to make two out of one ticket, meaning that the original ticket is put on status ‘solved’ just before the SLA is violated ( ->meaning : SLA is met) and open a second one, on which the SLA clock starts ticking again. This will give the Service Desk agent virtually double time. This can happen even with the agreement of the end user and with some fantasy the Service Desk agent can make two incidents out of one. The Service Managers will end up in endless micromanagement if they want to get that under control…

The ‘Customer pending’ Cheat

As service providers only can be held accountable for things they have under own control, the SLA is usually stopped as soon something needs customer clarification or if something in the supply chain is not in provider responsibility. While this is something that makes completely sense, it also can be heavily misused. In Incident-, Change- and other processes you can temporarely stop every time SLA (and therefore gain time) by either inventing a question (relevant or not for further processing) to the other party or claim that this now out of scope of your service responsibility.

The Service Availability Cheat

Setting service availability targets is THE key SLA for customers (they want the services to be available). Customers always have an end-to-end and not a component view (they perceive the service either as available or not)  but the problem is that the SLA service providers offer are almost never offered that holistic view. One typical example is Cloud Services. Between the Cloud Service provider and the customer there is always the Internet and the customer network. As both things are not under control of the Cloud Service Provider, there will be never an End-to-End availability SLA for Cloud Services and if there is an SLA breach there is always the argument that something was wrong with the Internet or the Customer network. This will end up in endless discussion.

The Performance Cheat

What is true for availiability is also true for capacity. Setting maximum response times in an SLA is very common. But it’s again very unlikely that this will be an end-to-end SLA. If performance is bad, there will be again the discussion why.

The Measurement Cheat

In relation to the availability & performance cheat explained before, also consider who measures the SLA’s: It is usually the service provider. It’s up to the customer if he wants to trust that, but for sure there is a lot of cheating potential here.

The ‘Pay per ticket’ Cheat or how to destroy Problem Management

Models like ‘Pay per ticket’ have become very popular in Service Desk outsourcing. It gives the customer a clear pricing model and there is a fixed price for each call at the outsourced Service Desk. However, the big issue here is, that the Service Desk provider will have absolutely NO interest to bring more stability and quality in the customer’s Service environment, because this would mean LESS incidents, which turns into LESS profit for the service provider. In fact, the Service Provider would have an interest in more instability, resulting in more incidents. Problem Management gets completely forgotten.

The Backup / Restore Cheat

Datacenter Outsourcing, and anyway all kind of Service Outsourcing usually involves Backup/Restore. There is a fixed price per ‚managed‘ Backup-Gigabyte and the customer has to specify exactly how the backup plan on provider side should look like: For example, frequency of transactional/daily/weekly/monthly/yearly backups and how long these backups should be kept (for each Service). Because of the complexity (providers make it extra complex) it can very easily happen that the customer makes a mistake in the backup plan, and therefore certain backups are never deleted, although not needed anymore, and the data amount keeps growing and growing. Suddenly the CIO has double costs and is wondering why. This is a true financial goldmine for Service Providers. I consider this as cheating, because the Service Provider often know EXACTLY that there is room for backup optimization, but he will NEVER tell this the customer, just for profit reasons.

So, what did we learn from all that?  First, there is a lot of cheating going on. Sure, on service provider side it’s also about money, but in conjunction with cheating (conscious or not) this is just short-term thinking. Sure, on customer side you could for each ‚misused‘ KPI put an additional KPI’s in place and try to fix it. But as explained before, this is in my opinion not the best way to go and might things even make worse. If the Service Provider doesn’t have the right attitude and customer orientation, he will always try to cheat. My final thought: SLA’s are important, need proper design, but if you as a Service Provider focus only on them, you will forget, what it is all about: Service attitute, customer collaboration, a strong Service Manager, managing the day-to-day relationship, and empowered people. This is the key to success. Or to put it more bluntly: A marriage contract will not make you a happy couple, but if things go terribly wrong, it’s definitely worth looking at it.

1 Comment
  • Bart Van Brabant

    2. Juni 2014 at 08:29 Antworten

    Hallo Herr Lichtenberger,

    Very good article – because many people know this is happening (and even far worse practices) and even let it happen.

    Processes and Metrics have no value whatsoever if people do not adhere to them.

    I would like to stay in touch (asked via LinkedIn)

    Bart Van Brabant
    Former ITSMF Chairman & ITSM Researcher

Post a Comment