The Ops4Less model of IT Operations cost control:
P A I D ™
Keep service level reporting simple - black box it
Keep service level reporting simple - black box it. No need for complex monitoring solutions, at least not initially.
It has always seemed to me that most IT monitoring and measuring tools are very self-serving. They look at the world from the internal IT silo perspective. In ITSM terms they are mildly interesting diagnostic tools for incident and problem resolution, but in terms of service level measurement the only really useful tools are the ones that measure the end user experience.
I know these tools exist - there are plenty of them, the ones that measure response times from the desktop, that treat IT as a black box service provider.
They measure response time and availability. How do you define availability? I think the best definition is when response times are greater than some agreed value. This includes the case where response times are infinite which is too often the IT definition of a service not being available, but as far as a user is concerned very slow response is generally as useless as none at all.
These are the only metrics that really map to what the user sees. All the network monitors and server probes and traffic agents and database monitors and snorage (sorry storage has never excited me) consoles... they are toys for the geeks but they don't tell us much about the service.
It is almost always impossible to consolidate their information up into a true depiction of the service. I've spoken before about the boundary problem: no matter how good your IT, there is usually something you can't measure adequately somewhere in the chain delivering the service. Even when it is possible, it's very hard.
True, the user experience tools aren't perfect either. It generally is not practical to put an agent on every desktop and measure every single user's experience. But I believe a representative sampling is enough.
Certainly users do. There is a sigh of relief when they finally see metrics that measure their world instead of the arcane insides of the IT beast.
And yet when you listen to IT users and vendors, these experience monitors don't seem to be seen as the most important IT monitoring tool there is. They are ancilliary. Accessories. Plug-ins. Add-ons.
They are not. They are your lead monitoring tool. All the others are the add-ons: the drill-down tools that let you work out why the user experience is outside the bounds of the SLA.
The very first monitoring to put in is the end-user experience (OK OK closely followed by the event console). Quick wins in service level reporting. And clear guidance in prioritising problems.