Posted by: Eric Siegel
And while I'm on a SaaS chain-of-thought, one final
discussion for the day: diagnostics.
What will you do when users at a branch office or out in the
field call and complain about SaaS performance? Your operations center needs a
structured checklist to follow to quickly manage the incident and isolate it to
the responsible party -- a process known as "triage." (See, for
example, the Burton Group report "A
Framework for Network Incident Management.")
Here's where your SLA key performance indicators, or service
level indicators (SLIs) can be useful. Your SaaS provider should be giving you
on-line access to their performance as seen by users in their server room,
and that's clearly the first checkpoint for the operations staff. If that's in
trouble, well, the problem is easy to locate. But if that's OK, then the next
step is to look at your service level indicators at your branch offices,
especially the branch office that's reporting trouble. If that indicator is in
trouble (and that indicator may be a full SaaS web page download time, or it
may be only the TCP round-trip time and DNS lookup time for the SaaS provider),
then, if you have a solid SLA, you should be able to send that evidence to the
SaaS provider and insist on their opening a trouble ticket. It's not really
your job to diagnose the problems between your branch office and the SaaS
server room; it's the SaaS provider's job. And they probably have more leverage
with their Internet service provider (ISP) than you do; they can even demand
better peering with the ISP that your branch office is using.
Of course, it may turn out that it's your ISP having
internal problems. In that case, you and the SaaS provider may need to explore
new ways of connecting to the SaaS server site, through better peering at a
point that doesn't encounter the congestion within your ISP, by a direct link
to your major offices outside of the Internet, or by your moving to a different
ISP for your own access (the threat of doing that might inspire your ISP to
upgrade its systems).
Or it may be that it's your user who is having problems
with his own browser or with the branch office's internal network; you'll know that because your service
level indicators won't see any problems. In that case, it's an incident to be
handled by your service desk; there's no need to file a trouble ticket with the
SaaS provider. (And your relationship and credibility with your SaaS provider will improve when
you're not filing minor reports that turn out to be inside your own system!)

Comments