Exchange migrations tend to be complex. Even smaller organizations running Small Business Server with less than 75 users, may take a week or more to plan, prepare and execute their email migration.
Any business that’s been through a migration at least once will remember that most of the migration effort was spent in planning. Otherwise they may remember the large mop-up operation and the time spent visiting desktops, recovering mail and rolling aspects of the migration backwards and forwards.
Data loss (what PSTs?), client upgrades and wrongly migrated data tend to come to mind when thinking about what can go wrong, as well as the mail server that crashed during the migration. During a migration a fair amount of change is introduced and additional processing is forced onto both the source and target Exchange platform. For an older platform at the limits of its lifespan or operational capacity, the extra overhead an email migration introduces may be the straw that breaks the camel’s back.
Cloud based email continuity may act as insurance in this regard by enabling client continuity and transactional continuity in case the migration wobbles or breaks. Let’s explore that in a bit more detail.
Migrations are heavily process driven. In order to migrate, a fair amount of surveying, planning, lab testing, etc need to be accomplished. It makes sense to use the desktop visit of the plan/survey component to introduce the agents required onto the desktops in order to make client continuity possible.
If an Exchange server in the source or the target organization were to fail during the migration, Outlook clients would be redirected to the cloud, with little or no disruption to service or – crucially – the user experience. This allows the outage to be addressed, mail flow and client mail service to be restored without the pressure of fighting two fires concurrently – ie, a broken environment and a broken migration.
Cloud based email continuity allows you to benefit from the scale of the cloud as a side effect of leveraging continuity in the cloud, provided of course your users have the required network or internet connectivity to beat a path to the cloud.
In our day to day lives we’re generally quite comfortable accepting the argument of personal insurance, which guards us against any number of possible scenarios, such as breaking a leg while skiing, medical insurance, insurance against theft, and so on. All of these boil down to paying a small amount of money to a much larger entity and thereby being guaranteed the benefit of that entity’s scale and reach in the case of something unfortunate happening.
As the idea of cloud on demand becomes more pervasive, insuring your migration in the short term against loss of email continuity makes as much sense as taking out insurance on your car before you take it on the road.
Google and Microsoft have recently been poking holes in each others’ uptime SLAs (Service Level Agreements.) The squabble has been summed up here by Paul Thurrot from Windows IT Pro.
In short Google claimed its Google Apps service had achieved 99.984% uptime in 2010 and, citing an independent report, went on to say this was 46 times more available than Microsoft’s Exchange Server. Microsoft retaliated by saying BPOS achieved 99.9% (or better) uptime in 2010 and this was in line with their SLA. Microsoft quite rightly protested at Google’s definitions of uptime and what should or should not be included.
The discussion continues.
Uptime is one of those things included in your service provider’s SLA that you never really give much attention to, unless it’s alarmingly low: 90%, for example. Most Cloud, SaaS or hosted providers will give uptime SLA figures of between 99.9% (three nines) and 99.999% (five nines). Mimecast proudly offers a 100% uptime SLA.
All of these nines represent different levels of ‘guaranteed’ service availability. For example, one nine (90%) allows for 36.5 days of downtime per year. As I said, alarming. Two nines (99%) would give you 3.65 days of downtime per year, three nines (99.9%) 8.76 hours, four nines (99.99%) 52.56 minutes and five nines (99.999%) 5.26 minutes per year. Lastly six nines, which is largely academic, gives a mere 31.5 seconds.
What does all of this mean to you as a consumer of these services? In terms of actual service, very little, unless you happen to be in the minority percentage; that is to say everything has gone dark and quiet and you’re suffering a service outage.
What is much more important is how the vendor treats you in the event they don’t achieve 100%. It is hard for any vendor to absolutely guarantee 100% uptime all of the time, so you must make sure there is a provision for service credits or financial compensation in the event of an outage. If not, the SLA is worthless. Any reputable SaaS or Cloud vendor will have absolute confidence in their infrastructure, so based on historical performance a 100% availability SLA will be justifiable. Mimecast offers 100% precisely for this reason. We have spent a large amount of R&D time on getting the infrastructure right so it can be used to back up our SLA, and as a result we win many customers from vendors whose SLAs have flattered to deceive.
A larger issue perhaps we ought to consider is highlighted by the arrows Google is flinging in Microsoft’s direction: namely, how do vendors really define uptime? What sort of event do they class as an outage? Does the event have to occur for any length of time to qualify? Is planned downtime included in the calculation? And so on.
There is no standard with which uptime is defined and common sense isn’t always applied either. In other markets, consumers are reasonably protected from spurious vendor claims by independent third parties like Consumer Reports or Which. Not so with the claims tech companies make regarding the effectiveness of their solutions, and the result is a great deal of spin, which in turn inevitably leads to misinterpretation and confusion.
Fortunately, we’re not the only ones to see the need for standards here. Although it’s early days still, you can get an overview of ongoing current efforts at cloud-standards.org.
Google and Microsoft’s argument is based largely on differences in measurement rather than any meaningful level of service. In a highly competitive market, any small differentiation can be a perceived bonus (by the vendor) but if we’re all using different tape measures to mark our lines, the only reliable way tell who comes out on top is to talk to the long-term customers.
This week’s #list is a lighthearted look at the start-to-finish phases of an email outage.
1. Everything is working just fine. It’s been a while since your last outage, if ever.
2. Users are happily sending and receiving email and doing what users do.
3. The odd 200MB attachment causes the a slowdown. Thank-you users!
Trouble is Brewing
4. Your mail stores are getting a little full, but you’re not that worried.
5. The mail server or OS software vendor releases a major update.
6. You know this could mean trouble, You put off applying the update.
7. You find out the update fixes a problem you’ve been having with the system.
8. You tell the CIO you’ll have to apply the update.
9. The CIO tells you downtime is only available at 10pm on Sunday night, this week.
10. You cancel the plans you had for Sunday, and pre-empt spending the night on the couch as you partner won’t be happy.
You Plan and Prepare – 7 P’s right?
11. You’ve planned the update, backed up the server, you’re ready.
12. You bounce the server once last time (just to make sure)
13. You start to apply the update: Diligently pressing the Next button.. Next, next, next, ….
Then it Happens
14. Then the blue progress bar stops moving. Your anxiety level (already high) begins to rise.
15. You wonder whether the vendor has included an anxiety detector, as the longer this takes the more stressed you get, and therefore the longer it takes.
16. You wait.. and wait.. and 45 minutes later you decide. Reboot.. Heck, why not?.
17. Hey, everything looks ok, it’s back up. No problem.
18. Then you check the services, and the ohnosecond happens.
19. You realize that although everything looks good, a critical service isn’t starting. Maybe a key service for the email server.
20. Panic sets in.. This is going to be a long night.
What Happens Next?
21. Support might work, it might not. Roll back plan might work, it might not.
22. You know the users will be connecting soon.
23. Reverting to a previous back-up is a scary prospect.
24. You call the CIO to explain; the CIO tells you to get it working by 8am on Monday and goes back to sleep.
25. You’re in for a long night… You wish you had pushed harder for that email continuity solution.
26. The rest depends, on luck, skill, the alignments of the planets and anything else you’d care to channel.
27. You promise yourself to get the CIO to signoff on the Email Continuity solution.
28. No-one noticed, No-one cares.
29. Many espresso’s and a 4-pack of red bull, zero sleep and despair have given you the appearance of a vagrant.
30. Your colleagues tell you to take it easy on the weekends, you need more sleep…
Mike Vizard mentioned something on his blog a few weeks ago, that I thought had been missed by many. I had been thinking about a series of posts for this blog under the umbrella of email continuity and was putting together a list of common outages businesses have to deal with; here in the US, for Gulf States in particular, the hurricane is the biggie.
Vizard, like me, had spotted that NOAA are predicting an “above-normal hurricane season” – but he does go on to warn that;
The predictions are rarely on target, but the havoc wrought by Hurricanes Katrina and Andrew prompts people to take the issue seriously.
Which is quite true. Of course the Weather is an archetypal example of Chaos Theory at work, and that makes predicting its patterns and movement almost impossible; but what we do know is that if Danielle, Earl, Fiona or Gaston make landfall this year everything in their path will be subject to a new type of Chaos.
For many it’s a case of board up and move out. Everything grinds to a halt for a few days until the threat has passed. If you’re running a business this is not good, but you may have already thought about a way to keep your essential services like email up and running. I know of IT managers who simply turn off their Exchange Servers, unplug them and drive them away – and that works, but leaves your users and customers with nothing.
And this is where a cloud based email continuity service would step in. Vizard’s points out that advances in cloud computing can help you mitigate the impact of any disaster, not just a hurricane. Vizard;
The key thing to remember is that servers in the cloud are usually thousands of miles away from the actual disaster, and as long as you can provide people with access to them, you can be back in business…
Admittedly if I were facing down a large enough threat, I would be telling my users to collect their things and go, and it’s likely all my local services such email, Internet and power would be unusable anyway. But relying on a continuity solution based, as Vizard points out, thousands of miles away means that once we’re safely inland we can get back on the air.
And that’s the important part, getting back on the air! Telling my customers we’re still in business and we’re still able to respond to, them regardless of the situation outside, means I don’t loose business or worse, simply vanish.
Keeping a weather eye out is always a challenge, but the last thing you need to do is vanish.
Townsend uses a cloud-based approach for its email environment to lower email storage and to ensure email business continuity. Their email is now archived off site automatically and their employees have continuous access to email even during power or network outages.