How the Cloud Helped Us Through a Hurricane
As the path of Hurricane Irene became apparent late the week of August 22, GreenPages’ Managed Services operations team realized we had some serious planning to do.
On the morning of Thursday August 25, our Operations Management team reviewed our plan for Hurricane Irene, which was bearing down on our Network Operations Center in Charlestown, MA as well as GreenPages’ Kittery, ME headquarters. The topic of conversation: how would we maintain availability of our critical systems during the storm, and how would we take care of our customers if our regular weekend shift engineers lost their connectivity or were overwhelmed by the volume of tickets in the midst of a hurricane?
Our Managed Services business is a 24/7 operation because our clients rely upon us to monitor their environments and remediate issues. Because of the size of our operation, an “on-call” engineer doesn’t cut it for weekends and overnights; we have “eyes on glass” three shifts a day, seven days a week – and they are busy all the time. So we needed to make sure our key systems would be available to our engineers – no matter what.
As it turns out, the systems issues were the easy part because of our virtualized architecture. Two of our critical systems, Kaseya (Remote Monitoring and Management) and AutoTask (Professional Services Automation) are highly resilient, SaaS applications, so we are well protected by the cloud architecture there. Our other critical system is our call center, Cisco Call Manager. This is owned by GreenPages, but it is a virtualized infrastructure distributed across three sites: Kittery, Charlestown and our collocation facility in Boston. This ensures that we can take and make calls even if we lose 2 of the 3 distributed sites.
We did a quick review of our contingencies for system availability and then were able to focus most of our energy on our shift coverage strategy. We prepared some of our Monday-Friday engineers (including one engineer in Michigan – far outside the path of Irene’s impact) to jump in over the weekend if our regular weekend guys lost connectivity or if the volume of tickets spiked.
As it turned out, Sunday – the day Irene hit New England – was indeed a busy day for us. While our offices in Boston and Kittery did not end up getting hit too badly, hundreds of thousands across the northeast lost power and many of our customers (and some of our employees) went off-line. We had a much heavier than normal ticket volume, but our extended staff – all of them working remotely – were able to connect to our systems and take calls Business as Usual, thanks in large part to the resiliency of our systems infrastructure.
By Michael Halperin, VP of Managed Services
- » Four ways to migrate to the cloud without missing a beat: A guide
- » VMworld 2019: Going big on Kubernetes, Azure availability - and a key ethical message
- » Five key tips to prioritise the security of DevOps tools and processes
- » Oracle wants to say goodbye to shared responsibility by ramping up autonomous next-gen cloud approach
- » Cloud performance and change management cited in latest DORA DevOps analysis