News, On Campus, Top Story

Power Surge That Took BC Offline ‘Very Serious,’ All Systems Now Restored

A local power surge damaged elements of the power infrastructure in St. Clement’s Hall on Brighton Campus on Tuesday at 9 a.m., taking Boston College’s phone services, Wi-Fi capabilities, Agora portal, Google Suite features, and other technology capabilities offline, according to an email sent to the community by Michael Bourque, vice president of information technology services (ITS). St. Clement’s contains approximately 2,000 computer servers that support academic and administrative functions, Bourque wrote.

The outage was slowly fixed, and BC online functionality returned to full capacity at 11:15 a.m. on Wednesday when an issue with BC Apps was resolved. Most telephone service and Wi-Fi capabilities available on main campus were restored two hours after the outage and four hours before power was fully restored in St. Clement’s, according to a follow-up email from Bourque to The Heights. Brighton and Newton networks were not fully restored until 6 p.m. on Tuesday, according to the ITS website. Every service, with the exception of the BC Apps issue, was resolved by 6 a.m. on Wednesday.

The reason for the longer recovery time was because enough damage was done to require ITS to rebuild layers of the software and hardware environment, according to Bourque’s original email to the community. Bourque noted in his follow-up that it was not yet clear how much permanent damage has been done to the server farm that will require further repairs, but that ITS and equipment specialists will work to undertake further diagnostic work to figure out if there are any further repairs or replacements necessary.

Bourque also said that the high voltage used in St. Clement’s made restoring online capabilities more complicated, since working with such levels of electricity requires extreme care in order to bring capabilities back online without causing further delays or damage to the servers.

The St. Clement’s data center was designed to include “appropriate electromechanical and safety measures,” according to Bourque. BC added a 1,500 kilowatt generator, an automatic transfer switch that controls whether power is running through typical sources or the generator, and modular uninterruptible power supplies—which moderate how power is utilized and distributed.

The reason such measures were unable to prevent the outage was because the outage was “very serious,” according to Bourque. St. Clement’s has never experienced a full outage within the data center since it was constructed in 2006, despite the physical building experiencing many outages over the past 13 years, Bourque said.

In preparation for dealing with issues like this, ITS has been performing disaster recovery tests for over 10 years, according to Bourque. At a third-party facility, ITS employees work to recover equipment at that site while BC users test the recovery efforts remotely from campus, Bourque explained. Recovery is accomplished by configuring operating systems, security measures, restoring networking capability, and restoring access to various applications.

Bourque said that he believed ITS successfully leaned on that experience and performed admirably, despite the serious nature of the outage. ITS’ recovery plans emphasize quick recovery of key services, such as internet and telephone access, which is why those services came back online on Main Campus relatively quickly, despite some slower progress across the entirety of campus.

One of the slower aspects of BC’s online capabilities to come back online was access to the Google Suite, and more specifically being able to send and receive emails exchanged between bc.edu accounts. Part of the reason for that was by the time key services had been recovered by ITS, at which point connections to Google apps were restored, it took hours for the servers to catch up on the hours worth of emails that they had gathered. Those emails, while the servers were down, were unable to reach intended recipients.

The University’s Canvas website—used by students and faculty as a document sharing platform and online discussion area—also took a long time for ITS to bring fully back online and was not a part of the initial key services restoration. ITS’ website noted that Canvas and the Agora portal—which provides students and faculty access to proprietary University information, as well as course availabilities, billing, and degree audit information—were not fully restored until 12:30 a.m. Wednesday morning. Security and integration systems within Canvas and Agora served as issues that took longer for ITS to resolve.

Bourque said that, moving forward, ITS will concentrate on improving resiliency and recovery capabilities so that, if a similar issue ever occurs in the future, full service can be reached at a faster rate.

Featured Image by Celine Lim / Heights Editor

March 28, 2019