Sept. 8, 2015 - Report on Data Center Move (comprehensive)

The mail delay and datacenter move of September 5-6, 2015, were the result of a confluence of events. We were in the early stages of a datacenter move, which was expedited significantly.

 

What Happened

By Sept 5, upstream-provider Telia was in the midst of a 48-hour service interruption to a portion of the Los Angeles area. Separately, on Saturday at 4:30 pm PDT (11:30 UTC), our primary Los Angeles colocation meet-me room experienced a complete power failure. We were able to partition our customers from much of both events, but not entirely. 

MailRoute continued to receive and process mail but did have some inbound issues from a few customers, and some others were unable to relay through us.  

When this was still evident by Sunday morning, Sept 6, our CEO determined that we’d close that datacenter and move to our new colocation facility immediately. Our new provider worked with our team to pull together network engineers, technicians and pure muscle to move an entire datacenter worth of equipment and get it back online between 1:27 pm PDT and 3:53 pm PDT.

 

Mail Delivery

The next 12 hours were spent on cleanup and maintenance – when managing a huge number of servers, some don’t come up quite the way they’re supposed to.

Any sender who couldn’t reach us should have been queuing email for a later retry, and mail shouldn’t have bounced or been lost. There are some servers that return an email after a temporary deferral, but we cannot control non-MailRoute servers that don’t follow SMTP protocols. 

Today, Sept 8, we isolated two clusters of servers that were backlogged and routed all new incoming mail around those clusters, allowing them to clear. All queued mail has been cleared.

 

Communications

A notice was sent to all Tech and Emergency Contacts listed in your account that we would be taking our service down for the datacenter move. This went out at 8:45am PDT on Sept 6. 

(Important: If you have not listed any contacts in your account, we do not have addresses for sending critical notifications. Admins, please login here and select the Contacts tab. Add Tech, Emergency and Billing contacts for future important communication. Include an address or addresses that are not serviced by MailRoute, such as a webmail address or personal domain, so we can reach you during outages.)

Announcements and updates were periodically made to our knowledge base (login at support.mailroute.net and subscribe to alerts here). The links to these updates were tweeted by @mailrouteinc (follow us on Twitter).

 

In Conclusion

MailRoute took immediate and significant action to eliminate a problem in our overall configuration by moving to a new colocation on Sunday, Sept 6, 2015. This resulted in a couple hours of announced downtime and some anticipated post-move management and cleanup. We do not see reports of any continuing issues and all mail has been cleared / delivered.

 

Your feedback is critical to our improvement. Please suggest how we might improve our ways of communicating with you during necessary periods of maintenance or when we experience service issues. As noted, please ensure that you have contact information listed in your account, including some addresses that are not serviced by MailRoute; you’ve subscribed to alerts from us; and, if you use Twitter, please follow @mailrouteinc.

 

Thank you. We appreciate your business.

Have more questions? Submit a request

0 Comments

Article is closed for comments.
Powered by Zendesk