On Sunday, May 3, 2009, at approximately 2:45 AM, we experienced a network outage affecting the majority of our customers. Both our primary and backup core router experienced hardware failures that resulted in packet loss, and finally full loss of connectivity at or around 2:45AM. At or around 2:35 AM, our monitoring system alerted us to an increase in packet loss, and an investigation of the cause was launched to determine the root cause. At approximately 2:45 AM, our monitoring system alerted us to the loss of connectivity, and two technicians were immediately dispatched to the site to fix the issue.
At approximately 3:12 AM, the technicians arrived on-site and began diagnostic work on the routers to determine the root cause of the issue. It was determined that multiple hardware failures caused both routers to fail, and repair work to the secondary router began at approximately 3:40 AM. At 4:32 AM, repairs on the secondary router were finished, and service was restored. The primary router required replacements parts that were not on-hand at the time, and so as of now, has not been repaired.
We have taken measures to insure that future incidents are not as severe, including ordering replacement’s for both routers, and ordering spare parts for all router components that are not redundant. By doing this, we expect that any issues that arise in the future can be dealt with much quicker. We apologize to all of our customers affected, and promise that we are doing everything reasonable to insure this doesn’t happen again.