A message from Rich Belanger, ProQuest CIO
On Tuesday, after an outage at our provider Amazon Web Services (AWS), the ProQuest platform resumed normal operations at 8:14 PM EST (01:14 GST), enabling students and researchers around the world to resume their learning and exploration. We take our role in supporting that important work very seriously, so once service was restored we immediately began reviewing our logs to understand exactly what happened. We also opened up cases with our technical counterparts at AWS.
We are awaiting a detailed root-cause analysis from AWS, but we’ve already identified an architectural change that we can make to our server configurations that will reduce the impact of an AWS failure on ProQuest systems. We’re at work on it now and I am confident it will be implemented, tested, and patched in a few weeks. Further, we will continue to work with AWS to develop options should its Simple Storage Service (S3) ever fail again.
When we initially moved into the AWS cloud we conducted a series of tests to see how the ProQuest platform would handle a wide variety of infrastructure failures. This rare scenario was not one that we tested, so we are also examining our test scenarios to see if there are other gaps that we can test and fix.
Thank you for your patience as we worked through the issues to restore connectivity on Tuesday. Our goal is 100% uptime for ProQuest services and we will continue testing and improving our systems to achieve it.