Tuesday, October 17, 2017

Human error behind BA’s ‘tragic’ data centre outage that grounded 75,000 passengers

With a bill that could reach £150m, British Airways is conducting an investigation into the event while facing critics from industry and insurance bodies.

One of history’s largest IT meltdowns in the aviation space has reportedly been triggered by human error.

The “catastrophic” and “tragic” event at a British Airways’ (BA) data centre in Heathrow, London, occurred after a staff member switched off a power supply unit in “perfect working order”, according to The Times.

The paper reports that an investigation is under way and will likely focus on human error as the cause for the disturbance to the data centre.

When switching off the unit power supply (UPS) and then running an uncontrolled reboot, the engineer caused “catastrophic physical damage” to the servers due to power overload.

According to sources, the engineer works for CBRE Global Workplace Solutions.

The outage happened around 9:30am on Saturday, May 27, 2017, and is said to have lasted 15 minutes. However, its consequences were harshly felt by more than 75,000 passengers worldwide as 800 to 1,000 flights were cancelled.

Earlier this week, Bill Francis, head of IT at International Airlines Group (IAG), BA’s parent company, sent an email to staff in which he mentioned the outage at the Heathrow data centre had not been cause by an IT failure or software related problems.

BA’s CEO Alex Cruz firstly blamed the outage on a power supply. Days after IAG’s CEO Willie Walsh, also claimed that a power surge had brought down BA’s IT systems.

In a statement, BA said: “There was a loss of power to the UK data centre which was compounded by the uncontrolled return of power which caused a power surge taking out our IT systems. So we know what happened we just need to find out why.

“It was not an IT failure and had nothing to do with outsourcing of IT, it was an electrical power supply which was interrupted.

“We are undertaking an exhaustive investigation to find out the exact circumstances and most importantly ensure that this can never happen again.”

However, energy supplier UK Power Networks, which provides power to the area where the data centre sits, disputed BA’s claims of a power outage saying that energy had not been interrupted.

Such was followed by reports that maintenance at the data centre, built in 1980s, was poor and that the facility had been filled up with modern computing systems and hardware throughout the years without the necessary retrofits.

Hours after the incident, the GMB union went as far as claiming that the IT problems happened after BA dismissed “hundreds of dedicated and loyal” IT staff to outsource its IT needs to India in a bid to save money.

CEO Cruz has in the meantime come under fire and many have called for his resignation. Cruz has put down any idea of resigning from the role of CEO which he was appointed to in April 2016.

Cruz said: “We do apologise profusely for the hardship that these customers of ours have had to go through.

“We absolutely profusely apologise for that and we are absolutely committed to provide and abide by the compensation rules that are currently in place.

“We are absolutely committed to finding the root causes of this particular event and we will make sure nothing like this happens to British Airways ever again.”

Industry analysts have said the outage could cost BA as much as £150m in compensations and reimbursements to those affected.

The airliner is now battling with insurers on who is to blame for the incident. The Association of British Insurers (ABI) has already come out to say the company gave passengers the wrong information which has complicated the claims process.

The ABI told the Financial Times: “Any cover available under travel insurance will usually kick in only if compensation is not available from any other source.

“Those affected should seek compensation, and any refunds of expenses, in the first instance from British Airways.

“People affected by the disruption should be able to claim compensation and refunds for any expenses as simply as possible, not being passed from pillar to post.

“EU flight compensation regulations set out that airline operators should provide compensation to passengers that suffer long delays or cancellations.”


 15:45, 2 June, 2017

CBRE has spoken out denying the rumours that suggest human error to be behind the BA data centre downtime.

The company manages the BA Boadicea House data centre, near Heathrow, and said the claims around what caused the outage were “not founded in fact”.

CBRE said: “We are the manager of the facility for our client BA and fully support its investigation. No determination has been made yet regarding the cause of this incident. Any speculation to the contrary is not founded in fact.”