Why did Delta take days to restore normal service after CrowdStrike outage? Experts weigh in.

Why did Delta take days to restore normal service after CrowdStrike outage? Experts weigh in.
Brandon Bell/Getty Images

(NEW YORK) — An outage caused by a software update distributed by cybersecurity firm CrowdStrike triggered a wave of flight cancellations at several major U.S. airlines – but the disruption was most severe and prolonged at Delta Airlines.

In all, the carrier canceled more than 2,500 flights over a period that stretched from last Friday, when the outage began, into the middle of this week.

The U.S. Department of Transportation opened an investigation into Delta this week over its uniquely severe flight disruptions.

“All airline passengers have the right to be treated fairly,” Transportation Secretary Pete Buttigieg said on Tuesday in a post on X.

In a statement on Tuesday, Delta said it is fully cooperating with the investigation. “Across our operation, Delta teams are working tirelessly to care for and make it right for customers impacted by delays and cancellations as we work to restore the reliable, on-time service they have come to expect from Delta,” the company said.

The company also issued an apology on Wednesday for the outage-related problems.

“Please accept our sincere apologies for the disruption to your recent travel plans caused by a vendor technology outage affecting airlines and companies worldwide,” the airline said in a statement.

“It’s a surprise that a multi-billion-dollar corporation like Delta would allow this to happen,” Henry Harteveldt, a travel industry analyst at Atmosphere Research Group, told ABC News.

“I’m hopeful that the worst is behind us now. While we can breathe a sigh of relief, I think a lot of people are understandably nervous about flying Delta,” Harteveldt added.

Delta did not immediately respond to an ABC News request for comment.

Airline and cybersecurity experts spoke to ABC News about what made the CrowdStrike outage so disruptive, and why it took days for Delta to resume normal service.

What made the CrowdStrike outage so disruptive for Delta?

The CrowdStrike outage was so impactful because of the severity of the IT failure and the scale of its reach within the internal operating systems at Delta, experts told ABC News.

“For a company such as Delta, they rely on countless partner services for everything from scheduling pilots and planes to providing meal service and snacks to allowing customers to select their seats,” David Bader, a professor of cybersecurity and the director of the Institute of Data Science at the New Jersey Institute of Technology, told ABC News.

“The CrowdStrike bug disrupted many of those critical services that keep the airline running at full capacity,” Bader added.

Mark Lanterman, the chief technology officer at the cybersecurity firm Computer Forensic Services, said the outage resulted from a faulty software update initiated by CrowdStrike. The resulting computer bug interrupted core services because of the degree to which CrowdStrike pervades the Delta operating systems, he added.

“The CrowdStrike update is deep inside the operating system. When that was installed, there was bad code inside of this update. And when Windows came across the bad code, it panicked and it crashed,” Lanterman said.

The outage, which affected CrowdStrike clients that use Windows operating systems, disrupted a critical system that ensures each flight has a full crew, Delta said in a statement on Monday.

“Upward of half of Delta’s IT systems worldwide are Windows based,” Delta said.

Why did it take days for Delta to resume normal service?

The reason for the prolonged recovery from the outage was because the CrowdStrike update disruption required a manual fix at each individual computer system, experts told ABC News. While each fix can be completed in no more than 10 minutes, the vast number of Delta’s digital terminals required significant manpower to address, expert said.

“This isn’t a fix that could be done automatically; IT resources can’t just sit at a computer and push out an update and everything is fixed,” Lanterman said. “It took so long because Delta has a lot of computers and likely they have limited IT resources to go from computer to computer.”

In a statement on Tuesday, the airline acknowledged the challenge posed by the manual fix requirement.

“The CrowdStrike error required Delta’s IT teams to manually repair and reboot each of the affected systems, with additional time then needed for applications to synchronize and start communicating with each other,” Delta said.

Copyright © 2024, ABC Audio. All rights reserved.