CrowdStrike Holdings Inc., the cybersecurity company at the center of massive global IT outages, said that a bug in a quality-assurance tool the company uses to check updates for mistakes allowed flawed data to go out to customers, causing last week’s meltdown.
On Friday, the company pushed through an update for Windows machines via a rapid-response mechanism, meant to respond quickly to changing threats. That update contained a critical flaw. CrowdStrike’s “content validator,” which is supposed to test updates for errors before they go out, malfunctioned and let the bug pass through, the company said in an incident report published on Wednesday.
That undetected error crashed Windows systems and kicked off one of the most spectacular rolling IT failures in history. The US company is trying to piece together the series of events that led to crashed Microsoft Windows computer systems around the world, taking down airline, banking and stock exchange operations from Australia and Japan to the UK.
Related: Trigger Warning: Cyber Policy Wordings to Impact Coverage for Tech Outage
Microsoft and CrowdStrike rolled out fixes last week, and many systems have been restored. But for several hours, bankers in Hong Kong, doctors in the UK and emergency responders in New Hampshire found themselves locked out of programs critical to keeping their operations afloat. More than 8.5 million Windows users were affected, according to Microsoft.
CrowdStrike said it’s working to improve Rapid Response Content testing in the future. A new check “is in process” in order to fix the faulty content validator. The company also said it would give customers greater control over how these updates are delivered onto their systems.
The company — which was criticized for mass-deploying the catastrophic update instead of starting with a smaller rollout that would’ve prevented widespread outages — also said it plans to stagger future updates via “canary deployments” which are tested piecemeal before bigger rollouts.
Related: Insurers Face Business Interruption Claims After Global Tech Outage
These updates will be a “vital step in mitigating any future risks” and could prove to be a useful model for similar companies and create better industry practices, said Nathan Oliver, chief information security officer at Microminder Cyber Security.
Still, the power that this mistake had to hobble critical businesses and services worldwide last week has raised fears about the vulnerability of the global IT system, which is dependent on a handful of dominant tech companies.
“What I would still be concerned about, is these companies are such an intrinsic part of the global supply chain and global infrastructure,” said Saif Abed, a former doctor with the UK’s National Health Service and expert in cybersecurity and public health. “These fixes being proposed today are very particular, but they don’t necessarily provide me with an assurance that something of this catastrophic nature might not happen again for different reasons.”
CrowdStrike’s shares dropped nearly 30% in the aftermath of the outage, slashing billions of dollars from its market value. The US House Committee on Homeland Security requested an appearance from Chief Executive Officer George Kurtz and lawmakers called on him to explain how the company will mitigate risks of a similar incident in the future.
Shawn Henry, CrowdStrike’s chief security officer, apologized in a post on LinkedIn on Monday, saying that the company had “failed” its customers.
“The confidence we built in drips over the years was lost in buckets within hours, and it was a gut punch,” he said.
Top photo: Blank digital billboards in Times Square in New York, US, on July 19.
Was this article valuable?
Here are more articles you may enjoy.