Microsoft Hits Back at Delta in Clash Over System Breakdown (bloomberg.com) 166
Microsoft said Delta Air Lines turned down repeated offers for assistance following last month's catastrophic system outage, echoing claims by CrowdStrike in an increasingly contentious conflict between the carrier and its technology partners. From a report: Microsoft employees reached out to Delta to give technical support every day from July 19 through July 23, and "each time Delta turned down Microsoft's offers to help," according to a letter Tuesday from the technology giant's attorneys to Delta's representatives. Microsoft Chief Executive Officer Satya Nadella also personally emailed Delta CEO Ed Bastian and never heard back. "Even though Microsoft's software had not caused the CrowdStrike incident, Microsoft immediately jumped in and offered to assist Delta at no charge," according to the letter, which was signed by Mark Cheffo of Dechert LLP. The claims, in response to Delta's hiring of attorney David Boies, heighten the tension after Delta suggested it would try to seek compensation for a breakdown it expects to cost it $500 million this quarter. The airline was slower to recover than competitors after an errant software update from CrowdStrike affected Microsoft systems, creating a cascading effect that led Delta to cancel thousands of flights over several days.
Makes me wonder (Score:5, Interesting)
What might Delta have to hide, that they turned down clear offers from a company that was in a position to provide good, useful help? To me this smells like something a bit shadier than garden-variety c-suite incompetence.
Re:Makes me wonder (Score:4, Informative)
What might Delta have to hide, that they turned down clear offers from a company that was in a position to provide good, useful help? To me this smells like something a bit shadier than garden-variety c-suite incompetence.
Or it could be as simple as a contact flub. Some support personnel at Microsoft makes the auto-generated call request to Delta, gets the wrong support monkey and the conversation is as follows:
"This is Microsoft support contacting you regarding incident (blah blah blah) and checking if you need any assistance."
Support Monkey that's on the phone, because they aren't involved in attempting to rectify the current clusterfuck: "Uh, I don't see anybody around to ask, so I guess we're good?"
"OK." Click.
I mean, I'm sure it could be something nefarious, but assuming so seems counter to the way the real world works. It's probably just too many levels of people too separated from one another to have idea WTF is going on as its happening. Post-mortem would be interesting, but being all private companies I doubt the public will ever find out the truth.
Re:Makes me wonder (Score:4, Insightful)
Its also not always simple to just bring in outside people either. What do policies say about exposing confidential data to non-employees? Depending on industry they may be legal or regulatory hurtles there?
Keep in mind too the decision makers who are in a position to suspend such policies, are very busy with aspects of crisis management. You might be some Director in IT, getting a call from Microsoft Support, the VP you need to allow you to "read them in" is currently doing CNN interviews..
Nothing is easy to do in the middle of that kind of shit show. That is WHY everyone makes disaster recover run-books, etc. Which logically don't include "and them Microsoft hopefully calls us and offers help."
Re: Makes me wonder (Score:5, Insightful)
Re: (Score:2)
Aw, there you go raining on my conspiracy theory. Makes sense though.
Re: (Score:2)
They were only very busy because the people are incompetent. So incompetent that they can't even check their emails on any other device except their work computers.
Re: (Score:2)
Its also not always simple to just bring in outside people either. What do policies say about exposing confidential data to non-employees? Depending on industry they may be legal or regulatory hurtles there?
Keep in mind too the decision makers who are in a position to suspend such policies, are very busy with aspects of crisis management. You might be some Director in IT, getting a call from Microsoft Support, the VP you need to allow you to "read them in" is currently doing CNN interviews..
Nothing is easy to do in the middle of that kind of shit show. That is WHY everyone makes disaster recover run-books, etc. Which logically don't include "and them Microsoft hopefully calls us and offers help."
Makes perfect sense - no idea why you were modded down.
Re: (Score:2)
According to the blurb, the head of Microsoft emailed the head of Delta and never heard back.
I mean, I'm sure it could be something nefarious,
Such as incompetence? Stupidity? Ego? Pride? Take your pick.
Re: (Score:3)
Head of Delta: Hmmm....a message from the Head of Microsoft. Looks like a fishing attempt to me. Underling, has any one from MS called me.
Underling: Nope, we check the logs, nada.
Head of Delta: Well, if he wants to get a hold of me, he can just use a phone like a normal person instead of sending something a dubious as an email message. Underling, if we get any calls from MS, make sure you find out who it is and we will call them back, we cannot trust an AI-infected world anymore.
Re: (Score:2)
Re: Makes me wonder (Score:2)
The email server was of course down due to crowdstrike. Not everyone uses cloud services.
Re: (Score:2)
Or it could be as simple as a contact flub. According to the blurb, the head of Microsoft emailed the head of Delta and never heard back. I mean, I'm sure it could be something nefarious, Such as incompetence? Stupidity? Ego? Pride? Take your pick.
Heads of such diverse companies rarely expect contact from one another. For me, not that I'm a CEO, but anything from a higher up at Microsoft goes right to the spam filter. In fact, Microsoft's spam filter would probably catch it before mine did.
Re: Makes me wonder (Score:2)
Yes, but this is microsoft and delta. Iâ(TM)m sure ceos had multiple contacts in the course of developing their relationship. Iâ(TM)m sure theyâ(TM)re in each otherâ(TM)s mobile. Iâ(TM)m also sure, if they really thought it was a microsoft failure, ceo would be reaching out with a big ol WTF?
Re: (Score:2)
Re: (Score:2)
Or an assumption that it was a phishing attack.
Re: (Score:2)
Probably emailed something like info @ delta.com or executives @ delta.com. There call center is probably overwhelmed with emails to executives with refund issues and cancelled flights.
There probably is an senior Microsoft account executive who probably has a real email address for the Delta CTO.
Re: Makes me wonder (Score:2)
Assuming that the email server did work and wasn't impacted.
Re: (Score:2)
It's more possible that they were doing what they were trained to do when "Microsoft Support" contacts. You treat it as a phishing attempt since that's what it usually is. The same could be said of Crowdstrikes attempts to contact as they would also seem to be a phishing attempt to most users.
'Member in the early days of spam filtering with email, how it seemed like you could either get all the spam, or no email at all? We've managed to do that with every form of communication. Progress is awesome!
Re:Makes me wonder (Score:5, Funny)
The eight most terrifying words in the English language are "We're from Microsoft, and we're here to help."
Re: (Score:2)
What can MS do in this situation? It all comes down to man hours. Delta has who knows how many thousand machines and every one needs a person to sit there and enter the bitlocker code so the machine can boot from a flash drive.
Re: Makes me wonder (Score:4, Informative)
If Delta is seriously considering suing they cannot bring in Microsoft or CrowdStrike like this. You don't invite the fox into the henhouse.
Re: (Score:2)
Microsoft wouldn't have offered to do it for free. What I'm wondering: what was the price tag?
Comment removed (Score:4, Insightful)
Re: (Score:2)
how's microsoft going to help delta get people and planes to where they're supposed to be right now when they're currently halfway across the country because the previous two trips for that plane got cancelled?
Re: (Score:3)
Re: (Score:3)
Exactly. The "argument" MS is trying to make here is completely worthless. They are just posturing for the press.
Re: (Score:2)
Or it could simply be that Delta, holding Microsoft at least partly to blame for creating the problem, didn't want to take a chance that Microsoft would exacerbate rather than ameliorate their difficulties?
One could only hope Delta is not THAT much of a trainwreck of a company to have a person holding these kinds of opinions in a position of any sort of power.
This was not and is not, at any point, "a Microsoft issue".
You are beyond probably even professional medical help if you genuinely believe the people who designed the OS will somehow make your situation worse.
Re: (Score:2)
What might Delta have to hide, that they turned down clear offers from a company that was in a position to provide good, useful help?
"Microsoft support" calls me every couple of weeks telling me that they can fix my computer over the phone. Even though I don't have a Microsoft computer. Don't think it's good, useful help, though.
They probably just hung up on them.
Re: (Score:2)
Perhaps Delta's infrastructure is in worrying shape - turning down Microsoft and CrowdStrike might be a way to avoid showing third parties how bad it actually is.
I mean, there is usually an order to restoring services - usually you'd get the Active Directory server up and running first as directory services is foundational to getting the rest of the systems active and permissions and such.
But then, Microsoft and CrowdStrike will likely ask - where are your AD servers? Are you using a cloud AD as well (which
Re: (Score:2)
What might Delta have to hide, that they turned down clear offers from a company that was in a position to provide good, useful help?
Their own hubris? I've worked for plenty of companies who think they know better than vendors when it comes to the vendor's own products.
Fingers be flying! (Score:5, Insightful)
All three companies are at fault here. Delta for having an insufficient IT infrastructure and disaster recovery plan. Microsoft for it's abysmal operating system offering a vector for these types of catastrophes. CrowdStrike for it's incredible ineptness and handling of updates and QA procedures. Hell, big tech in general is an absolute shit show and needs regulation now or this is just going to keep happening. This was a big storm but the perfect storm is still on the horizon heading our way.
Re: (Score:2)
Bingo, too many people these days assume there is a single cause for something, a cause they can honk on about 'till weare all sick of hearing from them.
Re: (Score:2)
Exactly. But he human race has this inability to prevent large-scale catastrophes due to abject stupidity. Hence we will get that urgently needed regulation only when the world experiences a week (or more) of shutdown of _everything_ . All that crap built on MS trash and around it is incredibly fragile.
Re: (Score:2)
Microsoft for it's abysmal operating system offering a vector for these types of catastrophes.
By offering a vector this means allowing kernel drivers to be installed by third parties. I don't agree that places Microsoft at fault for anything. A bug in the Falcon kernel module for Linux could easily have resulted in the exact same outcome.
Re: (Score:2)
All three companies are at fault here. Delta for having an insufficient IT infrastructure and disaster recovery plan. Microsoft for it's abysmal operating system offering a vector for these types of catastrophes.
How about your stop with the ridiculous FUD? If Microsoft is to blame for offering "a vector for these types of catastrophes", just how much do you hate Linux? After all, you do realize Linux offers a whole ton MORE of these kinds of vectors and they are publicly available to anyone?
Re: (Score:2)
Microsoft for it's abysmal operating system offering a vector for these types of catastrophes.
Yeah sure. Let's just lock you out of your own system so that you can't ever install any software that messes with your machine even if you're an administrator on that machine. That's what you want don't you? A walled garden where you're protected from yourself?
Re: (Score:2)
Protection from Microsoft would be preferred.
Delta already ditched WinMo (Score:2)
Helloo, this is mIcrosoft support. (Score:3)
We have detected a problem with your machiine.
Re: (Score:3)
I wonder if they asked for payment in gift cards.
Except Microsoft DID cause the underlying issue. (Score:4, Insightful)
I'll give credit that maybe Microsoft didn't know this was possible, or, some safety failed, or any number of weird IT / Developer voodoo happened. Even if Microsoft had no knowledge of lockup being possible, they still caused it.
The best thing Microsoft can do, own it. Microsoft needs to jump in front of the bullet, admit they share at least 50% of the blame, and provide a real solution going forwards. Microsoft should take this event and turn themselves around, they have an excellent open door to move from hobby / creepy uncle peephole software, to a professional powerhouse software company.
Why should Delta have accepted the offer of help? Has anyone honestly gotten even passable custom service from Microsoft? Look at how they're handling the issue after the fact, and being they can't accept ownership, do you think they would have actually helped?
Re:Not really (Score:4, Insightful)
CrowdStrike was writing equally shitty kernel modules that crashed Linux. https://www.theregister.com/20... [theregister.com]
How do you mitigate against third party kernel modules other than outright denying access?
This would have absolutely happened if Delta had 10,000 instances of Red Hat.
Re: (Score:2)
Re:Not really (Score:4, Informative)
The crash on linux was a linux bug that apparently is now corrected. From your link:
"Updated to add on July 24
We understand now that CrowdStrike's software on Linux crashed due to a kernel bug involving BPF, which will need to be patched as per advisories from distro makers. Falcon Sensor code running at the kernel level was not affected; code at the user level using BPF to do its work was affected."
Re: (Score:2)
Which one? There's two listed in TFA, one is related to an eBPF call that was a kernel bug, and the other is related to loading of a kernel module. Or do you think that it's not possible to kernel panic a Linux machine by loading a kernel module?
That's the whole point of the discussion. Do you want the ability to run code on your machine, or do you want your hardware to be a walled garden of code blessed by the OS vendor (though reading through the comments here people are expecting less walled garden and m
Re: (Score:2, Insightful)
Only that on Linux, this did not prevent boot and you had a few minutes to fix things. Not comparable.
Re: (Score:2)
Pretty sure a kernel panic will interrupt your boot process.
Re: (Score:2)
Not if it requires the system to run for a while to happen.
Re: (Score:2)
Only that on Linux, this did not prevent boot and you had a few minutes to fix things. Not comparable.
Depends on which one you're talking about. Crowdstrike has had a history of multiple Linux related crashes and one of them definitely happened on boot, literally right as their kernel module (you know kernel modules right? as in low level code capable of fucking up anything) was loaded.
Even if you were right and one of the Linux cases didn't result in a kernel panic on boot, are you actually crediting them with pure luck in that it would haven't (but did) happen?
Re: (Score:2)
I am referring to the BPF problem. And I am aware that was a Linux kernel bug.
Re: (Score:2)
Funny how clueless people like you think Linux can't be made unbootable by loading a buggy kernel module. That is the true definition of inventing shit.
So keen to heap shit on Windows that you don't realise your precious OS gives you far more latitude to fuck shit up. (And that is a good thing, if you want to be locked out of your own hardware use an iPad). Claiming that "MicroShit" is bad because it allows administrators to install code running on kernel level makes me wonder if you ever have even used a c
Re: (Score:2)
The commonly referred to Linux crash in relation to ClownStroke required the kernel to run for a while before it occurred. It did not cause any boot issues. That is a pure fabrication by some Microsoft apologists. I never made any general claim. That is just your overactive imagination.
You seem to get less knowledgeable by the day. Change of meds?
Re: (Score:2)
I read that a very similar thing happened to Debian systems a few months back. It doesn't seem Windows specific, just more systems impacted.
Re: (Score:2)
Re: (Score:2)
The larger problem here was still having to enter a 48 digit bitlocker key on every affected system. Yeah MS or CrowdStrike had a quick fix but Delta had to have people physically enter a unique key on every machine. Everything from scheduling to the signage displays at airports.
Delta has the biggest blame here for allowing every single box to auto update at once.
Re: (Score:2)
BitLocker is one of those software products that you honestly have to wonder how anyone thought it was a good idea. BitLocker is a mess of an encryption software, it's trigger-happy, it's unstable, and it can enable itself with little warning about what should do. Being fair, again, if you had to reboot a Linux system with LUKS active, you'd need to enter the password, but at least then you have r
Re: (Score:2)
Sure, Delta should have run a smaller test update, but Windows still should haven't locked up the way it did.
This is incorrect. Windows did the correct thing by catching the unhandled exception in a kernel mode driver and halting operation. Continuing after such a failure has occurred would be wholly unacceptable behavior for a general purpose operating system.
Re: (Score:2)
Re: (Score:2)
I disagree, it should have halted the service / driver, since it's third party, and kept on going.
This behavior would be disastrous as ignoring exceptions would compromise the integrity of the system. In this case it is doubly so because the halted service imposes security related operational constraints that would no longer be enforced.
Re: (Score:2)
Re: (Score:2)
The fix would be to remove CrowdStrike from the drivers of the machine, but that would require access to the filesystem. Now, if you had an encrypted filesystem for Linux, you'd have the same issue of, "can't fix it if you are loading drivers into the kernel that are bad".
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Not actually similar. On Linux you could still boot and then fix the issue.
Re:Except Microsoft DID cause the underlying issue (Score:4, Interesting)
Linux had a bug, Windows has a fundamental design flaw.
That's markedly different.
Microsoft would have to resurrect the NT 3.51 driver model that they ditched.
It's possible that everybody would prefer this in 2024, especially for servers, on modern hardware, especially with worker architectures and distributed databases.
The gamers could keep their dangerous drivers for performance.
This is basically what has to happen at this point. Even using virt assist with drivers now.
Crowdstrike should move to Win-eBPF which they use on linux.
Re:Except Microsoft DID cause the underlying issue (Score:4, Interesting)
> I'll give credit that maybe Microsoft didn't know this was possible
Nah - NT 3.51 had separation between drivers and kernel for robustness. It was roughly GUI VMS and got an Orange Book certification.
NT4 moved drivers into kernel space for performance.
Re: (Score:2)
Re: (Score:2)
Microsoft can't claim innocence, they absolutely were the cause of the issue. They might not have been the root cause of the issue, but their software shit the bed, and locked up.
I'll give credit that maybe Microsoft didn't know this was possible, or, some safety failed, or any number of weird IT / Developer voodoo happened. Even if Microsoft had no knowledge of lockup being possible, they still caused it.
This makes no sense. Why would Microsoft be responsible for failures in other peoples code? Windows did exactly what it was designed to do catch an unhandled exception in someone else's flawed kernel code and halt execution.
Re: (Score:2)
Re: (Score:2)
They're not responsible for other people's code,
Other peoples code caused the problem.
they're responsible for their kernel. The kernel shit the bed, and it might have shit the bed because of someone else,
The kernel did not fail. It did exactly what it was designed to do.
but now Microsoft have to make sure in the future the same kind of error, only takes out the offending product.
Lets for the sake of argument assume Microsoft snapped its fingers and got rid of third party kernel drivers altogether or went with some sort of radically different Microkernel architecture able to fully isolate consequences of failure.
Is it your contention the acceptable course of action would have been for the system to continue operation without the benefit of the failed security software? It shoul
Re: (Score:2)
Re: (Score:2)
What should have happened is the system disabling Crowd Strike, and then carrying on like normal.
Uncaught exceptions are fatal in kernel space. The only possible correct answer is halting operation.
One could simply ignore reality and instead argue the equivalent of while approaching a cliff vehicles should automatically sprout wings and fly to safety rather than plummeting to its doom. Never mind this was never part of the deal and nobody was ever at any point in time confused that it was.
If carrying on like normal is too risky, then disable the related subsystems like networking, but keep the system up and bootable.
Well shit if only the networking stack was disabled Delta wouldn't have suffered any downtime.
The entire issue is that Microsoft entered a failed state where nothing worked, and where the solution required jumping through a lot of headache.
Yea like totally...
Re: (Score:2)
At least then the systems are up, and debugging the issue is simpler, then locking into a blue screen and bleeping the system access. However, either way, what happened, happened and now Micr
Re: (Score:2)
What's your argument exactly?
I think I've made myself clear the only acceptable response to an uncaught exception in kernel mode is to halt operation.
That Windows should have bricked itself
Windows didn't brick itself.
because in 2024 Microsoft can't handle a simple memory exception or driver failure?
That's right it can't and neither can Linux. The reason for that is kernel level operations (including code executed by third party kernel drivers) have global access and so when something unexpected happens there is no way to isolate the consequences or reason about the implications of the unexpected failure. Your choice is to cross your fingers and hope for
Re: (Score:2)
So for something like this (Score:2)
The one at fault is crowd strike for not testing. Now I suppose the argument could be made that the compa
Re: (Score:2)
Re: (Score:2)
The issue is they're trying to push way, hard, from having any responsibility for the issue.
Delta is right (Score:2)
Delta was forced into this position. why should they be further forced to accept help from the companies that fucked up their systems in the first place???
Re: (Score:2)
Delta drunkenly walked their own asses into this position. And if one of the companies is responsible, offers free help, and I refuse it or ignore it, then I don't deserve lawsuit money from them either. Everyone that works at Delta is a grade A moron.
This is where CYA comes in handy (Score:4, Interesting)
All those calls made by Microsoft should be preserved online for everyone to see. The email Nadella sent should be posted online (minus the final email address). Microsoft should be showing everything they did to help and Delta either refused or ignored them.
While not a fan of Microsoft, in a case like this it is entirely believable they did what they could to offer assistance. Showing their efforts would put everything to rest in a heartbeat and throw it back on Delta to explain their side of the story.
Re: (Score:2)
If the choice is taking free help from MS or losing at least 500 million dollars and look like incompetent morons, I think I'd take the help.
Re: (Score:2)
You think that was the choice? In what drug-infused fantasy do you live?
Re: This is where CYA comes in handy (Score:2)
Whether they realized it at the time or not, yes. That was the two choices they had.
Re: (Score:2)
No, it clearly was not. You being clueless or lying does not change that. For example, just to demonstrate that you are full of shit, they could have asked others for help. And suddenly, there are three choices.
Re: This is where CYA comes in handy (Score:2)
Sure. They could have. Though it wouldn't have improved anything or gotten themselves back up to speed any faster. Plus that requires the other party to accept to help - no guarantees there. So it wasn't a real or practical option. Microsoft was already offering the help for free.
They could also have just had a better process in place to deal with this kind of failure, but they didn't and still won't. Nobody else took as long to recover. It wasn't even difficult, complicated, it time consuming to recover. E
buy more (Score:2)
Microsoft: it's your own fault, you should have bought even more of our crap.
They turned down the offers because (Score:2)
They turned down the offers because the only solution available to them, as far as I know, was to manually update each affected machine with some physical presence. Not sure they can accept much help in that self-inflicted case, right? Or did I miss something here?
Re: (Score:2)
I don't know about Delta, but the servers would have been primarily virtual machine based, and those wouldn't need physical presence to get to. Individual workstations would be the next thing to work on, but one person per location could get it done in a reasonable amount of time. Even with Bitlocker, as long as the keys were available, getting through to knock out the CrowdStrike drivers and remove them doesn't take THAT long.
I'm not saying the effort is trivial, but the, "how do you respond and recove
Doesn't change the liability (Score:2)
Re: (Score:2)
The liability is a trickier issue here, actually.
There is liability on several sides.
CrowdStrike is certainly liable for the initial issue, but they had a fix issued within a couple hours. News reports have said about 60% of Fortune 500 businesses had outages, and more than half had resolved the outages within 12 hours, most within 24 hours. The other airlines around the globe had resolved the technical outage on a similar scale, about 12-24 hours depending on the business to get the technical side figure
Crowdstrike is a crutch for Microsoft's Windows (Score:2)
Satya Nadella never called us (Score:2)
Our 55-employee company was hit too. We never got any offer to help from Microsoft.
I guess Microsoft only reaches out to help customers when they're big enough to generate bad PR for Microsoft.
Re: (Score:2)
bfd (Score:2)
"Satya Nadella also personally emailed Delta CEO Ed Bastian and never heard back"
Our CEO has been WhatsApping random employees with "I need your attention for a quick task"
Everyone's ignoring him, because security.
Car analogy (Score:2)
Re: (Score:2)
Re: (Score:3)
By the time this is settled in court, I bet at least two of companies involved will have faded.
As much as I hate defending Microsoft, it's impossible to make an operating system that's unbreakable by a 3rd party app with sufficient privileges. It's like I used to tell family members about not downloading random shit from the internet: I could wipe everything on your computer with a single command. If I can do that, so can something malicious that you've downloaded.
Delta will probably just get a government bailout if it came down to it.
That just leaves Crowdstrike, and to them I say "good riddance.
Re: eMess (Score:2)
Delta and Crowdstrike seems to be the first in line to the chopping block.
Re: eMess (Score:2)
Delta may get a bailout, but Clownstroke is going away. There are too many replacements and they cost governments too much money.