Meanwhile a Microsoft employee on how to prevent such an issue under Linux: https://www.phoronix.com/news/systemd-Auto-Boot-Assessment
a Microsoft employee
You’re talking about good ol’ Lenny like he isn’t the author of the most used init and utility system as well as PulseAudio.
I know who that is and he’s also a Microsoft employee these days which makes this a funny sequence of statements:
“EU bad because they made us open up Windows to 3rd party anti-virus vendors. Oh, btw, the fully open Linux operating system can cope with such a problem if properly configured. Here’s the documentation to make that configuration.”
Not only that, he specifically attacked “commercial operating systems” - it’s anyone’s guess which he meant - for not implementing it.
I don’t know enough about Windows 10/11, but aren’t they supposed to boot into a menu thet allows you to pick the last known good configuration before it evens boots to the gui?
The problem is with a specific file on the disk, not a misconfiguration
Apparently it’s because CrowdStrike installed their device driver as one that must start when Windows starts.
Explained here: https://youtu.be/wAzEJxOo1ts?feature=shared&t=675
I’ve linked to the specific time where he explains that issue, but tbh the whole video is worth watching.
I don’t use Windows these days but I still enjoy Dave’s channel
It’s been a while since I had such a massive problem under Windows but the last time you could try to restore one of the last backups and usually that failed because Windows restore points are/were crap.
Yeah we tried that where I work (I’m not IT) and it failed. Safe mode didn’t work either 'cause it couldn’t authenticate the user for login as the server was down as well.
Oh FFS. I love this era where companies will not accept the blame due to “liability”, even when they are explicitly to blame.
Fuck Microsoft and fuck Windows.
But if you inject hacky bullshit third party code into someone’s OS that breaks things, it’s not the OS’s fault.
But in this case Microsoft certified the driver. If they knew the driver included an interpreter that can run arbitrary code, they shouldn’t have certified it because they can not fully test it. If they didn’t know, then their certification test are inadequate. Most of the blame lies with the security software. If Microsoft didn’t certify it, they would have had zero fault.
Certifying a driver is not an endorsement.
It is a verification that it is legitimately from who it claims to be from. Microsoft has zero fault, period.
I had a read about the WHQL (which I assumes what certified means). It uses the Windows HLK to perform a series of tests, which submited to Microsoft, and only then the driver will be signed.
While certification isn’t endorsement, the testing and the resulting certification implies basic compatibility and reliability. And causing bootloops and BSODs is anywhere but close to “basic compatibility and reliability.”
Crowdstrike bypassed WHQL because the update was not to the driver, it was to a configuration file that then gets ingested by the driver. It’s deliberate so they can push out updates for developing threats without being slowed down by the WHQL process.
And that means when they decide to just send it on a Friday with a buggy config file, nobody is responsible but Crowdstrike.
Oh wow. Then definitely CS is in fault. What a brilliant idea they have.
The Windows Hardware Certification program (formerly Windows Hardware Quality Labs Testing, WHQL Testing, or Windows Logo Testing) is Microsoft’s testing process which involves running a series of tests on third-party device drivers, and then submitting the log files from these tests to Microsoft for review. The procedure may also include Microsoft running their own tests on a wide range of equipment, such as different hardware and different Microsoft Windows editions.
For the Nth time, crowdstrike circumvented the testing process
Edit: this is not to say that cs didn’t have to in order to provide their services, nor is this to say that ms didn’t know about the circumvention and/or delegate testing of config files to CS. I’ll take any opportunity to rag on MS, but in this case it is entirely on CS.
We all hate Microsoft for turning Windows into an ad platform but they aren’t wrong.
They are legally required to give Crowdstrike or anyone complete low level access to the OS. They are legally required to let Crowdstrike crash your computer. Because anything else means Microsoft is in control and not the software you installed.
It’s no different than Linux in that way. If you install a buggy device driver on Linux, that’s your/the driver’s fault, not Linux.
They are legally required to let Crowdstrike crash your computer.
I call Bullshit.
If it had been Windows NT 3.5, there would have been no bluescreens around the world. It would have stopped the buggy software, given a message accordingly, and continued it’s job. That Windows was not stupid enough to crash itself just because of a null pointer in another software.
Now you tell me that Windows NT 3.5 is illegal?
I ran 3.5. Yes, a network driver crash would blue screen NT3.5. Graphics were in user space in 3.5 so a video driver couldn’t take NT 3.5 down but networking was in the kernel. https://en.m.wikipedia.org/wiki/Hybrid_kernel
a network driver crash would blue screen NT3.5.
OK, and… Were the legally required to make it crash?
A better comparison would be an iPhone. Apple has locked that down so much that it’s impossible to install something like CrowdStrike falcon, thus it’s not possible for something like this to happen.
Microsoft is saying if the EU would let them, they too could lock down their platform enough to prevent this from happening.
However, I would prefer to maintain control over my device and do what I want with it, instead of just what Apple/Microsoft want; even if that means I might break my device.
Not then, but European anti trust lawsuits resulted in laws that require Microsoft to allow 3rd parties complete access. That means if the 3rd party software is a low level driver, it will crash the system. They are legally required to allow vendors the level of access that can crash the system.
They were legally required to permit third party to install a kernel mode driver.
You could absolutely install software on Windows 3.5 that would crash the system.
Can confirm. I’ve crashed most Microsoft products from msdos 5.
@OfCourseNot @neme @apfelwoiSchoppen @Blue_Morpho @NeoNachtwaechter @MinFapper
My dislike for MS products goes back to 1979 and the BASIC interpreter on the Commodore Pet.
At the time I thought that’s just how computers are but within just a few years I realised that non Microsoft BASIC always seemed to be better…
We all hate Microsoft for turning Windows into an ad platform but they aren’t wrong.
Sorry, how is that related to the stability of the kernel?
I explained in my second sentence.
“They are legally required to give Crowdstrike or anyone low level access to the OS.”
If you install a buggy driver into Linux and it crashes, that’s not a problem with the Linux kernel.
https://www.redhat.com/sysadmin/linux-kernel-panicI fully agree with you on that front, but ads have nothing to do with kernel access, so how is that relevant to their legal requirements?
I was explaining why everyone hates on Microsoft but the Crowdstrike crash had nothing to do with the reasons people hate MS.
Gotcha.
But what if Windows have something similar to eBPF in Linux, and CS opted to use it, will this disaster won’t happen at all or in a much smaller scale and less impactful?
Crowdstrike managed to fuck up Linux through eBPF just as well.
https://access.redhat.com/solutions/7068083
If you load hacky shit into the kernel it can always find a way to make a nasty surprise. eBPF is a little bit better fence, not some miracle that automatically fixes shitty code.
But these eBPF loader bugs are fixed now. Windows drivers are still causing BSODs and will continue to do so until Microsoft adopts eBPF.
Yeah I saw the article that says they’re legally required but until I can actually read that document where it says “thou shall give everyone ring-0” access I’m gonna call it bullshit.
If it’s not ring 0, it’s not full access. They are legally required to give full access.
I’ll believe it when I read it.
It might not be written literally like that but for Microsoft not letting third party developers write kernel drivers for windows would be considered abusing their position in the market very fast. The problem isn’t they allow kernel drivers, this is just ms throwing all the balls they can, is that they certified this very driver, as tested and stable. Without this certification most IT teams would’ve been more reticent to install crowdstrike’s root kit in their systems.
The thing is, Microsoft’s virus-scanning API shouldn’t be able to BSOD anything, no matter what third-party software makes calls to it, or the nature of those calls. They should have implemented some kind of error handler for when the calls are malformed.So this is really a case of both Crowdstrike and Microsoft fucking up. Crowdstrike shoulders most of the blame, of course, but Microsoft really needs to harden their API to appropriately catch errors, or this will happen again.I’m an idiot. For some reason, I was thinking about the Windows Defender API, which can be called from third-party applications.
I don’t believe there was any specific API in use here, for virus scanning or not. I suppose maybe the device driver API? I am not a kernel developer so I don’t know if that’s the right term for it.
Crowdstrike’s driver was loaded at boot and caused a null pointer dereference error, inside the kernel. In userspace, when this happens, the kernel is there to catch it so only the application that caused it crashes. In kernelspace, you get a BSOD because there’s really nothing else to do.
I stand corrected. For some reason, I was thinking they used the actual Windows Defender API, which can be called programmatically from third-party applications, but you’re correct, it was a driver loaded at boot. Microsoft isn’t at all at fault, here.
Isn’t that API what the article is talking about?
Nope. It’s a lower level kernel API that has to be accessed at boot via a driver. The API I was thinking of - and I use the term “thinking” loosely, here - is an API that userspace applications can take advantage of to scan files after boot is already complete.
I actually agree, I own my computer / OS and I should be able to do what you’re saying (install and break things). But Microsoft is a trillion dollar multi national corporation and I am certainly going to give them grief about this because I owe them less than nothing, let alone any good will.
That doesn’t make any sense. How does arguing against your position do anything but harm it?
Maybe just give them grief over the myriad negative things they do that don’t counter your position?
You are going to give grief to Microsoft for allowing what you want?
???
You are not wrong, but people don’t want to hear it. Do we want to retain control over what goes into kernel space or not? If so, we have to accept that whatever we stuff in there can crash the entire thing. That’s why we have stuff like driver signatures. Which Crowdstrike apparently bypassed with a technical loophole from how I understand it.
even when it was the bears I knew it was regulation and taxes.
why do communists hate free market and liberty?
won’t someone think of the corporations!?
they feed you, shit lord… show some respect for your betters!
This whole thing just exposes that people getting paid big bucks for this shit, aint really that smart or planning for anything, they are just collecting rent until something blows up lol
I wouldn’t be surprised if the people finding viruses/malwares and detecting them aren’t the same people responsible for deployment. And anyway, it’s not like smart people make zero mistakes…
I always wonder about that and also all these data “breaches”
Seems like a great way to retire early lol
Economic incentive is there but at least we know that people are honest!
They just pay so when it goes sideways they can hold up their hands and point out a reputable supplier was used and now it’s not their problem or blemish on their career.
Yes, an anarchist guy pointed this out to me that in our world responsibility can be delegated via contract while this doesn’t make any sense. The responsible person should still be responsible, only the specific action would be choosing those to whom to delegate the obligation for which they are responsible.
Like in Nazi Germany and other fascist states they like to emotionally make only the leaders responsible, while with corps they like to only make the last company in chain responsible.
In fact all chain is responsible. Responsibility is fully contagious.
If this was like this in all laws, we’d have a much better world.
Is this even relevant? Wasn’t it a kernel driver module?
It’s a third party kernel module, which Microsoft would love to be able to block, but legally can’t. It’s technically possible to write a virus scanner that runs in user space instead of the kernel, but it’s easier to make sure everything gets scanned if it’s in the kernel.
Personally, I don’t see the issue. Microsoft shouldn’t be responsible for when a third party creates a buggy kernel module.
And when you, as a company, decide to effectively install a low-level rootkit on all your machines in hopes that it will protect you against whatever, you accept the potential side effects. Last week, those side effects occurred.
MS gives them access, so they’re responsible.
I disagree. As someone else in this thread said: if you compile a buggy Linux driver that crashes the system, it’s still the fault of the driver.
I’m not exempting Crowdstrike and I’m not sure the comparison holds: linux is a kernel, mot a corporation.
Try Ubuntu or RedHat, would they be liable?
My answer might surprise you, but no. Your source code, your binary, your responsibility. Not that of the platform, the compiler, or the company that supplies it.
Linux does not certify drivers though. Microsoft does.
It is my understanding that this driver had not been (re) certified by Microsoft, though. So in that case, I stand by my statement.
If it had been, I’d agree with that blame.
I bet you love your locked down iPhone too
Why would I buy an Apple product?
Come on, conform to their baseless assumptions so their insult can stick!
Hard to say yet, if Microsoft is responsible or not. The thing is they certified it, as a stable and tested driver. But it isn’t just a driver, but an interpreter/loader that loads code at runtime and executes it. In kernel mode. If Microsoft knew this they’re definitely responsible for certifying it, but maybe crowdstrike hid this behavior until it was deployed to the customers.
It was my understanding that this wasn’t certified. Crowdstrike circumvented the signing process.
The driver was signed, the issue was with a configuration file for that’s not part of the driver.
Maybe it should be. At least part of the package that’s signed.
A configuration file shouldn’t crash the kernel. I don’t understand how this solution could pass the certification. I don’t know the criteria of course, but on the surface it sounds like Crowdstrike created a workaround, and Microsoft either missed or allowed it.
AFAIK, blue screen doesn’t mean kernel crash. Hell, windows crashing isn’t even rare.
Certification doesn’t mean it has Microsoft seal of approval either, only that it comes from a certified and approved vendor, with some checks at best.
Config files are not part of the driver, ever. How do you think you can change the settings of you GPU without asking Microsoft?
But hey, if you are so willing to blame Microsoft for the one time it’s not their fault, may I talk to you about our Lord Savior Linux? In my office we only knew because of the memes.
How would you prove that no input exists that could crash a piece of code? The potential search space is enormous. Microsoft can’t prevent drivers from accepting external input, so there’s always a risk that something could trigger an undetected error in the code. Microsoft certainly ought to be fuzz testing drivers it certifies but that will only catch low hanging fruit. Unless they can see the source code, it’s hard to determine for sure that there are no memory safety bugs.
The driver developers are the ones with the source code and should have been using analysis tools to find these kinds of memory safety errors. Or they could have written it in a memory safe language like Rust.
You don’t need to prove that no input can crash the code. “Exhaustive testing is not possible” is one of the core testing principles, ISTQB teaches that. As far as we know, the input was a file filled with zeroes, and not some subtle configuration or instruction. That can definitely be expected, tested, and handled.
As far as we know, the input was a file filled with zeroes
CrowdStrike have said that was not the problem:
This is not related to null bytes contained within Channel File 291 or any other Channel File.
That said, their preliminary incident review doesn’t give us much to go on as to what was wrong with the file.
You’re speculating that it was something easy to test for by a third party. It certainly could have been but I would hope it’s a more subtle bug which, as you say, can’t be exhaustively tested for. Source code analysis definitely would have surfaced this bug so either they didn’t bother looking or didn’t bother fixing it.
The document that outlines the agreement between Microsoft and the European Commission is available as a Doc file on Microsoft’s website.
…which seems to be inaccessible. I highly doubt this document specifically said “giv’em ring-0 access”, this is just MS trying to deflect blame and cash it at the same time.
I’m sorry, but competition is good.
Installing some closed blob into your kernel, that’s on you.
The problem is if anything is not enough competition. We just saw a centralized monoculture fall over.
The document states that Microsoft is obligated to make available its APIs in its Windows Client and Server operating systems that are used by its security products to third-party security software makers.
The document does not, however say those APIs have to exist. Microsoft could eliminate them for its own security products and then there would be no issue.
I’m pretty sure that if Microsoft provided a decent way to do what Crowdstrike does, most companies would opt for that.
So… Sucks to suck I guess.
Uhhh they do. Defender for Endpoint. It’s available as both P1 and P2 depending on what you need.
Why should MS do that? I guess if they saw a market value for it, they could. Like how Defender came to be after 20 years of third party anti-virus.
They certainly developed the tech for it - I remember reading about some of their research circa 2000 making the OS and everything on it a database. They’ve kind of been working that direction for years (see MyLifeBits).
I suppose they could provide an add-on tool for this, but I suspect there’s a political barrier (imagine the blowback of MS providing such a tool).
Blaming the EU is stupid MacOS is locked down, for the EU it’s more about apps less about the kernel space.
Security software are also “apps”. Since Microsoft is also in the security software business locking down access for their competitors could definitely be seen as anti-competitive practices.
Apple doesn’t have a monopoly with MacOS so other rules apply.
My issue with that is Android is also pretty locked down and most certainly does have a monopoly, in general I think it’s just MS being stupid.