CrowdStrike IT Outage Explained by a Windows Developer

ɐɥO@lemmy.ohaa.xyz · 4 months ago

CrowdStrike IT Outage Explained by a Windows Developer

peopleproblems@lemmy.world · 4 months ago

That answered a lot of questions.

I hope they publicly state how they pushed a bad file, but I doubt it.

Seems like someone really didn’t pay attention to what they were doing, and they might have an internal problem with QA.

DudeDudenson@lemmings.world · 4 months ago

Their QA worked better than intended, they had tests fail worldwide and tons of results to work off of hahaha

sunzu@kbin.run · 4 months ago

This likely going to be text book case of how to not a run a company in a dominant market position that caused world wide system failures.

Makes you wonder if we should be allowing such consolidtion in critical industries. This ain’t even about economics anymore. More of a infrastructure and national security decision.

Or fucking supervivise and train people properly… I don’t know. Sounds like management problems.

SauceFlexr@lemmy.world · 4 months ago

As someone that works in QA, yeah, they needed something to catch this. I saw someone mention somewhere without a source that they missed it as all test machines have their full suite of software installed. In that scenario, the computer wasn’t affected. So for QA it seems their labs might need to be more in tune with the user base.

However, the fact that they are able to push this so quickly worldwide seems like a big process issue. I get 0 day issues and that is how they justify it. But deploy to a small subset of customers before going global seems more reasonable.

NateSwift@lemmy.dbzer0.com · 4 months ago

I heard somewhere that the updated ignored staging settings set. So even if companies had it set to only roll out to a subset of their computers it went everywhere

SauceFlexr@lemmy.world · 4 months ago

Oof. Then that seems more on the ops side of things. Interesting. I can’t wait for them to never share what happened so we can all continue to speculate. 😂

0x0@programming.dev · 4 months ago

I read somewhere (commentes in that video) that CS ignored their own customer-configured stagger upgrades for some upgrades…

dan@upvote.au · 4 months ago

Apparently those settings are only for updates to the software itself, not for updates to the definition files.

andrew_bidlaw@sh.itjust.works · 4 months ago

they might have an internal problem with QA.

They don’t have a lack of quality assurance. They have a lack-of-quality assurance.