I've picked the perfect day to return from vacation. Being greeted by thousands of users being mad at you and people asking for your head on a plate makes me reconsider my career choice. Here's to 12 hours of task force meetings...
Huge sympathies to you. If it's any consolation, because the scale of the outage is SO massive and widely reported, it will quickly become apparent that this was beyond your control, and those demanding your 'head on a plate' are likely to appear rather foolish. Hang in there my friend.
To their credit, the stakeholder that asked for my head personally came to me and apologised once they realised that entire airports have been shut down worldwide. But yeah, not a Friday/funday hahaha
Ye and these types make any problem worse. Any technical problem also becomes a social problem to deal with these lunatics and keep the house of cards from crumbeling.
It's not a management thing, it's very much a personality trait ... that for whatever reason seems to survive in pockets of management in most organisations over a certain size.
It's not a trait that survives well at yard crew level, trade assistents that freak out at spiders either get over it or never make it through apprenceships to become tradespeople.
In IT those who deal with failing processes, stopped jobs, smoking hardware, insuffcient RAM, tight deadlines learn to cope or get sidelined or fired (mostly).
To be clear, I've seen people get frazzled at most levels and many job types in various companies.
My thesis is there's a layer of management in which nervous types who utterly lose their cool at the first sign of trouble can survive better than elsewhere in large organisations.
But that's just been my experience over many years in several different types of work domains.
Ohhh absolutely. And it's not just users, it's also management. "How does this affect us? Are we compromised? What are our options? Why didn't we prevent this? How do you prevent this going forward? How soon can you have it back up? What was affected? Why isn't it everyone? Why are things still down? Why didn't X or Y unrelated vendor schlock prevent this?..."
And on and on and on. Just the amount of time spent unproductively discussing this nightmare is going to cost billions.
Nothing is more annoying than having a user ask a litany of questions obvious to the person working on the problem and looking for the answers while working on the problem and looking for the answers.
They’re valid for a postmortem analysis. They’re not helpful while you’re actively triaging the incident, because they don’t get you any steps closer to fixing it.
Exactly my thinking. Asking these questions doesn't help us now. But after all the action is done, they should be asked. And really should be questions that always get asked from time to time, incident or no incident.
The problem is that you are only focusing on making the computers work and not the system.
"we don't know yet" is a valid response and gives the rest something to work, and it shouldn't annoy you that it's being asked, first of all because if they are asking is because you are already late.
you have to to tell the rest of the team what you know and you don't know, and update them accordingly.
until your team says something the rest don't know if it's a 30 minute thing or the end of the world or if we need to start dusting off the faxes.
Your head belongs on the plate for not being able to point back to your recommendation for failover posture enhancement such as identifying core business systems, core function roles, having fully offline emergency systems, warning of the dangers of making cloud services your only services, and then pointing to the proposed costs to implement these systems being lower than the damages caused by outage to core business services.
Move to a new career if you feel you don't have the ability to push right back against this.