Somebody said, “I need to reboot the DB servers…” and I immediately felt my eye starting to twitch.
I quickly asked, “Why are you rebooting it?”
“Because it’s unstable,” was his reply.
“But what have you done to it to identify the cause?” I challenged, “What changes have you made to correct it?”
“Well, none! I’m rebooting it because it’s unstable!”
I’m not concerned about rebooting a DB server, I mean, there will be a brief period of unavailability, but instead because a reboot wouldn’t address the actual problem.
You see, a reboot, isn’t a fix. It’s only a postponement of the inevitable. You’ll still need to address the actual problem itself that led to the instability. Instead, you’ve a flawed belief that a reboot would fix something, or that it would buy you time — this time — so it could be fixed.
And, predictably, doing a reboot only masked the issue.
Temporarily.
It was rebooted again about two hours later. Not fixed or troubleshot any further that time either. And is still in need of clearly understanding why it had become unstable to begin with: addressing the root cause of the instability itself.
It’ll be rebooted again. And again.