Tuesday, December 11, 2007

Data Center Meltdown

Denial and the coming “data meltdown” by ZDNet's Michael Krigsman -- Subodh Bapat, Sun Microsystems eco-computing vice president, believes we’ll soon see the first world-class data center meltdown. According to News.com: “You’ll see a massive failure in a year,” Bapat said at a dinner with reporters on Monday. “We are going to see a data center failure of that scale.” “That scale” referred to the problems [...]

Now let's think about that. How can a datacenter fail?

  1. It can lose power for an extended period. It takes a lot of backup generators to keep a 50 megawatt datacenter humming.
  2. A virus or worm can shut it down.
  3. A natural disaster can destroy it or force a shutdown. Often times this is due to a power failure rather than destruction.
  4. It can have its WAN/Internet connection(s) severed. This isn't quite catastrophic unless you're on the other side of the WAN.

Michael has pointed out that Subodh Bapat doesn't point to the expected cause of a major data center meltdown, he just says one is coming. That's because it's not really a matter of one cause. There are so many risks, and you multiply them by the growing number of data centers, and what you come up with is that there's bound to be a major failure soon. We just don't know what the precise cause will be.

Most of the threats involve a geographically localized destruction or disabling of the data center. This means you need off-site recovery, and it probably needs to be fast. That means you probably need more than recovery, you need one or more hot sites that can take over the load of one that fails. This is extremely expensive for a "normal" data center. I can hardly imagine how much it would cost for a 50+ megawatt facility. Basically what we have is too many eggs in one basket, with economies of scale pushing us to keep on putting more eggs in the same basket.

What surprises me is that Subodh Bapat didn't say, oh, well, Sun has the solution. Considering that Jonathan Schwartz put it forward over a year ago. Ok, well, he didn't exactly suggest Blackbox as a means of easily distributing data centers. But think about it. If you're a large corporation you are probably already geographically distributed. If you expand your data center in cargo-container sized units across the nation (or world), you are half..err..maybe one quarter of the way there. You still have to figure out how to make them be hot sites for each other, or at least recovery sites. But at least losses at any given site would be minimized.

Sphere: Related Content

No comments: