Super Computers: Too Big to Fail?
“Just beware the shiny new servers with outrageous resources. You may ultimately create a monster that is literally too big to fail,” concludes Mark Vaughn in his recent column High-performance computers: Too much of a good thing? in SearchServerVitualization .
This article was music to my ears. Mark Vaughn has got it spot on when he suggests that some of the great new high performance computers (e.g. blade servers with unimaginable amounts of memory and disk capacity in tiny footprints) may not actually be the best for your virtualised infrastructure. This is the concern I have always had with what I refer to as “aggressive” consolidations. I get the message that you will be saving money, cooling and it’s ultra flexible. But, when that super server has a problem, VMs will fail and I was interested in the author’s suggestion of a domino effect – something I hadn’t considered before but that is entirely possible, if not probable.
All is not lost though! Just substitute those super servers with fault tolerant technology. That way, if there is a problem, the server carries on running, albeit in simplex (think of it as limp-home without any loss of performance or functionality) mode. Services can then be live migrated onto other existing nodes and, hey presto, no downtime!
Filed under: Fault-Tolerance | Leave a Comment
Tags: Andy Bailey, Fault-Tolerance, virtual machine, virtualisation, Virtualization



No Responses Yet to “Super Computers: Too Big to Fail?”