Anybody who is involved in running high availability services in a virtualised world ought to read through this very informative article.

It discusses the various techniques used to prioritise the restarting of services when failures occur on a particular host.

Firstly, there is confusion over whether the article is about true Disaster Recovery (DR) – by that I mean a geographically placed standby site should your primary datacentre suddenly be filled up with water. I can only conclude that the article is really referencing a local site as it talks about software lockstepping, partnerships with third parties etc – all these things are only really suitable for a geographically close site because they rely on very low latencies between the various systems.

So at this point let me introduce another idea:

“To avoid having to worry about any restart scenarios, deploy your critical hosts on fault tolerant technology …”


Again, the question of reliability is lurking in this article with specific reference to virtualisation and Tier1 app’s …

There is a very simple solution to this problem. If your goal is to provide mainframe-class reliability and operations to virtual resources then consider running a fault tolerant server as the host.

Fault tolerant systems have been available for years – the company I work for has just celebrated its 30th birthday. The technology has been successfully ported and is now used in industry standard server designs and is capable of running your favourite virtualisation hypervisor without any modifications. They are extremely effective and highly suitable for the Tier1 applications mentioned in this article.

Don’t be frightened – you can move them today and enjoy all the advantages of virtualisation.


Stratus Technologies’ ftServer Offers Smooth Execution of Tiered Applications: Principled Laboratories.

Time to blow one’s own trumpet, as the saying goes. I guess the message here is to state, quite categorically, that it is OK and acceptable to run your tier1 applications in a virtualised environment. I know that a lot of clients have been reticent about doing this and there are normally two reasons:

  1. Security, which I am really not qualified enough to answer.
  2. Availability – how do I ensure that all my consolidated applications and services are going to keep on running?

Fortunately I am qualified to answer number 2.

Put them on a host and sign up to a 100% availability guarantee…

Available from the company I work for!


Here’s an article worthy of note that discusses the state of servers.

X86 seems to have taken over the world – but we knew that already didn’t we?

I note that one survey respondent said:

“The move to virtual servers for high availability and redundancy has become our goal.”

Well, why not raise the stakes and go for gold?

Surely the goal should be systems and services that don’t fail in the first place?

Blades and the latest generation of intel servers simply cannot achieve this goal. Fault tolerant technology can … and they are based on the X86 architecture which is just so popular.


A post on vmblog last month brought a great case study to my attention,

I like these types of solutions – in addition to all the great power and cooling savings we have all got used to, the flexibility that we gain when we virtualize is probably the most useful for on-going operations.

It’s great to see this extending to the storage layer but I was intrigued in this article as to what was meant by the guaranteed uptime and fault tolerance. Having delved into the Datacore architecture it’s clear that this refers to the storage subsystem. Do I take it then that this guarantee doesn’t extend to the hosts running the hypervisor?

I can only conclude that this is the case and of course, we have a suggestion…

Fault tolerant hardware is now available to run the Hyper-V stack – this means fully duplexed components with no single points of failure.

A very neat suggestion for Total Wine to keep the corks popping.


I am always harping on about fault tolerant technology. By definition, technology is normally associated with hardware but not always.

So, here is the BUT with reference to the guys I work for. You can do lots of clever stuff with software these days and here is the proof.

For SMB and medium sized deployments, it’s the bees’ knees or the dog’s you know what .


You may wonder what on earth Avance is …

Well, if you were to take a couple of standard servers running an enterprise class virtualisation software stack, slap on some shared storage, add some (usually expensive) licences to manage things like live migration and monitor/manage, shake it all up to come out with a single product/solution, I guess you would have Avance.


Desktop virtualisation is something that I haven’t really honed-in on before.

There is a valid question to ask as “how important is a single desktop in terms of its availability?” The answer is probably “not very important” as some recovery time would almost certainly be acceptable.

However, the hosts today are supporting an enormous amount of desktop sessions so the real question becomes “what is the impact of a 1000 desktop sessions all failing at the same time?”

The answer to this is, of course, completely different.

My answer is to protect the environment with fault tolerant technology. It goes beyond this though. It is emerging that, in most VDI architectures, the broker server is the most important element. So, even if you don’t run the host environment on protected hardware, the least you should be doing is running the broker application on a bullet-proof platform.


“Uptime, what’s in 0.1%?” What a great question!

Putting some meat on the bones, this article delves into some of the well-know social media sites and produces some interesting statistics, not just about the sites, but about their availability figures. It concludes by stating that 99.99% availablility is ok and is a good balance between availability and affordability.

But for tier 1 applications, 99.99% just simply isn’t good enough. These types of systems demand ultimate availability. Sometimes this is for customer satisfaction reasons, sometimes data integrity or maybe because it is really hard work recovering systems after outages. For whatever reason, there is choice and the ability these days to run commercially available operating systems on fault tolerant hardware. Applications do not have to be re-engineered and it truly is an out-of-the-box availability solution.

And the good news is, the solution costs little more than you would have paid otherwise.


There are some great new storage software solutions coming onto the market and this one caught my attention as it gives the ability to turn your vmware host into an iScsi…


The advantage of sharing your local storage is that your vSphere environment can now use the extra features like VMware High Availability, VMware DRS and VMware VMotion and you don’t have to spend money on buying a separate iSCSI storage solution – or fibre channel storage.

As the report highlights though, the major disadvantage is that this puts enormous “pressure”, from an availability perspective, on the main host system. I know the software is also available with a mirroring option but you can still suffer downtime because the VMs running in memory of the failing host will of course crash.

But are you seriously going to risk losing everything by putting all your efforts and resources onto one host? A great idea but unfortunately with a fundamental flaw.

Of course, as the Availability Advisor, I have a solution… host on ftServer.


Just been published in Virtualization Journal! See No RAC, No RISC, No Problem | Virtualization Journal.

Having fended off challenges from Linux for several years now, the RISC-Unix platform is now under siege on another front – x86 servers. Long dismissed as workgroup and departmental servers, or as platforms for low-level enterprise applications, x86 servers are making serious inroads into corporate data centers. Earlier this year, analyst firms Gartner and IDC issued studies showing that sales of and revenue from RISC-Unix servers were continuing their slide against x86 boxes. The Gartner study showed a 28.5 percent decline in the number of units shipped and a 26.9 percent drop in revenue from them. The IDC study released in May of this year cited a “perfect storm” of circumstances, including the recession, the purchase of Sun by Oracle, and potential hardware upgrades by other vendors inducing delays in purchase decisions, all leading to the lowest level of spending on RISC servers IDC has ever recorded. Also encouraging the move away from RISC-Unix for many applications is the combination of a gathering virtualization groundswell in the data center and the rapidly improving performance capabilities of the x86 platforms…

For more, please visit  No RAC, No RISC, No Problem | Virtualization Journal.