The question of whether or not to do synchronous or asynchronous replication between storage arrays does not come up often but I suspect it will as more and more people expand their business continuity infrastructure.Â It’s an important question because it can have a serious impact on the production environment.
With EMC’s Mirrorview/S (sync) there is a distance limitation of between 50km and 200km depending on what fibre optics you are using (short/long wave vs. dwdm).Â Mirrorview/A (async) is more widely used over an IP WAN connection but can also be used over fibre as well.
- Synchronous – Exact copy of data on production
- Little to no data lost
- Distance limited (60km using short wave gbics, long wave gbics or optical extenders, 200km using dense wave division multiplexors)
- WAN link more expensive (fibre vs. copper/ip) unless Fibre Channel over IP converters are used and those are still a little expensive
- Cheaper WAN link between sites (IP usually)
- Writes to prod don’t have to wait on mirror site to write
- Not distance limited like sync replication
- Data can be lost depending on write intervals from prod to DR site
What you need to know –
Array-based mirroring is a great way to protect multiple hosts in an environment instead of buying per-server or per-application replication.Â As I’ve discussed before, the biggest drawback is that it provides a restartable copy which isn’t the same as an active-active cluster application transaction-level replication (i.e. Oracle Dataguard, Exchange CCR, MySQL Master/Slave replication).Â Be careful of adequate LAN/WAN line quality, poor comm lines can cause insanely painful headaches (troubleshooting, added latency, etc).Â Get line tests done to determine available bandwidth, line quality and latency.
This is no joke.Â Right now southern Indiana is under water.Â Some organizations I have worked with are fine but Columbus Regional Hospital and many others aren’t.Â Their data center was in a basement (which isn’t rare) and took on a surge of water.Â Within what may have been a few hours, it was completely submerged.Â They had to evacuate all patients to other area hospitals.Â They’ve got a lot to overcome in the next couple of weeks but with the proper infrastructure, the pain can be lessened.
It’s true that most causes of data unavailability are human (planned and unplanned) and that natural disasters rank low on the list.Â However, this is no reason to sit idle.
Things to think about:
1) How long can I survive without my information infrastructure
2) Do I have a good copy of data offsite that meets my recovery point objectives?
3) What will a recovery look like (local, geo-dispersed, etc.)
4) Do I have a REAL plan I can act on?
5) What could make recovery easier (VMware SRM, async replication, etc.)?
The point isn’t to ask you a comprehensive list of questions.Â The point is to get YOU to think about the fact that it’s only a matter of time before something happens.Â Columbus Regional Hospital was unlucky.Â I’m not sure of the level of DR plans they have but I do know that many of the folks there have been working hard to improve this prior to the flooding that just occurred.Â We don’t have enough hours/days/minutes to be completely prepared but it would serve everyone well to sit down and think about the details of their business continuity plans.
I’ve been with enough organizations to know when people are confident about their DR/BCP plans and when they’re just closing their eyes and filling out binders of material with the thought, “This is the best we’re going to do, there is so much other stuff that has to get done first that this BCP plan won’t even matter.”Â There are many local vendors around that can help you plan.Â Just think how easy it would be to recover with a few servers running VMware and virtual machines ready to turn on the instant something happens.Â It may not be the whole piece to your DR strategy but it would get you up and running a lot quicker.