sync or async replication?
The question of whether or not to do synchronous or asynchronous replication between storage arrays does not come up often but I suspect it will as more and more people expand their business continuity infrastructure. It’s an important question because it can have a serious impact on the production environment.
With EMC’s Mirrorview/S (sync) there is a distance limitation of between 50km and 200km depending on what fibre optics you are using (short/long wave vs. dwdm). Mirrorview/A (async) is more widely used over an IP WAN connection but can also be used over fibre as well.
Mirrorview/S -
Pros:
- Synchronous – Exact copy of data on production
- Little to no data lost
Cons:
- Distance limited (60km using short wave gbics, long wave gbics or optical extenders, 200km using dense wave division multiplexors)
- WAN link more expensive (fibre vs. copper/ip) unless Fibre Channel over IP converters are used and those are still a little expensive
Mirrorvew/A -
Pros:
- Cheaper WAN link between sites (IP usually)
- Writes to prod don’t have to wait on mirror site to write
- Not distance limited like sync replication
Cons:
- Data can be lost depending on write intervals from prod to DR site
What you need to know -
Array-based mirroring is a great way to protect multiple hosts in an environment instead of buying per-server or per-application replication. As I’ve discussed before, the biggest drawback is that it provides a restartable copy which isn’t the same as an active-active cluster application transaction-level replication (i.e. Oracle Dataguard, Exchange CCR, MySQL Master/Slave replication). Be careful of adequate LAN/WAN line quality, poor comm lines can cause insanely painful headaches (troubleshooting, added latency, etc). Get line tests done to determine available bandwidth, line quality and latency.
Comments welcome.
Very good posting! Another key consideration is impact on host I/O. With synchronous replication, the host does not receive acknowledgement of the I/O until the I/O has been replicated; thus host I/O performance is directly tied to the available bandwidth of the replication infrastructure. Asynchronous replication decouples this from the source host.
CCR is not an Active/Active cluster. It has one active node and one passive.
Brian, good catch. I struggled to find a good term for application clusters which provide near 0 RPO’s via transaction log shipping. The spirit of the comparison is merely that async array level replication doesn’t provide that.