Tagged: srm

Why Proof of Concept projects fail

This may seem obvious to some readers but I haven’t seen a good list of considerations to help ensure a successful PoC project.  Here are some training wheels to make sure you don’t crash and burn.

  1. Lack of requirementsAll key stakeholders involved should sign off on a detailed requirements document.  It doesn’t have to be in blood, an email response with a “yes” will suffice unless there are contractual obligations.  I hear, “We just want to see if it will work” all the time.  When you’re doing a PoC, be as specific as possible in defining “IT”.  Unless a solution is completely unbaked, think about how you would envision it working in your environment.  Talk to people and ask them how it works in their environment as you come up with requirements.  Be as transparent as possible with the vendor so there is no hidden agenda or confusion.
  2. Lack of a leader – Designate a lead. I’d be rich if I got a nickel for every PoC that failed because of a lack of a leader.  You need someone to keep track of the requirements, vendor involvement and testing.  PoC’s are easy to get lost in the fray because there aren’t obvious penalties for the customer who doesn’t see a PoC through. Conditional PO’s instead of freebie PoC’s are becoming more common.
  3. Lack of experience with the product – Let the vendor show you how a product was meant to be used.  If you’ve never touched a product before, why would you want to run a PoC all by yourself?  Seriously, your parents had to teach you how to velcro your shoes.  Which leads me to the next point that comes after someone says “We couldn’t get it to work this way so we tried X, Y and Z”.
  4. No documentation – Document your setup and any changes as they’re made. There are a ton of variables in your environment, document them.  I can’t stress this enough.  First make sure you deployed the product according to best practices.  If you need exceptions then run them by someone who knows what they’re doing and note them.
  5. Not asking for help – If you must, call and allow time for help. Yes, you might anger someone but it’s worth calling in for help before declaring a project a failure.  I can’t promise that a white knight will come in and save the day as the deadline for your PoC approaches but call for help anyway.

When I come in to help out with PoC’s that are in trouble in their 11th hour, two things are usually immediately apparent.  First, there was no leader or everyone went off in their own direction without accountability.  Second, there was a lack of familiarity with the product.  This list isn’t for my benefit, it’s all to help others have successful PoC’s.  If you have suggestions, send them in!

sync or async replication?

The question of whether or not to do synchronous or asynchronous replication between storage arrays does not come up often but I suspect it will as more and more people expand their business continuity infrastructure.  It’s an important question because it can have a serious impact on the production environment.

With EMC’s Mirrorview/S (sync) there is a distance limitation of between 50km and 200km depending on what fibre optics you are using (short/long wave vs. dwdm).  Mirrorview/A (async) is more widely used over an IP WAN connection but can also be used over fibre as well.

Mirrorview/S -

Pros:

  • Synchronous – Exact copy of data on production
  • Little to no data lost

Cons:

  • Distance limited (60km using short wave gbics, long wave gbics or optical extenders, 200km using dense wave division multiplexors)
  • WAN link more expensive (fibre vs. copper/ip) unless Fibre Channel over IP converters are used and those are still a little expensive

Mirrorvew/A -

Pros:

  • Cheaper WAN link between sites (IP usually)
  • Writes to prod don’t have to wait on mirror site to write
  • Not distance limited like sync replication

Cons:

  • Data can be lost depending on write intervals from prod to DR site

What you need to know -

Array-based mirroring is a great way to protect multiple hosts in an environment instead of buying per-server or per-application replication.  As I’ve discussed before, the biggest drawback is that it provides a restartable copy which isn’t the same as an active-active cluster application transaction-level replication (i.e. Oracle Dataguard, Exchange CCR, MySQL Master/Slave replication).  Be careful of adequate LAN/WAN line quality, poor comm lines can cause insanely painful headaches (troubleshooting, added latency, etc).  Get line tests done to determine available bandwidth, line quality and latency.

Comments welcome.