We’re off to a pretty good start here at EMC World. I’ve gotten to meet up with many other twitter folks at the ZDNet Blogger’s Lounge organized by @lendevanna. Last year social media was just a small lunch tweetup but this year we have the lounge and a lot more networking going on. There have been so many good conversations going on and social media is creeping into EMC more every day thanks to the hard work of @lendevanna, @stu, @davidkspencer, @gminks, @davegraham and many others.
I’ve spent quite a bit of time talking with my fellow GestaltIT.com bloggers (@storagenerve and @chrismevans) and others including the folks above and @basraayman, @storagezilla, @mike_fishman, @storageanarchy and many others.
Joe Tucci’s keynote was more or less the same as it has been in years past but he spent a lot of time talking about Cloud Computing. Their cloud view was the same as what VMware is pitching. Lots of talk about private and public cloud with federation between for traditional IT applications.
V-Max:
I attended a couple of V-Max sessions with @storagenerve on architecture and enginuity. The architecture really is built to scale but a I’m not sure who will be scaling beyond 8 or 16 engines. What will probably be more common is more V-Max engines able to federate data between systems instead of having one large global system. Federation will probably be a big focus for EMC because most customers aren’t running the same modular but monolithic array for 5-10 years, they usually roll them after 3-4 because of technology and financial reasons.
We also saw a lot of numbers on IOPs and performance that I had never seen before for both DMX and V-Max. I’ve always had the perception that EMC doesn’t publish much if any numbers but either that’s changing because of openness or the possibility that V-Max has good numbers and there isn’t much ambiguity on what is faster than what. The numbers we saw were more about architectual limits and not benchmarking.
Powerpath/Powerpath VE:
Powerpath is getting some licensing changes where there will be an option of using a license server so licenses can be much more easily managed. EMC did say that Powerpath VE for VMware will be released on May 21st. As some admins may already know, multipathing for storage in VMware is manual and difficult today. VMware vSphere Enterprise Plus will be required if you want to use Powerpath VE. It will do multipathing across VMs, load balancing and EMC array optimization.
It’s obvious VMware and virtualization are playing a huge role in cloud computing from the perspective of Infrastructure as a Service (IaaS). VMware’s lead Cloud Architect, Mike Dipetrillo, was gracious enough to provid some great insight into VMware’s strategy.
IaaS is where most managed service providers focusing on today for their cloud offerings. We discussed that developing self-service infrastructure as a service provider offering is tricky. The folks who have done infrastructure understand the glue that is needed for provisioning and allowing users control over their own environment. Giving users the ability to turn the knobs that control things takes a lot of work. Today that is VMware Lab Manager under the covers with some special glue for provisioning. Lab Manager has some challenges today because it wasn’t designed for a multi-tenant environment. Over the next year you’ll start to see products come out that address this for service providers. VMware is also heavily focused on delivering more APIs which allow companies like RightScale to hook into VMware to provision and manage virtual machines. VMware categorizes all of its cloud computing initiatives under the vCloud umbrella. This will include all of their cloud-focused products and APIs. The roadmap has developed rapidly over the last year or two.
Helping small, medium and large businesses build out their internal clouds has been a big focus as well. It needs to be easy to allow people to have the flexibility to move things between the internal and external cloud. One of the questions I get asked the most is “How can I move my applications out into the cloud?” It is a lot cheaper and easier to virtualize your existing software stack compared to rewriting things to fit on exotic platform as a service software at the moment.
Another thing we discussed is how enterprise companies don’t really like “elastic” or “usage-based” billing models. They actually prefer allocation-based where they billed in a consistent fashion. I’ve never given a lot of thought to this but it makes sense. A lot of companies do business the way they do because it works well for them.
Today VMware has over 500 service providers who are either in the process to getting a cloud offering off the ground or have one today. The team at VMware is small when compared to the rest of the organization but things have been progressing fast.
We discussed competition briefly but both agreed how things are changing rapidly. It was obvious that VMware is agressively ramping up its vCloud offering and the internal structure to go along with it. The benefit they have as a company is that they’re able to leverage so many of their existing products and IP.
Many thanks to Mike for sparing some time to discuss VMware’s vCloud initiative.
Why should you care about how you lay your storage out? Maybe because it’s your job or because it’s the right thing to do. Perhaps it’s because your application performance isn’t acceptable or your boss won’t let you buy shelves full of 15K RPM disks anymore. It’s not uncommon for pure frustration to stream out of a CIO’s mouth regarding how expensive enterprise storage is and that they’re “sick of throwing fibre channel disks at a problem”.
Even if your array does this “automatically” or you’ve got performance to spare, here are some things to keep in mind as you scale:
1. Analytics tools are your best friend - If you have no instrumentation, you’re flying blind. Your storage should allow you to see what’s going on underneath the covers so you can track down performance issues. Third-party tools to do this are available but make sure you buy the analytics tools when you purchase an array. We want to know if latency is horrible or if IOPs are high but throughput is low.
2. Workloads on the same RAID groups should be complimentary (caveat, see #3) - If you’ve got SQL and Exchange, try putting SQL log LUNs on the Exchange data LUN RAID group and Exchange log LUNs on the SQL data RAID group. Don’t put two of the same type of workloads in the same RAID group and expect harmony.
3. Pick an array that has some sort of QoS - If you’ve got the space and want to put the video storage on the same RAID group as SQL logs, do it but make sure you can put some restrictions on video if SQL should get better performance.
4. Monitor performance periodically and move LUNs to different tiers - If you’re using a ton of the expensive fibre channel disk space for an app that doesn’t need the performance, move it to more dense fibre channel or SATA disks.
If you have a finite budget and need to be more mindful of storage costs, this will all start to mean something. If you’re lost and don’t know how to begin monitoring then ask a storage systems engineer for help or call your SAN vendor’s support line.
We think of web apps as what belongs in the “cloud”. Virtualization is changing this so that both small and enterprise apps are a fit. To me there can be an internal cloud and an external cloud. As virtualization continues to evolve, we will see the lines blur between both.
I recently lead a session at CloudCampIndy on “App Scalability in the Cloud”. Many of those who participated were app developers as well as general business people. We talked about understanding your application regardless of who developed it and the impact that cloud computing would have. For now, application scaling will be similar in both. The difference will be how you add and pay for capacity.
Here are some points made:
- Pick or develop apps with scaling in mind from the start
- Virtualization is changing how you deploy your apps
- Horizontal scaling apps do better in the cloud
- Vertical scaling works but is more limited
- Developers benefit from knowing their app’s impact on the underlying infrastructure (Is my app read or write intensive? Does it cache well?)
- Caching is a cheap way to improve database performance
- Database replication (master/slave) or sharding is another way to scale
- Have at least two providers if you need disaster recovery capabilities (1 could be yourself)
- Products like VMware’s vCenter AppSpeed will make scaling out an easy automated process
Whether you’re running SAP or some web application that needs to scale, you need to understand bottlenecks in a system and ways of resolve them. Disk is usually the slowest component in an architecture. However, before you go spend $150k on an expensive SAN, make sure you’ve optimized your application and added caching where useful to speed things up. If you’re in the cloud (Amazon EC2, Bluelock, Slicehost, Joyent etc.) you will pay for the resources you use so it is wise to optimize your architecture in the beginning.
The question of whether or not to do synchronous or asynchronous replication between storage arrays does not come up often but I suspect it will as more and more people expand their business continuity infrastructure. It’s an important question because it can have a serious impact on the production environment.
With EMC’s Mirrorview/S (sync) there is a distance limitation of between 50km and 200km depending on what fibre optics you are using (short/long wave vs. dwdm). Mirrorview/A (async) is more widely used over an IP WAN connection but can also be used over fibre as well.
Mirrorview/S -
Pros:
- Synchronous - Exact copy of data on production
- Little to no data lost
Cons:
- Distance limited (60km using short wave gbics, long wave gbics or optical extenders, 200km using dense wave division multiplexors)
- WAN link more expensive (fibre vs. copper/ip) unless Fibre Channel over IP converters are used and those are still a little expensive
Mirrorvew/A -
Pros:
- Cheaper WAN link between sites (IP usually)
- Writes to prod don’t have to wait on mirror site to write
- Not distance limited like sync replication
Cons:
- Data can be lost depending on write intervals from prod to DR site
What you need to know -
Array-based mirroring is a great way to protect multiple hosts in an environment instead of buying per-server or per-application replication. As I’ve discussed before, the biggest drawback is that it provides a restartable copy which isn’t the same as an active-active cluster application transaction-level replication (i.e. Oracle Dataguard, Exchange CCR, MySQL Master/Slave replication). Be careful of adequate LAN/WAN line quality, poor comm lines can cause insanely painful headaches (troubleshooting, added latency, etc). Get line tests done to determine available bandwidth, line quality and latency.
Comments welcome.
Tags: dr, mirrorview, replication, srm
This is no joke. Right now southern Indiana is under water. Some organizations I have worked with are fine but Columbus Regional Hospital and many others aren’t. Their data center was in a basement (which isn’t rare) and took on a surge of water. Within what may have been a few hours, it was completely submerged. They had to evacuate all patients to other area hospitals. They’ve got a lot to overcome in the next couple of weeks but with the proper infrastructure, the pain can be lessened.
It’s true that most causes of data unavailability are human (planned and unplanned) and that natural disasters rank low on the list. However, this is no reason to sit idle.
Things to think about:
1) How long can I survive without my information infrastructure
2) Do I have a good copy of data offsite that meets my recovery point objectives?
3) What will a recovery look like (local, geo-dispersed, etc.)
4) Do I have a REAL plan I can act on?
5) What could make recovery easier (VMware SRM, async replication, etc.)?
The point isn’t to ask you a comprehensive list of questions. The point is to get YOU to think about the fact that it’s only a matter of time before something happens. Columbus Regional Hospital was unlucky. I’m not sure of the level of DR plans they have but I do know that many of the folks there have been working hard to improve this prior to the flooding that just occurred. We don’t have enough hours/days/minutes to be completely prepared but it would serve everyone well to sit down and think about the details of their business continuity plans.
I’ve been with enough organizations to know when people are confident about their DR/BCP plans and when they’re just closing their eyes and filling out binders of material with the thought, “This is the best we’re going to do, there is so much other stuff that has to get done first that this BCP plan won’t even matter.” There are many local vendors around that can help you plan. Just think how easy it would be to recover with a few servers running VMware and virtual machines ready to turn on the instant something happens. It may not be the whole piece to your DR strategy but it would get you up and running a lot quicker.
Tags: bcp, disaster recovery, dr
EMC worked with IDC to make a Worldwide Information Growth Ticker as seen here:
One thing I’ve noticed in all this talk about explosive information growth is how most vendors are sticking to a strategy of how to store it and manage it. A lot of these vendors make a lot of money storing content but I’m beginning to wonder how good being a bunch of digital “pack rats” truly is. Even if we build systems to manage the information, how much value can we extract out of the digital “junk” we keep. It’s not the responsibility of companies to figure out the value of the information for us but it would be nice to know along with the calculator, how much that information truly costs. I think as the information grows, we’ll start to see people come to terms with how they manage that information and what they decide to consume or store.
Here’s an example: My digital camera (Fuji Finepix S5 Pro) shoots 25MB raw files and I choose to shoot raw because it’s a “digital negative”. Now compared to Canon and Nikon, Fuji’s raw format is horribly inefficient. On some days I can run through an 8GB flash card which gets me roughly 260 pictures. That gives me 84 days of pictures assuming I fill up a card. That’s a lot of pictures to shoot but even if I shoot half as much, I could end up filling up a 750GB SATA drive in a couple of years given how often I take pictures. That is a ton of data to create, manage and protect. Pile on all the mp3’s and movies people download and it’s even easier to see how people fill up 500GB hard drives in a years time. Now maybe I’m an extreme case but the point is that even cutting the average user’s data creation rate by 1/8th of mine, it isn’t cheap. Most consumers aren’t used to buying new drives every couple of years and also figuring out how to protect that kind of data.
I don’t think technology is keeping up with generating at least from a consistent cost perspective. Part of my reasoning is that now people are placing much more value on their data than they used to. How will the average joe handle the this cost and growth?
I skipped out on some of the technical sessions yesterday to meet with some of the bloggers and folks on twitter. I think a lot of people will agree that the social aspect is just as valuable if not more so than the technical sessions.
I had lunch with Bill Petro, Joyce Tompsett, Jon Collins, David Spencer, and Jason Benway. We discussed about the benefit of transparency and social media for companies. A great book to read is the Cluetrain Manifesto which talks about how companies benefit from genuine conversation with their prospects and customers. Jon made a great point that Cluetrain is not the solution but rather a feature or ingredient that corporate social media must have. A lot of the points I made as an EMC outsider were that pointing my customers to genuine conversations within EMC be it technical or business-oriented are much easier than me saying, “Trust me, they are listeners and truly care.” One of my biggest challenges aside from competition has been convincing skeptics that EMC is not The Big Evil Machine(tm).
Later on I met up with Mark Twomey and Scott W. and talked with them for almost two hours. Mark and Scott have the inside track and do a great job of blogging about EMC’s technology and how it honestly stacks up against the competition. They’re not a marketing machine but rather two passionate individuals who go to bat for what they believe but take critical feedback. No kool-aid there folks.
Overall a great last couple of days.
Tags: EMCWorld2008 storage
EMCWorld 2008 is well underway. The keynote was much like last years keynote in that there was talk about how information growth is continuing to explode. Unfortunately cloud computing was touched on only briefly and specific EMC strategy wasn’t discussed.
I did meet with Ryan Johnson who is the product manager for EMC’s Lifeline software. Lifeline is “Network storage OEM software for the SOHO and Prosumer market.” In a nutshell, this is home centralized storage done right. You can store your music, movies and even surveillance camera data all on one device that will support remote backup to EMC’s Mozy online backup service. The software is at release 1 today but a ton more features will be coming in version 2. The Intel demo was really slick with about 4-5 HD videos streaming simultaneously to a TV, an iMac and an xbox 360. Currently Intel has a product that holds 4 drives and is starting to ship today. Iomega will have a device with 2 drives shipping in August. The biggest challenges for EMC have been making a easy to use interface but giving the device a lot of features. I did mention silent data corruption and ZFS to Ryan and he said they were looking at innovating in the data integrity area.
I attended a lot of VMware-specific architectual and performance engineering sessions since that seems to be my focus with my customers right now. Some of the information was new but a lot of it I have heard last year. Interestingly enough, it seemed that there were some mixed messages emerging from VMware folks who work on the same team.
A lot of my customers are just getting into centralized storage for VMware and are having a hard time deciding if they should do fibre channel, iSCSI or even NFS. There are no performance differences between storage protocols (iSCSI, fibre channel or NFS). Now there is a throughput difference between 1 gigabit iSCSI and 4 gigabit fibre channel. Most importantly, if you’re going to consolidate a lot of hosts and could push the 1 gigabit barrier, 4gb fibre channel makes things a little easier without having to aggregate lots of smaller links.
That’s all for now, on to day 3.
Tags: EMCWorld2008 storage
I just upgraded to Wordpress 2.5. So far everything has gone smoothly but I’m sure some bugs will crop up. If you notice anything let me know.
There are a ton of new features and if you’re interested you can find them here.