Category: zfs

Data Dedupe comes to ZFS

It’s official… Data deduplication has been added to ZFS (read the link if you’re new to data deduplication). Hats off to Jeff Bonwick and Bill Moore who did a ton of the work in addition to Mark Maybee, Matt Ahrens, Adam Leventhal, George Wilson and the entire ZFS team.  The implementation is a synchronous block-level one which deduplicates data immediately as it is written.  This is analogous as to how DataDomain does it in their dedupe appliances.

What’s interesting about this is now dedupe will be available for *free* unless Oracle does something stupid.  Sun’s implementation is complimentary to the already-existing filesystem compression.  I’m not sure how much of an issue this is yet but the current iteration can not take advantage of SHA256 acceleration in the SPARC Niagara2 CPUs but eventually we should see hardware acceleration implemented.

When will it be available? It should be available in the Opensolaris dev branches in the next couple of weeks as code was just committed to be part of snv_128.  General available in Solaris 10 will take a bit longer until the next update happens.

For OpenSolaris, you change your repository and switch to the development branches – should be available to public in about 3-3.5 weeks time.  Plenty of instructions on how to do this on the net and in this list.  — James Lever on the zfs-discuss mailing list

How do I use it? If you haven’t built an Opensolaris box before, you should start looking at this great blog post here.  I wouldn’t get things rolling until dedupe is in the public release tree.

Ah, finally, the part you’ve really been waiting for.

If you have a storage pool named ‘tank’ and you want to use dedup, just type this:

zfs set dedup=on tank

That’s it.

Like all zfs properties, the ‘dedup’ property follows the usual rules for ZFS dataset property inheritance. Thus, even though deduplication has pool-wide scope, you can opt in or opt out on a per-dataset basis.

— Jeff Bonwick

What does this mean to me? Depends.  For people who like to tinker, you can build your own NAS or iSCSI server with dedupe *and* compression turned on.  Modern CPUs keep increasing in speed and can handle this.  This is huge.  Now, should you abandon considering commercial dedupe appliances that are shipping today?  Not if you want a solution for production as this won’t be officially supported until it’s rolled into the next Solaris update.  For commercial dedupe technology vendors, this is another mark on the scorecard for the commoditization of dedupe.

What things do I need to be aware of? The bugs need to be worked out of this early on so apply standard caution.  READ JEFF’s BLOG POST FIRST!!! There is a verification feature, use it if you’re either worried about your data or using fletcher-4 as a hashing algorithm to speed up dedupe performance (zfs set dedup=verify tank or zfs set dedup=fletcher4,verify tank).

How do I stay up to date on ZFS in general? Subscribe to the zfs-discuss mailing list (also in forum format).  It can be high volume but it is worth it if you want to stay on top of all things zfs. do

Can and when will SSDs + SATA replace FC/SAS?

Simon Seagrave ( asked, “How long do you think it’ll be before SSD will overtake SCSI as primary tier 1 SAN storage? Giving a new SSD and SATA tiered mix.”

Short answer: Yes, it will be SSDs + SAS and within 2 years.

The real quesion is when 15K RPM highspeed drives will be replaced with SSDs and 7.2K RPM high capacity drives.  SAS will probably end up replacing both FC and SATA in the majority of mid-range storage but the jury is still out on if this will happen in high-end arrays.

What are we talking about here? From an array design perspective, frequently accessed blocks of data should be served from ram and SSD.  SSDs have a much faster response time (microseconds vs. milliseconds) than traditional hard disk drives which enable this possibility.  The Sun Unified Storage platform was one of the first platforms to do this all in one array with their Hybrid Storage Pool design.  There are also some new appliances coming out like the Xcelasan from Dataram.  NetApp offers read accelerationt through their PAM 1 (Performance Acceleration Modules).  EMC will also start the transition at the LUN level with their implementation of FAST as described by Devang Panchigar.  This list is not meant to be comprehensive as I’m sure I have left out vendors and their roadmaps.

When will this happen? I expect the majority of storage vendors to implement this type of solution at the block level within the next 2 years based on current development cycles from most major storage vendors.  It will take some longer than others because of their architectures. It will be a key differentiating feature between vendors.  Beyond basic tiering between SSDs and high capacity disks we should see more advanced algorithms on what data and where to move it to.  I’ve followed journey of Sun’s ZFS  on the zfs-discuss mailing lists over the last year and have noticed that peculiar performance behaviors happen (write-pulsing) have required fine-tuning.

Solaris-based NAS and Virtualization at home

I run Zimbra’s mail server suite in an Ubuntu VM on my Mac. My only problem is that it eats up 512MB of my Mac Pro’s memory. I want to move it off so the first step is finding a new home for a Linux VM. I also want to move music and other archival data to something I don’t have to back up all the time.

Meet Solaris Nevada. The opensource community-developed version of Solaris 10 which includes Sun’s new xVM technology. Xen (VMware virtualization competitor) is built in to Solaris Nevada which means I can set up a virtual server on a Solaris x86 machine. I also get to reap the benefits of zfs. Using an 8-port PCI SATA controller (Supermicro AOC-SAT2-MV8) and 5 250GB SATA drives, I’ve got a RAID 5 protected SATA ZFS filesystem that can do nfs, cifs and iSCSI.

Putting all the bits and pieces together gives me a multi-purpose box that can now function as a fault-tolerant fileserver and box that can host virtual machines. Why did I pick Solaris? Because it’s free and ZFS is one of the best filesystems out there. I can make periodic snapshots of my ZFS filesystems and use the send/receive functionality to replicate it. Could I have done the same thing with Linux? ZFS isn’t out for Linux yet and Solaris has a number of other advantages. By the way, ZFS is in the Sun-supported version of Solaris 10 too.

On my list of TODOs is finish the migration from Ubuntu on the Mac to CentOS running on the Solaris box and move my iTunes library as well.