I’ve been upgrading my storage every 2 years or so for at least the last 10 years, and it was time to get serious.
The last two upgrades I made were 2 x 1TB drives in a RAID1, and then up to 2 x 2TB drives in a RAID1. Both configurations were attached to a mid-level RAID card and offered “ok” performance and “ok” redundancy, since it was a simple RAID1.
Enter, “The problem”: In addition to growing my storage needs quicker, largely around the fact that my wife and I were downloading a lot of HD TV content, I also started leveraging VMware a lot more, and disk I/O was becoming a real bottleneck on the pair of 320GB drives I had running in that box (another simple RAID1). It was only a year since I had moved up to 2 x 2TB HD’s and 3TB HD’s would have barely been a band-aid.
My goal was to solve both problems with 1 solution:
Build a SAN that provided higher I/O capabilities than the current local storage in my VMware box, AND provide bulk storage for the increasing amount of videos we download, pictures we take, music we collect, etc.
I had heard about ZFS years ago, but when I dug deeper, it seemed like it wasn’t ready for prime-time. Well, things have changed. ZFS is becoming widely used at the enterprise level and considered stable for production. ZFS works similarly to how a basic RAID controller functions, but uses a whole x86 computer to act as the controller with ZFS acting like the “firmware” running on the controller. ZFS offers data integrity verification against data corruption modes, support for high storage capacities, snapshots, copy-on-write clones, continuous integrity checking with automatic repair, and iSCSI through COMSTAR at the OS level.
ZFS running on a moderately priced x86 box will far outperform high-end RAID cards that alone cost more than this whole build. Since ZFS does the “work” you previously counted on the RAID card for, you DON’T NEED high-end RAID cards to attach your drives to in a ZFS box. You do want “fast” HBA’s, but they’re still cheap. especially if you’re willing to buy used, since these simple HBA’s are what typically ships with base-model servers and are typically “pulled” and sold on eBay. The HBA in my recipe below is an LSI1068E with a Dell part #. Several vendors make HBA’s with this chip, so if you can’t readily find the dell part, try looking for cards branded under Lenovo or HP.
Now, the recipe:
1x NORCO 10-bay Hot Swap Server Chassis = $240
http://www.newegg.com/Product/Product.aspx?Item=N82E16811219037
6x Western Digital WD20EARS 2TB SATA HD’s = $540
http://www.newegg.com/Product/Product.aspx?Item=N82E16822136514
2x 1M SAS 32-pin to 4 SATA = $40
http://www.provantage.com/cables-go-10249~7CBT9212.htm
1x LSI 1068E PCIe SATA/SAS Controller = $19 on eBay
http://accessories.dell.com/sna/products/Hard_Drives_Storage/productdetail.aspx?c=ca&l=en&s=biz&cs=calca1&sku=313-8239
Oracle Solaris Express 11 = FREE
http://www.oracle.com/technetwork/server-storage/solaris11/downloads/index.html
TOTAL = $799
This build does expect you have some general experience with storage, x86 PC’s, and a few parts laying around you can re-use. You’ll notice the above recipe doesn’t include the mobo, cpu, mem, or power supply. These are all things I had laying around from other builds, and it doesn’t have to be fancy. Anything dual-core with at least 2GB of mem and a PCIe slot that works with storage adapters will be just fine. MOST motherboards will work with an HBA in their PCIe slot, but it’s a good idea to do a few google searches before you pull the trigger on a purchase to see if anyone else has tried with that specific board. I used a Gigabyte mobo and had no issues. You will want to make sure the power supply you use has enough umphh to power the system and all the drives. I had a 550w of good quality in my parts pile and used that.
If you don’t have any of the above items, figure another $200-$250 for the build. . which still only puts the total at around $1,000. If you’re buying a new mobo and cpu, buy something low power. The difference between a dual core i3 and a quad core i7 will be very little. Spend your money on memory instead, which will make a far greater impact on performance than raw CPU speed. The memory in your home-built SAN is used similarly to how the cache would function on a classic RAID card. The more you have, the faster it goes.
If you plan on building a SAN similar to the one above with ~6 drives, you’ll be fine with a single HBA and the cable linked above. In addition to the drives you’ll use for your storage, you’ll want a drive or two for the OS. One drive is all that’s needed, but if you have two laying around, it’s a good idea to have redundancy for Solaris Express, since if the OS goes down, your data will be inaccessible until you reload Solaris on a fresh drive and remount your data pools.
So, build your rig, install Solaris Express (instructions available at the same site linked above for download), and make sure Solaris can “see” all of the hardware in your box. The next step is configuration. I opted for configuring via a GUI, but ZFS does have a full-featured command line implementation. The GUI I used is napp-it, available here (for free):
http://www.napp-it.org/downloads/index_en.html
You can actually install it as easily as running the below one-liner from a root shell on your SAN:
wget -O – www.napp-it.org/nappit | perl
napp-it is basically a simple web app that interfaces with ZFS underneath, but presents a nice GUI in your browser for building and managing storage pools and their accessability. There’s LOTS of different ways you can build a storage pool or pools with the 6 drives suggested above, but I would suggest a RAID-Z2. This works similarly to the classic RAID-6, in that you can lose up to two drives without losing any data. I won’t go into the other pool types and/or their merits here, but the ZFS wikipedia page has tons of info:
http://en.wikipedia.org/wiki/ZFS
A 6 disk RAID-Z2 built from 2TB drives will yield about 7.1 terabytes of usable space after formatting. The same 6 disks in a RAID-Z1 would yield ~9 terabytes formatted, and a simple volume with no redundancy would yield over 10 terabytes. The reward of up to 30% more usable space is NOT worth the risk of a single disk failure wiping out all your data. I would strongly suggest the RAID-Z2 option.
Go build!
Here’s my home network rack with the completed SAN on the bottom shelf:
click the pic for full-size view
NOTE: The price for the 2TB drives linked above was $89.99/ea when I put this article together. With the flooding in Thailand, the prices for all drives skyrocketed, but is starting to settle down again. If you’re not in a hurry, wait until the price stabilizes. It *will* come back down to the $89.99 price point, if it hasn’t already by the time you’re reading this