Michael Altfield's gravatar

*Cheap*, Redundant, Multi-TB, Storage Solution

Storage is getting so cheap these days. So cheap, in fact, that multi-terabyte home servers are now economically feasible.

The emergence of cheap 1 terabye hard drives and ZFS perfectly compliment each other. Like others, I've embraced these two technologies to build myself a redundant, multi-TB disk array with 3x1TB drives running in a RAIDZ on OpenSolaris for about $300.

Introduction

ZFS

ZFS is the new Sun filesystem that was designed for the 21st century. It's extremely easy to implement, even easier to maintain, insanely resilient, and quite scalable. This post isn't about ZFS, so I'll refrain from going (too much) into its features, but here's the most important aspects (obtained from wikipedia):

  • Open Source and Free
  • Integrity Checking and Automatic Repair
  • Software RAID(Z)
  • Compression
  • Portable
  • Easy
  • Automagic

Pitfall

When it was all said and done, I did have (just) one gripe with ZFS: it currently doesn't allow adding drives to virtual devices (such as a RAIDZ).

In ZFS, you have a pool that contain virtual devices that contains physical disks. A virtual device is a collection of physical disks in one of the following configurations: striped, cloned, or RAIDZ (2). In my setup, I have a single RAIDZ virtual device made up of 3x1TB drives.

Currently, you can only add more virtual devices to the zpool. This means that if you have a 3 drive RAIDZ in your zpool and want to add 1 drive to your RAIDZ, you can't. The only thing you can do (if you want to have a RAIDZ to get redundancy and at least 75% data usage) is add 3 _more_ drives. For more info, see this post about a proposed Expand-O-Matic RAID-Z.

Open Solaris

IMHO, the best way to get and use the latest version of ZFS is through the OpenSolaris OS. OpenSolaris is driven by Sun, so it's going to have the latest and greatest (both bleeding edge *and* stable) revisions of ZFS.

BSD

ZFS has been ported to BSD. Alternatively, you could use BSD--but you'll probably be (a little) behind in functionality.

Linux

Due to licensing issues, ZFS is having difficulty being ported to Linux distros. You *could* use it, but I'd be very hesitant to do so..

Hardware

Here's what I threw in a pot to brew my rig:

I bought the drives and the SATA controller card off newegg, and I picked the server up at a local computer recycle shop for free. If you want a free server, I highly recommend stopping by a local computer recycle shop. This Dell 2450 is considered 'unusable' in the business world because it's ages old and out of warranty. Mine _only_ has 2x1Ghz processors and 2GB of RAM--not enough for most enterprise corporations, but perfect for my needs. Anyway, most recycle shops are being flooded with these machines. A lot of the time, they just ship them off to china to be melted down. The guy I got it from didn't mind sparing a couple--he said he'd rather see it being recycled for computing than recycled for scraps.

So, all in all, I spent about $300 for 3TB of raw storage (actually: 2.73TB) and 2.25TB of usable storage (actually: 1.79TB).

OpenSolaris Install

First things first, I downloaded the latest version of OpenSolaris (at the time of writing: 2008.11), and booted to live CD to kick off my server install.

No GUI-less Server Install

To my surprise, there was *not* an option to install OpenSolaris without a GUI. By default, it comes with a GNOME environment. This sucks. My guess for this is that Sun wants solaris to be for servers and OpenSolaris to be for workstations. Perhaps like Red Hat's Fedora.

Pitfall

33% through the install process, I closed the case of the server, and the entire live CD froze. I had to hard-reboot and re-do the whole install process all over again. meh.

ATA Timeout Fail

After the install, I rebooted, removed the CD (although this step was ambigious because the liveCD didn't mention whether I should leave the CD in or not), and let grub default to the GUI-based boot. After this, I sat at an OpenSolaris "loading" window for ~1 hour before I decided something was broken, gave up, and rebooted.

This time, I started in OpenSolaris' "text" mode. Here, I saw:

SunOS Release 5.11 Version snv_101b 32-bit
Copyright 1983-2008 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.

Somehow, later along, I also got these errors when (not) booting.

WARNING: /pci@0,0/pci-ide@f,1/ide@0 (ata0):
   timeout: abort request, target=0 lun=0
WARNING: /pci@0,0/pci-ide@f,1/ide@0 (ata0):
   timeout: abort device, target=0 lun=0
WARNING: /pci@0,0/pci-ide@f,1/ide@0 (ata0):
   timeout: reset target, target=0 lun=0
WARNING: /pci@0,0/pci-ide@f,1/ide@0 (ata0):
   timeout: reset bus, target=0 lun=0

May 22 13:17:25 svc.startid[7]: svc:/network/physical:nvam: Method or service exit timed out. Killing contract 8.
May 22 13:17:25 svc.startid[7]: svc:/network/physical:nvam: Method "/lib/svc/method/net-nwam start" failed due to signal KILL.

..and these:

SunOS Release 5.11 Version snv_101b 32-bit
Copyright 1983-2008 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
Hostname: apoc
Reading ZFS config: done.
Mounting ZFS filesystems: (6/6)

May 23 14:16:01 apoc scsi: WARNING: /pci@0,0/pci-ide@f,1/ide@0 (ata0):
   timeout: abort request, target=0 lun=0
May 23 14:16:01 apoc scsi: WARNING: /pci@0,0/pci-ide@f,1/ide@0 (ata0):
   timeout: abort device, target=0 lun=0
May 23 14:16:01 apoc scsi: WARNING: /pci@0,0/pci-ide@f,1/ide@0 (ata0):
   timeout: reset target, target=0 lun=0
May 23 14:16:01 apoc scsi: WARNING: /pci@0,0/pci-ide@f,1/ide@0 (ata0):
   timeout: reset bus, target=0 lun=0
...

The 4 WARNING messages continued to spit out every 15 minutes or so.

I found promising solutions here and there, but neither one of them worked. I ended up just yanking out the IDE device on the server--the (unnecessary) CD-ROM drive.

Once I did this, my system came up in 6 minutes (well, 3 minutes if you start counting at the GRUB bootloader). It's not great, but--because I rarely anticipate shutting down a server; it's fine.

Keymap Fail

I quickly discovered that the following keys regrettably do not work OOTB in OpenSolaris:

  1. home
  2. end
  3. page down
  4. page up
  5. delete
  6. insert

PCI SATA Controller Card

Every time I plugged in the SATA Controller Card, it would infinitely hang.

I found it was a problem with the BIOS version that gets shipped on the card. I had to flash the update--which turned out to be less trivial than one would expect. This process is described in my other post where I review the 'MASSCOOL High Speed PCI Controller Card'/Model XWT-RC040/4 port SATA Card

ZFS Install

First, I had to display the list of drives currently connected to the computer:

root@apoc:~# echo |format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c4t0d0 
          /pci@1,0/pci1028,3@2,1/disk@0,0
       1. c5d1 
          /pci@0,0/pci-ide@4/ide@0/cmdk@1,0
       2. c6d0 
          /pci@0,0/pci-ide@4/ide@1/cmdk@0,0
       3. c6d1 
          /pci@0,0/pci-ide@4/ide@1/cmdk@1,0
Specify disk (enter its number): Specify disk (enter its number): 
root@apoc:~# 

I was able to decipher from this output that my 3 1TB drives were the c5d1, c6d0, and c6d1 devices.

If you've done any research into ZFS, then you've heard it all before. Setting up my RAIDZ was unbelievably easy! I'll admit, getting OpenSolaris up and running, configuring the OS, and figuring out how to get my cheap 4-port SATA controller card to give JBOD to the OS was a bitch, but actually creating the ZFS stuff was a total breeze:

root@apoc:~# zpool create vault raidz c5d1 c6d0 c6d1
root@apoc:~# zpool list
NAME    SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
rpool  16.9G  2.91G  14.0G    17%  ONLINE  -
vault  2.73T   150K  2.73T     0%  ONLINE  -
root@apoc:~# zpool status
  pool: rpool
 state: ONLINE
 scrub: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	rpool       ONLINE       0     0     0
	  c4t0d0s0  ONLINE       0     0     0

errors: No known data errors

  pool: vault
 state: ONLINE
 scrub: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	vault       ONLINE       0     0     0
	  raidz1    ONLINE       0     0     0
	    c5d1    ONLINE       0     0     0
	    c6d0    ONLINE       0     0     0
	    c6d1    ONLINE       0     0     0

errors: No known data errors
root@apoc:~# 

That was it! That first command--followed by ~30 seconds of 'automagicness'--was all it took!

Note: The 'rpool' is the root pool created by the installer--it contains the OS's files. I was quite amazed by this functionality; I didn't expect it to be already built-in to the distro, but it was for 2008.11!

Misc Notes

Installing Needed Packages

There were a few applications that I personally _require_ for system administrative purposes that did *not* come with opensolaris by default:

root@apoc:~# pkg install SUNWscreen
PHASE                                          ITEMS
Indexing Packages                            554/554 
DOWNLOAD                                    PKGS       FILES     XFER (MB)
Completed                                    1/1       30/30     0.51/0.51 

PHASE                                        ACTIONS
Install Phase                                  48/48 
Reading Existing Index                           9/9 
Indexing Packages                                1/1
root@apoc:~# 

More info on the pkg command:

  • http://dlc.sun.com/osol/docs/content/IPS/cmdref.html
  • http://opensolaris.org/sc/src/pkg/gate/src/man/pkg.5.txt

Related Posts

4 comments to *Cheap*, Redundant, Multi-TB, Storage Solution

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>