Cheap storage Server – Part two

Not all bits are created equal

There are many reasons to pay for the reliability, rich feature set, and fail-over capabilities of our NetApp. The NetApp is an excellent product. However, there is also a significant amount of data at Thayer that just doesn’t need all the advanced features. For instance, we have research groups with multi-Terabyte data sets that just want their data easily accessible. If it were unavailable occasionally it wouldn’t be a huge deal. Other features such as snapshots are not critical.

If we can store these larger data sets on a cheap storage alternative, we can minimize the size and cost of our NetApp.

Design philosophy

After stalling for a couple years, hoping something would come along, we have decided to move forward with a plan that makes many compromises, but should meet the basic requirements for many of our storage needs.

We initially started the search to replace the NetApp altogether feature for feature. However, over time we lowered our standards to requirements that meet most of the needs for simple storage of lots of data. As our research continued, our philosophy on what product to choose evolved. In the end, the three main philosophical items we looked for when choosing a cheap storage solution were:

  1. Widely used technologies and standards – If lots of people are already using the technology it is more likely to be stable. It also is likely to be around in a few years. While little start ups may have cool technology, we really don’t want to trust our data to a half-baked product. Little companies can also get gobbled up or go out business overnight, never to be heard from again.
  2. Healthy communities and companies that are committed to the technology – In case we do have questions, or run into an issue, we want an active community of people that will be able to knowledgeably answers our questions. It isn’t enough that the technology says it supports CIFS and Active Directory. If it doesn’t work quite right in _our_ environment, and nobody seems to know why, we just can’t be comfortable moving forward.
  3. Simplicity – Simple solutions have less chance of breaking and are easier to fix in the event that they do break. It is easy to build several layers of redundancy, but end up with a system that is more likely to break because of all the moving pieces.

What we considered

ZFS on OpenSolaris/Solaris

Link

We actually own a Sun Fire X4540 (Thor) running Solaris. We are currently using it as our online backup for the NetApp. It seems like a nice piece of hardware and the ZFS feature set is amazing. Snapshots, filesystem compression, expandability, checksumming, and the rest are all great.

  • Technically a home run.
  • Doesn’t score so well on our three design philosophy requirements.
  • Reasonable, researched questions in forums go unanswered.
  • Sun Engineers try to be helpful, but can disappear for weeks at a time.
  • Sun shunned the Samba project and instead decided to implement their own CIFS server.
  • Projects lack focus. There are several different editions of Solaris. They appear to be attempting to clear this up, but I really wish their “Project Indiana” broke free of a lot of the legacy issues weighing them down.
  • Storage systems must be purchased fully populated.
  • We could not get AD group integration working and ID mapping is complex.
  • Even the latest build of OpenSolaris makes you feel like you are a Sys Admin from 1999.
  • Sun laid off many employees. They have lots of projects they are trying to maintain, but seem to be having trouble keeping up with them.
  • To make things worse, they got gobbled up by Oracle. How important Solaris is to Oracle remains to be seen.

Sun Storage 7000 Unified Storage (Amber Road)

Link

Sun realized there were admins like us who wanted ZFS but didn’t want to admin Solaris. So they wisely made a storage appliance. They did a great job of putting a pretty face on top of ZFS and Dtrace and then went and charged way too much for it.

  • Price is lower than NetApp and EMC, but not low enough to build a decent sized userbase and build some critical mass.
  • Can’t replicate from an Amber Road device to our existing Sun Fire X4540.

ZFS on Nexenta

Link

Nexenta seemed like it could be a good compromise for some of our Solaris concerns. Like Amber Road, it hides all/most of the Solaris administration and allows you to use ZFS like an appliance. We set up Nexenta on test systems on multiple occasions and it shows a lot of promise. Again a home run technically, but falls apart on our design philosophy.

  • The company is very small without a very big user base.
  • If you have any Solaris specific technical problems, they basically say, “go talk to Sun”.
  • The depth and breadth of their documentation needs some major improvements.
  • Fairly inexpensive (good educational discounts) but when building cheap storage, every additional dollar pushes up the cost per gigabyte.
  • They are working like mad to add advanced features like failover.

Came very close to giving them a shot, but they just need a little more time to fully bake.

Clustered Samba

Link

I hope this project gains momentum as we are really drawn to an open system that we can grow on demand by adding another server full of hard drives as our storage needs grow in fits and starts.

  • Still early and very little documentation
  • Uses clustered file system which are still maturing and poorly supported on Ubuntu

Netgear ReadyNAS

Link

Netgear sells the ReadyNAS, a decent Linux based Network Attached Storage. We bought a 6 bay ReadyNAS Pro which we currently use for network based Time Machine backups for a few Macs. Bascially we wanted to get some mileage with a Linux based NAS.

  • Decent web management interface with easy AD integration.
  • Seems to have better throughput than most small business NAS systems.
  • Tech Support is friendly but very bureaucratic. We ran into a bug when trying to add user and group permissions to a share. Getting tech support has been very difficult. You basically have to give them your first born child in order to be raised to the higher level of support. They really need to say, “yes that is a bug, we’ll fix it and get back to you”. Instead, they had me do dozens and dozens of unrelated tweaks to no avail. When I was finally suppose to get the next level of tech support, my ticket somehow got closed and I basically had to do the dozens of task all over again under a new ticket (which is still unresolved).
  • The largest product they sell is 12 bays. Probably not big enough for us unless we used multiple systems.
  • It uses EXT3 filesystem, so quota is user and group based instead of directory/project based. (can be worked around with lots of groups).
  • Dollar per gigabyte is very good (currently around $0.53 per usable GB for the 3200 with 12 2TB drives in RAID 6 with 1 hot spare).

Btrfs

Link

A new, GPL filesystem originally developed by Oracle. All the fancy features that we want, baked into the Linux kernel. Fingers crossed that in a couple years this will have stabilized to the point that it is ready for production use. It seems like it has the most promise of competing with NetApp’s WAFL and ZFS. Oddly enough, Btrfs is now owned by Oracle. Will Oracle continue to develop both ZFS and Btrfs?

Other

We looked at numerous other technologies in varying levels of detail. None seemed to quite have the momentum at this point, and several were just not designed for a typical file server. We’ll keep an eye on them.

Some of the projects include:

Read part three… what we actually chose to go with (warning, it is pretty anticlimactic).