All storage costs money. As anyone who has worked in the industry for very long knows, that 1TB drive for $110 at NewEgg is not what we use at work. Or if it is, we use a lot of them and throw RAID into the equation. We do not get to offer storage at $0.107 a gigabyte, that much is clear.
What we use at work is usually not consumer grade SATA drives. We use U320 SCSI. Fibre Channel. SAS. The high performance stuff is running faster than 7200 RPM, and has better failure rates. Direct-attached storage is almost always attached to a RAID controller. Network-attached (ethernet or Fibre Channel) adds its own overhead, as the storage has to live inside of a dedicated unit of some kind.
Figuring out the cost
The true cost of storage involves several factors, especially for network-attached storage.
- The cost of the array divided by the redundant storage available (array-cost-per-GB)
- The cost of the disks, divided by the redundant storage available (disk-cost-per-GB)
Also, backup costs figure in as well
- Tape library cost, divided by the redundant storage available (tape-library-cost-per-GB)
- The total information stored in the backup rotation
- The number of tape media needed for a full backup rotation (media-number)
- The cost of each tape, multiplied by the number needed for the full rotation, and divided by the redundant storage available (media-cost-per-GB)
- The cost of the backup software needed to support the backup rotation, divided by the redundant storage available (software-cost-per-GB)
Of course this assumes a pure tape-backup scheme. Disk-backup involves its own costs. Things like de-duplication technology can help reduce the storage footprint of a whole rotation, thus reducing the disk-space consumed by the backup rotation. Hybridized systems that use disk and tape for backup get even more complex. For this post I’ll keep it to tape.
This also doesn’t include an apportioned percentage of any Fibre Channel infrastructure that may be in place (assuming FC), or any ethernet infrastructure investments to support high speed storage-networking.
The cost of the disk storage (primary storage) is array-cost per-GB plus disk-cost-per-GB.
The cost of the tape backup system is tape-library-cost-per-GB plus media-cost-per-GB plus software-cost-per-GB.
Real world example
We use HP storage arrays at my job. Right now we have two of them, an EVA6100 and an EVA4400. The EVA6100 is an upgraded EVA3000 that we’ve had for several years, and the EVA4400 is a new purchase. The EVA4400 is filled with FATA drives, because it is supposed to be used for ‘near-line’ storage needs, such as backup-to-disk, or long term archives. The EVA6100 is filled with Fibre Channel drives and is our primary storage array.
The cost-per-GB on the EVA6100 is no less than $16.22.
The cost-per-GB on the EVA4400 is no less than $3.02.
The theoretical cost-per-GB of a full backup rotation on the EVA6100 is $9.46.
The cost of the EVA6100 is actually higher then that, this number just includes costs I can isolate and account for. An unknown percentage of the EVA3000 hardware was re-used to build the EVA6100, so I can’t correctly apportion the cost of the EVA3000 over the current storage totals. This number also doesn’t include the amount of money we’ve spent on support contracts over the lifetime of this array.
Why does the EVA6100 cost so vastly more than the EVA4400 on a per-GB basis?
- The EVA4400 uses 7.2K RPM, 1TB FATA drives that cost under $1000 each, where the EVA6100 uses 10K RPM 300GB FC drives that cost around $1800 each.
- A majority of the current 300GB drives in the EVA6100 were purchased back when they cost over $2K each.
- The EVA6100 hardware is more expensive than the EVA4400 hardware on a per-chassis and per-enclosure basis.
How Dynamic Storage Technology can help
For this example I’ll use a hypothetical OES2-based volume containing 2TB of data, representing a volume shared between multiple departments.
That 2TB of space on the EVA6100 occupies $33,219 worth of storage in primary storage costs, and a further $19,374 in backup costs, for a total storage cost of $52,593. This is a lot of money for just 2TB, considering that 2TB is now the size of a moderate home-media-server.
But what happens if I throw DST into the mix?
Looking at the Volume Inventory data, I see that 64.5% of the data housed on that volume hasn’t been modified in the last 12 months (this is the real percentage of one of my volumes right now). So of that 2TB of space, 1321 GB of it hasn’t been modified in 12 months or longer. If I put a Dynamic Storage Technology policy in place to migrate files older than 12 months to the EVA4400, this leaves 727 GB on the EVA6100, and 1321 GB on the EVA4400. The primary storage cost now drops to $15,781 ($11,792 in EVA6100 storage, and $3,989 in EVA4400 storage). This is less than half the pure-EVA6100 storage cost of $33,219.
The change to the backup costs depends on a number of things, but one cost is pretty easy to estimate. The full backup rotation costs for the EVA6100 will go down dramatically. While the incremental backup sizes will stay the same, the size of the full backups will go down quite a lot. This will markedly reduce the amount of data in the backup rotation, and thus reduce media costs significantly.
The data on the 4400 doesn’t change very often at all. Incremental backups would catch files migrated as part of the DST policy, and those few old files that get modified by users; making them a much smaller percentage of the full backup size than the actively used data on the EVA6100. Also, it is possible to do full backups less often than with the active data, which further reduces the size of the total backup rotation.
The cost of primary storage of this data would be cut in half by using DST, even though two storage arrays are involved in the environment. In this era of increasingly squeezed IT budgets, this would increase the effective lifetime of the EVA6100, and increase how long we go between disk purchases for the 6100.
Some further notes on cost
The backup cost I quote above is probably less than yours. This reflects our usage of a 3rd tier backup vendor, older backup hardware, and a relatively scant backup-retention period. If you have to retain full backups for 12 months, your costs will be higher than these. If you are using a top tier backup vendor (NetBackup, Legato) your costs could be quite a bit more than mine. If you are using a true tape library, such as a Quantum Scalar i2000, your costs will be higher.
There are some storage types that DST can’t touch, such as GroupWise data. That’s storage that’ll probably stay on your fast storage devices. DST is designed for, “unorganized file data,” which is a code-phrase for “file server”. In my case 70% of our storage is consumed by, “unorganized file data,” so we could realize some serious gains through DST. Your environment may be different.
And finally, one cost I ignored here is the cost of power and cooling. I do know the EVA4400 sucks less juice than the EVA6100 even though they have the same number of drive enclosures, but I could not tell you how much less.