Data Domain Goes Nearline

 

When Data Domain entered the market they introduced a solution to reduce backup and restore windows and made disk a practical medium for storing backup data. Now with their nearline release Data Domain can reduce your ongoing investment in acquiring primary storage while further reducing backup windows.


Data Domain perfected Data De-duplication in the disk to disk backup space and the most recent release of its operating system allows users to extend that ability into other secondary storage workloads. For users this means that they can use a single de-duping platform to support backup, archive or other nearline functions.


This article will show you how you can leverage this new capability to reduce capitol outlay on primary storage while further reducing your backup burden.


First it is important to understand that this is simply an upgrade to the current OS, these new capabilities are available to all Data Domain customers going back several generations of product, All the Data Domain systems are now a storage platform that can now be used for archive, manual file migrations and, of course, backups with data de-duplication being factored globally across all these uses.


An important requirement for any nearline or archive process is to be able to track multiple versions of a file as it ages. In tape you achieve this by copying data to separate pieces of media, Data Domain delivers this ability by implementing snapshots in this new release of the OS. The addition of snapshots delivers versioning functionality to the data as it is stored. Customers can use this to preserve existing data sets prior to a major update to those files. Then, if there is a need for an older file, the ability to roll back to the file is now available.


The snapshot functionality does not require the copy-on-write or block-incremental method that many storage vendors use, which limits the number of snapshots that you can store due to the amount of space and overhead required as data changes.  Instead, the snapshots are driven by the same incremental differencing technology that is the foundation of the de-duplication process. This, in conjunction with de-duplication, means that you can maintain an almost unlimited number of snapshots for times that can span into years. In addition, if you are using replication you can have different snapshot schedules in your DR location than you have in your primary.


Using Data Domain nearline

The first area to consider is using the Data Domain Appliance as a target for archiving applications like email archiving. Implementing an email archive can address the problems with email storage management and PST growth. There is significant added value in archiving email directly to an intelligent device like Data Domain’s as opposed to dumb disk. With Data Domain you get data de-duplication. Don’t be fooled by email archive applications that claim de-dupe. They typically only will eliminate storage of duplicate files. They are not able to dedupe when there are small changes in the data or file like Data Domain's variable length segment deduplication. Also Data Domain provides data protection features like RAID 6, data integrity checking and data replication.


The second area to consider is optimization of the backup process to thin the bandwidth requirement of the backup. The ability of Data Domain to de-duplicate and store both backup data and other nearline workloads in the same system enables the user to make a number of powerful changes in the way that they manage data in their enterprises. Traditionally primary or online storage should be used for transactional and active data. Tape is for off-line data, in that it is not immediately accessible. Nearline is for data that is not being continually accessed and has been archived to a near permanent state.


The problem is that in actuality, primary storage contains mostly non-active data. Most studies cite 80% of primary storage is consumed by inactive data. Some vendors tried mixing SATA into a primary storage array to address inactive data, but it was not well received because it was not efficient; moving 10TB's of inactive data from primary storage to 10TB's of SATA storage does not make a lot of sense and it is not as cost effective as you would like. With Data Domain’s nearline and data deduplication capabilities, you can move that 10TB’s of inactive data from primary storage to a Data Domain appliance that, because of de-duplication, may only require a fraction of that capacity. 


By moving inactive data off of primary storage you greatly reduce your investment in that storage. By having a nearline device that can also perform data de-duplication you can reduce the investment in your secondary storage. This is a double win for customers cutting costs at both ends of the equation!


All this non-active data in the primary data set has a negative impact on the backup process. In some environments this data can total into the millions and millions of files and may represent multiple terabytes in capacity. Data de-duplication has helped considerably in that at least backing up this non-changing data each week no longer takes additional storage capacity on the backup disk target but your backup software and network still suffer the effects.


By identifying those files, permanently moving them out of the backup path and storing them on a nearline archive, you can greatly reduce the amount of data that needs to be brought across the network and significantly lower the resources required on the backup application.

While you can use a software migration tool to help with the movement of this data, the fact that this is disk and not tape makes it practical for this data to be manually moved. The Data Domain unit simply appears as a network share that can be copied to, read from or copied from. No need to have a software application that organizes tapes, this is a file system; navigate it and find the file just like you would any other file system.

With Data Domain's replication support you can then replicate this archived data. Once in place, you can consider this data archived and there is truly no need to back this data up, so you can now achieve the goal of removing this data from the backup process. The data is safer on the Data Domain appliance than it is on tape. Data Domain offers complete RAID 6 protection and constant data integrity checking, both of which are impossible with tape.

With the exception of transactional data, the Data Domain Appliance can be used in a variety of solutions that can reduce investments in primary and secondary storage as well as reduce backup windows further than ever. Its openness and a standard network mount point makes it easy to use while providing for endless storage possibilities. 

 

Monday, November 12, 2007

 
 
Made on a Mac

next >

< previous