Permabit
Permabit
In our recent white paper we outlined the capabilities being demanded by end-users for enterprise strength Disk Based Archive technology. The four key requirements of these systems are: Scalability, Cost Efficiencies, Availability and Protection. We recently reviewed the Permabit Enterprise Archive™ Data Center Series system from Permabit Technology Corporation and found it to score very high in each of these categories. After our internal review was completed, we then spoke to a variety of users of the technology to compare our results to their real world applications. The user feedback was very positive and mirrored our internal test results.
As stated in our White Paper, the need for Enterprise Disk Based Archiving is wide ranging. Some customers use these solutions in the classic sense as a target for email archiving or file data migration. What really impressed us though is the variety of other scenarios that these systems are used for, as well as the variety of customers that are using it. Permabit is certainly not locked into a niche role; the appeal of their system is very diverse.
Architecture
The Permabit Data Center Series is a disk based archive that uses a grid storage architecture. This grid is made up of 1U intel based servers that act as nodes in the grid. There are two types of functions that these nodes perform. Storage Nodes are nodes in the grid that hold storage. Today, those servers hold four 500GB SATA drives. These Storage Nodes are connected via their own internal network in the grid. Today, up to 48 nodes can exist per storage grid, netting a raw disk capacity of 96TBs per grid. Up to 32 grids can be managed centrally with the Storage Pool Expander add-on module providing a total of 3PBs of raw disk storage capacity.
The other node type is an Access Node that acts as a gateway between the IP Network and the Storage Nodes. The Access Nodes perform the filesystem fingerprinting as well as provide CIFS, NFS, and WebDAV access to the Storage Nodes. They are typically paired for high availability. The Access Nodes scale linearly and each addition adds to the overall grid performance.
A starter system will use 8 Storage Nodes and 2 Access Nodes. Both storage capacity and access (ingestion) can be scaled independently. The nodes are all installed in the same rack along with provided switches for the internal network. To the servers and users of this storage, there is only one Grid that is referenced, essentially a single mount point. The management of the individual nodes is all managed by the Permabit OS. With all the individual components, the system is quite easy to install and most customers are writing data to the system within a few hours of opening the boxes.
While the initial thought may be to use the Permabit system to replace the archive role of tape or optical in the data center, the systems true value comes from being able to broaden and enable the use of archive storage. In most data centers, the reality is that data is never really archived off of primary storage. There may be a copy of it on tape or disk, but the data is still on primary storage as well. This is because optical and tape are difficult to access and cannot continuously validate data integrity over the course of many years. As stated above, access to the system is a simple NFS or CIFS mount. If you know how to mount a Network Drive you know about 90% of what you need to know to use the Permabit system. All the enterprise capabilities work well in their default out-of-the-box settings, allowing for fast starts. The system can be fine tuned as your understanding of the product matures. For example, you can initially create volumes that are read/write and later change them to WORM if corporate or regulatory requirements demand it.
Why not just use more primary disk or cheap disk?
The obvious question is why would you use an Enterprise Disk Archive as opposed to just adding additional disk capacity on your current storage platform? Or why not use a cheap SATA RAID array now commonly available?
It’s Safer
Your data is actually safer on a Permabit system than on primary storage and certainly it is safer than on tape or optical. The Permabit system is specifically designed for long term retention of data, while primary data stores are not. Protection of this long term data is of paramount importance. The data written to the Permabit archive will likely outlast the current primary storage platform, the disk-to-disk backups that protect the primary store and the tapes that were put on a shelf for the same purpose. In many cases, the data you place in an archive may be the last copy of that data available. Redundant protection is critical.
Part of that protection is the grid architecture which is more reliable and more redundant than most primary storage systems. As stated earlier, the Access Nodes are delivered in an HA fashion and the Storage Nodes improve on Raid 5 and Raid 6, utilizing a new disk protection approach called RAIN-EC. The RAIN-EC technologies delivered in Permabit Enterprise Archive Data Center Series represent a fundamentally new way of protecting data. Rain-EC was designed from the ground up for the unique data retention challenges of a petabyte-scale archive. Automatic, unlimited recovery from multiple simultaneous failures provides the security required for an enterprise repository of hundreds of terabytes to petabytes in size.
Safety is more than just redundancy, it is also knowing that the data you have written to the storage archive will be readable for years to come. Permabit verifies every write by leveraging the file system. As data comes in it is segmented into data blocks, and each block is fingerprinted. That fingerprint should never change. Not only is the fingerprint checked for accuracy when the initial write occurs, but during idle times the system will recheck older data blocks to see if there has been any degradation and notifying you accordingly.
Finally the entire system is designed to be replicated. So, in the event of a site failure, you are still protected. In fact, Permabit deploys something called WAN Optimized Data Replication for optimization of the replication process. If you have similar data in three sites only the unique blocks are replicated to the DR System. This can substantially reduce bandwidth requirements.
The true test of these safety claims is in the real world. Thanks to the resiliency of these systems we don’t know of a single customer that has lost data.
More Scalable
With most RAID systems, especially the inexpensive type, the ability to scale capacity is going to be a challenge with the archive data set. In an enterprise archive, 40TBs can be considered small. Multiply 40TBs out just a few years and we think many customers will be well over 1PB of archive data. The ability to scale and do so quickly is critical. With the Permabit system, capacity can be added in a matter of minutes with no downtime or impact to the system operation. Simply add a new Storage Node and the additional capacity is automatically available. Many of the Permabit customers we spoke to have added capacity on a quarter by quarter basis to the system, with no upgrade interruptions and have significantly cut their primary storage acquisitions in half or more.
More Secure
The Permabit system has both Write Once Read Many (WORM) file system capability as well as encryption available. These capabilities are critical in environments where the need to show a chain of custody or to lock down data is important. As mentioned earlier, what makes the WORM capability more practical is that it can be enabled later without the need for any additional software or different hardware and you can mix WORM volumes and normal volumes on the same grid. This is important because in many cases you may not know what should be WORM protected and what should not. In addition, legal requirements will continue to evolve and you may want the flexibility to make a volume WORM compliant to address the changing requirements. The ability to enable WORM later provides flexibility that will allow for a safe quick start of the initial system, avoiding the long data classification phase that derails many archive and retention projects.
More Cost Effective
One of the main competitors for Permabit is tape and optical technology, but in actuality it is also maintaining and growing primary storage - in essence doing nothing with the most costly storage you have. The value of the Permabit solution is that it provides a platform that makes creating, accessing and managing an archive practical enough that users are actually and finally moving old data off of their primary storage systems as well as legacy tape and optical deployments. Cost reductions come from two areas. The system itself uses industry standard hardware and SATA drives while leveraging its grid architecture to make those components safe for enterprise consumption.
Second, the Permabit Enterprise Archive uses Scalable Data Reduction™ (SDR) which is a combination of data deduplication and compression to optimize disk capacity. With data deduplication, redundant segments of data are identified before being written to disk and instead of writing those redundant segments a pointer is made to the original segment. This technology is now becoming commonplace in disk-to-disk backup. It is fairly new to archive. Because archive data tends to be unique compared to backup, in many cases an archive will not see the same level of reduction that a backup deduplication appliance will. Still expectations of 3X to 5X are not uncommon. Two exceptions are database archiving and VMware Archiving. Both of these scenarios will result in significantly higher levels of reduction. In conjunction with deduplication, compression will provide possibly another 50% optimization of storage capacity.
The result of these two factors is a cost per gigabyte approaching that of tape compared to $15-40 per GB on primary storage from the leading suppliers.
The bottom line is that if you are planning to buy additional primary storage capacity you may be better served by using Permabit. Moving data off of primary and on to a Permabit Archive can be dramatically less expensive than purchasing new primary storage. In an economy where budgets are going flat or shrinking, freeing up budget dollars earmarked for primary storage by using archive storage allows for other critical projects to be funded.
Friday, July 25, 2008
Product Spotlight