Storage@home


Storage@home was a distributed data store project designed to store massive amounts of scientific data across a large number of volunteer machines. The project was developed by some of the Folding@home team at Stanford University, from about 2007 through 2011.

Function

Scientists such as those running Folding@home deal with massive amounts of data, which must be stored and backed up, and this is very expensive. Traditionally, methods such as storing the data on RAID servers are used, but these become impractical for research budgets at this scale. Pande's research group already dealt with storing hundreds of terabytes of scientific data. Professor Vijay Pande and student Adam Beberg took experience from Folding@home and began work on Storage@home. The project is designed based on the distributed file system known as Cosm, and the workload and analysis needed for Folding@home results. While Folding@home volunteers can easily participate in Storage@home, much more disk space is needed from the user than Folding@home, to create a robust network. Volunteers each donate 10 GB of storage space, which would hold encrypted files. These users gain points as a reward for reliable storage. Each file saved on the system is replicated four times, each spread across 10 geographically distant hosts. Redundancy also occurs over different operating systems and across time zones. If the servers detect the disappearance of an individual contributor, the data blocks held by that user would then be automatically duplicated to other hosts. Ideally, users would participate for a minimum of six months, and would alert the Storage@home servers before certain changes on their end such as a planned move of a machine or a bandwidth downgrade. Data stored on Storage@home was maintained through redundancy and monitoring, with repairs done as needed. Through careful application of redundancy, encryption, digital signatures, automated monitoring and correction, large quantities of data could be reliably and easily retrieved. This ensures a robust network that will lose the least possible data.
Storage Resource Broker is the closest storage project to Storage@home.

Status

Storage@home was first made available on September 15, 2009 in a testing phase. It first monitored availability data and other basic statistics on the user's machine, which would be used to create a robust and capable storage system for storing massive amounts of scientific data. However, in the same year it became inactive, despite initial plans for more to come. On April 11, 2011 Pande stated his group had no active plans with Storage@home.