SLAC Today logo

New Data Silos to Expand SLAC Scientific Data Capacity

(Photo - old data silos)
The data silo "Johnny5" is set for retirement after about two decades. (Photo courtesy SLAC Computing.)

SLAC has certain things in great abundance. One of them is data, particularly from scientific experiments such as BaBar. Much of SLAC's data is stored on electronic disks, but these are pricey, big on energy consumption and prone to failure.

To cope with SLAC's vast arsenal of facts and figures, the computing department houses six data silos containing collectively around twenty-two thousand 200-gigabyte tapes, and around five thousand 20-gigabyte tapes. Though they are slower than disks, tapes are better for long term archival of data and more durable. Now, the six tape-based data silos are being replaced with newer, more compact storage structures.

"We're going through a big change," said Randy Melen, from the computing department. As SLAC science grows in scope, particularly with the Linac Coherent Light Source, the computing department must adjust its resources. "Before the end of the year, we'll get rid of all of [the old silos]."

All data from one data silo was already transferred at the end of May. Then the silo was disassembled and removed from the computing building. The days for the remaining five silos are numbered. These silos have been around so long—more than twenty years—that two of them have names: Johnny and Carlos. Nearing their expected lifespan, the old silos are ready for retirement.

(Photo - inside a data silo)
Inside one of the original data silos. (Photo courtesy SLAC Computing.)

"There's a desire to have a more sustainable computing environment," said Norm Ringgold from the computing department. Considering the ever-expanding science going on at SLAC, the need for data storage will continue to balloon. These silos also hold data from the Linac Coherent Light Source and Fermi Telescope, in addition to backup tapes for Unix and Windows systems. "This gains us a whole bunch of computing space and power."

These circular data warehouses each hold more than four thousand tapes. A robot, stationed in the center, pulls out tapes according to location, and delivers them to a cabinet where the data can be accessed. Once the data transfer is complete, the robot returns the tape to an empty space inside the silo.

Before the remaining five silos are decommissioned and hauled away, the remaining data must be transferred to new silos. That means transferring two petabytes of data—equivalent to 2,000 terabytes or 2x1015 bytes. Currently, there are six data streams between the old and new silos, with the new system gradually soaking up two decades worth of SLAC science. The entire transfer project will take nearly a year.

The new silos, in addition to a more compact design, will also require less power than their predecessors, and house a total of 13,000 tapes. A mesh screen allows a peek inside to see the robots buzzing from one end of the u-shape structure to another. There are four rows of robots, which deliver tape drives to a cabinet. Quicker tape drives and robots will make accessing data four times faster than it was in the old data silos.

While around the same size as the old tape drives, the new ones will hold one terabyte of data, rather than 200 gigabytes—increasing the data storage capacity by a factor of five.

Removing these silos will also clear about 1,500 square feet of space, which is designated for expanding existing scientific projects and adding computing resources for new researchers and future projects.

"We've got plenty that will gobble up that space," Ringgold said.

—Julie Karceski
  
SLAC Today, June 17, 2010