SLAC Today is available online at:

In this issue:
Preserving the Data Harvest
Financial Brown Bag Tomorrow

SLAC Today

Tuesday - January 5, 2010

Preserving the Data Harvest

(Photo - woman scanning shelf of canned data)
Canning, pickling, drying, freezing—physicists wish there were an easy way to preserve their hard-won data so future generations of scientists, armed with more powerful tools, can take advantage of it. They've launched an international search for solutions. (Photo: Reidar Hahn, Fermilab.)

When the BaBar experiment at SLAC National Accelerator Laboratory shut down in April 2008, it brought an end to almost nine years of taking data on the decays of subatomic particles called B mesons. But that was hardly the end of the story for the 500 scientists working on the experiment. In November they celebrated the publication of their 400th paper, and they expect the next few years will yield at least 100 more.

These BaBar results and discoveries stem from more than two million megabytes of data. As impressive as this number is, it's only a fraction of the data that will come out of the next generation of high-energy physics experiments. For instance, the ATLAS detector at CERN's Large Hadron Collider will produce a whopping 320 megabytes of data every second, surpassing BaBar's total output within three months.

BaBar's treasure trove of data, which may contain answers to questions we don't even know how to ask yet, raises an increasingly important question in high-energy physics: When the party's over, what do you do with the data?

In the past, this was not so much of a concern. New experiments came along in a regular drumbeat, regularly superseding one another in terms of what could be done with the data they produced. Today, as experiments get bigger, more complex, and much more expensive, the drumbeat has slowed considerably, and physicists are starting to realize the value of wringing as much insight out of every experiment as they possibly can.

But without a conscious effort to preserve them, data slowly become the hieroglyphs of the future. Data preservation takes a lot of work, and with that, a lot of resources. Researchers have to think not only about where to store the data, but also how to preserve it in a way that it can still be used as technology and software change and experts familiar with the data move on or retire.

"Preserving the bits for all time is probably not difficult, but the data themselves become very, very rapidly an arcane, dead language," says Richard Mount, SLAC's head of scientific computing. "Preserving the ability to fully understand the nuances of a dead language is not without its cost."

It's an investment, though, that a growing number of physicists and collaborations are seriously considering. A study group known as DPHEP, for Data Preservation and Long Term Analysis in High Energy Physics, has been holding workshops to look at the issue. The BaBar collaboration has also emerged as an important force in the effort to solve the puzzle, with members striving to provide a working model of how data preservation can be done. 

Read more in Symmetry magazine...

Financial Brown Bag Tomorrow

The January 6 Financial Brown Bag will cover "Overview of WBS/Web Charta Reporting Tools." Alan Hansen from the PeopleSoft team will be there to discuss current and longer-term PeopleSoft projects. The session will be held in the Building 41 Yellow Room from noon to 1 p.m. Please register on the Training Registration Web site to attend the session.

Events (see all | submit)

Access (see all)

(see all | submit)

 Lab Announcements

Community Bulletin Board

Training (see all | register)

Lab Training

Upcoming Workshops & Classes

News (submit)

dividing line
(Office of Science/U.S. DOE Logo)

View online at