Content area
Full Text
New tools help biologists integrate complex datasets
IT'S NOT YOUR HIGH SCHOOL BIOLOGY LAB. TODAY'S BURGEONING field of systems biology takes researchers away from traditional one-gene-at-a-time bench experiments in favor of combining technologies from fields such as genomics, mass spectrometry, imaging and informatics to advance their understanding of biological systems.
The advent of high-throughput (HTP) technologies, such as transcriptomics (microarrays) and proteomics, has been fueling a revolution in biology by enabling systems-level analysis. These HTP approaches are especially promising for characterizing biomolecules at a global scale; however, the large, heterogeneous data sources make interpretation especially challenging. Microarrays - a method used to investigate the expression levels of thousands of genes simultaneously - produce data with very high dimensionality and a lot of variability from experiment to experiment. Proteomics, the study of protein expression patterns in organisms, is a vast, complex field that requires tools such as powerful separation methods and mass spectrometers, and advanced algorithms to automate data processing.
The lack of computational capabilities to analyze bulky datasets from HTP techniques is a bottleneck for this new era of biology research. As a consequence, many investigators who conduct experiments that generate massive amounts of data find few options for analyzing the data; therefore, the full potential of the studies is never realized. Combining HTP measurements in a given experiment may provide more information, but may also exacerbate the problem. A global challenge for biology is integrating HTP data into computational models that predict cell response.
"There's a need for heterogeneous data. Most analytical technologies provide only a single dimension of data," says Steven Wiley, Director of Systems Biology at the U.S. Department of Energy's Pacific Northwest National Laboratory (PNNL) in Richland, WA. "Complex processes involve multi-step transformations, and you can't understand a process without being able to obtain and integrate data on all of its dimensions."
One example is a change in protein expression. There are multiple levels at which protein production can be controlled, not to mention post-translational modifications that often dictate protein function. To understand how the abundance and activity of a given protein are regulated, a biologist has to be able to analyze changes in the different forms of the molecules over various time scales.
But data analysis doesn't end with...