Content area
Full Text
About the Authors:
Martin Carlsen
Affiliation: Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark
Patrice Koehl
Affiliation: Department of Computer Science and Genome Center, University of California Davis, Davis, CA, United States of America
Peter Røgen
* E-mail: [email protected]
Affiliation: Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark
Introduction
Proteins are the essential macromolecules inside cells that perform nearly all cellular functions. Just like macroscopic tools, their shapes is a key feature for defining their functions. Structural biologists have embarked upon the challenge of finding the structures of all proteins, in hopes of unraveling this relationship between geometry and biological activity and learn in the process how cells function. Determining experimentally the structure of a protein at the atomic level however is not yet an easy task: this can be indirectly deduced from the fact that we currently know millions of protein sequences but less than hundred thousand protein structures. Predicting the structure of a protein from first principles is not much easier: direct applications of the ideas that have been used for modeling small molecules have not yet been successful on these much larger molecules. Recent reports on the advancements of ab initio techniques clearly show that the protein structure prediction community is making progress, but that the quality of the models they generate do not meet yet the stringent accuracy requirements to become useful to the biologists [1]. Interestingly, the series of Critical Assessment of protein Structure Prediction (CASP) meetings have highlighted that while the methods for generating models of protein structures have improved significantly [2], identifying the native-like conformations among the large collections of model structures (also called decoys) remains a significant challenge [3], [4]. In this paper we focus on this problem.
Anfinsen's thermodynamics hypothesis states that the native structure of a protein is determined only by its amino acid sequence [5]. Structural and computational biologists translate this postulate into the statement, that under physiological conditions, the native state of a protein is a unique, stable minimum of the free energy. The key to solving the protein structure prediction problem amounts therefore to finding an accurate representation of this free energy function and several methods have been proposed to...