Content area
Full Text
ABSTRACT
A number of computational approaches have been developed to reengineer promising chimeric proteins one at a time through targeted point mutations. In this article, we introduce the computational procedure IPRO (iterative protein redesign and optimization procedure) for the redesign of an entire combinatorial protein library in one step using energy-based scoring functions. IPRO relies on identifying mutations in the parental sequences, which when propagated downstream in the combinatorial library, improve the average quality of the library (e.g., stability, binding affinity, specific activity, etc.). Residue and rotamer design choices are driven by a globally convergent mixed-integer linear programming formulation. Unlike many of the available computational approaches, the procedure allows for backbone movement as well as redocking of the associated ligands after a prespecified number of design iterations. IPRO can also be used, as a limiting case, for the redesign of a single or handful of individual sequences. The application of IPRO is highlighted through the redesign of a 16-member library of Escherichia coli/Bacillus subtilis dihydrofolate reductase hybrids, both individually and through upstream parental sequence redesign, for improving the average binding energy. Computational results demonstrate that it is indeed feasible to improve the overall library quality as exemplified by binding energy scores through targeted mutations in the parental sequences.
BACKGROUND AND INTRODUCTION
The ability to proactively modify protein structure and function through a series of targeted mutations is an open challenge that is central in many different applications. These include, among others, enhanced catalytic activity (1-3) and stability (4,5), creation of gene switches for the control of gene expression for use in gene therapy and metabolic engineering (6,7), signal transduction (8,9), genetic recombination (10), motor protein function, and regulation of cellular processes (see Bishop et al. (11) for a review). This task is complicated by the fact that proteins rely on complex networks of subtle interactions to enable function (12-14). Therefore, the effect of a mutation is difficult to assess a priori requiring the capture of its direct or indirect effects on many neighboring amino acids. As a result, most protein engineering paradigms involve the synthesis and screening of multiple protein candidates (protein library) as a way to enhance the odds of identifying proteins with the desired functionality level. These directed evolution design paradigms (15-20)...