It appears you don't have support to open PDFs in this web browser. To view this file, Open with your PDF reader
Abstract
Recent developments in high throughput biology have enabled the systematic exploration of the relation between genomic variants and phenotypes. The immense amount of data generated from the high throughput experiments, however, poses challenges to researchers. New statistical and computational approaches are desired to use the data efficiently to draw biological meaningful conclusions. In this thesis research, we developed new methods to take advantage of high-throughput biological data to tackle important problems including analyzing genome-wide association studies between human genomic variants and human phenotypes, finding co-complexed proteins from protein interaction networks and estimating the false-positive and false negative rates of two-hybrid protein-protein interaction screens. We also present a database designed to compile and perform preliminary analyses of yeast histone systematic mutations.
The new gene-based association test that we have developed has improved power compared to previous methods because it merges multiple weak associations within a gene into a stronger combined signal. Application of the new approach to ECG traits recovered two more genome-wide significant loci, in addition to the four genome-wide significant loci identified by traditional methods. The two new findings were validated in a meta-analysis using a larger population. Protein complexes are basic functional units in biological processes. Finding proteins that reside in the same complex can provide important information for understanding disease mechanisms. We reviewed current methods and proposed new methods to find co-complex proteins from 'seed' proteins using confidence-weighted protein physical interaction networks. We systematically evaluated all approaches and explored the effects of different confidence metrics on their performances.
To provide information to improve the protein physical interaction network, we extended capture-recapture theory to estimate protein-specific false-positive and false-negative rates in yeast two-hybrid screens. Analysis of yeast, worm and fly protein-protein interaction data indicated that 25% to 45% of the reported interactions are likely false positives. The overall false-negative rate ranges from 75% for worm to 90% for fly, which arises from a roughly 50% false-negative rate due to statistical under-sampling.
Histones are the basic protein components of nucleosomes. They are among the most conserved proteins and are subject to a plethora of post-translational modifications. We designed a database for histone systematic mutations. This database combines histone phenotypes with information about sequences, structures, post-translational modifications and evolutionary conservation. Preliminary analyses confirm that mutations at highly conserved residues and modifiable residues are more likely to generate phenotypes.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer