Content area
Full Text
Introduction
Over the past 5 years, massively parallel DNA sequencing platforms have become widely available. 1 As a result, variants data on genomes from healthy subjects and patients are being generated at an unprecedented rate. However, the development of bioinformatics tools for handling these data lags behind, creating a gap between the generation of massive data and the ability to fully exploit the biological contents of these data. To fill the urgent demand, we previously developed the ANNOVAR (ANNOtate VARiation) software for functional annotation of genetic variants from sequence data. 2 ANNOVAR efficiently uses up-to-date information to annotate genetic variants detected from diverse genomes with user-specified versions of genome builds. Although ANNOVAR has become one of the most widely used annotation tools for sequencing data, the requirement to type command line arguments makes ANNOVAR inaccessible to the average biologists and clinicians who would otherwise benefit from its extensive functionality.
Therefore, we developed a web server called wANNOVAR to facilitate web-based personal genome annotation, using ANNOVAR as the back-end annotation engine. Users need to simply submit a list of variants (even whole-exome or whole-genome variants), and wANNOVAR can process the submission and generate HTML-based result pages. It allows flexibility by permitting the users to select customised filtering criteria and identify a subset of prioritised variants from thousands or even millions of input variants. Below, we describe the implementation of the wANNOVAR sever and illustrate its utility using two high-throughput sequencing data sets on Mendelian diseases.
Methods
The web server is composed of a web interface and a background program for executing annotation tasks. Our tests indicated that the server performed well under a light load for user queries. For example, annotating an exome with ~20 000 SNPs and indels takes merely a few minutes in the server. The subroutines for handling user query were written in Perl and were facilitated by the Common Gateway Interface module (CGI.pm). The static and dynamic HTML pages have been tested in different versions of Internet Explorer, Firefox and Google Chrome browsers.
Input fields for the wANNOVAR server include a sample identifier, an email address, a variant file, the reference genome build, the gene definition system and optionally a disease model for running the 'variants reduction' pipeline. The default input...