The availability of next‐generation sequencing (NGS) technologies and improved computational tools has revolutionized the field of plant molecular systematics (reviewed in Cronn et al., 2012; McCormack et al., 2013; Soltis et al., 2013). Access to genome‐scale data presents exciting opportunities for researchers to develop hundreds or potentially thousands of informative, taxon‐specific loci from nuclear genomes—large, multilocus data sets that can potentially resolve relationships at any phylogenetic scale (e.g., Godden et al., 2012).
Recently, there has been much interest in developing single‐copy nuclear (SCN) loci from new or existing NGS resources such as transcriptomes (i.e., sequences representing the expressed portion of the genome; see Bräutigam and Gowik, 2010; Strickler et al., 2012) or genome skimming data (i.e., low‐coverage genome sequencing; see Straub et al., 2012), and a few pioneering studies have reported great success in developing large sets of orthologous SCN loci with elaborately designed bioinformatic pipelines (e.g., Straub et al., 2011; Rothfels et al., 2013; Weitemier et al., 2014; Tonnabel et al., 2014; Pillon et al., 2014). Nevertheless, SCN locus discovery from NGS data remains a complex process for many researchers with limited bioinformatics training and access to computational resources. To address these challenges, we developed MarkerMiner 1.0, a fully automated, open‐access bioinformatic workflow to aid plant researchers in the discovery of putative orthologous SCN loci and to facilitate downstream marker development activities such as primer or probe design with user‐friendly output.
METHODS AND RESULTS
Overall design of the application
Transcriptome sequencing is a useful approach for acquiring new data for phylogenetic marker development, and it might offer some advantages over genome skimming approaches. For example, the high output of NGS platforms, coupled with the reduced representation afforded by transcriptome sequencing, permits multiplexing of more samples from a clade of interest. This provides a more comprehensive a priori survey of phylogenetic utility across both gene space and the clade of interest than genome skimming on a fixed budget. Moreover, researchers may find that expressed sequence tags (ESTs) or de novo transcriptome assemblies already exist for many groups of angiosperms (e.g., transcriptomes available through the 1000 Plants [oneKP] project; see
MarkerMiner is a novel, command line–based computational workflow that identifies putative orthologous SCN loci present in two or more user‐provided angiosperm transcriptome assemblies and outputs detailed tabular results and sequence alignments for downstream assessment of phylogenetic utility, locus selection, intron‐exon boundary prediction, and primer or probe development for targeted sequencing (see Figs. 1–3) . The tool features a user‐configurable command line interface that is backed by a computational pipeline, and its job submission graphical user interface is accessible to researchers with limited bioinformatics training. Moreover, MarkerMiner is freely available via the iPlant cloud computing infrastructure (
MarkerMiner's fully automated workflow (Figs. 1 and 2) is implemented in Python and makes use of specific open‐source bioinformatic software to perform the following data filtering and processing steps: transcript length filtering, putative ortholog filtering, putative SCN locus filtering, secondary transcript reporting, transcript clustering and reorientation, DNA multiple sequence alignments, and DNA profile alignments with protein‐coding reference sequences (CDS) containing masked introns. The tool offers convenient functions with regard to user‐specified filtering parameters and reference CDS, and these are described in more detail below.
Filtering transcriptomes using minimum length parameters
As a first step, MarkerMiner filters each user‐provided transcriptome assembly using a minimum length parameter. By default, the application removes transcripts less than 900 bp. However, users have the flexibility to specify an alternative length parameter based on their individual preferences and research needs. Decreasing the default length parameter (e.g., <900 bp) will facilitate retention of larger numbers of transcripts for downstream filtering steps. In contrast, increasing the default length parameter (e.g., >900 bp) may result in discovery of fewer orthologs between sampled taxa.
Filtering putative ortholog pairs with reciprocal BLAST queries
MarkerMiner employs independent reciprocal BLAST (Altschul et al., 1990, 1997) queries on each filtered transcriptome assembly to identify putative orthologs. By default, the application uses the Arabidopsis thaliana (L.) Heynh. proteome from the PLAZA 2.5 database (Van Bel et al., 2012) as a reference. However, we offer the flexibility to use one of 15 additional reference options (see Box 1), and MarkerMiner is updated periodically as new references become available. Under the default settings, the filtered transcripts from each assembly are aligned against Arabidopsis proteins with NCBI‐BLASTX using E‐value 0.01 and, conversely, the Arabidopsis proteins are aligned against the filtered transcripts from each assembly with TBLASTN using E‐value 0.01. The reciprocal top hits from each of the BLAST analyses are retained if they meet the following criteria, respectively: a minimum of 70% of the transcript length is aligned with a reference protein with at least 70% sequence similarity (BLASTX), and a minimum of 80% of the protein length is aligned to a transcript with at least 70% sequence similarity (TBLASTN). These stringency criteria for parsing BLAST output are default parameters, but users have the option to specify alternative criteria.
Filtering putative single‐copy nuclear genes
De Smet et al. (2013) reported a carefully curated list of SCN genes as part of a gene family analysis that included 17 genomes broadly distributed across angiosperm phylogeny (i.e., five monocots and 12 eudicots). Of the SCN genes identified by the study, 177 were “strictly single‐copy” in all 17 genomes, and 2809 were “mostly single‐copy” (i.e., single‐copy in most of the genomes, with duplicates detected in at least one to as many as three other genomes) (De Smet et al., 2013). As the evolution of these SCN genes is largely uninfluenced by gene duplication, their sequence evolution is expected to act in concordance with species evolution, making them an invaluable resource in mining for SCN loci from transcriptomes.
MarkerMiner employs a user‐specified SCN gene reference set curated by DeSmet et al. (2013) as a final data filter. Putative ortholog pairs whose transcripts have top reciprocal BLAST hits against SCN reference proteins are retained and classified as putative single‐copy ortholog pairs.
Secondary transcript reporting
There may be cases in which a single‐copy protein has more than one transcript passing the BLAST filtering criteria. However, as previously indicated, only the transcript with the top scoring alignment is reported by MarkerMiner as a putatively orthologous single‐copy transcript. For some researchers, information about additional transcripts with lower scores (which also align uniquely to a single‐copy protein) may be of particular interest. These “secondary transcripts” may represent splice isoforms, putative paralogs, or partially assembled transcripts, although their characterization is difficult in the absence of a reference genome.
Reference options available in MarkerMiner 1.0. The default option is indicated with an asterisk (*). Reference genomes and their corresponding annotations were downloaded from the PLAZA 2.5 database (Van Bel et al., 2012).
Arabidopsis lyrata (L.) O'Kane & Al‐Shehbaz
Arabidopsis thaliana L.*
Brachypodium distachyon (L.) P. Beauv.
Carica papaya L.
Fragaria vesca L.
Glycine max (L.) Merr.
Malus domestica Borkh.
Manihot esculenta Crantz
Medicago truncatula Gaertn.
Oryza sativa L.
Populus trichocarpa Torr. & A. Gray
Ricinus communis L.
Sorghum bicolor (L.) Moench
Theobroma cacao L.
Vitis vinifera L.
Zea mays L.
MarkerMiner provides additional information about secondary transcripts via additional output. Users can use these tabular results to guide decisions about which loci to pursue for downstream marker development or to investigate further the duplication status of secondary transcripts for particular genes of interest.
Clustering, reorientation, and alignment of single‐copy transcripts and output
After the transcripts corresponding to SCN loci are filtered from all assemblies, MarkerMiner clusters transcripts by reference protein ID (Fig. 2). The transcripts within each of the resulting SCN gene clusters (or orthogroup sets) are reverse‐complemented as necessary to ensure identical sequence orientation prior to multiple sequence alignment; the corresponding DNA reference sequence of A. thaliana (or an alternative, user‐specified reference) is used to reorient sequences. Next, MarkerMiner outputs a detailed tabular report that includes the following details for each SCN locus detected: a reference gene ID, a single‐copy classification (e.g., “strictly” or “mostly”) according to De Smet at al. (2013), a gene functional description, the number of putative orthologs detected across all assemblies, and a scaffold ID for each of the transcriptome assemblies included in the analysis (Fig. 2; see also the user manual [available at
MarkerMiner outputs two types of alignments to aid researchers with downstream assessments of phylogenetic utility, locus selection, intron‐exon boundary prediction, and primer or probe development. First, a multiple sequence alignment is performed for each gene cluster with MAFFT (Katoh et al., 2002, 2009) using −quiet and −auto parameters, and alignment files are reported in FASTA format (Figs. 2 and 3). Users can edit these alignments, assess phylogenetic utility among detected loci, infer preliminary phylogenies (if appropriate), or proceed with downstream development of individual loci for phylogenetic applications (Figs. 2 and 3). Second, MarkerMiner aligns the user‐specified reference CDS with intronic regions masked with the character ‘N’ to their respective MAFFT multiple sequence alignments (Fig. 3) by using MAFFT's ‘−add’ functionality (Katoh and Frith, 2012); the intron coordinates correspond to data extracted from the PLAZA 2.5 database.
MarkerMiner provides all alignment output in FASTA format. The alignments can be useful for prediction of putative intron‐exon boundaries and approximate intron size, which will facilitate design of primers or probes for amplification or capture of complete or partial intronic regions. For example, intronic regions can be recovered completely using exon‐anchored primer pairs and PCR amplification (Lemmon and Lemmon, 2013; Pillon et al., 2014). Alternatively, intronic regions can also be recovered with hybrid enrichment approaches (e.g., sequence capture; see Lemmon and Lemmon, 2013), whereby probes are designed in the flanking exonic regions of targeted introns (e.g., close to the intron‐exon junction). These probes will facilitate capture of partial or complete intronic regions along with their exonic counterparts during a hybridization step, followed by PCR enrichment and sequencing on NGS platforms. With current sequencing technologies capable of generating read lengths up to 2 × 300 bp (Illumina MiSeq; see
Many of the SCN loci identified by De Smet et al. (2013) correspond to “housekeeping” genes. Due to their wide conservation across eukaryotes, the exonic regions of these genes may offer limited utility at shallow phylogenetic scales (Calonje et al., 2009). Fast‐evolving intronic regions may represent more desirable choices for phylogenetic studies of closely related, recently derived, and rapidly diverging angiosperm lineages (see Godden et al., 2012). MarkerMiner's intron‐exon boundary predictions are based on a user‐specified reference CDS; the accuracy of intron‐exon boundaries and intron sizes will depend on the level of divergence between the user‐specified reference and the taxa under study.
Accessibility and high‐performance computing
MarkerMiner is open‐source and is made freely accessible to the research community for use in a local computing environment as well as via the iPlant Collaborative Atmosphere cloud‐computing infrastructure (
Tests of MarkerMiner using oneKP transcriptomes
We evaluated the performance of MarkerMiner and tested its efficacy for SCN locus discovery with four data sets comprising transcriptome assemblies from the oneKP project: Lamiales (n = 77), Amaryllidaceae s.l. (n = 7), Draba L. (n = 6), and Solanum L. (n = 6) (see Appendix 1 for a list of samples). The selected data sets represent groups broadly distributed across angiosperm phylogeny (e.g., asterids, rosids, and monocots sensu APG III [2009]) and actual marker development projects (or test cases) focused on resolving relationships at different phylogenetic scales (e.g., interfamilial [Lamiales], intrafamilial [Amaryllidaceae s.l.], and intrageneric [Draba and Solanum]).
The total number of distinct, putative SCN loci detected by MarkerMiner (Fig. 4A) for each clade ranged from 666 (Draba) to 1993 (Lamiales) (mean = 1217, median = 1106, standard deviation = 560), with a mean of 535 loci detected per transcriptome accession across the four test cases (median = 584, standard deviation = 226, range = 0–909; results for individual data sets are reported in Fig. 4B). The distribution of shared SCN loci identified across all sampled accessions within each of the four test cases showed a negative trend (Fig. 4C); few loci were shared by all accessions, and most loci were detected in only one to three accessions. Nevertheless, at least 13% (Solanum) to 22% (Lamiales and Draba) of the SCN loci were shared by at least half of the sampled accessions in each test case (mean = 18%, median = 18%, and standard deviation = 0.05% across all four test cases), providing adequate data for downstream assessments of phylogenetic utility and primer or probe development.
The phylogenetic utility of putative single‐copy genes amplified using primers developed via a preliminary version of MarkerMiner (developed by S. Chamala) was documented in Metrosideros Banks ex Gaertn. (Pillon et al., 2014). Intron regions were amplified by designing primers on flanking exons using putative intron‐exon boundary information determined by aligning cDNA sequences with those of Arabidopsis genes.
Researchers should be aware that loci detected by MarkerMiner might not be single‐copy in their clade of study. Evaluation of the single‐copy status of genes is needed within the clade of interest, for example using phylogenetic (e.g., Pillon et al., 2013) or other (e.g., Duarte et al., 2010) approaches.
CONCLUSIONS
MarkerMiner, as demonstrated by our tests with oneKP data, represents an easy‐to‐use and effective tool for phylogenetic marker development. Researchers with limited bioinformatics training and limited access to high‐performance computing resources can use MarkerMiner to identify hundreds of putative SCN genes for phylogenomic analyses of any angiosperm group of interest. While we acknowledge that transcriptomic approaches to marker development may result in large numbers of missing loci across the surveyed samples (as demonstrated by each of our four test cases with oneKP data), the cautionary emphasis placed on individual gene absences may be overstated. First, most of the putative single‐copy genes detected by MarkerMiner have general “housekeeping” functions (Duarte et al., 2010; De Smet et al., 2013). Thus, individual gene absences across surveyed transcriptomes are more likely to represent differences in sequencing quality and coverage across samples than actual gene losses. These differences can be mitigated with careful sample preparation and planning of marker development projects involving NGS (e.g., standardized tissue collection practices and realistic limits to multiplexing). Second, our MarkerMiner results indicated that a large proportion of the putative SCN loci are generally shared by at least half of the surveyed transcriptomes. Despite missing data across our oneKP transcriptomes, MarkerMiner was able to recover ample data for assessments of phylogenetic utility and downstream marker development applications with as few as six transcriptomes.
The downstream processes for selecting and developing markers for targeted sequencing are more or less the same for approaches that use either transcriptomic or genome skimming data, with the caveat that the phylogenetic utility of noncoding loci cannot be assessed a priori from transcriptome data. Nevertheless, as suggested by our results, transcriptomic approaches using MarkerMiner are both economical and efficient, and MarkerMiner's multipurpose output can facilitate marker development projects targeting coding and noncoding regions.
Appendix 1.
Transcriptome assemblies from the 1000 Plants (oneKP) project used for the development and testing of MarkerMiner 1.0. Four test cases are shown: (1) Amaryllidaceae s.l., (2) Lamiales (including outgroups from Boraginales, Gentianales, and Solanales), (3) Draba, and (4) Solanum.
APG III clade | Order | Family | Taxon | oneKP sample ID |
Amaryllidaceae s.l. | ||||
Monocots | Asparagales | Amaryllidaceae s.l. | Allium sativum L. | GJPF |
Monocots | Asparagales | Amaryllidaceae s.l. | Agapanthus africanus (L.) Hoffmanns. | PRFO |
Monocots | Asparagales | Amaryllidaceae s.l. | Narcissus viridiflorus Schousb. | IQYY |
Monocots | Asparagales | Amaryllidaceae s.l. | Phycella cyrtanthoides (Sims) Lindl. | DMIN |
Monocots | Asparagales | Amaryllidaceae s.l. | Rhodophiala splendens (Renjifo) Traub | JDTY |
Monocots | Asparagales | Amaryllidaceae s.l. | Traubia modesta (Phil.) Ravenna | ZKPF |
Monocots | Asparagales | Amaryllidaceae s.l. | Zephyranthes treatiae S. Watson | DPFW |
Lamiales | ||||
Core eudicots/asterids/lamiids | Boraginales | Boraginaceae | Ehretia acuminata R. Br. | EMAL |
Core eudicots/asterids/lamiids | Boraginales | Boraginaceae | Lennoa madreporoides La Llave & Lex. | SMUR |
Core eudicots/asterids/lamiids | Boraginales | Boraginaceae | Mertensia paniculata (Aiton) G. Don | DKFZ |
Core eudicots/asterids/lamiids | Boraginales | Boraginaceae | Phacelia campanularia A. Gray | YQIJ |
Core eudicots/asterids/lamiids | Boraginales | Boraginaceae | Pholisma arenarium Nutt. | HANM |
Core eudicots/asterids/lamiids | Gentianales | Gentianaceae | Exacum affine Balf. f. | KPUM |
Core eudicots/asterids/lamiids | Gentianales | Rubiaceae | Galium boreale L. | WQRD |
Core eudicots/asterids/lamiids | Lamiales | Acanthaceae | Anisacanthus quadrifidus (Vahl) Nees | PCGJ |
Core eudicots/asterids/lamiids | Lamiales | Acanthaceae | Ruellia brittoniana Leonard | AYIY |
Core eudicots/asterids/lamiids | Lamiales | Acanthaceae | Sanchezia Ruiz & Pav. | NBMW |
Core eudicots/asterids/lamiids | Lamiales | Acanthaceae | Strobilanthes dyeriana Mast. | WEAC |
Core eudicots/asterids/lamiids | Lamiales | Bignoniaceae | Kigelia africana (Lam.) Benth. | QKEI |
Core eudicots/asterids/lamiids | Lamiales | Bignoniaceae | Kigelia africana (Lam.) Benth. | SVQC |
Core eudicots/asterids/lamiids | Lamiales | Bignoniaceae | Mansoa alliacea (Lam.) A. H. Gentry | TKEK |
Core eudicots/asterids/lamiids | Lamiales | Bignoniaceae | Tabebuia umbellata (Sond.) Sandwith | UTQR |
Core eudicots/asterids/lamiids | Lamiales | Byblidaceae | Byblis gigantea Lindl. | GDZS |
Core eudicots/asterids/lamiids | Lamiales | Calceolariaceae | Calceolaria pinifolia Cav. | DCCI |
Core eudicots/asterids/lamiids | Lamiales | Gesneriaceae | Saintpaulia ionantha H. Wendl. | RWKR |
Core eudicots/asterids/lamiids | Lamiales | Gesneriaceae | Sinningia tuberosa (Mart.) H. E. Moore | DTNC |
Core eudicots/asterids/lamiids | Lamiales | Gratiolaceae | Bacopa caroliniana (Walter) B. L. Rob. | CLRW |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Agastache rugosa (Fisch. & C. A. Mey.) Kuntze | PUCW |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Ajuga reptans L. | UCNM |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Lavandula angustifolia Mill. | FYUH |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Leonurus japonicus Houtt. | SNNC |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Marrubium vulgare L. | EAAA |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Melissa officinalis L. | TAGM |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Clinopodium serpyllifolium (M. Bieb.) Kuntze subsp. fruticosum (L.) Bräuchler | WHNV |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Nepeta cataria L. | FUMQ |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Oxera neriifolia (Montrouz.) Beauvis. | GNPX |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Oxera pulchella Labill. | RTNA |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Pogostemon cablin (Blanco) Benth. | GETL |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Poliomintha bustamanta B. L. Turner | XMBA |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Prunella vulgaris L. | PHCE |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Pycnanthemum tenuifolium Schrad. | DYFF |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Rosmarinus officinalis L. | FDMM |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Salvia L. | EQDA |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Scutellaria montana Chapm. | ATYL |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Plectranthus scutellarioides (L.) R. Br. | BAHE |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Teucrium chamaedrys L. | LRRR |
Core eudicots/asterids/lamiids | Lamiales | Lamiaceae | Thymus vulgaris L. | IYDF |
Core eudicots/asterids/lamiids | Lamiales | Lentibulariaceae | Pinguicula agnata Casper | MXFG |
Core eudicots/asterids/lamiids | Lamiales | Lentibulariaceae | Pinguicula caudata Schltdl. | JCMU |
Core eudicots/asterids/lamiids | Lamiales | Lentibulariaceae | Utricularia L. | HRUR |
Core eudicots/asterids/lamiids | Lamiales | Oleaceae | Chionanthus retusus Paxton | KTAR |
Core eudicots/asterids/lamiids | Lamiales | Oleaceae | Forestiera segregata (Jacq.) Krug & Urb. | UEEN |
Core eudicots/asterids/lamiids | Lamiales | Oleaceae | Ligustrum sinense Lour. | MZLD |
Core eudicots/asterids/lamiids | Lamiales | Oleaceae | Olea europaea L. | TORX |
Core eudicots/asterids/lamiids | Lamiales | Orobanchaceae | Conopholis americana (L.) Wallr. | FAMO |
Core eudicots/asterids/lamiids | Lamiales | Orobanchaceae | Epifagus virginiana (L.) W. P. C. Barton | URZI |
Core eudicots/asterids/lamiids | Lamiales | Orobanchaceae | Epifagus virginiana (L.) W. P. C. Barton | XMOG |
Core eudicots/asterids/lamiids | Lamiales | Orobanchaceae | Lindenbergia philippinensis Benth. | WUZV |
Core eudicots/asterids/lamiids | Lamiales | Orobanchaceae | Lindenbergia philippinensis Benth. | ZVFS |
Core eudicots/asterids/lamiids | Lamiales | Orobanchaceae | Orobanche fasciculata Nutt. | PHOQ |
Core eudicots/asterids/lamiids | Lamiales | Orobanchaceae | Orobanche fasciculata Nutt. | VTOK |
Core eudicots/asterids/lamiids | Lamiales | Paulowniaceae | Paulownia fargesii Franch. | UMUL |
Core eudicots/asterids/lamiids | Lamiales | Pedaliaceae | Uncarina grandidieri (Baill.) Stapf | ZRIN |
Core eudicots/asterids/lamiids | Lamiales | Plantaginaceae | Antirrhinum majus L. | EBOL |
Core eudicots/asterids/lamiids | Lamiales | Plantaginaceae | Antirrhinum majus L. | TPUT |
Core eudicots/asterids/lamiids | Lamiales | Plantaginaceae | Antirrhinum braun‐blanquetii Rothm. | YRHD |
Core eudicots/asterids/lamiids | Lamiales | Plantaginaceae | Digitalis purpurea L. | GNRI |
Core eudicots/asterids/lamiids | Lamiales | Plantaginaceae | Plantago maritima L. | YKZB |
Core eudicots/asterids/lamiids | Lamiales | Plantaginaceae | Plantago virginica L. | PTBJ |
Core eudicots/asterids/lamiids | Lamiales | Rhemanniaceae | Rehmannia glutinosa Steud. | OWAS |
Core eudicots/asterids/lamiids | Lamiales | Schlegeliaceae | Schlegelia parasitica (Sw.) Miers ex Griseb. | GAKQ |
Core eudicots/asterids/lamiids | Lamiales | Schlegeliaceae | Schlegelia parasitica (Sw.) Miers ex Griseb. | CWLL |
Core eudicots/asterids/lamiids | Lamiales | Schlegeliaceae | Schlegelia violacea Griseb. | EDXZ |
Core eudicots/asterids/lamiids | Lamiales | Scrophulariaceae | Anticharis glandulosa Asch. | EJBY |
Core eudicots/asterids/lamiids | Lamiales | Scrophulariaceae | Buddleja L. | GRFT |
Core eudicots/asterids/lamiids | Lamiales | Scrophulariaceae | Buddleja lindleyana Lindl. | XRLM |
Core eudicots/asterids/lamiids | Lamiales | Scrophulariaceae | Celsia arcturus Jacq. | SIBR |
Core eudicots/asterids/lamiids | Lamiales | Scrophulariaceae | Verbascum L. | XXYA |
Core eudicots/asterids/lamiids | Lamiales | Tetrachondraceae | Polypremum procumbens L. | COBX |
Core eudicots/asterids/lamiids | Lamiales | Verbenaceae | Lantana camara L. | PSHB |
Core eudicots/asterids/lamiids | Lamiales | Verbenaceae | Phyla dulcis (Trevir.) Moldenke | MQIV |
Core eudicots/asterids/lamiids | Lamiales | Verbenaceae | Verbena hastata L. | GCFE |
Core eudicots/asterids/lamiids | Solanales | Convolvulaceae | Ipomoea pubescens Lam. | EMBR |
Core eudicots/asterids/lamiids | Solanales | Solanaceae | Solanum ptychanthum Dunal | DLJZ |
Draba | ||||
Core eudicots/rosids/malvids | Brassicales | Brassicaceae | Draba aizoides L. | HABV |
Core eudicots/rosids/malvids | Brassicales | Brassicaceae | Draba hispida Willd. | GTSV |
Core eudicots/rosids/malvids | Brassicales | Brassicaceae | Draba magellanica Lam. | UVQL |
Core eudicots/rosids/malvids | Brassicales | Brassicaceae | Draba oligosperma Hook. | LAPO |
Core eudicots/rosids/malvids | Brassicales | Brassicaceae | Draba ossetica (Rupr.) Sommier & Levier | LJQF |
Core eudicots/rosids/malvids | Brassicales | Brassicaceae | Draba sachalinensis Trautv. | BXBF |
Solanum | ||||
Core eudicots/asterids/lamiids | Solanales | Solanaceae | Solanum cheesmaniae (L. Riley) Fosberg | UGJI |
Core eudicots/asterids/lamiids | Solanales | Solanaceae | Solanum dulcamara L. | GHLP |
Core eudicots/asterids/lamiids | Solanales | Solanaceae | Solanum lasiophyllum Humb. & Bonpl. ex Dunal | DLAI |
Core eudicots/asterids/lamiids | Solanales | Solanaceae | Solanum ptychanthum Dunal | DLJZ |
Core eudicots/asterids/lamiids | Solanales | Solanaceae | Solanum sisymbriifolium Lam. | NMDZ |
Core eudicots/asterids/lamiids | Solanales | Solanaceae | Solanum virginianum L. | LQJY |
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2015. This work is published under http://creativecommons.org/licenses/by-nc/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Premise of the study:
Targeted sequencing using next‐generation sequencing (NGS) platforms offers enormous potential for plant systematics by enabling economical acquisition of multilocus data sets that can resolve difficult phylogenetic problems. However, because discovery of single‐copy nuclear (SCN) loci from NGS data requires both bioinformatics skills and access to high‐performance computing resources, the application of NGS data has been limited.
Methods and Results:
We developed MarkerMiner 1.0, a fully automated, open‐access bioinformatic workflow and application for discovery of SCN loci in angiosperms. Our new tool identified as many as 1993 SCN loci from transcriptomic data sampled as part of four independent test cases representing marker development projects at different phylogenetic scales.
Conclusions:
MarkerMiner is an easy‐to‐use and effective tool for discovery of putative SCN loci. It can be run locally or via the Web, and its tabular and alignment outputs facilitate efficient downstream assessments of phylogenetic utility, locus selection, intron‐exon boundary prediction, and primer or probe development.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
1 Department of Biology, University of Florida, Gainesville, Florida, USA
2 Department of Biology, University of Florida, Gainesville, Florida, USA; Florida Museum of Natural History, University of Florida, Gainesville, Florida, USA; Facultad de Ciencias Forestales y Conservación de la Naturaleza, Universidad de Chile, Santiago, Chile
3 Department of Biology, University of Florida, Gainesville, Florida, USA; Florida Museum of Natural History, University of Florida, Gainesville, Florida, USA; Rancho Santa Ana Botanic Garden, Claremont, California, USA
4 Plant Genomics, J. Craig Venter Institute, Rockville, Maryland, USA
5 Department of Biology, Bucknell University, Lewisburg, Pennsylvania, USA; Jepson and University Herbaria, University of California, Berkeley, Berkeley, California, USA
6 Department of Plant Systems Biology, Vlaams Instituut voor Biotechnologie, 9052 Ghent, Belgium; Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium
7 Department of Biology, University of Florida, Gainesville, Florida, USA; Genetics Institute, University of Florida, Gainesville, Florida, USA
8 Department of Biology, University of Florida, Gainesville, Florida, USA; Florida Museum of Natural History, University of Florida, Gainesville, Florida, USA; Genetics Institute, University of Florida, Gainesville, Florida, USA
9 Florida Museum of Natural History, University of Florida, Gainesville, Florida, USA; Genetics Institute, University of Florida, Gainesville, Florida, USA