Content area
Full text
1. Introduction
Data generated by genomic sequencing projects from a wide variety of species now allow for the assembly of combined protein sequence data sets to reconstruct the universal tree of life (e.g., [1]). On the other hand, it is still an open question whether the universal common ancestor (UCA) of all extant life on Earth existed or not. Although molecular phylogenetic methods automatically construct a tree when a sequence data set is provided, the inferred tree does not necessarily guarantee the existence of UCA, because its existence is assumed implicitly from the beginning usually in molecular phylogenetics.
The theory of UCA has enjoyed a compelling list of circumstantial evidence as given by Theobald [2]. However, there had been no attempt to test the UCA hypothesis among three domains (or superkingdoms) of life, that is, eubacteria (Bacteria), archaebacteria (Archaea), and eukaryotes (Eukarya), by using molecular sequences until Theobald [2] challenged this problem with a formal statistical test. By using the sequence data sets compiled by Brown et al. [1] and by using the model selection criterion AIC [3], he showed that the UCA hypothesis is much superior to any independent origin hypothesis, and he concluded that the UCA theory holds. While the UCA hypothesis postulates that eubacteria, archaebacteria, and eukaryotes descended from a single common ancestor called UCA, the independent origin hypotheses include scenarios such as eubacteria having a different origin from that of archaebacteria/eukaryotes or the three domains have different origins from each other. His attempt is the first step towards the goal of establishing the UCA theory with a solid statistical ground. However, his methodology contains some problems for establishing the UCA theory as discussed by us [4], and, in this communication, we will give further details of our arguments.
The most serious problem of Theobald’s analysis is that he used aligned sequences compiled by Brown et al. [1], who were interested in resolving the phylogenetic relationships among archaebacteria, eubacteria, and eukaryotes, including whether each domain of life constitutes a monophyletic clade. So they a priory assumed the existence of UCA. Indeed, alignment is a procedure based on an assumption that the sequences have diverged from a common ancestral sequence. Brown et al. wrote “Individual protein families were first computer aligned and then...