Full text

Turn on search term navigation

© 2025 Martínez et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

The genetic code, a unifying principle in biology, ensures that all organisms, stemming from a Last Universal Common Ancestor (LUCA), share fundamental rules for translating DNA into proteins. However, codon usage varies across the tree of life, influenced not only by GC-content and proteome composition but also by complex, often less understood rules dependent on each species’ evolutionary trajectory. To better understand these rules, we segregated codons into their functional parts and applied Shannon’s information-theoretic measures to 1,434 species from eight diverse taxonomic groups. We provide robust evidence that the first codon base plays a central role in amino acid determination, while the third base serves an accessory function. Using conditional entropy measures, we rigorously quantified this relationship, universally confirming the greater informational variability of the third base across all sampled species for the first time at this scale. Our analysis revealed significant heterogeneity in coding strategies across different taxonomic groups. Notably, the unique variability observed in Archaea, in contrast to the more constrained patterns in Eukaryotes and Bacteria, underscores the profound influence of evolutionary pressures and distinct life histories on genetic information processing. The identification of outlier species, exhibiting distinct informational profiles, highlights specific instances where unusual lifestyles or ecological niches may have driven unique adaptations in codon usage and underlying informational dependencies. These informational patterns offer a complementary perspective to traditional phylogenetic analyses, further revealing a hierarchical organization of informational dependencies among codon components that sheds light on the intricate grammar of genetic information. We also rigorously investigated the relationship between GC-content and our informational measures, concluding that these entropy measures provide valuable insights that cannot be obtained from GC-content alone. This work not only offers a novel framework for quantifying informational properties of codon usage but also reveals previously unappreciated aspects of how genetic information is encoded and processed across life’s domains.

Details

Title
Sampling informational properties of codon usage through the tree of life
Author
Martínez, Octavio  VIAFID ORCID Logo  ; Reyes-Valdés, Manuel Humberto; Ochoa-Alejo, Neftalí
First page
e0335824
Section
Research Article
Publication year
2025
Publication date
Nov 2025
Publisher
Public Library of Science
e-ISSN
19326203
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3276035162
Copyright
© 2025 Martínez et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.