Content area
Plants have acquired the ability to adapt and respond to varying environmental conditions through modifications in their developmental programs. This adaptability relies on the plant’s capacity to sense environmental cues and respond via diverse signal transduction pathways and transcriptional regulation. Transcription factors are central in these processes, orchestrating specific gene expression in both developmental and stress responses. In Arabidopsis thaliana, 91% of transcription factors contain large intrinsically disordered regions (IDRs). The structural flexibility in these regions is critical in protein-protein interactions and contributes to functional versatility across different cell types. MADS-domain transcription factors constitute an eukaryotic protein family involved in a diversity of developmental processes and stress responses. Using bioinformatic tools, we found that most Arabidopsis MADS-domain proteins contain IDRs (≥30 residues) in their C-terminal region, with a higher proportion of global disorder in Type II compared to Type I MADS-domain proteins. Remarkably orthologous proteins from non-plant species in the Eukarya domain (Drosophila melanogaster, Saccharomyces cerevisiae, and Homo sapiens) also present disordered C-terminal regions, containing longer IDRs than those found in Arabidopsis, or other analyzed plant species. Additionally, conserved motifs were identified within the C-terminal IDRs of Arabidopsis Type I and Type II MADS-domain proteins, suggesting interactions with co-regulatory partners. We also identified putative activation domains in the C-terminal region of Type I and Type II MADS-domain proteins. The involvement of IDRs in selecting co-regulators is further supported by the identification of Molecular Recognition Features (MoRFs) in Type II MADS-domain proteins. The conserved structural disorder in the C-terminal region of MADS-domain proteins, which includes specific motifs, across diverse domains of life provides valuable insights into their structural properties and mechanisms of action as transcriptional regulators.
Introduction
Plants have developed a wide range of mechanisms to adapt to changing environmental conditions. Their ability to sense various signals and respond accordingly through diverse and complex signaling pathways to accomplish accurate gene regulation, allows them to exhibit phenotypic plasticity. This plasticity enables plants to optimize their growth and development in response to factors such as light, temperature, water availability, nutrient levels, and biotic stresses. Remarkably different species of plants share plastic traits associated with their specific habitats (e.g., [1]), highlighting the adaptive role of plastic responses and underlying the functional role of the genes involved and their regulation [2].
Transcription factors (TFs) are central regulators in all organisms, playing key roles in diverse biological processes, such as growth, differentiation, hormone signaling, stress responses, and immune defense. By integrating signals from different pathways, TFs orchestrate the complex regulatory networks that underlie plant development and adaptation, enabling plants to respond dynamically to changing environmental conditions [3–9].
MADS-domain TFs belong to a eukaryotic protein family that participates in several developmental processes and stress responses [10,11]. The name MADS-box is derived from the four founding members of this family: MINICHROMOSOME MAINTENANCE 1 (MCM1) in yeast, AGAMOUS (AG) in Arabidopsis thaliana (from now on Arabidopsis), DEFICIENS (DEF) in Antirrhinum majus (snapdragon), and SERUM RESPONSE FACTOR (SRF) in humans [12]. The MADS (M) domain is highly conserved and present in all members of the family and is typically located in the N-terminal region of the protein. However, there are exceptions to this pattern, such as in SRF in humans and MCM1 in yeast, where the MADS domain is found in different positions within the protein [13,14]. While some organisms have just a few MADS-box genes, such as Saccharomyces cerevisiae (yeast) with four, Drosophila melanogaster with two, and Homo sapiens with five [15–18], plants exhibit a considerably higher abundance of these genes (52–167 in 21 angiosperms) [19–22]. MADS-domain proteins are classified into two main types: Type I (Serum Response Factor (SRF)-like), and Type II (Myocyte Enhancer Factor (MEF)-like) based on their sequence characteristics, genomic organization, and functional roles [23,24]. In yeast, MADS-domain proteins are involved in key processes such as arginine metabolism, osmotic stress response, and mating type regulation. In Drosophila, MEF2 genes participate in muscle differentiation, while in humans the different MEF2 genes also contribute to heart and neural development, and the response to various diseases [13,16,25–27]. In plants, MADS-domain proteins play several roles in development, encompassing the transition to flowering, flower organ development, fruit ripening, root development, and vegetative phase change, among others [24]. In Arabidopsis, Type I MADS-domain proteins have been primarily implicated in female gametogenesis and seed development, while Type II MADS-domain proteins play central roles in controlling floral organ identity and flower development in angiosperms and are involved in almost all Arabidopsis developmental processes [24,28]. Type I MADS-domain proteins typically consist of three domains: M, I (intervening), and C (C-terminal domain), whereas Type II proteins possess four domains, forming the acronym MIKC [29–33]. In both cases, the “M” domain is responsible for DNA binding but also participates in protein-protein interactions. MADS-domain proteins bind to a specific DNA sequence known as the “CArG” box located in the regulatory regions of target genes. The “I” domain is located between the “M” and the “K” domains. While it is less conserved than the “M” domain, it also participates in DNA-binding and dimerization specificity [33,34]. The “K” domain derives its name from its structural resemblance to proteins known as keratins. This domain is involved in protein-protein interactions and contributes to the formation of higher-order complexes. Finally, the “C” domain is located at the C-terminal of the protein and exhibits variations in length and sequence among different MADS-domain proteins. For some MADS-domain proteins, it has been shown that this domain participates in transactivation and in the formation of higher-order complexes [35].
A phylogenetic analysis of 107 Arabidopsis MADS-box sequences revealed that these genes can be grouped into two main lineages (Type I and Type II) or five subfamilies: Mα, Mβ, and Mγ, and MIKCc and MIKC*. The Mα, Mβ, and Mγ, subfamilies belong to Type I MADS-domain proteins whereas the MIKCc and MIKC* subfamilies are classified as Type II [24,36]. The genomic distribution of MIKC genes, along with evidence from genome history, suggests that these genes existed before the Arabidopsis genome polyploidization event and are distinct from the Mα, Mβ, and Mγ subfamilies [23,36].
In Arabidopsis, 91% of TFs are characterized by large intrinsically disordered regions (IDRs), crucial for diverse cellular functions [37]. IDRs are segments of proteins that lack a fixed or stable three-dimensional structure under physiological conditions. Their unique physicochemical properties arise from the specific nature of their amino acid sequences and the characteristics of the individual amino acid residues within these sequences. Intrinsically disordered proteins (IDPs) and/or IDRs often lack a significant number of hydrophobic residues typically associated with folded protein domains whereas they are enriched in amino acids that allow their flexibility [38]. The IDP amino acid sequences enable a structural dynamic that depends on the physicochemical properties of their environment and the interactions with other molecules [39,40]. This flexibility plays a pivotal role in mediating IDP multiple protein-protein interactions and in their participation in different developmental processes, such as cell cycle, transcriptional control, and responses to different stress conditions [41,42]. However, only a limited number of TF families have been analyzed, highlighting the presence of IDRs and their association with their role in their respective functions [43–47].
Interestingly, IDRs often contain specific sequences known as Molecular Recognition Features (MoRFs), these short transiently folding sequences play a key role in determining site-partner specificity [48,49]. MoRFs show distinctive physicochemical characteristics that may aid in protein interaction through the hydrophobicity of their amino acid composition (rich in proline and methionine) and the hydrophilic nature of IDRs [50,51]. The charge of individual amino acids and their electrostatic interactions affect the conformational structure of the protein, which in turn affects its binding specificity and stability. This has led to the hypothesis that MoRFs are key participants in the multi-partner binding capacity of hub proteins [52–54]. The dynamic interconnectivity of IDPs/IDRs has also been associated with their ability to aggregate through liquid-liquid phase separation (LLPS), a process that plays a critical role in protein and RNA organization [55]. LLPS is now recognized as a fundamental organizational and regulatory principle across all organisms. It enables the concentration of specific proteins and nucleic acids into biomolecular condensates, i.e., membrane-less organelles that regulate cellular activity by localizing particular proteins, accelerating enzymatic reactions, and favoring selective interactions. This dynamic regulation supports the precise control of a diversity of processes involved in growth, development, and responses to environmental changes and pathogens [56].
In this work, we used bioinformatic tools and publicly available datasets to investigate the presence of IDRs in MADS-domain proteins from plants and other taxa. We found that Arabidopsis Type I and Type II MADS-domain TFs present a high level of disorder, ranging from 20 to 80%, with longer IDRs in their C-terminal domain. This characteristic was found to be conserved in orthologs of these proteins from other plant species as well as from Drosophila melanogaster, Saccharomyces cerevisiae, and Homo sapiens. Of note, within the C-terminal region of proteins in both the SOC1 and the FLC clades [23,24], we found two different motifs common to all members of the SOC1 and FLC clades, suggesting functional restriction and phylogenetic conservation within these groups. Interestingly, the SOC1 motif is conserved across SOC1 orthologs from different plant species, and in other Arabidopsis Type II MADS-domain proteins. Many Arabidopsis MADS-domain proteins also contain MoRFs that overlap with or are located just a few amino acids away from the SOC1 and the FLC motifs in the last residues of the C-terminal region. We mapped putative activation domains (ADs) across all clades of Type I and Type II MADS-domain proteins on their C-terminal region. Interestingly, Type I showed a higher proportion of ADs compared to Type II MADS-domain proteins. Finally, our in silico analysis predicts that some MADS-domain proteins may form condensates, potentially leading to different conformational arrangements and expanding their functional roles. The information in this work adds valuable insights to better understand the MADS TFs molecular mechanisms involved in the control of plant growth, morphogenesis, and responses to environmental changes.
Materials and methods
Complete sequence retrieval and domain description
Individual protein sequences of Arabidopsis thaliana were retrieved from the TAIR database [57] except for STK (AGL11) which was retrieved from UniProt (Q38836). Protein domains were identified using the UniProt database [58]. The I domain was defined based on the AtSOC1 protein obtained from the supplementary information in Lai et al. (2021) [33] for both Type I and Type II MADS-domain proteins. In this study, the C-terminal region was defined as the sequence located downstream of the I (for both Type I MADS-domain proteins) or downstream of the K domain (for Type II MADS-domain proteins). Retrieval of protein sequences from other plants and non-plant organisms was obtained from GenBank, the Rice Annotation Project DataBase, and UniProt [58–60] (S1 Table). A total of 143 sequences were collected, including 100 from Arabidopsis, nine from rice, and at least one homolog from basal plants and other angiosperms (S1 Table). The selection of the Type I MADS-domain proteins was based on Bemer et al. (2010) [28], while the selection of rice sequences covering both Types of MADS-domain protein orthologs was based on Arora et al. (2007) [61].
Sequence alignment and phylogenetic analysis
After retrieving the individual protein sequences, they were aligned with the MAFFT algorithm at the MAFFT web server [62,63]. The parameters used were L-INS-i and the mafft-homolog function with UniRef. After the alignment of the 143 sequences, we visually inspected for ambiguous misalignments with the AliView program [64]. Using this alignment, we determined the position of the I domain for Type I and Type II MADS-domain proteins. The M and the K domains for all MADS-domain proteins were retrieved from Uniprot. This approach allowed us to accurately map the C-terminal region along with its corresponding IDRs. By assigning approximately 190 amino acid residues up to the K-domain, we found a better alignment of conserved regions among the selected protein sequences. The alignment with the MIK domains was used to recover a phylogeny by applying a Maximum Likelihood approach with the RAxML algorithm, following a JTT substitution model, at the CIPRES website [65–68].
Motif and Molecular Recognition Features (MoRFs) identification
The MEME algorithm [69,70] was used to identify motifs within the whole protein sequence or the C-terminal regions. Molecular Recognition Features (MoRFs) are small IDRs potentially involved in the initial events of molecular recognition during protein-protein interactions. These regions undergo disorder-to-order transitions upon binding. MoRFs were identified in the complete protein sequences using the fMoRFpred tool that uses the physicochemical properties of amino acid residues to fit a Support Vector Machine (SVM) model for predicting the presence of MoRFs within IDRs [49,71]. To map the predicted MoRFs, their locations were aligned across the protein sequences alongside the MEME motifs.
Mapping of activation domains within the C-terminal region of MADS-domain proteins
Potential activation domains (ADs) in MADS-domain proteins were retrieved directly from Morffy et al. (2024) [72] (S2 Table). In this study, the authors experimentally identified activation domains in various plant TFs using a comprehensive library. This library consisted of overlapping 40 amino acid fragments, spanning the entire set of plant TFs, with a step size of 10 amino acids, resulting in a total of 68,441 fragments. These fragments were screened in yeast to assess their transcriptional activation capacity. Based on this experiment and subsequent normalization, a (Plant Activation Domain Identification) PADI score was assigned to each fragment. Among the fragments showing transcriptional activation activity, some corresponded to MADS-domain TFs. The PADI scores included in our manuscript are those reported here, and the localization of activation domains (AD) within MADS-domain TFs was inferred from the results obtained in Morffy et al. (2024) [72], where the authors applied a neural network-based algorithm, known as transcriptional activation domain activity (TADA) network. Their work integrated multiple layers of analysis, including the construction of a feature matrix and the use of methods to assess the impact of both individual input features and border local and global interactions predicted by TADA. Additionally, these results were further analyzed using deep learning to identify key properties relevant to the prediction of ADs. These authors also applied a Shapley additive explanations (SHAP) analysis to capture non-linear and linear correlations, thereby uncovering complex patterns. This was followed by additional deep-learning steps that culminated in the development of a tool capable of predicting potential ADs. Using the Morffy et al. (2024) [72] dataset, we filtered the information for MADS-domain proteins to localize the “NOT-AD” and the “AD” fragments within their C-terminal IDR.
IDR prediction and structural disorder score of the complete proteins
The majority of Arabidopsis IDRs were retrieved from the Alphafold section of the MobiDb database [73] (accessed in November 2023). The IDR prediction was conducted with the AlphaFold Colab Notebook with default values or directly in the AlphaFold web server [74] (accessed in January 2024). The resulting structures were visually inspected with the pbd.file coupled with the Predicted Aligned Error (PAE) of the proteins in the Chimera X molecular visualizer [75]. We also checked the IDRs for each prediction in JalView and accounted for the low Temperature Factor regions, which correspond to the low pLDDT values [76]. Disordered regions predicted by AlphaFold are those with a pLDDT < 50, usually seen as ribbon-like structures [77, 78]. The structural disorder analysis was assessed using the RIDAO platform [79], which includes various intrinsic disorder predictors [79,80]. Protein sequences in FASTA format were used as input for the analysis, the amino acid sequence of the C-terminal region was manually extracted from the original full-length protein sequences and formatted in FASTA. The RIDAO platform provides two key metrics for each protein: The Average Disorder Score (ADS) and the Percent of Predicted Intrinsic Disorder Residues (PPIDR). The ADS indicate the overall propensity of a protein to be intrinsically disordered, allowing a comprehensive analysis of structural disorder. The PPIDR represents the proportion of amino acids predicted to be disordered, considering those with significant intrinsic disorder score (> 0.5), relative to the total number of residues in the protein [81].
Physicochemical properties of disordered proteins
Parameters related to the amino acid residue charge (NCPR: net charge per residue, and FCR: fraction of charge residues) and their distribution (patterning parameter kappa) [52] across the C-terminal regions of MADS-domain proteins were calculated using tools available in the CIDER web server [82,83].
Post-translational modifications (phosphorylation)
Phosphorylation in serine, threonine, and tyrosine residues within the complete protein sequences was predicted using the NetPhos algorithm [84–86]. Additionally, experimentally validated data were obtained from the ATHENA at [87] (accessed in February 2024) and the EPSD databases (accessed in May 2025) [88].
Condensate formation tendency
Condensate propensity was calculated for the complete MADS-domain protein and their C-terminal domain region sequences using the FuzDrop algorithm [89,90]. Protein sequences in FASTA format were used as input for the analysis, the amino acid sequence of the C-terminal region was manually extracted from the original full-length protein sequences and formatted in FASTA.
Data analyses and image rendering
We evaluated differences in disorder ADS and PPIDRS among MADS-domain proteins, with a Wilcoxon test under a permutation approach with the coin R package [91]. Also, we compared the raw number of phosphorylation sites among Arabidopsis MADS-domain protein types with a Chi-squared test. All the analyses and graphics were performed in the R program [92]. Protein diagrams were made with the drawProteins library from the Bioconductor repository, and the final rendering was done with the GIMP program [93] (license GPLv3. Version 3.0.4, Free Software). Statistical graphs were made with the tidyverse libraries and with ggplot2 side-by-side working libraries (S3 Table). Other R packages used here are cited in S3 Table.
Results
The C-terminal domain of Arabidopsis MADS-domain transcription factor family presents structural disorder propensity
The function of MADS-domain proteins in development is conserved across diverse plant species and other taxa. Given the diversity of cell types and conditions in which these transcription factors regulate gene expression, the structural disorder appears to be an advantageous property, enabling efficient and versatile functionality. In this study, we evaluated the occurrence of structural disorder in 58 Type I and 42 Type II MADS-domain proteins from Arabidopsis to gain insight into their protein structure and its relationship to their function. To evaluate the degree of disorder, the Average Disorder Score (ADS) and the Percent of Predicted Intrinsic Disorder Residues (PPIDR) were calculated in the RIDAO platform, for both the complete proteins and their C-terminal regions. For Type I MADS-domain proteins, both the mean ADS and mean PPIDR of the complete proteins and their C-terminal regions are similar (Fig 1A and 1C, left panel). In contrast, for Type II MADS-domain proteins, both the mean ADS and mean PPIDR are larger for the C-terminal region than for the complete proteins (Fig 1A and 1C, right panel). Also, the relation between the protein length and ADS or PPIDR values revealed a broader length distribution (100–450 amino acid residues) for Type I MADS-domain proteins and only a minor inverse correlation between protein length and global disorder (Fig 1B and 1D, left panel). In contrast, Type II proteins exhibited a narrower length range (200–300 amino acid residues) but followed a positive trend between length and disorder (Fig 1B and 1D, right panel). Moreover, Type II MADS-domain proteins present a significantly higher disorder (ADS and PPIDR) compared to Type I proteins (permutation Wilcoxon test, [ADS] Z = −3.1497, p = 0.0016; [PPIDR] Z = −3.897 p = 1x10-4, Fig 1). AlphaFold-based analysis of the structural disorder distribution revealed that this property is predominantly located in the C-terminal regions for both types: in Type I this region follows the I-domain, while in Type II proteins it is found beyond the K-domain (Fig 2A, S4 Table). Notably, the longest IDRs in both types were consistently located within the C-terminal region, and the IDRs at the C-terminal region in Type I MADS-domain proteins were longer than in the Type II MADS-domain proteins (permutation Wilcoxon test, Z = 4.1552, p-value < 1e-04p, Fig 2, S4 Table). The number of IDRs containing at least 30 amino acid residues was similar between the two groups (S4 Table, Type I total count = 51, Type II total count = 42, χ2 = 0.87, df = 1, p = 0.351). Nevertheless, in nine Type I MADS-domain proteins (AGAMOUS-LIKE91 [AGL91], AGL29, AGL58, AGL64, AGL87, AGL101, AGL59, AGL102, AGL85) we only detected IDRs shorter than 30 amino acid residues, whereas all Type II MADS-domain proteins contain at least one IDR with ≥30 amino acids. Interestingly, most MADS-domain proteins present an IDR at the beginning of the N-terminal, in the first residues of the M-domain.
[Figure omitted. See PDF.]
(A) Distribution of ADS in MADS-domain TFs. (B) Scatterplots showing the association between protein length and ADS. (C) Distribution of PPIDR values in MADS-domain TFs. (D) Scatterplots showing the association between protein length and PPIDR. Red dots indicate the mean of each value, and black dots indicate the raw values of either ADS or PPIDR per protein. Statistical analysis using the Wilcoxon test shows a significant difference between Type I and Type II proteins (p = 0.0002).
[Figure omitted. See PDF.]
(A) Schematic representation of Type I and Type II MADS-domain proteins indicating their distinctive domains: MADS (M = yellow), Intervening (I = green), Keratin-like (K = blue), and Intrinsically Disordered Regions (IDR = red). Numbers below indicate the amino acid positions. Dots above the domains indicate putative phosphorylation sites. The columns to the right of each protein diagram indicate their corresponding values for PPIDR and ADS. Proteins are arranged in descending order according to their PPIDR. (B) Distribution of predicted phosphorylation sites in the N-terminal and C-terminal regions of Type I and Type II MADS-domain proteins. Whiskers indicate ±1.5 times the interquartile range (IQR) according to the Tukey method; the middle line denotes the median.
The flexibility and dimensions of IDRs depend on their amino acid composition, where the charge content and hydrophobicity are key factors. Because the composition of most IDPs includes positive and negative charges, some of their characteristics can be described by the fraction of charged residues (FCR) and net charge per residue (NCPR). Nevertheless, the best descriptors for IDP conformational properties are the FCR and the distribution of oppositely charged residues, defined by a patterning parameter named kappa (k) [52]. On average, polypeptides with low kappa-values (closer to 0) are predicted to adopt more extended conformations, while sequences with higher kappa-values (closer to 1) are expected to form more compact, hairpin-like structures [52]. The kappa-values obtained for the C-terminal of MADS-domain proteins showed that these regions in Type I and Type II proteins are prone to adopting extended conformations, likely due to the counterbalance of the interchain electrostatic interactions resulting from the more even distribution of oppositely charged residues. This result aligns with a mean NCPR close to zero and a mean FCR value at the boundary between weak and strong polyampholytes obtained for these IDRs (S1 Fig). Although this conformational analysis was applied to the IDRs in isolation from the rest of the protein and under certain conditions, which may influence their overall conformational properties, these correlations further reinforce the potential impact of these physicochemical properties on their structural organization and dimensions.
The C-terminal region of Arabidopsis MADS-domain proteins has a high propensity for phosphorylation
As phosphorylation is closely associated with protein disorder [94], we predicted the phosphorylation propensity of Type I and Type II MADS-domain proteins using the NetPhos algorithm. When comparing the C-terminal sequences of both MADS-domain protein types, we found that the Type II C-terminal contains a similar number of phosphorylation sites compared to those of the Type I C-terminal region (number of phosphorylation sites/ C-terminal length, permutation Wilcoxon test, Z = −1.8163, p = 0.069) (Fig 2B). In contrast, Type I MADS-domain proteins exhibited more abundance of predicted phosphorylation sites in regions outside of their C-terminal region (Fig 2B).
To further investigate the significance of the phosphorylation sites between the N-terminal and the C-terminal regions of MADS-domain proteins, we examined experimentally validated phosphorylation sites in Arabidopsis using two actively curated databases, ATHENA and EPSD. Compared with the large number of predicted sites from NetPhos, experimentally confirmed sites are less represented in Arabidopsis MADS-domain proteins. For Type I MADS-domain proteins, we found 48 experimentally verified sites in the N-terminal region and 17 sites in the C-terminal region. For Type II MADS-domain proteins, there were 18 sites in the N-terminal and 13 in the C-terminal regions (S5 Table).
The disordered C-terminal is a conserved feature of MADS-domain proteins across diverse taxa
Type I and Type II Arabidopsis MADS-domain proteins contain two or three domains, respectively, which we grouped into the N-terminal region (Fig 2A). As both types have a large, disordered C-terminal region (Fig 2A), we investigated whether this characteristic is more broadly conserved across organisms within the Eukarya domain. As a proof of concept, we only examined MADS-domain proteins from three of the four kingdoms within the Eukarya domain: Plantae, Fungi, and Animalia [95]. We selected sequences from Homo sapiens [5], Drosophila melanogaster [2], and Saccharomyces cerevisiae [4] and used Arabidopsis MADS-domain protein sequences as a reference due to their extensive characterization. These sequences were compared with those of various phyla within Plantae including Chlorophytes, Charophytes, Gymnospermae, and Angiospermae. We extended this comparison to model species within Animalia (Chordata and Arthropoda) and Fungi (Ascomycota) (Fig 3 and S2 Fig). This analysis showed that all the selected MADS-domain proteins of non-plant species contain a disordered C-terminal region and that at least for these proteins, these regions are considerably longer than in most plant MADS-domain proteins (Fig 3 and S2 Fig). Moreover, regardless of the taxonomic group to which a species belongs, the C-terminal region consistently presents a higher level of disorder compared to the full-length protein, either with ADS or PPIDR (S3 Fig).
[Figure omitted. See PDF.]
The phylogenetic tree presented here is a simplified and rescaled version of the original presented in Supplementary Fig 2, with branch lengths adjusted for clarity. Deeper nodes within Type I MADS-domain protein clades exhibit low phylogenetic support, and their relationships should be interpreted with caution. In contrast, the more derived nodes, particularly among Type II proteins, show stronger phylogenetic support, consistent with previously published phylogenies. PPIDR and ADS values for each protein are shown in the columns to the right of the diagram.
The disordered C-terminal regions of Arabidopsis MADS-domain proteins contain conserved motifs and MoRFs
To further characterize the C-terminal region of the MADS-domain proteins, we looked for conserved motifs within this region and identified three distinct motifs both in Type I and in Type II MADS-domain proteins (Fig 4). Among the Arabidopsis MADS-domain proteins, there is a conserved pattern of motif distribution, with motifs shared in a subfamily-specific manner. Within the β subfamily of Type I MADS-domain proteins [24], 10 out of 19 proteins (Fig 4) contain both motif 1 and motif 2, whereas seven proteins from both the γ and β subfamilies possess only motif 2 (Fig 4). In the γ subfamily [24], 10 out of 17 proteins have motif 3 (Fig 4). Our analysis shows that none of the α subfamily of Type I MADS-domain proteins contain any distinctive motif.
[Figure omitted. See PDF.]
Motifs were detected using the MEME algorithm [121] based on the C-terminal sequences of Type I and (left panel) Type II MADS-domain proteins (right panel). Type I MADS-domain proteins do not contain molecular recognition features (MoRFs; purple rectangles) near or within the identified motifs. In contrast, in Type II MADS-domain proteins, MoRFs (purple rectangles) overlap with the predicted motifs, supporting their potential functional relevance. MorRFs were predicted using the fMoRF algorithm [123]. Consensus sequences of the identified motifs shown in panels (lower panel).
Among Type II MADS-domain proteins, three motifs are shared by members of different clades. In the SOC1 clade (SOC1, XAL2, AGL42, AGL19, AGL71 and AGL72), all members contain motif 4, which is not only shared by the SOC1 clade members but also by other proteins outside this clade, including SEP3, AGL24, SEP2, AGL15, and SVP (Fig 4, right panel). Within the FLC clade, all members shared motif 5, suggesting functional constraints and convergent acquisition (Fig 4). Motif 6 is exclusive to CAL and AP1, which are partially redundant in floral meristem determination and belong to the AP1 clade [96]. Additionally, we selected various SOC1 orthologs and members of the FLC clade to further characterize their conserved motifs. For SOC1 orthologs, the consensus motif is predicted to span 19 residues, with a core of 10 highly conserved residues across angiosperms, except in Populus trichocarpa (S4A Fig). The FLC consensus motif is 21 amino acids long and contains at least eight conserved residues , primarily located in the latter portion of the sequence [24] (S4B Fig).
We also evaluated the Molecular Recognition Features (MoRFs) across the full-length MADS-domain proteins to determine whether the ubiquitous IDRs in the C-terminal coincide with any MoRFs. Since MoRFs are known to mediate molecular recognition and are proposed to facilitate specific interactions among proteins, we hypothesized that the C-terminal would contain at least one predicted MoRF. Indeed, we found that several MADS-domain proteins, regardless of their species of origin, have a MoRF within the last 10 amino acids of their C-terminal region (S5 Fig). Moreover, several MADS-domain proteins show 1–4 amino acid MoRFs at the beginning of their N-terminal region. Intriguingly, one of these proteins is AG, which also has an IDR of at least 30 amino acids in the N-terminal region. Interestingly, the MoRFs in SOC1, AGL24, AGL42 and SVP proteins were found associated with the SOC1 motif (Fig 4). Similarly, the FLC motif also shows predicted MoRFs within the last 5–7 amino acids, located towards the end of the motif in all the proteins of the FLC clade. In contrast, Type I MADS-domain proteins do not present any MoRFs coinciding with identified motifs (Fig 4).
The disordered C-terminal regions of MADS-domain proteins contain potential activation domains
Activation domains (AD) in TFs play a central role in the function of these proteins, as they constitute the recruiting sites of coactivator complexes to activate transcription [97]. A connection between IDRs and ADs has been established in several transcription factor families, highlighting the importance of structural flexibility in recognizing diverse molecular targets according to the cellular conditions [98,99]. The highly conserved presence of a C-terminal IDR in Type I and Type II MADS-domain proteins prompted us to search for potential ADs in this region. Using the plant activation domain identification (PADI) score developed by Morffy et al. (2024) [72] to analyze these C-terminal regions, we identified high-scoring regions in 34 (58.3%) of Type I MADS-domain proteins, indicating the presence of a high proportion of potential ADs, compared to the 9 (22.5%) of Type II MADS-domain proteins with a high PADI score (Fig 5 and S2 Table).
[Figure omitted. See PDF.]
The Predictive Activation Domain Index (PADI), or scaled activation score, predicts the likelihood of putative activation domains (ADs), with fragments with a PADI score ≥1 classified as potential ADs [64]. In the graphs, dots indicate the positions of 40-amino-acid fragments predicted to containADs, while black stars mark experimentally validated ADs. Numbers along the horizontal lines show regions within the protein that are enriched in putative ADs. Colored boxes represent the defined domains in Type I and Type II MADS-domain proteins: MADS domain (yellow), I domain (green), K domain (blue), and C-terminal region (red).
Although not all MADS-domain proteins exhibited potential ADs, numerous regions with a high PADI score were found in some of them. To look for possible AD distribution patterns, we graphed the localization of the high PADI scoring regions across the C-terminal region of the MADS-domain proteins. Given the large amount of Type I proteins, we grouped them by subfamilies (Mα, Mβ, and Mγ) as described for Type I MADS-domain proteins to improve clarity in the analysis [24,36]. The C-terminal region of Type I Mα MADS-domain proteins showed a higher abundance of potential ADs (208–341). In contrast, the highest abundance of putative ADs in Type I Mβ was found from 111 to 324 amino acid residues. For the Type I Mγ subfamily, the highest AD abundance was found in two distinct sections (161–278 and 281–339) of their central region. In the case of Type II MADS-domain proteins, potential ADs were more evenly distributed across the region between 171–255 amino acid residues (Fig 5 and S2 Table).
Additionally, for Type I MADS-domain proteins, we found that putative ADs are associated with motifs previously identified in one member of the Mβ subfamily (AGL81), and one member of the Mγ subfamily (PHE1) [36]. Similarly, for Type II MADS-domain proteins, some ADs are associated with different motifs identified in one member of the SOC clade (AGL19), in two members of the SQUA clade (CAL, AP1), and two members of the SEP clade (SEP2 and SEP3) [24]. The overlap of specific motifs within particular phylogenetic groups of MADS-domain proteins supports the functional significance of these potential ADs.
MADS-domain proteins show a propensity for liquid-liquid phase separation
Some proteins can form compartments within cells where certain proteins, RNA, and metabolites concentrate to orchestrate specific cellular processes. These compartments are generated via Liquid-Liquid Phase Separation (LLPS), a process in which certain molecules concentrate to form a new liquid phase distinct from the surroundings. Some proteins are considered droplet drivers according to their likelihood of forming a droplet state via pLLPS. In this state, protein interactions can occur in different binding configurations, making IDPs common components of these structures. To explore the propensity of MADS-domain proteins to spontaneously undergo LLPS and form condensed cellular states, we used the FuzDrop server [89]. The FuzDrop algorithm defines a probability threshold value (pLLPS ≥ 0.60) to identify proteins capable of phase separation and driving droplet formation. This analysis showed that among Type I MADS-domain proteins and their C-terminal domain, the LLPS propensity is widely distributed between proteins with low or high ADS or PPIDR, finding 21 proteins with LLPS propensity (AGL103, AGL93, AGL89, AGL53, AGL74, AGL48, AGL77, AGL102, AGL60, AGL64, AGL23, AGL56, AGL92, AGL98, AGL75, AGL52, AGL76, AGL45, AGL81, AGL43, AGL29), and three more when analyzed only the C-terminal region (AGL91, AGL99, AGL86), except six of the first proteins (AGL60, AGL64, AGL98, AGL52, AGL76, AGL81) (Fig 6 and S4 Table). Regarding Type II MADS-domain proteins, only four proteins showed LLPS propensity (AGL104, AGL79, AGL66, GOA), whose ADS are between 0.4 and 0.6 (S4 Table). This number increased to 17 more when only the C-terminal region was analyzed (including SEP1, SEP4, AGL13, AGL15, AGL18, FCL, MAF1, AP1, AG, AGL19, AGL67, AGL24, SHP2, MAF5, AGL17, AGL72, MAF4) (Fig 6 and S4 Table).
[Figure omitted. See PDF.]
The LLPS probability index (pLLPS) was calculated using FuzPred, while the ADS and PPIDR were derived from RIDAO predictions. The scatterplots illustrate the relationship between the pLLPS index and ADS (A) or PPIDR (B) for the full-length proteins (dark diamonds) and the C-terminal (light grey diamonds), for Type I and Type II MADS-domain proteins.
Discussion
The Arabidopsis MADS-domain proteins participate in nearly all developmental processes and are also involved in many different stress responses [11,29,100]. These proteins are divided into two groups based on their phylogenetic relationship and protein domains: Type I with three domains (M, I, and C), and Type II with four domains (M, I, K, and C) [24,35,101,102].
In this work, we demonstrated through in silico analyses that the C-terminal regions of 100 Arabidopsis MADS-domain proteins listed in UniProt are enriched with IDRs (≥30 residues), with no significant differences in the number of IDRs between Type I and Type II proteins. Although we haven’t been able to find information regarding the functional characterization of these regions in Type I proteins, several examples underscore the significance of the C-terminal domain in the function of Type II MADS-domain proteins. In particular, specific phenotypes and altered protein-protein interactions in Type II MADS-domain proteins have been associated with point mutations or deletions within IDRs in the C-terminal region, highlighting their functional importance (Fig 7 and S6 Table). For instance, in Arabidopsis, Raphanus sativus, Nicotiana sylvestris and N. tabacum, the C-terminal regions of APETALA 1 (AP1) and its orthologs have been shown to mediate transcriptional activation in yeast and mammalian cells [101]. Additionally, three AP1 loss-of-function mutants (ap1–4, ap1–6 and ap1–8), all with mutations in the C-terminal region, exhibit different phenotypes [103]. Similarly, the C-terminal domains of GLOBOSA (GLO) and DEFICIENS (DEF) in Antirrinhum majus are critical for the interaction between GLO and DEF and between DEF and SQUAMOSA (SQUA) [103]. Furthermore, the C-terminal domain plays a central role in mediating interactions between MADS-domain proteins and non-MADS-domain proteins. For instance, the co-repressors SEUSS (SEU) and LEUNIG (LUG) interact with AP1 or SEP3 through their C-terminal domains [104]. In Arabidopsis, the K and C-terminal domains of AGAMOUS (AG) are also indispensable for DNA binding [105,106]. Interestingly, a small IDR in the N-terminal of AG, located before the MADS-domain (Fig 1), is essential for its function, as constructs lacking this IDR exhibit a phenotype like an AP2 mutant. Moreover, overexpression of AG without its C-terminal domain results in a phenotype like that of the AG loss-of-function mutant (ag), indicating that the C-terminal domain participates in AG functions [106]. For SEP3, the interaction between helices in the N-terminal domain and those in the C-terminal domain of different partner proteins creates a hydrophobic interface that facilitates dimerization [107]. This supports the hypothesis that the C-terminal domain participates in the stabilization of protein complexes. The dimerization of TFs has an important role in regulating heterodimerization, enabling dynamic temporal responses to changes in protein concentrations, among other functions [108]. Given that most MADS-domain proteins function as homo or heterodimers [107], the role of IDRs in their C-terminal region becomes of particular interest.
[Figure omitted. See PDF.]
Top left panel – Schematic representation of Type I and Type II MADS proteins. Different functional domains are color-coded, with the C-terminal region illustrated as a red ribbon. In the C-terminal IDR are Activation Domains (ADs), Molecular Recognition Features (MoRFs) or motifs, and phosphorylation sites (see color-codes to the left of the diagrams). These features are associated with promoting protein–protein interactions both between MADS-domain proteins and with other regulatory partners. Some MADS-domain proteins display a high propensity for liquid–liquid phase separation, suggesting their involvement in the formation of biomolecular condensates. Such transcription factor condensates have been reported to enhance target gene expression and, in some cases, to recruit components of the transcriptional machinery (e.g., RNA Pol II). In some instances, deletion of the C-terminal region of MADS-domain proteins disrupts their protein–protein interactions, potentially impairing dimer formation and resulting in reduced activation of target gene activation.
The analysis conducted in this study also revealed that structural disorder in the C-terminal region of MADS-domain proteins is widely conserved across diverse taxa, including Drosophila, yeast, and human MADS-domain proteins. In humans, there are four MADS-domain proteins (MEF2A-D) mainly involved in neural development, muscle formation, heart development, and carcinogenesis. Of note, like some Arabidopsis MADS-domain proteins, the C-terminal domain of human MEF2B and MEF2D is required for transactivation [109,110]. Furthermore, phosphorylation at specific sites within the C-terminal domain of MEF2D has been shown to inhibit its transcriptional activity [111]. Consistent with these observations, our analysis also found a high frequency of predicted phosphorylation sites in the C-terminal domain of both Type I and Type II Arabidopsis MADS-domain proteins. This finding also aligns with previous observations showing a higher phosphorylation propensity in IDRs compared to ordered regions across entire proteomes [94]. It underscores the functional relevance of these post-translational modifications and highlights the prevalence of multisite phosphorylation within disorder regions. Although some of the sites predicted from the NetPhos may not be functional and/or are still waiting for experimentally testing, the relatively balanced distribution of experimentally tested sites across both regions in Type II suggests functional relevance in both the N-terminal and C-terminal regions. Furthermore, the higher representation of sites in the N-terminal regions (M, I, and K domains) may partly reflect a historical research focus on the DNA-binding function of the M domain. On the contrary, the C-terminal region is more variable and has not been throughly studied and might play an important but underappreciated role in MADS-domain protein function. The presence of multiple phosphorylation sites in IDRs has been associated with their role in modulating gradual cellular responses and mediating protein-protein interactions, emphasizing the regulatory role of phosphorylation within disordered domains [112–115]. Further investigation into the function of the predicted phosphorylation sites within the C-terminal disordered region of MADS-domain proteins will enhance our understanding of the mechanisms by which these TFs regulate gene expression.
By searching for conserved motifs within the disordered C-terminal region of MADS-domain proteins, we identified three distinct and specific motifs in Type I and three for Type II MADS-domain proteins. Interestingly, motif 3 identified in Type I MADS-domain proteins corresponds to a conserved region previously reported by De Bodt et al. (2003) [116]. Furthermore, some of these motifs were found in all the members of different clades, suggesting that they are associated with particular functions mediated by specific interactions and shared between the related proteins. This is the case of two motifs found in Type I MADS-domain proteins that are shared among some MADS-domain proteins of Mγ and Mβ clades [24]. Motif 4 and 5, identified in Type II MADS-domain proteins (Fig 4B), is conserved across all SOC1-clade and FLC-clade proteins, respectively, highlighting its potential functional significance. Conducting motif-swapping experiments among SOC1 and FLC clade members and MADS-domain proteins that naturally lack this motif would provide valuable insights into its functional significance. The presence of clade-specific motifs in MADS-domain proteins may be attributed to the highly conserved interaction networks within different plant MADS-domain protein clades [117]. Interactions among MADS-domain proteins are restricted by their highly conserved domains in the complete proteins [117]. Conserved motifs within the variable C-terminal region identified in this study might also be involved in facilitating and stabilizing the interactions between MADS-domain proteins.
Several examples in plants demonstrate that transcriptional activation depends on the recruitment of coactivators. In MADS-domain proteins, tetramer formation increases the DNA regions available for transcriptional binding, enhancing their regulatory capacity [107,118]. Using the data from activation domains (ADs) obtained by Morffy et al. (2024) [72], we showed that a significant proportion of the potential ADs for MADS-domain TFs are in their C-terminal IDR. Moreover, we found that some of the identified ADs overlap with a conserved motif in the MADS-domain TFs of the SOC1 clade. This is particularly evident in the C-terminal of Type II MADS-domain proteins. The conserved motif in the C-terminal IDR of SEP homologs across different angiosperms [119], which corresponds to motif 4 in the SOC TFs, coincides with one of the ADs identified by Morffy et al. (2024) [72]. This overlap strongly suggests that this motif has a functional significance. We made similar observations for AP1 TFs, where ADs, characterized by the presence of acidic, proline-rich, and glutamine-rich subdomains, have been experimentally identified in their C-terminal IDRs [101].
Interestingly, Type I MADS-domain proteins exhibit more putative ADs than Type II and this could reflect specific roles for Type I MADS-domain proteins not only in the female gametophyte development but also in seed development [28,120–124]. Unfortunately, we were unable to find any study specifically analyzing the functional relevance of the C-terminal domains or any other regions of these proteins. This outcome adds further interest to the findings presented here and encourages future research into the role of the potential ADs in mediating the association between MADS-domain proteins and their co-regulators.
Finally, our analysis revealed that some MADS-domain proteins have a propensity to undergo liquid-liquid phase separation (LLPS). Nevertheless, even among those with high disorder scores, particularly within Type II MADS-domain proteins, not all appear capable of undergoing LLPS. This discrepancy may be influenced by the fact that our analysis was based solely on protein primary structures, without accounting for possible posttranslational modifications such as phosphorylation. As we show, MADS-domain TFs could be phosphorylated at multiple sites. These posttranslational modifications, whether occurring at one or several sites, could be implicated in the promotion of LLPS, with the extent of phase separation likely dependent on specific cellular conditions. Additionally, the LLPS propensity obtained in this analysis is in agreement with findings showing that the capacity to drive LLPS is not determined solely by the disordered nature of the sequence. Instead, it depends on specific sequence features, such as the distribution and patterning of aromatic and charged residues. These sequence-encoded patterns are essential for enabling the multivalent interactions required for the formation of biomolecular condensates [125]. When only the C-terminal region of MADS-domain proteins is analyzed for LLPS propensity, the effect of intrinsic disorder shifts the scores upward. A greater number of proteins present LLPS scores above 0.6 compared to analyses of the full-length proteins, supporting the contribution of IDRs to protein condensation propensity. Furthermore, some studies have suggested that ADs can drive TF phase separation, leading to the formation of transcriptional condensates associated with chromatin [126–128]. However, recent findings indicate that this mechanism is not universal, as the recruitment of activators can also occur independently of phase separation. Important events that enhance transcriptional activation include the multivalent interactions mediated by the ADs, which increase the residence time of TFs on chromatin and thereby promote the recruitment of coactivators [97].
This study highlights common characteristics shared by MADS-domain proteins, not only in plants but also across organisms from other domains of life. Notably, the presence of a disordered C-terminal region in Type I and Type II MADS-domain TFs stands out. The functional significance of this region is strongly supported by the identification of several potential ADs and phosphorylation sites. Furthermore, we identified not only putative protein-protein interaction sites within this region but also conserved motifs specific to evolutionary related MADS-domain proteins, further supporting their role in the transcriptional regulatory function of these TFs.
The remarkable conservation of the structurally disordered C-terminal region in MADS-domain TFs suggests specific biological functions for this region. Some of these may be associated with the presence of conserved motifs and/or phosphorylation sites. However, these elements correspond to short segments within a broader region that, based on primary sequence alignments, do not appear to be under strong evolutionary constraint. To date, the persistence of IDRs across the proteomes of all analyzed organisms remains an open question. Although some evolutionary approaches have attempted to identify conserved molecular features (i.e., NCPR, kappa, FCR, etc.) in the amino acid sequences of IDRs [129], the findings suggest that, while certain molecular features are linked to known functions in yeast IDRs and may reflect a mechanism of IDR evolution, this pattern does not appear to extend to multicellular organisms like Drosophila [130]. These observations show that although IDRs may follow unique patterns of amino acid substitutions, intrinsic disorder itself is subjected to dynamic evolutionary processes, shaped by more complex evolutionary constraints across evolving properties of different domains of life.
Considering the well-established role of MADS-domain TFs in regulating diverse developmental processes and stress responses, the conservation of a structurally disordered C-terminal region across all family members, along with the presence conserved motifs, potential activation domains and phosphorylation sites, suggest a shared regulatory mechanism. Based on these findings, we proposed a mechanistic model describing the functional role of specific structural elements within the C-terminal domain of both Type I and Type II MADS-domain TFs. The structural flexibility provided by the IDRs in the C-terminal domain, combined with the presence of MoRFs, ADs, and a high phosphorylation propensity, points to a regulatory role in modulating the MADS-domain TF activity within transcriptional complexes (Fig 7, S6 Table). In this model, these IDRs confer both structural flexibility and modularity, facilitating the formation of dynamic protein complexes. This may occur either through a high propensity for liquid-liquid phase separation (LLPS) or by promoting transient interactions with other proteins, both of which support the formation of transcriptional condensates (Fig 7). In the present study, we identified MoRFs and ADs within the C-terminal IDRs, where MoRFs likely mediate specific, transient interactions with regulatory proteins or other MADS-domain TFs in concert with ADs, these elements may modulate the assembly and stability of transcriptional complexes. Previous reports suggest that the formation of condensates enriched in transcription factors enables fine-tuned regulation of transcriptional activity, particularly in response to physiological or developmental cues [131,132]. Within such condensates, a variety of proteins can interact with TF IDRs through their ADs and LLPS-driven mechanisms further supporting the functional significance of the C-terminal IDRs in MADS-domain TFs.
Overall, this study provides valuable insights for a deeper understanding of the relationship between protein structure and function for MADS-domain TFs. We believe this information will encourage and support further experimental studies by researchers working on MADS-domain proteins in diverse biological systems, especially to investigate the functional relevance of these conserved regions.
Supporting information
S1 Table. MADS-domain proteins of Arabidopsis, other plant species, and non-plant species used in this study.
https://doi.org/10.1371/journal.pone.0330098.s001
(DOCX)
S2 Table. MADS-domain proteins analyzed by Morffy et al. (2024) [64].
AD = activation domain. AD, Maybe and Not AD definitions are given based on the PADI (plant activation domain identification). PADI score ≥1 and mean disorder >0.5 are defined as “AD”; PADI score ≥1 and mean disorder de ≤ 0.5 are defined as “Maybe”; PADI <1 are defined as “Not AD”.
https://doi.org/10.1371/journal.pone.0330098.s002
(DOCX)
S3 Table. R packages used in this research.
These packages are available from CRAN (https://CRAN.R-project.org/) or Bioconductor (Huber et al., 2015).
https://doi.org/10.1371/journal.pone.0330098.s003
(DOCX)
S4 Table. Arabidopsis MADS-domain proteins, disordered regions and phosphorylation sites.
https://doi.org/10.1371/journal.pone.0330098.s004
(XLSX)
S5 Table. Experimentally tested phosphorylation sites obtained from two different databases Athena (https://athena.proteomics.wzw.tum.de/master_arabidopsisshiny/), and EPSD (https://epsd.biocuckoo.cn/Browse.php).
Sites were mapped on domains, according to UniProt and Liu et al., 2021 definition. See main text for a detailed description.
https://doi.org/10.1371/journal.pone.0330098.s005
(DOCX)
S6 Table. Experimental evidence supporting the functional relevance of the C-terminal region in MADS-domain proteins across different species.
https://doi.org/10.1371/journal.pone.0330098.s006
(DOCX)
S1 Fig. Physicochemical properties of Type I and Type II MADS-domain proteins.
(A) Net charge per residue (NCPR), (B) Charge distribution (Kappa), and (C) Fraction of Charged Residues (FCR). Whiskers indicate ±1.5*IQR based on Tukey test. The middle line represents the median.
https://doi.org/10.1371/journal.pone.0330098.s007
(TIF)
S2 Fig. A complete Maximum Likelihood (ML) phylogeny of Arabidopsis MADS-domain proteins including MADS-domain proteins of other plant species and non-plant species.
Oryza sativa japonica (Os), Solanum dulcamara (SD), Mangifera indica (MI), Populus trichocarpa (PT), Amborella trichopoda (AMB), Pinus radiata (Prad), Selaginella mollendorffii (SM), Chara braunii (CB), Chlorella dessiccata (CD), Saccharomyces cerevisiae (SC), Drosophila melanogaster (DM), and Homo sapiens (HS). Yellow-shaded branches cover Type I MADS grouped with SRF-like and MEF-like MADS-domain proteins. Purple-shaded branches cover most Type II MADS-domain proteins. Numbers adjacent to nodes represent bootstrap support. Orange dots at particular nodes indicate the putative ancestral motif for that specific clade.
https://doi.org/10.1371/journal.pone.0330098.s008
(TIF)
S3 Fig. Structural disorder in the C-terminal region and the full-length proteins of MADS-domain transcription factors from different organisms.
Boxplots showing the ADS and PPIDR values for the C-terminal region and the full-length proteins from: (A) Plant species analysed in this study, excluding Arabidopsis. (B) Saccharomyces cerevisiae, (C) Homo sapiens, and (D) Drosophila melanogaster. Whiskers indicate ±1.5*IQR according to the Tukey test. The middle line represents the median.
https://doi.org/10.1371/journal.pone.0330098.s009
(TIF)
S4 Fig. Distinctive motifs in MADS-domain proteins of SOC1 (A) and FLC (B) clades.
The phylogenetic tree was derived from the complete MADS-domain protein tree shown in Supplementary Fig 2. To enhance the visualization of phylogenetic relationships among the proteins, branch lengths were rescaled and truncated. The conserved amino acid residues of the SOC1-motif and FLC-motif are highlighted in bold within the consensus motif.
https://doi.org/10.1371/journal.pone.0330098.s010
(TIF)
S5 Fig. Identified MoRFs in MADS-domain proteins.
MADS-domain protein sequences from Arabidopsis, other plants, and non-plant organisms are shown, with predicted MoRFs highlighted in yellow Putative MoRFs were identified using the fmoRFpred algorithm [123], based on the analysis of full-length protein sequences.
https://doi.org/10.1371/journal.pone.0330098.s011
(TIF)
Acknowledgments
The first authors wish to thank Consejo Nacional de Humanidades, Ciencias y Tecnología (CONAHCyT) for the postdoctoral scholarships granted (E.R.A., CVU number 413896; T.N.R., CVU 501149).
References
1. 1. Wells CL, Pigliucci M. Adaptive phenotypic plasticity: the case of heterophylly in aquatic plants. Perspect Plant Ecol Evol Syst. 2000;3:1–18.
* View Article
* Google Scholar
2. 2. Sultan SE. Phenotypic plasticity for plant development, function and life history. Trends Plant Sci. 2000;5(12):537–42. pmid:11120476
* View Article
* PubMed/NCBI
* Google Scholar
3. 3. Spitz F, Furlong EEM. Transcription factors: from enhancer binding to developmental control. Nat Rev Genet. 2012;13(9):613–26. pmid:22868264
* View Article
* PubMed/NCBI
* Google Scholar
4. 4. Schwechheimer C, Bevan M. The regulation of transcription factor activity in plants. Trends Plant Sci. 1998;3:378–83.
* View Article
* Google Scholar
5. 5. Mosa KA, Ismail A, Helmy M. Plant Stress Tolerance. Springer International Publishing. 2017.
6. 6. Hoang XLT, Nhi DNH, Thu NBA, Thao NP, Tran LSP. Transcription factors and their roles in signal transduction in plants under abiotic stresses. Curr Genomics. 2017;18:483–97.
* View Article
* Google Scholar
7. 7. Kaufmann K, Muiño JM, Jauregui R, Airoldi CA, Smaczniak C, Krajewski P, et al. Target genes of the MADS transcription factor SEPALLATA3: integration of developmental and hormonal pathways in the Arabidopsis flower. PLoS Biol. 2009;7(4):e1000090. pmid:19385720
* View Article
* PubMed/NCBI
* Google Scholar
8. 8. Yu L-H, Miao Z-Q, Qi G-F, Wu J, Cai X-T, Mao J-L, et al. MADS-box transcription factor AGL21 regulates lateral root development and responds to multiple external and physiological signals. Mol Plant. 2014;7(11):1653–69. pmid:25122697
* View Article
* PubMed/NCBI
* Google Scholar
9. 9. Babu MM, Luscombe NM, Aravind L, Gerstein M, Teichmann SA. Structure and evolution of transcriptional regulatory networks. Curr Opin Struct Biol. 2004;14(3):283–91. pmid:15193307
* View Article
* PubMed/NCBI
* Google Scholar
10. 10. Melzer R, Theissen G. MADS and more: transcription factors that shape the plant. Methods Mol Biol. 2011;754:3–18. pmid:21720944
* View Article
* PubMed/NCBI
* Google Scholar
11. 11. Castelán-Muñoz N, Herrera J, Cajero-Sánchez W, Arrizubieta M, Trejo C, García-Ponce B. MADS-Box genes are key components of genetic regulatory networks involved in abiotic stress and plastic developmental responses in plants. Front Plant Sci. 2019;10:853.
* View Article
* Google Scholar
12. 12. Schwarz-Sommer Z, Huijser P, Nacken W, Saedler H, Sommer H. Genetic Control of Flower Development by Homeotic Genes in Antirrhinum majus. Science. 1990;250(4983):931–6. pmid:17746916
* View Article
* PubMed/NCBI
* Google Scholar
13. 13. Wu W, de Folter S, Shen X, Zhang W, Tao S. Vertebrate paralogous MEF2 genes: origin, conservation, and evolution. PLoS One. 2011;6(3):e17334. pmid:21394201
* View Article
* PubMed/NCBI
* Google Scholar
14. 14. Acton TB, Zhong H, Vershon AK. DNA-binding specificity of Mcm1: operator mutations that alter DNA-bending and transcriptional activities by a MADS box protein. Mol Cell Biol. 1997;17(4):1881–9. pmid:9121436
* View Article
* PubMed/NCBI
* Google Scholar
15. 15. Dichoso D, Brodigan T, Chwoe KY, Lee JS, Llacer R, Park M, et al. The MADS-Box factor CeMEF2 is not essential for Caenorhabditis elegans myogenesis and development. Dev Biol. 2000;223(2):431–40. pmid:10882527
* View Article
* PubMed/NCBI
* Google Scholar
16. 16. Lilly B, Galewsky S, Firulli AB, Schulz RA, Olson EN. D-MEF2: a MADS-box transcription factor expressed in differentiating mesoderm and muscle cell lineages during Drosophila embryogenesis. Proc Natl Acad Sci. 1994;91:5662–6.
* View Article
* Google Scholar
17. 17. Pollock R, Treisman R. Human SRF-related proteins: DNA-binding properties and potential regulatory targets. Genes Dev. 1991;5(12A):2327–41. pmid:1748287
* View Article
* PubMed/NCBI
* Google Scholar
18. 18. Alvarez-Buylla ER, Liljegren SJ, Pelaz S, Gold SE, Burgeff C, Ditta GS, et al. MADS-box gene evolution beyond flowers: expression in pollen, endosperm, guard cells, roots and trichomes. Plant J. 2000;24(4):457–66. pmid:11115127
* View Article
* PubMed/NCBI
* Google Scholar
19. 19. Becker A, Theissen G. The major clades of MADS-box genes and their role in the development and evolution of flowering plants. Mol Phylogenet Evol. 2003;29(3):464–89. pmid:14615187
* View Article
* PubMed/NCBI
* Google Scholar
20. 20. Meng D, Cao Y, Chen T, Abdullah M, Jin Q, Fan H, et al. Evolution and functional divergence of MADS-box genes in Pyrus. Sci Rep. 2019;9(1):1266. pmid:30718750
* View Article
* PubMed/NCBI
* Google Scholar
21. 21. Shore P, Sharrocks AD. The MADS-box family of transcription factors. Eur J Biochem. 1995;229(1):1–13. pmid:7744019
* View Article
* PubMed/NCBI
* Google Scholar
22. 22. Qiu Y, Li Z, Walther D, Köhler C. Updated Phylogeny and Protein Structure Predictions Revise the Hypothesis on the Origin of MADS-box Transcription Factors in Land Plants. Mol Biol Evol. 2023;40(9):msad194. pmid:37652031
* View Article
* PubMed/NCBI
* Google Scholar
23. 23. Alvarez-Buylla ER, Pelaz S, Liljegren SJ, Gold SE, Burgeff C, Ditta GS, et al. An ancestral MADS-box gene duplication occurred before the divergence of plants and animals. Proc Natl Acad Sci U S A. 2000;97(10):5328–33. pmid:10805792
* View Article
* PubMed/NCBI
* Google Scholar
24. 24. Smaczniak C, Immink RGH, Angenent GC, Kaufmann K. Developmental and evolutionary diversity of plant MADS-domain factors: insights from recent studies. Development. 2012;139(17):3081–98. pmid:22872082
* View Article
* PubMed/NCBI
* Google Scholar
25. 25. Chen X, Gao B, Ponnusamy M, Lin Z, Liu J. MEF2 signaling and human diseases. Oncotarget. 2017;8(67):112152–65. pmid:29340119
* View Article
* PubMed/NCBI
* Google Scholar
26. 26. Mead J, Bruning AR, Gill MK, Steiner AM, Acton TB, Vershon AK. Interactions of the Mcm1 MADS box protein with cofactors that regulate mating in yeast. Mol Cell Biol. 2002;22(13):4607–21. pmid:12052870
* View Article
* PubMed/NCBI
* Google Scholar
27. 27. Messenguy F, Dubois E. Role of MADS box proteins and their cofactors in combinatorial control of gene expression and cell development. Gene. 2003;316:1–21. pmid:14563547
* View Article
* PubMed/NCBI
* Google Scholar
28. 28. Bemer M, Heijmans K, Airoldi C, Davies B, Angenent GC. An atlas of type I MADS-box gene expression during female gametophyte and seed development in Arabidopsis. Plant Physiol. 2010;154:287–300.
* View Article
* Google Scholar
29. 29. Theißen G, Kim JT, Saedler H. Classification and phylogeny of the MADS-box multigene family suggest defined roles of MADS-box gene subfamilies in the morphological evolution of eukaryotes. J Mol Evol. 1996;43:484–516.
* View Article
* Google Scholar
30. 30. Krizek BA, Meyerowitz EM. Mapping the protein regions responsible for the functional specificities of the Arabidopsis MADS domain organ-identity proteins. Proc Natl Acad Sci U S A. 1996;93(9):4063–70. pmid:8633017
* View Article
* PubMed/NCBI
* Google Scholar
31. 31. Riechmann JL, Krizek BA, Meyerowitz EM. Dimerization specificity of Arabidopsis MADS domain homeotic proteins APETALA1, APETALA3, PISTILLATA, and AGAMOUS. Proc Natl Acad Sci U S A. 1996;93(10):4793–8. pmid:8643482
* View Article
* PubMed/NCBI
* Google Scholar
32. 32. Yang Y, Jack T. Defining subdomains of the K domain important for protein-protein interactions of plant MADS proteins. Plant Mol Biol. 2004;55(1):45–59. pmid:15604664
* View Article
* PubMed/NCBI
* Google Scholar
33. 33. Lai X, Vega-Léon R, Hugouvieux V, Blanc-Mathieu R, van der Wal F, Lucas J, et al. The intervening domain is required for DNA-binding and functional identity of plant MADS transcription factors. Nat Commun. 2021;12:4760.
* View Article
* Google Scholar
34. 34. Fan HY, Hu Y, Tudor M, Ma H. Specific interactions between the K domains of AG and AGLs, members of the MADS domain family of DNA binding proteins. Plant J. 1997;12(5):999–1010. pmid:9418042
* View Article
* PubMed/NCBI
* Google Scholar
35. 35. Riechmann JL, Meyerowitz EM. MADS domain proteins in plant development. Biol Chem. 1997;378(10):1079–101. pmid:9372178
* View Article
* PubMed/NCBI
* Google Scholar
36. 36. Parenicová L, de Folter S, Kieffer M, Horner DS, Favalli C, Busscher J, et al. Molecular and phylogenetic analyses of the complete MADS-box transcription factor family in Arabidopsis: new openings to the MADS world. Plant Cell. 2003;15(7):1538–51. pmid:12837945
* View Article
* PubMed/NCBI
* Google Scholar
37. 37. Salladini E, Jørgensen MLM, Theisen FF, Skriver K. Intrinsic Disorder in Plant Transcription Factor Systems: Functional Implications. Int J Mol Sci. 2020;21(24):9755. pmid:33371315
* View Article
* PubMed/NCBI
* Google Scholar
38. 38. Dunker AK, Brown CJ, Lawson JD, Iakoucheva LM, Obradović Z. Intrinsic disorder and protein function. Biochemistry. 2002;41(21):6573–82. pmid:12022860
* View Article
* PubMed/NCBI
* Google Scholar
39. 39. Dunker AK, Lawson JD, Brown CJ, Williams RM, Romero P, Oh JS, et al. Intrinsically disordered protein. J Mol Graph Model. 2001;19(1):26–59. pmid:11381529
* View Article
* PubMed/NCBI
* Google Scholar
40. 40. Radivojac P, Obradovic Z, Smith DK, Zhu G, Vucetic S, Brown CJ. Protein flexibility and intrinsic disorder. Protein Sci. 2004;13:71–80.
* View Article
* Google Scholar
41. 41. Dyson HJ, Wright PE. Intrinsically unstructured proteins and their functions. Nat Rev Mol Cell Biol. 2005;6(3):197–208. pmid:15738986
* View Article
* PubMed/NCBI
* Google Scholar
42. 42. Tompa P. Intrinsically disordered proteins: A 10-year recap. Trends in Biochem Sci. 2012;37:509–16.
* View Article
* Google Scholar
43. 43. O’Shea C, Kryger M, Stender EGP, Kragelund BB, Willemoës M, Skriver K. Protein intrinsic disorder in Arabidopsis NAC transcription factors: transcriptional activation by ANAC013 and ANAC046 and their interactions with RCD1. Biochem J. 2015;465(2):281–94. pmid:25348421
* View Article
* PubMed/NCBI
* Google Scholar
44. 44. Stender EG, O’Shea C, Skriver K. Subgroup-specific intrinsic disorder profiles of arabidopsis NAC transcription factors: Identification of functional hotspots. Plant Signal Behav. 2015;10:e1010967.
* View Article
* Google Scholar
45. 45. Sun X, Jones WT, Rikkerink EHA. GRAS proteins: the versatile roles of intrinsically disordered proteins in plant signalling. Biochem J. 2012;442(1):1–12. pmid:22280012
* View Article
* PubMed/NCBI
* Google Scholar
46. 46. Valsecchi I, Guittard-Crilat E, Maldiney R, Habricot Y, Lignon S, Lebrun R, et al. The intrinsically disordered C-terminal region of Arabidopsis thaliana TCP8 transcription factor acts both as a transactivation and self-assembly domain. Mol Biosyst. 2013;9(9):2282–95. pmid:23760157
* View Article
* PubMed/NCBI
* Google Scholar
47. 47. Tarczewska A, Greb-Markiewicz B. The Significance of the Intrinsically Disordered Regions for the Functions of the bHLH Transcription Factors. Int J Mol Sci. 2019;20(21):5306. pmid:31653121
* View Article
* PubMed/NCBI
* Google Scholar
48. 48. Oldfield CJ, Cheng Y, Cortese MS, Romero P, Uversky VN, Dunker AK. Coupled folding and binding with alpha-helix-forming molecular recognition elements. Biochemistry. 2005;44(37):12454–70. pmid:16156658
* View Article
* PubMed/NCBI
* Google Scholar
49. 49. Yan J, Dunker AK, Uversky VN, Kurgan L. Molecular recognition features (MoRFs) in three domains of life. Mol Biosyst. 2016;12(3):697–710. pmid:26651072
* View Article
* PubMed/NCBI
* Google Scholar
50. 50. Mohan A, Oldfield CJ, Radivojac P, Vacic V, Cortese MS, Dunker AK, et al. Analysis of molecular recognition features (MoRFs). J Mol Biol. 2006;362(5):1043–59. pmid:16935303
* View Article
* PubMed/NCBI
* Google Scholar
51. 51. Vacic V, Oldfield CJ, Mohan A, Radivojac P, Cortese MS, Uversky VN, et al. Characterization of molecular recognition features, MoRFs, and their binding partners. J Proteome Res. 2007;6(6):2351–66. pmid:17488107
* View Article
* PubMed/NCBI
* Google Scholar
52. 52. Das RK, Pappu RV. Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues. Proc Natl Acad Sci U S A. 2013;110(33):13392–7. pmid:23901099
* View Article
* PubMed/NCBI
* Google Scholar
53. 53. Bhattarai A, Emerson IA. Dynamic conformational flexibility and molecular interactions of intrinsically disordered proteins. J Biosci. 2020;45:29. pmid:32020911
* View Article
* PubMed/NCBI
* Google Scholar
54. 54. Dunker AK, Cortese MS, Romero P, Iakoucheva LM, Uversky VN. Flexible nets. FEBS Journal. 2005;272:5129–48.
* View Article
* Google Scholar
55. 55. Alberti S. The wisdom of crowds: regulating cell function through condensed states of living matter. J Cell Sci. 2017;130(17):2789–96. pmid:28808090
* View Article
* PubMed/NCBI
* Google Scholar
56. 56. Alberti S, Hyman AA. Biomolecular condensates at the nexus of cellular stress, protein aggregation disease and ageing. Nat Rev Mol Cell Biol. 2021;22(3):196–213. pmid:33510441
* View Article
* PubMed/NCBI
* Google Scholar
57. 57. TAIR - Home. Available from: https://www.arabidopsis.org/.
58. 58. UniProt. Available from: https://www.uniprot.org/.
59. 59. National Center for Biotechnology Information. National Center for Biotechnology Information. Available from: https://www.ncbi.nlm.nih.gov/.
60. 60. RAP-DB | HOME. Available from: https://rapdb.dna.affrc.go.jp/.
61. 61. Arora R, Agarwal P, Ray S, Singh AK, Singh VP, Tyagi AK, et al. MADS-box gene family in rice: genome-wide identification, organization and expression profiling during reproductive development and stress. BMC Genomics. 2007;8:242. pmid:17640358
* View Article
* PubMed/NCBI
* Google Scholar
62. 62. MAFFT alignment and NJ/ UPGMA phylogeny. Available from: https://mafft.cbrc.jp/alignment/server/.
63. 63. Katoh K, Rozewicki J, Yamada KD. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief Bioinform. 2019;20(4):1160–6. pmid:28968734
* View Article
* PubMed/NCBI
* Google Scholar
64. 64. Larsson A. AliView: a fast and lightweight alignment viewer and editor for large datasets. Bioinformatics. 2014;30(22):3276–8. pmid:25095880
* View Article
* PubMed/NCBI
* Google Scholar
65. 65. Jones DT, Taylor WR, Thornton JM. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992;8(3):275–82. pmid:1633570
* View Article
* PubMed/NCBI
* Google Scholar
66. 66. Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30(9):1312–3. pmid:24451623
* View Article
* PubMed/NCBI
* Google Scholar
67. 67. Miller MA, Pfeiffer W, Schwartz T. Creating the CIPRES Science Gateway for inference of large phylogenetic trees. In: 2010 Gateway Computing Environments Workshop, GCE. 2010.
68. 68. Portal | CIPRES. Available from: https://www.phylo.org/index.php/site.
69. 69. Meme - Submission Form. Available from: https://meme-suite.org/meme/tools/meme.
70. 70. Bailey TL, Johnson J, Grant CE, Noble WS. The MEME Suite. Nucleic Acids Research. 2015;43:W39-49.
* View Article
* Google Scholar
71. 71. fMoRFpred - fast Molecular Recognition Feature predictor. Available from: http://biomine.cs.vcu.edu/servers/fMoRFpred/.
72. 72. Morffy N, Van den Broeck L, Miller C, Emenecker RJ, Bryant JA Jr, Lee TM, et al. Identification of plant transcriptional activation domains. Nature. 2024;632(8023):166–73. pmid:39020176
* View Article
* PubMed/NCBI
* Google Scholar
73. 73. Piovesan D, Del Conte A, Clementel D, Monzon AM, Bevilacqua M, Aspromonte MC, et al. MobiDB: 10 years of intrinsically disordered proteins. Nucleic Acids Res. 2023;51:D438-44.
* View Article
* Google Scholar
74. 74. AlphaFold. Google Colab. Available from: https://colab.research.google.com/github/deepmind/alphafold/blob/main/notebooks/AlphaFold.ipynb.
75. 75. Meng EC, Goddard TD, Pettersen EF, Couch GS, Pearson ZJ, Morris JH, et al. UCSF ChimeraX: Tools for structure building and analysis. Protein Sci. 2023;32(11):e4792. pmid:37774136
* View Article
* PubMed/NCBI
* Google Scholar
76. 76. Annotation from 3D structure data. Available from: https://www.jalview.org/help/html/features/xsspannotation.html.
77. 77. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–9. pmid:34265844
* View Article
* PubMed/NCBI
* Google Scholar
78. 78. Wilson CJ, Choy WY, Karttunen M. AlphaFold2: A role for disordered protein/region prediction? Int J Mol Sci. 2022;23:4591.
* View Article
* Google Scholar
79. 79. RIDAO: Rapid Intrinsic Disorder Analysis Online. Available from: https://ridao.app/users/sign_in.
80. 80. Dayhoff GW 2nd, Uversky VN. Rapid prediction and analysis of protein intrinsic disorder. Protein Sci. 2022;31(12):e4496. pmid:36334049
* View Article
* PubMed/NCBI
* Google Scholar
81. 81. Shukla S, Lastorka SS, Uversky VN. Intrinsic Disorder and Phase Separation Coordinate Exocytosis, Motility, and Chromatin Remodeling in the Human Acrosomal Proteome. Proteomes. 2025;13(2):16. pmid:40407495
* View Article
* PubMed/NCBI
* Google Scholar
82. 82. Pappu Lab. Available from: https://pappulab.wustl.edu/CIDERinfo.html.
83. 83. Holehouse AS, Das RK, Ahad JN, Richardson MOG, Pappu RV. CIDER: Resources to Analyze Sequence-Ensemble Relationships of Intrinsically Disordered Proteins. Biophys J. 2017;112(1):16–21. pmid:28076807
* View Article
* PubMed/NCBI
* Google Scholar
84. 84. NetPhos 3.1 - DTU Health Tech - Bioinformatic Services. Available from: https://services.healthtech.dtu.dk/services/NetPhos-3.1/.
85. 85. Blom N, Gammeltoft S, Brunak S. Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol. 1999;294:1351–62.
* View Article
* Google Scholar
86. 86. Blom N, Sicheritz-Pontén T, Gupta R, Gammeltoft S, Brunak S. Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics. 2004;4(6):1633–49. pmid:15174133
* View Article
* PubMed/NCBI
* Google Scholar
87. 87. ATHENA. Available from: https://athena.proteomics.wzw.tum.de/master_arabidopsisshiny/.
88. 88. Lin S, Wang C, Zhou J, Shi Y, Ruan C, Tu Y, et al. EPSD: a well-annotated data resource of protein phosphorylation sites in eukaryotes. Brief Bioinform. 2021;22(1):298–307. pmid:32008039
* View Article
* PubMed/NCBI
* Google Scholar
89. 89. Hatos A, Tosatto SCE, Vendruscolo M, Fuxreiter M. FuzDrop on AlphaFold: visualizing the sequence-dependent propensity of liquid-liquid phase separation and aggregation of proteins. Nucleic Acids Res. 2022;50(W1):W337–44. pmid:35610022
* View Article
* PubMed/NCBI
* Google Scholar
90. 90. FuzDrop. Available from: https://fuzdrop.bio.unipd.it/predictor
91. 91. Hothorn T, Van De Wiel MA, Hornik K, Zeileis A. Implementing a class of permutation tests: The coin package. J Stat Softw. 2008;28:1–23.
* View Article
* Google Scholar
92. 92. R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. 2024.
93. 93. GIMP Development Team. GNU Image Manipulation Program (GIMP), Version 3.0.4. Community. 2025.
94. 94. Iakoucheva LM, Radivojac P, Brown CJ, O’Connor TR, Sikes JG, Obradovic Z, et al. The importance of intrinsic disorder for protein phosphorylation. Nucleic Acids Res. 2004;32(3):1037–49. pmid:14960716
* View Article
* PubMed/NCBI
* Google Scholar
95. 95. Margulis L, Chapman MJ. Kingdoms and Domains: An Illustrated Guide to the Phyla of Life on Earth, Fourth Edition. Elsevier. 2009.
96. 96. Alvarez-Buylla ER, García-Ponce B, Garay-Arroyo A. Unique and redundant functional domains of APETALA1 and CAULIFLOWER, two recently duplicated Arabidopsis thaliana floral MADS-box genes. J Exp Bot. 2006;57(12):3099–107. pmid:16893974
* View Article
* PubMed/NCBI
* Google Scholar
97. 97. Trojanowski J, Frank L, Rademacher A, Mücke N, Grigaitis P, Rippe K. Transcription activation is enhanced by multivalent interactions independent of phase separation. Mol Cell. 2022;82(10):1878–1893.e10. pmid:35537448
* View Article
* PubMed/NCBI
* Google Scholar
98. 98. Liu J, Perumal NB, Oldfield CJ, Su EW, Uversky VN, Dunker AK. Intrinsic disorder in transcription factors. Biochemistry. 2006;45(22):6873–88. pmid:16734424
* View Article
* PubMed/NCBI
* Google Scholar
99. 99. Dyson HJ, Wright PE. Role of Intrinsic Protein Disorder in the Function and Interactions of the Transcriptional Coactivators CREB-binding Protein (CBP) and p300. J Biol Chem. 2016;291(13):6714–22. pmid:26851278
* View Article
* PubMed/NCBI
* Google Scholar
100. 100. Vandenbussche M, Theissen G, Van de Peer Y, Gerats T. Structural diversification and neo-functionalization during floral MADS-box gene evolution by C-terminal frameshift mutations. Nucleic Acids Res. 2003;31(15):4401–9. pmid:12888499
* View Article
* PubMed/NCBI
* Google Scholar
101. 101. Cho S, Jang S, Chae S, Chung KM, Moon YH, An G, et al. Analysis of the C-terminal region of Arabidopsis thaliana APETALA1 as a transcription activation domain. Plant Mol Biol. 1999;40(3):419–29. pmid:10437826
* View Article
* PubMed/NCBI
* Google Scholar
102. 102. Kaufmann K, Melzer R, Theissen G. MIKC-type MADS-domain proteins: structural modularity, protein interactions and network evolution in land plants. Gene. 2005;347(2):183–98. pmid:15777618
* View Article
* PubMed/NCBI
* Google Scholar
103. 103. Egea-Cortines M, Saedler H, Sommer H. Ternary complex formation between the MADS-box proteins SQUAMOSA, DEFICIENS and GLOBOSA is involved in the control of floral architecture in Antirrhinum majus. EMBO J. 1999;18(19):5370–9. pmid:10508169
* View Article
* PubMed/NCBI
* Google Scholar
104. 104. Sridhar VV, Surendrarao A, Liu Z. APETALA1 and SEPALLATA3 interact with SEUSS to mediate transcription repression during flower development. Development. 2006;133(16):3159–66. pmid:16854969
* View Article
* PubMed/NCBI
* Google Scholar
105. 105. Shiraishi H, Okada K, Shimura Y. Nucleotide sequences recognized by the AGAMOUS MADS domain of Arabidopsis thaliana in vitro. Plant J. 1993;4(2):385–98. pmid:8106084
* View Article
* PubMed/NCBI
* Google Scholar
106. 106. Mizukami Y, Ma H. Determination of Arabidopsis floral meristem identity by AGAMOUS. Plant Cell. 1997;9(3):393–408. pmid:9090883
* View Article
* PubMed/NCBI
* Google Scholar
107. 107. Puranik S, Acajjaoui S, Conn S, Costa L, Conn V, Vial A, et al. Structural basis for the oligomerization of the MADS domain transcription factor SEPALLATA3 in Arabidopsis. Plant Cell. 2014;26(9):3603–15. pmid:25228343
* View Article
* PubMed/NCBI
* Google Scholar
108. 108. Amoutzias GD, Robertson DL, Van de Peer Y, Oliver SG. Choose your partners: dimerization in eukaryotic transcription factors. Trends Biochem Sci. 2008;33(5):220–9. pmid:18406148
* View Article
* PubMed/NCBI
* Google Scholar
109. 109. Martin JF, Miano JM, Hustad CM, Copeland NG, Jenkins NA, Olson EN. A Mef2 gene that generates a muscle-specific isoform via alternative mRNA splicing. Mol Cell Biol. 1994;14(3):1647–56. pmid:8114702
* View Article
* PubMed/NCBI
* Google Scholar
110. 110. Molkentin JD, Li L, Olson EN. Phosphorylation of the MADS-Box transcription factor MEF2C enhances its DNA binding activity. J Biol Chem. 1996;271(29):17199–204. pmid:8663403
* View Article
* PubMed/NCBI
* Google Scholar
111. 111. Wang X, She H, Mao Z. Phosphorylation of neuronal survival factor MEF2D by glycogen synthase kinase 3beta in neuronal apoptosis. J Biol Chem. 2009;284(47):32619–26. pmid:19801631
* View Article
* PubMed/NCBI
* Google Scholar
112. 112. Darling AL, Uversky VN. Intrinsic disorder and posttranslational modifications: The darker side of the biological dark matter. Front Genet. 2018;4:9.
* View Article
* Google Scholar
113. 113. Jin F, Grater F. How multisite phosphorylation impacts the conformations of intrinsically disordered proteins. PLoS Comput Biol. 2021;17:e1008939.
* View Article
* Google Scholar
114. 114. Nosella ML, Forman-Kay JD. Phosphorylation-dependent regulation of messenger RNA transcription, processing and translation within biomolecular condensates. Curr Opin Cell Biol. 2021;69:30–40. pmid:33450720
* View Article
* PubMed/NCBI
* Google Scholar
115. 115. Shnitkind S, Martinez-Yamout MA, Dyson HJ, Wright PE. Structural Basis for Graded Inhibition of CREB:DNA Interactions by Multisite Phosphorylation. Biochemistry. 2018;57(51):6964–72. pmid:30507144
* View Article
* PubMed/NCBI
* Google Scholar
116. 116. De Bodt S, Raes J, Van de Peer Y, Theissen G. And then there were many: MADS goes genomic. Trends Plant Sci. 2003;8(10):475–83. pmid:14557044
* View Article
* PubMed/NCBI
* Google Scholar
117. 117. Veron AS, Kaufmann K, Bornberg-Bauer E. Evidence of interaction network evolution by whole-genome duplications: a case study in MADS-box proteins. Mol Biol Evol. 2007;24(3):670–8. pmid:17175526
* View Article
* PubMed/NCBI
* Google Scholar
118. 118. Strader L, Weijers D, Wagner D. Plant transcription factors - being in the right place with the right company. Curr Opin Plant Biol. 2022;65:102136. pmid:34856504
* View Article
* PubMed/NCBI
* Google Scholar
119. 119. Zahn LM, Leebens-Mack JH, Arrington JM, Hu Y, Landherr LL, DePamphilis CW. Conservation and divergence in the AGAMOUS subfamily of MADS-box genes: evidence of independent sub- and neofunctionalization events. Evol Dev. 2006;8:30–45.
* View Article
* Google Scholar
120. 120. Bemer M, Wolters-Arts M, Grossniklaus U, Angenent GC. The MADS domain protein DIANA acts together with AGAMOUS-LIKE80 to specify the central cell in Arabidopsis ovules. Plant Cell. 2008;20(8):2088–101. pmid:18713950
* View Article
* PubMed/NCBI
* Google Scholar
121. 121. Colombo M, Masiero S, Vanzulli S, Lardelli P, Kater MM, Colombo L. AGL23, a type I MADS-box gene that controls female gametophyte and embryo development in Arabidopsis. Plant J. 2008;54(6):1037–48. pmid:18346189
* View Article
* PubMed/NCBI
* Google Scholar
122. 122. Kang IH, Steffen JG, Portereiko MF, Lloyd A, Drews GN. The AGL62 MADS domain protein regulates cellularization during endosperm development in Arabidopsis. Plant Cell. 2008;20:635–47.
* View Article
* Google Scholar
123. 123. Köhler C, Hennig L, Spillane C, Pien S, Gruissem W, Grossniklaus U. The Polycomb-group protein MEDEA regulates seed development by controlling expression of the MADS-box gene PHERES1. Genes Dev. 2003;17:1540–53.
* View Article
* Google Scholar
124. 124. Zhang WJ, Zhou Y, Zhang Y, Su YH, Xu T. Protein phosphorylation: A molecular switch in plant signaling. Cell Rep. 2023;42(7):112729. pmid:37405922
* View Article
* PubMed/NCBI
* Google Scholar
125. 125. Borcherds W, Bremer A, Borgia MB, Mittag T. How do intrinsically disordered protein regions encode a driving force for liquid-liquid phase separation? Curr Opin Struct Biol. 2021;67:41–50. pmid:33069007
* View Article
* PubMed/NCBI
* Google Scholar
126. 126. Pei G, Lyons H, Li P, Sabari BR. Transcription regulation by biomolecular condensates. Nat Rev Mol Cell Biol. 2025;26(3):213–36. pmid:39516712
* View Article
* PubMed/NCBI
* Google Scholar
127. 127. Sabari BR. Biomolecular Condensates and Gene Activation in Development and Disease. Dev Cell. 2020;55(1):84–96. pmid:33049213
* View Article
* PubMed/NCBI
* Google Scholar
128. 128. Wei M-T, Chang Y-C, Shimobayashi SF, Shin Y, Strom AR, Brangwynne CP. Nucleated transcriptional condensates amplify gene expression. Nat Cell Biol. 2020;22(10):1187–96. pmid:32929202
* View Article
* PubMed/NCBI
* Google Scholar
129. 129. Zarin T, Strome B, Nguyen Ba AN, Alberti S, Forman-Kay JD, Moses AM. Proteome-wide signatures of function in highly diverged intrinsically disordered regions. Elife. 2019;8:e46883. pmid:31264965
* View Article
* PubMed/NCBI
* Google Scholar
130. 130. Singleton MD, Eisen MB. Evolutionary analyses of intrinsically disordered regions reveal widespread signals of conservation. PLoS Comput Biol. 2024;20(4):e1012028. pmid:38662765
* View Article
* PubMed/NCBI
* Google Scholar
131. 131. Mann R, Notani D. Transcription factor condensates and signaling driven transcription. Nucleus. 2023;14:2205758.
* View Article
* Google Scholar
132. 132. Boija A, Klein IA, Sabari BR, Dall’Agnese A, Coffey EL, Zamudio AV, et al. Transcription Factors Activate Genes through the Phase-Separation Capacity of Their Activation Domains. Cell. 2018;175(7):1842-1855.e16. pmid:30449618
* View Article
* PubMed/NCBI
* Google Scholar
Citation: Ramírez-Aguirre E, Nava-Ramírez TB, Covarrubias AA, Garay-Arroyo A (2025) Structural disorder and distinctive motifs in the C-terminal region of the MADS-domain transcription factors are conserved across diverse taxa. PLoS One 20(8): e0330098. https://doi.org/10.1371/journal.pone.0330098
About the Authors:
Erandi Ramírez-Aguirre
Contributed equally to this work with: Erandi Ramírez-Aguirre, Teresa Beatriz Nava-Ramírez
Roles: Data curation, Formal analysis, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing
Affiliation: Laboratorio de Genética Molecular, Desarrollo y Evolución de Plantas, Depto. de Ecología Funcional, Instituto de Ecología, Universidad Nacional Autónoma de México, Ciudad de México, México
ORICD: https://orcid.org/0000-0002-5290-4517
Teresa Beatriz Nava-Ramírez
Contributed equally to this work with: Erandi Ramírez-Aguirre, Teresa Beatriz Nava-Ramírez
Roles: Formal analysis, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing
Affiliation: Departamento de Biología Molecular de Plantas, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, México
Alejandra A. Covarrubias
Roles: Conceptualization, Formal analysis, Investigation, Supervision, Writing – original draft, Writing – review & editing
E-mail: [email protected] (AG-A); [email protected] (AAC)
Affiliation: Departamento de Biología Molecular de Plantas, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, México
Adriana Garay-Arroyo
Roles: Conceptualization, Funding acquisition, Investigation, Supervision, Writing – original draft, Writing – review & editing
E-mail: [email protected] (AG-A); [email protected] (AAC)
Affiliation: Laboratorio de Genética Molecular, Desarrollo y Evolución de Plantas, Depto. de Ecología Funcional, Instituto de Ecología, Universidad Nacional Autónoma de México, Ciudad de México, México
ORICD: https://orcid.org/0000-0003-1575-6284
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
[/RAW_REF_TEXT]
1. Wells CL, Pigliucci M. Adaptive phenotypic plasticity: the case of heterophylly in aquatic plants. Perspect Plant Ecol Evol Syst. 2000;3:1–18.
2. Sultan SE. Phenotypic plasticity for plant development, function and life history. Trends Plant Sci. 2000;5(12):537–42. pmid:11120476
3. Spitz F, Furlong EEM. Transcription factors: from enhancer binding to developmental control. Nat Rev Genet. 2012;13(9):613–26. pmid:22868264
4. Schwechheimer C, Bevan M. The regulation of transcription factor activity in plants. Trends Plant Sci. 1998;3:378–83.
5. Mosa KA, Ismail A, Helmy M. Plant Stress Tolerance. Springer International Publishing. 2017.
6. Hoang XLT, Nhi DNH, Thu NBA, Thao NP, Tran LSP. Transcription factors and their roles in signal transduction in plants under abiotic stresses. Curr Genomics. 2017;18:483–97.
7. Kaufmann K, Muiño JM, Jauregui R, Airoldi CA, Smaczniak C, Krajewski P, et al. Target genes of the MADS transcription factor SEPALLATA3: integration of developmental and hormonal pathways in the Arabidopsis flower. PLoS Biol. 2009;7(4):e1000090. pmid:19385720
8. Yu L-H, Miao Z-Q, Qi G-F, Wu J, Cai X-T, Mao J-L, et al. MADS-box transcription factor AGL21 regulates lateral root development and responds to multiple external and physiological signals. Mol Plant. 2014;7(11):1653–69. pmid:25122697
9. Babu MM, Luscombe NM, Aravind L, Gerstein M, Teichmann SA. Structure and evolution of transcriptional regulatory networks. Curr Opin Struct Biol. 2004;14(3):283–91. pmid:15193307
10. Melzer R, Theissen G. MADS and more: transcription factors that shape the plant. Methods Mol Biol. 2011;754:3–18. pmid:21720944
11. Castelán-Muñoz N, Herrera J, Cajero-Sánchez W, Arrizubieta M, Trejo C, García-Ponce B. MADS-Box genes are key components of genetic regulatory networks involved in abiotic stress and plastic developmental responses in plants. Front Plant Sci. 2019;10:853.
12. Schwarz-Sommer Z, Huijser P, Nacken W, Saedler H, Sommer H. Genetic Control of Flower Development by Homeotic Genes in Antirrhinum majus. Science. 1990;250(4983):931–6. pmid:17746916
13. Wu W, de Folter S, Shen X, Zhang W, Tao S. Vertebrate paralogous MEF2 genes: origin, conservation, and evolution. PLoS One. 2011;6(3):e17334. pmid:21394201
14. Acton TB, Zhong H, Vershon AK. DNA-binding specificity of Mcm1: operator mutations that alter DNA-bending and transcriptional activities by a MADS box protein. Mol Cell Biol. 1997;17(4):1881–9. pmid:9121436
15. Dichoso D, Brodigan T, Chwoe KY, Lee JS, Llacer R, Park M, et al. The MADS-Box factor CeMEF2 is not essential for Caenorhabditis elegans myogenesis and development. Dev Biol. 2000;223(2):431–40. pmid:10882527
16. Lilly B, Galewsky S, Firulli AB, Schulz RA, Olson EN. D-MEF2: a MADS-box transcription factor expressed in differentiating mesoderm and muscle cell lineages during Drosophila embryogenesis. Proc Natl Acad Sci. 1994;91:5662–6.
17. Pollock R, Treisman R. Human SRF-related proteins: DNA-binding properties and potential regulatory targets. Genes Dev. 1991;5(12A):2327–41. pmid:1748287
18. Alvarez-Buylla ER, Liljegren SJ, Pelaz S, Gold SE, Burgeff C, Ditta GS, et al. MADS-box gene evolution beyond flowers: expression in pollen, endosperm, guard cells, roots and trichomes. Plant J. 2000;24(4):457–66. pmid:11115127
19. Becker A, Theissen G. The major clades of MADS-box genes and their role in the development and evolution of flowering plants. Mol Phylogenet Evol. 2003;29(3):464–89. pmid:14615187
20. Meng D, Cao Y, Chen T, Abdullah M, Jin Q, Fan H, et al. Evolution and functional divergence of MADS-box genes in Pyrus. Sci Rep. 2019;9(1):1266. pmid:30718750
21. Shore P, Sharrocks AD. The MADS-box family of transcription factors. Eur J Biochem. 1995;229(1):1–13. pmid:7744019
22. Qiu Y, Li Z, Walther D, Köhler C. Updated Phylogeny and Protein Structure Predictions Revise the Hypothesis on the Origin of MADS-box Transcription Factors in Land Plants. Mol Biol Evol. 2023;40(9):msad194. pmid:37652031
23. Alvarez-Buylla ER, Pelaz S, Liljegren SJ, Gold SE, Burgeff C, Ditta GS, et al. An ancestral MADS-box gene duplication occurred before the divergence of plants and animals. Proc Natl Acad Sci U S A. 2000;97(10):5328–33. pmid:10805792
24. Smaczniak C, Immink RGH, Angenent GC, Kaufmann K. Developmental and evolutionary diversity of plant MADS-domain factors: insights from recent studies. Development. 2012;139(17):3081–98. pmid:22872082
25. Chen X, Gao B, Ponnusamy M, Lin Z, Liu J. MEF2 signaling and human diseases. Oncotarget. 2017;8(67):112152–65. pmid:29340119
26. Mead J, Bruning AR, Gill MK, Steiner AM, Acton TB, Vershon AK. Interactions of the Mcm1 MADS box protein with cofactors that regulate mating in yeast. Mol Cell Biol. 2002;22(13):4607–21. pmid:12052870
27. Messenguy F, Dubois E. Role of MADS box proteins and their cofactors in combinatorial control of gene expression and cell development. Gene. 2003;316:1–21. pmid:14563547
28. Bemer M, Heijmans K, Airoldi C, Davies B, Angenent GC. An atlas of type I MADS-box gene expression during female gametophyte and seed development in Arabidopsis. Plant Physiol. 2010;154:287–300.
29. Theißen G, Kim JT, Saedler H. Classification and phylogeny of the MADS-box multigene family suggest defined roles of MADS-box gene subfamilies in the morphological evolution of eukaryotes. J Mol Evol. 1996;43:484–516.
30. Krizek BA, Meyerowitz EM. Mapping the protein regions responsible for the functional specificities of the Arabidopsis MADS domain organ-identity proteins. Proc Natl Acad Sci U S A. 1996;93(9):4063–70. pmid:8633017
31. Riechmann JL, Krizek BA, Meyerowitz EM. Dimerization specificity of Arabidopsis MADS domain homeotic proteins APETALA1, APETALA3, PISTILLATA, and AGAMOUS. Proc Natl Acad Sci U S A. 1996;93(10):4793–8. pmid:8643482
32. Yang Y, Jack T. Defining subdomains of the K domain important for protein-protein interactions of plant MADS proteins. Plant Mol Biol. 2004;55(1):45–59. pmid:15604664
33. Lai X, Vega-Léon R, Hugouvieux V, Blanc-Mathieu R, van der Wal F, Lucas J, et al. The intervening domain is required for DNA-binding and functional identity of plant MADS transcription factors. Nat Commun. 2021;12:4760.
34. Fan HY, Hu Y, Tudor M, Ma H. Specific interactions between the K domains of AG and AGLs, members of the MADS domain family of DNA binding proteins. Plant J. 1997;12(5):999–1010. pmid:9418042
35. Riechmann JL, Meyerowitz EM. MADS domain proteins in plant development. Biol Chem. 1997;378(10):1079–101. pmid:9372178
36. Parenicová L, de Folter S, Kieffer M, Horner DS, Favalli C, Busscher J, et al. Molecular and phylogenetic analyses of the complete MADS-box transcription factor family in Arabidopsis: new openings to the MADS world. Plant Cell. 2003;15(7):1538–51. pmid:12837945
37. Salladini E, Jørgensen MLM, Theisen FF, Skriver K. Intrinsic Disorder in Plant Transcription Factor Systems: Functional Implications. Int J Mol Sci. 2020;21(24):9755. pmid:33371315
38. Dunker AK, Brown CJ, Lawson JD, Iakoucheva LM, Obradović Z. Intrinsic disorder and protein function. Biochemistry. 2002;41(21):6573–82. pmid:12022860
39. Dunker AK, Lawson JD, Brown CJ, Williams RM, Romero P, Oh JS, et al. Intrinsically disordered protein. J Mol Graph Model. 2001;19(1):26–59. pmid:11381529
40. Radivojac P, Obradovic Z, Smith DK, Zhu G, Vucetic S, Brown CJ. Protein flexibility and intrinsic disorder. Protein Sci. 2004;13:71–80.
41. Dyson HJ, Wright PE. Intrinsically unstructured proteins and their functions. Nat Rev Mol Cell Biol. 2005;6(3):197–208. pmid:15738986
42. Tompa P. Intrinsically disordered proteins: A 10-year recap. Trends in Biochem Sci. 2012;37:509–16.
43. O’Shea C, Kryger M, Stender EGP, Kragelund BB, Willemoës M, Skriver K. Protein intrinsic disorder in Arabidopsis NAC transcription factors: transcriptional activation by ANAC013 and ANAC046 and their interactions with RCD1. Biochem J. 2015;465(2):281–94. pmid:25348421
44. Stender EG, O’Shea C, Skriver K. Subgroup-specific intrinsic disorder profiles of arabidopsis NAC transcription factors: Identification of functional hotspots. Plant Signal Behav. 2015;10:e1010967.
45. Sun X, Jones WT, Rikkerink EHA. GRAS proteins: the versatile roles of intrinsically disordered proteins in plant signalling. Biochem J. 2012;442(1):1–12. pmid:22280012
46. Valsecchi I, Guittard-Crilat E, Maldiney R, Habricot Y, Lignon S, Lebrun R, et al. The intrinsically disordered C-terminal region of Arabidopsis thaliana TCP8 transcription factor acts both as a transactivation and self-assembly domain. Mol Biosyst. 2013;9(9):2282–95. pmid:23760157
47. Tarczewska A, Greb-Markiewicz B. The Significance of the Intrinsically Disordered Regions for the Functions of the bHLH Transcription Factors. Int J Mol Sci. 2019;20(21):5306. pmid:31653121
48. Oldfield CJ, Cheng Y, Cortese MS, Romero P, Uversky VN, Dunker AK. Coupled folding and binding with alpha-helix-forming molecular recognition elements. Biochemistry. 2005;44(37):12454–70. pmid:16156658
49. Yan J, Dunker AK, Uversky VN, Kurgan L. Molecular recognition features (MoRFs) in three domains of life. Mol Biosyst. 2016;12(3):697–710. pmid:26651072
50. Mohan A, Oldfield CJ, Radivojac P, Vacic V, Cortese MS, Dunker AK, et al. Analysis of molecular recognition features (MoRFs). J Mol Biol. 2006;362(5):1043–59. pmid:16935303
51. Vacic V, Oldfield CJ, Mohan A, Radivojac P, Cortese MS, Uversky VN, et al. Characterization of molecular recognition features, MoRFs, and their binding partners. J Proteome Res. 2007;6(6):2351–66. pmid:17488107
52. Das RK, Pappu RV. Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues. Proc Natl Acad Sci U S A. 2013;110(33):13392–7. pmid:23901099
53. Bhattarai A, Emerson IA. Dynamic conformational flexibility and molecular interactions of intrinsically disordered proteins. J Biosci. 2020;45:29. pmid:32020911
54. Dunker AK, Cortese MS, Romero P, Iakoucheva LM, Uversky VN. Flexible nets. FEBS Journal. 2005;272:5129–48.
55. Alberti S. The wisdom of crowds: regulating cell function through condensed states of living matter. J Cell Sci. 2017;130(17):2789–96. pmid:28808090
56. Alberti S, Hyman AA. Biomolecular condensates at the nexus of cellular stress, protein aggregation disease and ageing. Nat Rev Mol Cell Biol. 2021;22(3):196–213. pmid:33510441
57. TAIR - Home. Available from: https://www.arabidopsis.org/.
58. UniProt. Available from: https://www.uniprot.org/.
59. National Center for Biotechnology Information. National Center for Biotechnology Information. Available from: https://www.ncbi.nlm.nih.gov/.
60. RAP-DB | HOME. Available from: https://rapdb.dna.affrc.go.jp/.
61. Arora R, Agarwal P, Ray S, Singh AK, Singh VP, Tyagi AK, et al. MADS-box gene family in rice: genome-wide identification, organization and expression profiling during reproductive development and stress. BMC Genomics. 2007;8:242. pmid:17640358
62. MAFFT alignment and NJ/ UPGMA phylogeny. Available from: https://mafft.cbrc.jp/alignment/server/.
63. Katoh K, Rozewicki J, Yamada KD. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief Bioinform. 2019;20(4):1160–6. pmid:28968734
64. Larsson A. AliView: a fast and lightweight alignment viewer and editor for large datasets. Bioinformatics. 2014;30(22):3276–8. pmid:25095880
65. Jones DT, Taylor WR, Thornton JM. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992;8(3):275–82. pmid:1633570
66. Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30(9):1312–3. pmid:24451623
67. Miller MA, Pfeiffer W, Schwartz T. Creating the CIPRES Science Gateway for inference of large phylogenetic trees. In: 2010 Gateway Computing Environments Workshop, GCE. 2010.
68. Portal | CIPRES. Available from: https://www.phylo.org/index.php/site.
69. Meme - Submission Form. Available from: https://meme-suite.org/meme/tools/meme.
70. Bailey TL, Johnson J, Grant CE, Noble WS. The MEME Suite. Nucleic Acids Research. 2015;43:W39-49.
71. fMoRFpred - fast Molecular Recognition Feature predictor. Available from: http://biomine.cs.vcu.edu/servers/fMoRFpred/.
72. Morffy N, Van den Broeck L, Miller C, Emenecker RJ, Bryant JA Jr, Lee TM, et al. Identification of plant transcriptional activation domains. Nature. 2024;632(8023):166–73. pmid:39020176
73. Piovesan D, Del Conte A, Clementel D, Monzon AM, Bevilacqua M, Aspromonte MC, et al. MobiDB: 10 years of intrinsically disordered proteins. Nucleic Acids Res. 2023;51:D438-44.
74. AlphaFold. Google Colab. Available from: https://colab.research.google.com/github/deepmind/alphafold/blob/main/notebooks/AlphaFold.ipynb.
75. Meng EC, Goddard TD, Pettersen EF, Couch GS, Pearson ZJ, Morris JH, et al. UCSF ChimeraX: Tools for structure building and analysis. Protein Sci. 2023;32(11):e4792. pmid:37774136
76. Annotation from 3D structure data. Available from: https://www.jalview.org/help/html/features/xsspannotation.html.
77. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–9. pmid:34265844
78. Wilson CJ, Choy WY, Karttunen M. AlphaFold2: A role for disordered protein/region prediction? Int J Mol Sci. 2022;23:4591.
79. RIDAO: Rapid Intrinsic Disorder Analysis Online. Available from: https://ridao.app/users/sign_in.
80. Dayhoff GW 2nd, Uversky VN. Rapid prediction and analysis of protein intrinsic disorder. Protein Sci. 2022;31(12):e4496. pmid:36334049
81. Shukla S, Lastorka SS, Uversky VN. Intrinsic Disorder and Phase Separation Coordinate Exocytosis, Motility, and Chromatin Remodeling in the Human Acrosomal Proteome. Proteomes. 2025;13(2):16. pmid:40407495
82. Pappu Lab. Available from: https://pappulab.wustl.edu/CIDERinfo.html.
83. Holehouse AS, Das RK, Ahad JN, Richardson MOG, Pappu RV. CIDER: Resources to Analyze Sequence-Ensemble Relationships of Intrinsically Disordered Proteins. Biophys J. 2017;112(1):16–21. pmid:28076807
84. NetPhos 3.1 - DTU Health Tech - Bioinformatic Services. Available from: https://services.healthtech.dtu.dk/services/NetPhos-3.1/.
85. Blom N, Gammeltoft S, Brunak S. Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol. 1999;294:1351–62.
86. Blom N, Sicheritz-Pontén T, Gupta R, Gammeltoft S, Brunak S. Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics. 2004;4(6):1633–49. pmid:15174133
87. ATHENA. Available from: https://athena.proteomics.wzw.tum.de/master_arabidopsisshiny/.
88. Lin S, Wang C, Zhou J, Shi Y, Ruan C, Tu Y, et al. EPSD: a well-annotated data resource of protein phosphorylation sites in eukaryotes. Brief Bioinform. 2021;22(1):298–307. pmid:32008039
89. Hatos A, Tosatto SCE, Vendruscolo M, Fuxreiter M. FuzDrop on AlphaFold: visualizing the sequence-dependent propensity of liquid-liquid phase separation and aggregation of proteins. Nucleic Acids Res. 2022;50(W1):W337–44. pmid:35610022
90. FuzDrop. Available from: https://fuzdrop.bio.unipd.it/predictor
91. Hothorn T, Van De Wiel MA, Hornik K, Zeileis A. Implementing a class of permutation tests: The coin package. J Stat Softw. 2008;28:1–23.
92. R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. 2024.
93. GIMP Development Team. GNU Image Manipulation Program (GIMP), Version 3.0.4. Community. 2025.
94. Iakoucheva LM, Radivojac P, Brown CJ, O’Connor TR, Sikes JG, Obradovic Z, et al. The importance of intrinsic disorder for protein phosphorylation. Nucleic Acids Res. 2004;32(3):1037–49. pmid:14960716
95. Margulis L, Chapman MJ. Kingdoms and Domains: An Illustrated Guide to the Phyla of Life on Earth, Fourth Edition. Elsevier. 2009.
96. Alvarez-Buylla ER, García-Ponce B, Garay-Arroyo A. Unique and redundant functional domains of APETALA1 and CAULIFLOWER, two recently duplicated Arabidopsis thaliana floral MADS-box genes. J Exp Bot. 2006;57(12):3099–107. pmid:16893974
97. Trojanowski J, Frank L, Rademacher A, Mücke N, Grigaitis P, Rippe K. Transcription activation is enhanced by multivalent interactions independent of phase separation. Mol Cell. 2022;82(10):1878–1893.e10. pmid:35537448
98. Liu J, Perumal NB, Oldfield CJ, Su EW, Uversky VN, Dunker AK. Intrinsic disorder in transcription factors. Biochemistry. 2006;45(22):6873–88. pmid:16734424
99. Dyson HJ, Wright PE. Role of Intrinsic Protein Disorder in the Function and Interactions of the Transcriptional Coactivators CREB-binding Protein (CBP) and p300. J Biol Chem. 2016;291(13):6714–22. pmid:26851278
100. Vandenbussche M, Theissen G, Van de Peer Y, Gerats T. Structural diversification and neo-functionalization during floral MADS-box gene evolution by C-terminal frameshift mutations. Nucleic Acids Res. 2003;31(15):4401–9. pmid:12888499
101. Cho S, Jang S, Chae S, Chung KM, Moon YH, An G, et al. Analysis of the C-terminal region of Arabidopsis thaliana APETALA1 as a transcription activation domain. Plant Mol Biol. 1999;40(3):419–29. pmid:10437826
102. Kaufmann K, Melzer R, Theissen G. MIKC-type MADS-domain proteins: structural modularity, protein interactions and network evolution in land plants. Gene. 2005;347(2):183–98. pmid:15777618
103. Egea-Cortines M, Saedler H, Sommer H. Ternary complex formation between the MADS-box proteins SQUAMOSA, DEFICIENS and GLOBOSA is involved in the control of floral architecture in Antirrhinum majus. EMBO J. 1999;18(19):5370–9. pmid:10508169
104. Sridhar VV, Surendrarao A, Liu Z. APETALA1 and SEPALLATA3 interact with SEUSS to mediate transcription repression during flower development. Development. 2006;133(16):3159–66. pmid:16854969
105. Shiraishi H, Okada K, Shimura Y. Nucleotide sequences recognized by the AGAMOUS MADS domain of Arabidopsis thaliana in vitro. Plant J. 1993;4(2):385–98. pmid:8106084
106. Mizukami Y, Ma H. Determination of Arabidopsis floral meristem identity by AGAMOUS. Plant Cell. 1997;9(3):393–408. pmid:9090883
107. Puranik S, Acajjaoui S, Conn S, Costa L, Conn V, Vial A, et al. Structural basis for the oligomerization of the MADS domain transcription factor SEPALLATA3 in Arabidopsis. Plant Cell. 2014;26(9):3603–15. pmid:25228343
108. Amoutzias GD, Robertson DL, Van de Peer Y, Oliver SG. Choose your partners: dimerization in eukaryotic transcription factors. Trends Biochem Sci. 2008;33(5):220–9. pmid:18406148
109. Martin JF, Miano JM, Hustad CM, Copeland NG, Jenkins NA, Olson EN. A Mef2 gene that generates a muscle-specific isoform via alternative mRNA splicing. Mol Cell Biol. 1994;14(3):1647–56. pmid:8114702
110. Molkentin JD, Li L, Olson EN. Phosphorylation of the MADS-Box transcription factor MEF2C enhances its DNA binding activity. J Biol Chem. 1996;271(29):17199–204. pmid:8663403
111. Wang X, She H, Mao Z. Phosphorylation of neuronal survival factor MEF2D by glycogen synthase kinase 3beta in neuronal apoptosis. J Biol Chem. 2009;284(47):32619–26. pmid:19801631
112. Darling AL, Uversky VN. Intrinsic disorder and posttranslational modifications: The darker side of the biological dark matter. Front Genet. 2018;4:9.
113. Jin F, Grater F. How multisite phosphorylation impacts the conformations of intrinsically disordered proteins. PLoS Comput Biol. 2021;17:e1008939.
114. Nosella ML, Forman-Kay JD. Phosphorylation-dependent regulation of messenger RNA transcription, processing and translation within biomolecular condensates. Curr Opin Cell Biol. 2021;69:30–40. pmid:33450720
115. Shnitkind S, Martinez-Yamout MA, Dyson HJ, Wright PE. Structural Basis for Graded Inhibition of CREB:DNA Interactions by Multisite Phosphorylation. Biochemistry. 2018;57(51):6964–72. pmid:30507144
116. De Bodt S, Raes J, Van de Peer Y, Theissen G. And then there were many: MADS goes genomic. Trends Plant Sci. 2003;8(10):475–83. pmid:14557044
117. Veron AS, Kaufmann K, Bornberg-Bauer E. Evidence of interaction network evolution by whole-genome duplications: a case study in MADS-box proteins. Mol Biol Evol. 2007;24(3):670–8. pmid:17175526
118. Strader L, Weijers D, Wagner D. Plant transcription factors - being in the right place with the right company. Curr Opin Plant Biol. 2022;65:102136. pmid:34856504
119. Zahn LM, Leebens-Mack JH, Arrington JM, Hu Y, Landherr LL, DePamphilis CW. Conservation and divergence in the AGAMOUS subfamily of MADS-box genes: evidence of independent sub- and neofunctionalization events. Evol Dev. 2006;8:30–45.
120. Bemer M, Wolters-Arts M, Grossniklaus U, Angenent GC. The MADS domain protein DIANA acts together with AGAMOUS-LIKE80 to specify the central cell in Arabidopsis ovules. Plant Cell. 2008;20(8):2088–101. pmid:18713950
121. Colombo M, Masiero S, Vanzulli S, Lardelli P, Kater MM, Colombo L. AGL23, a type I MADS-box gene that controls female gametophyte and embryo development in Arabidopsis. Plant J. 2008;54(6):1037–48. pmid:18346189
122. Kang IH, Steffen JG, Portereiko MF, Lloyd A, Drews GN. The AGL62 MADS domain protein regulates cellularization during endosperm development in Arabidopsis. Plant Cell. 2008;20:635–47.
123. Köhler C, Hennig L, Spillane C, Pien S, Gruissem W, Grossniklaus U. The Polycomb-group protein MEDEA regulates seed development by controlling expression of the MADS-box gene PHERES1. Genes Dev. 2003;17:1540–53.
124. Zhang WJ, Zhou Y, Zhang Y, Su YH, Xu T. Protein phosphorylation: A molecular switch in plant signaling. Cell Rep. 2023;42(7):112729. pmid:37405922
125. Borcherds W, Bremer A, Borgia MB, Mittag T. How do intrinsically disordered protein regions encode a driving force for liquid-liquid phase separation? Curr Opin Struct Biol. 2021;67:41–50. pmid:33069007
126. Pei G, Lyons H, Li P, Sabari BR. Transcription regulation by biomolecular condensates. Nat Rev Mol Cell Biol. 2025;26(3):213–36. pmid:39516712
127. Sabari BR. Biomolecular Condensates and Gene Activation in Development and Disease. Dev Cell. 2020;55(1):84–96. pmid:33049213
128. Wei M-T, Chang Y-C, Shimobayashi SF, Shin Y, Strom AR, Brangwynne CP. Nucleated transcriptional condensates amplify gene expression. Nat Cell Biol. 2020;22(10):1187–96. pmid:32929202
129. Zarin T, Strome B, Nguyen Ba AN, Alberti S, Forman-Kay JD, Moses AM. Proteome-wide signatures of function in highly diverged intrinsically disordered regions. Elife. 2019;8:e46883. pmid:31264965
130. Singleton MD, Eisen MB. Evolutionary analyses of intrinsically disordered regions reveal widespread signals of conservation. PLoS Comput Biol. 2024;20(4):e1012028. pmid:38662765
131. Mann R, Notani D. Transcription factor condensates and signaling driven transcription. Nucleus. 2023;14:2205758.
132. Boija A, Klein IA, Sabari BR, Dall’Agnese A, Coffey EL, Zamudio AV, et al. Transcription Factors Activate Genes through the Phase-Separation Capacity of Their Activation Domains. Cell. 2018;175(7):1842-1855.e16. pmid:30449618
© 2025 Ramírez-Aguirre et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.