This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. Introduction
Noncoding RNAs play an important role in the development of complex diseases, and their functions can be elucidated to help us understand the complex processes of disease and develop appropriate drugs [1]. MicroRNAs (miRNAs) are important types of noncoding RNAs that are key regulators of a variety of biological pathways, both in disease and normal states of the body [2]. They mostly play a negative regulatory role in promoting the degradation of mRNAs or inhibiting translation [3]. An increasing number of studies have reported the role of miRNA-mRNA regulatory networks in disease development [4, 5], suggesting that miRNAs may systematically perform gene regulation through a number of regulatory networks.
Unstable angina (UA) is one of the acute coronary syndromes in which the frequency and duration of attacks are unstable and may lead to myocardial infarction in severe cases [6]. Unstable angina is a complex cardiovascular disease associated with multiple causative factors [7]. Current studies suggest that the disease is caused by myocardial ischemia and hypoxia following the formation of a thrombus in the coronary arteries [8], but the exact etiology and pathogenesis remain to be further elucidated. Many studies have shown that some miRNAs and some of the genes they regulate may be diagnostic markers or therapeutic targets for unstable angina [9, 10]. Exploring the role played by miRNAs is an effective means of understanding the mechanisms of the disease and developing relevant drugs.
WGCNA (weighted gene coexpression network analysis) allows the analysis of experimentally measured genes or RNA expression data [11]. After calculating the correlation of genes or RNAs using expression data, a weighted correlation matrix and a topological overlap matrix can be constructed, and then, hierarchical clustering can be performed to form modules. We can focus on key modules and key nodes, as they can not only act as markers or therapeutic targets but also perform essential functions and influence many other biomolecules, which can be useful in understanding the complex mechanisms of disease [12]. However, the WGCNA method only considers correlations between genes or RNAs, which is not sufficient, as there are not only correlations but also similarities between molecules [13]. Similarity networks, which consist of similarity relationships between molecules, reveal which molecules have similar mechanisms of action [14] in contrast to the coexpression networks constructed by the WGCNA approach. In a similarity network, the existence of edges between nodes indicates that nodes have similar mechanisms of action, and a hub node with numerous neighbor nodes contains most of the mechanisms of action of its neighbors [15]. Therefore, hub nodes in similarity networks often perform multiple functions, so interfering with or disrupting these core nodes is likely to affect multiple functions. Similar to the key modules and key nodes in the WGCNA approach, the key nodes in these similarity networks are also likely to be markers and therapeutic targets and contribute to understanding the mechanisms of disease [16, 17]. Constructing similarity networks and finding key nodes provide an alternative way of analyzing complex networks [18].
A knowledge graph is a multirelational graph consisting of entities (nodes) and relationships (edges) and is composed of a series of triples (h, r, and t), which is essentially a heterogeneous network graph [19]. A knowledge graph or heterogeneous network graph is a complex network graph containing different types of nodes and edges. After modelling this complex network, it is possible to make predictions about the existence of nodes or edges in the network, as well as to predict the labels of nodes and edges, and also to perform recommendation tasks [20]. Many miRNAs regulate many genes and in turn affect many pathways. This complex process can be graphically represented by the miRNA-target gene-pathway heterogeneous network (MTP), which we refer to as the miRNA regulatory network [21]. On the one hand, the constructed MTP networks can be used to construct miRNA similarity networks and find hub miRNAs by calculating the similarity of miRNA actions, i.e., network analysis. On the other hand, MTP networks can also be modelled with models, such as knowledge graphs or graph neural networks, i.e., network modelling, both for predicting nodes and connected edges as well as labels with classifiers such as fully connected neural networks (MLP) after obtaining embedding representation [22]. Network modelling is a deep learning task for modelling complex heterogeneous network graphs. Both the network analysis part and the network modelling part explore miRNA regulatory networks from a network perspective, and the combination of the two parts provides new ideas to analyze and predict the role of miRNAs in complex diseases in a holistic and systematic way.
At present, the functions of many miRNAs are still not well understood, and the regulatory roles that miRNAs play in complex diseases are not systematically elucidated [23]. To find better hub miRNAs, it is also worth investigating whether correlations and similarities in complex networks can be combined to obtain a better network analysis method than WGCNA [24]. We used miRNAs as a representative of noncoding RNAs and constructed a miRNA regulatory network, the MTP network, using unstable angina as an example. For this regulatory network of noncoding RNAs, we proposed a research method that contains a network analysis part based on multiple network analysis methods and a network modelling part based on knowledge graph algorithms. Hub miRNAs were obtained by an improved network analysis method, while the MTP network was modelled using a knowledge graph algorithm, and interactions in this regulatory network were predicted as a prediction task for edges, while the class of nodes was predicted as a prediction task for node labels [25]. Finally, the potential targets and functions of hub miRNAs were predicted. The overall study flowchart is shown in Figure 1, and the flowchart of the network analysis part is shown in Figure 2. The acronym table for this study is Table S1 in Supplementary Materials.
[figure(s) omitted; refer to PDF]
2. Materials and Methods
2.1. Data Preparation
2.1.1. miRNAs and Their Target Genes
Expression data for noncoding RNAs analyzed by an array were obtained from the GEO database (GSE94605), specifically miRNA expression data in plasma from healthy subjects and patients with unstable angina, with 7 and 6 sample pools in the control and case groups, respectively [26]. Differential expression analysis by using the GEO2R tool was used to obtain differentially expressed plasma miRNAs, and then, we set |logFC| > 2 and adjusted
The GeneCards database and the DisGeNET database were used to find genes for unstable angina, and the target genes of miRNAs were intersected with disease genes. The intersecting genes were used as target genes regulated by miRNAs that were differentially expressed in the disease condition [29].
2.1.2. Tissue Localization of Genes and Protein-Protein Interactions
Different genes are expressed differently in different tissues, and gene expression is tissue specific [30]. We used the expression of genes in different tissues to indicate the tissue localization of genes. The TISSUES database was used to find the expression of miRNA target genes in different tissues, with gene expression data for up to 21 tissues [31].
Protein-protein interaction data were retrieved from the String database, keeping default parameter settings [32].
2.1.3. Functional Enrichment and Pathway Categorization
KEGG enrichment analysis of target genes was performed using the clusterProfiler package in R software [33], with
2.1.4. Datasets for Validation
The dataset for miRNA expression in plasma used to construct the complex network is referred to as the original dataset. To make the results more convincing, two additional independent external datasets were used to validate miRNAs and target genes, respectively, which are referred to as new datasets.
In the new GSE49823 dataset [34], plasma miRNA expression data were recorded for the unstable angina patient group and the control group, with 13 samples for the disease group and 13 samples for the control group, making a total of 26 miRNA expression samples. This miRNA expression dataset was used to test the performance of the network analysis algorithm and the reliability of hub miRNA.
The new GSE60993 dataset [35] contains gene expression data from the peripheral blood of patients with unstable angina and normal controls, with 9 samples from the disease group and 7 samples from the control group. This gene expression dataset was used to test the performance of the network modelling algorithm and the reliability of hub miRNA target genes.
2.2. Network Analysis
The WGCNA analysis method is based on the expression correlation between genes or RNAs for network analysis, however, using only correlation networks to find hub miRNAs does not fully utilize the information of complex networks. Similarity is different from correlation, so we proposed a method to calculate the similarity of miRNAs based on MTP heterogeneous networks to construct similarity networks and find hub miRNAs, which is called the SimCluster analysis method. The flowchart of the network analysis part is shown in Figure 2.
2.2.1. WGCNA
We used the WGCNA method to analyze the expression data of differentially expressed miRNAs and find hub miRNAs for unstable angina for subsequent studies [12]. The WGCNA method can transform the coexpression network into a scale-free network by setting the β parameter, with fewer nodes of a high degree and more nodes of a low degree [36].
Network analysis of miRNA expression data was performed using the ImageGP website [37] based on the WGCNA method, with parameters set to the signed network and Pearson correlation and R-squared set to 0.9. After calculation, the β parameter value was finally chosen as 18.
The obtained weighted miRNA coexpression network was hierarchically clustered to classify modules, and the top 10 most important miRNAs in each module were taken as hub miRNAs.
2.2.2. SimCluster
The SimCluster algorithm consists of two main parts, similarity network construction and hub miRNA screening, and the flow of the algorithm consists of the following 3 steps:
(1) We used the MTP network described in Section 3.1 to calculate the first-order similarity and second-order similarity of miRNAs, where first-order similarity refers to the similarity of miRNAs at the target gene network level (M-T), while second-order similarity refers to the similarity of miRNAs at the enriched pathway level (M-
(2) We defined similarity values as thresholds and obtained different similarity matrices by setting different threshold lower limits, which were then converted into corresponding similarity networks. Drawing on the idea of the WGCNA method to construct a scale-free network [39], we calculated the distribution of degree values and degree frequencies of all miRNAs in the similarity network to determine whether the network was a scale-free network or not. Specifically, we used Pearson’s correlation coefficient and R2 of linear regression to calculate whether the logarithm of degree values (lg (K)) and the logarithm of degree frequencies (lg (pK)) were highly correlated and linearly correlated.
(3) For scale-free similarity networks, we used several network clustering algorithms to divide modules and selected the optimal network clustering algorithm using modularity values [40]. For each divided module, miRNA with the largest degree value was selected as hub miRNA.
2.2.3. WGCNA & SimCluster
The WGCNA method uses correlation between miRNAs, while the SimCluster method uses similarity between miRNAs. The WGCNA & SimCluster method combines these two methods, containing both correlation and similarity information. Specifically, the results of the WGCNA method and the SimCluster method were intersected to obtain final hub miRNAs.
We compared the performance of hub miRNAs obtained by the WGCNA method, the SimCluster method, and the WGCNA & SimCluster method on original and new datasets, respectively, to evaluate their potential as biomarkers, using the AUC (area under the ROC curve) and AUPR (area under the PR curve) as metrics. A comparison of the above three network analysis methods is shown in Table 1, including data, methods, and some results.
Table 1
Comparison of the three network analysis methods.
WGCNA | SimCluster | WGCNA & SimCluster | ||
Data | Data sources | miRNA expression data | MTP network | Data used in the WGCNA method and SimCluster method |
Constructed matrices | Correlation matrix and the topological overlap matrix | First-order similarity matrix or the second-order similarity matrix | ||
Methods | Core principles | Pearson’s correlation of miRNA expression data | Jaccard similarity of miRNAs at the target gene level or pathway level | Intersection of the results of the WGCNA method and the SimCluster method |
Parameters of scale-free networks | Soft threshold | Similarity threshold | ||
R 2 greater than 0.9 | Both R2 and the correlation coefficient should be greater than 0.9 | |||
Module generation | Hierarchical clustering | Network clustering | ||
Results | Number of hub miRNAs | 40 | 63 | 11 |
Mean AUC values for the new dataset | 0.5191 | 0.5319 | 0.6292 | |
Mean AUPR values for the new dataset | 0.6019 | 0.6069 | 0.6773 |
Note. In this table, the results of the SimCluster method and the WGCNA & SimCluster method are calculated based on first-order similarity.
2.3. Network Modelling
We used the data from the data preparation section to construct a miRNA-target gene-pathway heterogeneous network (MTP) containing three types of nodes and edges: M, T, and
2.3.1. Models for Knowledge Graphs
A knowledge graph is a set of many triples (h, r, and t) [41], each (h, r, and t) representing the head entity h, the tail entity t, and the relationship between them r. Knowledge graph models are advantageous in dealing with complex heterogeneous graphs consisting of different types of nodes and edges [42], and a number of models have been successively published.
We have modelled the constructed MTP network using a series of knowledge graph models or graph neural network models that have been published in recent years. After much experimentation, we selected the RotatE model [43] for the link prediction task and the RGCN model [44] for the multilabel classification task.
(1) The RotatE model maps entities and relationships to a complex vector space and defines each relationhip as a rotation between the head entity and the tail entity. It allows the modelling and inference of relationships such as symmetry, antisymmetry, inversion, and composition, which are difficult to accomplish with other models [43]. The core formulations of the RotatE model are shown as follows:
In the above four formulas,
(2) The RGCN model (relational graph convolutional network) is a graph neural network model for heterogeneous graphs. By following the idea of message passing network calculation [45], the formulas are shown as follows:
(3) In addition to the RotatE model and the RGCN model, other advanced models were selected for comparison experiments to find the best model in order to perform the subsequent prediction task. RotatE is essentially a distance transformation model, RGCN is a classical heterogeneous graph neural network model, and we also selected TransE [46], which is also a distance transformation model, the Gaussian embedding model KG2E [47], the semantic matching model DistMult [48], and CompGCN [49], a model that combines knowledge graph algorithms with graph neural network algorithms.
2.3.2. Link Prediction Model
Based on the constructed MTP network, we used knowledge graph- or graph neural network models to perform link prediction, i.e., to predict whether there is an edge between two nodes or whether it is a fact triple [46]. In the prediction task, for each triple, there are both fixed head entity and relationship to predict tail entity, and fixed tail entity and relationship to predict head entity. We refer to the model that performs the link prediction task as the link prediction model.
The constructed MTP network for unstable angina as a dataset contains 573 nodes and 12629 edges. We used the above model to complete ten-fold cross-validation, training on the training set while validating on the testing set [50]. The performance of the model is evaluated using three metrics, Hits@k, MR, and MRR [46]:
In Equation (6), the average of the number of correct predictions among the top
2.3.3. Multilabel Classification Model
We used the six models described above to obtain the embeddings of nodes and then used a two-layer fully connected neural network (MLP) to predict multiple labels for target genes or pathways [25–51], with the sigmoid function as the final activation function. We refer to the model that performs the multi-label classification task as the multi-label classification model. For the multilabel classification task, we still used the MTP network as the dataset for ten-fold cross-validation to assess the performance of the model using accuracy as a metric. Accuracy was calculated by equations (8)–(10).
There are 21 labels for targets (T) and 5 labels for pathways (
2.3.4. Comparison Experiments
The aim of the comparison experiments is to find the best link prediction model and the best multilabel classification model [52]. We used RotatE, TransE, KG2E, DistMult, RGCN, and CompGCN models to perform M-T, T-T, and T-
2.3.5. Parameter Optimization Experiments
The aim of the parameter optimization experiments is to find the optimal parameters for each model [53]. Three parameters were selected for the parameter optimization experiments, namely, the embedding dimension, epoch, and learning rate. The embedding dimension was set to 32, 64, 96, and 128, the epoch was set to 25, 50, 100, and 200, and the learning rate was set to 0.0001, 0.001, 0.01, and 0.1, respectively.
2.4. Case Studies
Elucidating the function of miRNAs can help us understand a disease more accurately. There is still a lack of systematic studies on miRNAs in unstable angina [4]. Therefore, to obtain hub miRNAs by the best network analysis method, we conducted a case study using the best link prediction model. Specifically, the potential target genes of these hub miRNAs were predicted, i.e., the M-T link prediction task. For the prediction results, we performed a three-step validation method.
2.5. Three-Step Validation Method
The importance and reliability of hub miRNAs have been validated in the network analysis section. In this section, we validate the results of the network modelling section using a designed three-step validation method.
2.5.1. Validation of Hub miRNA-Target Gene Interactions
First, we assessed the reliability of the predicted hub miRNA-target gene interactions by searching the literature or other databases [54]. The assessment was performed using the TopK metric, meaning the proportion of predictions that was correct in the top K rankings [55]. The TopK results of all hub miRNAs were then averaged.
2.5.2. Validation of the Potential of Target Genes as Biomarkers
In the second step, we validated the predicted new and existing target genes for hub miRNAs, which comprised two validation methods.
(1) The target genes of hub miRNAs were validated for a new dataset (GSE60993) containing gene expression from unstable angina and healthy controls, using AUC and AUPR as evaluation metrics.
(2) Transcription factors (TFs) are important proteins that regulate gene expression [56], and their dysregulation will cause abnormal gene expression, which is closely associated with the development and progression of complex diseases [57]. The TF-Marker database [58] provides cell- and tissue-specific TFs and related markers. We searched the TF-Marker database to verify whether the target genes of hub miRNAs are TFs or related markers in tissues such as the heart, blood vessels, and arteries, which are closely associated with the development of unstable angina.
2.5.3. Validation of the Function of Target Genes
Finally, KEGG functional enrichment analysis was performed on these target genes [59]. The reliability of the predictions was further assessed to know whether enriched pathways were classical and critical pathways in unstable angina.
Through these three steps, the reliability of miRNA target genes, the reliability of target genes as biomarkers, and the reliability of the functions performed by the target genes were successively validated.
3. Results
3.1. MTP Network
There were 386 differentially expressed miRNAs in unstable angina, obtained by differential expression analysis and after screening. By searching the target genes of miRNAs and the genes of unstable angina and taking the intersection, 232 intersecting genes were obtained, corresponding to 238 miRNAs, with a total of 2706 miRNA-target gene interactions.
For these intersecting genes, after setting the species to Homo sapiens and the minimum interaction score to 0.4, a total of 8696 protein-protein interactions were found. Next, KEGG functional enrichment was performed, and a total of 103 pathways were screened, resulting in a total of 1361 gene-pathway interactions. The MTP network is summarized in Table 2, and the detailed data are available in Table S2 in Supplementary Materials. M, T, and
Table 2
Nodes and edges of the MTP network.
Nodes or edges | M | T | M-T | T-T | T- | |
Number | 238 | 232 | 103 | 2706 | 8562 | 1361 |
3.2. Network Analysis of Several Methods
3.2.1. Modules and Hub miRNAs Based on the WGCNA Method
Differentially expressed miRNAs may play an important role in disease states [60], and either overexpressed or underexpressed miRNAs were included in our study. We used the expression data from these miRNAs to construct a weighted miRNA coexpression network, divided modules using the dynamic tree cutting method, and merged the modules to finally obtain four modules. The results are shown in Figure 3(a).
[figure(s) omitted; refer to PDF]
A total of four colored modules, yellow, turquoise, brown, and blue, were generated, from which hub miRNAs were searched, respectively. The correlation between miRNAs was used as the weight of edges, and the importance of the nodes in each module was ranked according to the connectivity [61]. The top 10 nodes in each module are shown in Figure 3(b), and a total of 40 hub miRNAs were filtered out. The thickness of edges is proportional to the correlation value.
3.2.2. Similarity Network and the Scale-Free Network Based on the SimCluster Method
As can be seen from Table 3, the results of first-order similarity are better than those of second-order similarity. Figure 4(a) shows the variation of Pearson’s correlation coefficient and linear regression (ordinary least squares) R2 when different first-order similarity thresholds are set. The similarity value at which both Cor and R2 are greater than 0.9 is used as the threshold (0.25), and links greater than this threshold are retained, while links less than this threshold are excluded.
Table 3
Comparison of the performance of hub miRNAs obtained by several network analysis methods.
WGCNA | FC_hub | SimCluster_1 | WGCNA & SimCluster_1 | SimCluster_2 | WGCNA & SimCluster_2 | ||
Original dataset | Mean_AUC | 0.5071 | 0.9083 | 0.5771 | 0.6190 | 0.5884 | 0.9905 |
Mean_AUPR | 0.6689 | 0.9118 | 0.6620 | 0.7112 | 0.6679 | 0.9911 | |
New dataset | Mean_AUC | 0.5191 | 0.5197 | 0.5319 | 0.6292 | 0.5166 | 0.5296 |
Mean_AUPR | 0.6019 | 0.5862 | 0.6069 | 0.6773 | 0.5960 | 0.6026 |
Note. FC_hub refers to the top 20 miRNAs ranked by |logFC|. SimCluster_1 and SimCluster_2 refer to the results obtained based on first-order similarity and second-order similarity, respectively. “&” refers to the results of the intersection of the two methods. Bold values indicate the best results on the new dataset.
[figure(s) omitted; refer to PDF]
Figures 4(b) and 4(c) show the evaluation of whether a similarity network greater than the threshold is a scale-free network. It can be seen that lg (K) is highly correlated with lg (pK) and that the distribution of the two is linear. There are more nodes with a small degree K and fewer nodes with a large degree K. Fewer nodes connect most of the nodes, which is a characteristic of scale-free networks. The scale-free first-order similarity network is shown in Figure 5.
[figure(s) omitted; refer to PDF]
The linear regression equations for the SimCluster_1 method (SimCluster based on first-order similarity) and the SimCluster_2 method (SimCluster based on second-order similarity) can be found in Table S3 in Supplementary Materials.
3.2.3. Hub miRNAs Obtained Based on the SimCluster Method
Modules can be obtained after network clustering of the scale-free similarity network, and thus, hub miRNAs affecting each module can be obtained. We have demonstrated in a previous study that the fast greedy algorithm is a good network clustering algorithm [29], and in this study, we have also demonstrated that the fast greedy algorithm is the best among the four network clustering algorithms by calculating modularity values. The results of the comparison of network clustering algorithms are presented in Supplementary Materials.
We used the fast greedy algorithm to perform network clustering of the scale-free similarity network, finding the nodes with the largest degree value from each module as hub miRNAs. In some modules, there were many nodes sharing the largest degree value, and these were included in subsequent studies.
Final hub miRNAs were obtained by taking the intersection of the results of the SimCluster method and the WGCNA method (see the nodes marked with red borders in Figure 3(b) or the red nodes in Figure 5). Hub miRNAs obtained by the six network analysis methods, including the SimCluster method, are listed in Supplementary Materials.
3.2.4. Performance Comparison of Network Analysis Methods
We compared the performance of these network analysis methods using the mean AUC values and mean AUPR values of hub miRNAs obtained from each method when distinguishing between the disease and control groups. Table 3 shows the performance of hub miRNAs from the six methods. It can be seen that SimCluster_1 has better mean AUC values than the WGCNA method for both datasets, and the WGCNA & SimCluster_1 method has significantly improved over the WGCNA method in terms of both mean AUC values and mean AUPR values. The important point is that these results reveal that the network analysis approach combining correlation and similarity (WGCNA & SimCluster) gives better results than the approach using correlation (WGCNA) or similarity (SimCluster) alone, which is a new approach to finding hub nodes in the network. In addition, the WGCNA & SimCluster_2 method and the FC_hub method performed well for the original dataset and mediocrely for the new dataset, suggesting that neither the second-order similarity results nor the FoldChange results generalize well.
Hub miRNAs obtained by the WGCNA & SimCluster_1 method achieved the best results for the new dataset. We next used hub miRNAs obtained based on the WGCNA & SimCluster_1 method for our subsequent study. Figure 6 shows two of these hub miRNAs, hsa-miR-30a-5p and hsa-miR-502-3p. Both of these hub miRNAs had an AUC and AUPR above 0.97 for the original dataset, and both had an AUC above 0.67 and an AUPR above 0.71 for the new dataset. This shows the potential of these hub miRNAs as biomarkers, and it would be meaningful to conduct in-depth studies.
[figure(s) omitted; refer to PDF]
3.3. Link Prediction
The MTP network constructed is a small complex heterogeneous network, as shown in Table 2, containing three types of nodes and edges. A knowledge graph model or a graph neural network model can complete node-level, edge-level, or even graph-level modelling of complex heterogeneous networks, using existing data to predict unknown data [42].
We first performed link prediction for M-T, T-T, and T-
3.3.1. Comparison Experiments
Comparison experiments were conducted using six models, RotatE, TransE, KG2E, DistMult, RGCN, and CompGCN, to find the best link prediction model. Figures 7(a)–7(c) show the link prediction results for M-T, T-T, and T-
[figure(s) omitted; refer to PDF]
Table 4
Link prediction results for M-T, T-T, and T-
RotatE | TransE | KG2E | DistMult | RGCN | CompGCN | ||
MT | MR | 7.9155 ± 0.7216 | 12.7434 ± 1.581 | 16.6473 ± 1.9285 | 29.1253 ± 31.6744 | 9.9069 ± 1.3969 | 12.7025 ± 1.3974 |
MRR | 0.1274 ± 0.0134 | 0.0795 ± 0.0090 | 0.0608 ± 0.0071 | 0.0638 ± 0.0343 | 0.1031 ± 0.0171 | 0.0796 ± 0.0087 | |
TT | MR | 3.3667 ± 0.1477 | 8.9800 ± 0.2236 | 4.2304 ± 0.2909 | 16.3310 ± 24.3477 | 3.3050 ± 0.1418 | 2.9691 ± 0.1183 |
MRR | 0.2976 ± 0.0135 | 0.1402 ± 0.0099 | 0.2373 ± 0.0154 | 0.1675 ± 0.0847 | 0.3031 ± 0.0132 | 0.3373 ± 0.0134 | |
TP | MR | 4.4050 ± 0.4242 | 7.8392 ± 1.0865 | 12.4661 ± 2.2471 | 24.3245 ± 31.8611 | 4.4954 ± 0.3793 | 5.2599 ± 0.6200 |
MRR | 0.2288 ± 0.0210 | 0.1297 ± 0.0174 | 0.0826 ± 0.0151 | 0.0941 ± 0.0501 | 0.2239 ± 0.0189 | 0.1926 ± 0.0234 |
Bold values indicate the best results on that task.
It can be seen that for M-T link prediction, the RotatE model achieved the best results for all six metrics. For T-
In addition, we can find that all these knowledge graph models or graph neural network models performed better for T-T link prediction and T-
We finally selected the RotatE model as the link prediction model to perform subsequent link prediction tasks. The results of this section are also presented in Table S4 in Supplementary Materials.
3.3.2. Parameter Optimization Experiments
We demonstrated through comparison experiments that the RotatE model has good performance on all three types of edge prediction and is therefore a good link prediction model. Next, we chose the RotatE model for parameter optimization experiments to find the most suitable parameters. The three parameters optimized were the embedding dimension, learning rate, and epoch. In this study, we focused more on the function played by miRNAs, so we only performed parameter optimization experiments for M-T link prediction.
Figures 8(a)–8(c) show the changes in the results of Hits@k metrics after the parameters, the epoch, embedding dimension, or learning rate, were changed, respectively. When the parameters were changed, the RotatE model had the best results for Hits@5 and Hits@10 at an epoch of 50, for Hits@5, Hits@10, Hits@20, and Hits@50 at a learning rate of 0.001, and for Hits@5 at an embedding dimension of 64.
[figure(s) omitted; refer to PDF]
Table 5 shows the results calculated by MR and MRR. In the range of parameter variations, the RotatE model gave the best results for the metrics at an epoch of 50 and a learning rate of 0.001. The results were very close for both MR and MRR at 64 and 128 embedding dimensions, and again, since the best results were obtained for Hits@5 at an embedding dimension of 64, we still set the embedding dimension to 64.
Table 5
Results of parameter optimization experiments calculated by MR and MRR.
Epoch | 25 | 50 | 100 | 200 |
MR | 10.3123 ± 1.9440 | 7.9155 ± 0.7216 | 8.3054 ± 0.8194 | 8.155 ± 0.7939 |
MRR | 0.1000 ± 0.0177 | 0.1274 ± 0.0134 | 0.1215 ± 0.0123 | 0.1236 ± 0.0115 |
Embedding | 32 | 64 | 96 | 128 |
MR | 9.0563 ± 0.7624 | 7.9155 ± 0.7216 | 8.1209 ± 0.8165 | 7.9137 ± 0.8641 |
MRR | 0.1111 ± 0.0091 | 0.1274 ± 0.0134 | 0.1243 ± 0.0131 | 0.1276 ± 0.0131 |
Learning rate | 0.0001 | 0.001 | 0.01 | 0.1 |
MR | 44.1148 ± 7.67 | 7.9155 ± 0.7216 | 9.0002 ± 0.8692 | 10.4498 ± 1.0269 |
MRR | 0.0234 ± 0.0046 | 0.1274 ± 0.0134 | 0.1120 ± 0.0107 | 0.0966 ± 0.0101 |
The bold values indicate the best parameters and the resulting best results.
Therefore, by optimizing the three parameters, we finally chose an epoch of 50, an embedding dimension of 64, and a learning rate of 0.001 as the final parameter settings for the M-T link prediction task.
3.4. Multilabel Classification
The multilabel classification task that we have accomplished is to classify targets and pathways into multiple categories [25]. Specifically, the multilabel classification of targets refers to the classification of the tissue localization of targets. From the data obtained, targets can be distributed to as many as 21 tissues, with the top five being the blood, liver, nervous system, lungs, and heart. Figure 9(a) shows the distribution of targets in different tissues, with the thickness of the lines proportional to the amount of gene expression.
[figure(s) omitted; refer to PDF]
The multilabel classification of pathways refers to the division of pathways into different broad categories. The KEGG database classifies all human pathways into 7 broad categories, and the pathways enriched by the target genes of these miRNAs can be classified into 5 of these 7 broad categories. Figure 9(b) shows the classification of the enriched pathways into the five categories of organismal systems, cellular processes, human diseases, environmental information processing, and metabolism.
We first converted the label value of the target or pathway to 0 or 1, and if the target or pathway existed in a category, then the label value was 1; otherwise, it was 0. We then used a knowledge graph or graph neural network model to obtain the embedding representation of nodes and a 2-layer MLP to predict the multilabel classification of nodes. The means and standard deviations of the results after ten-fold cross-validation are shown in Table 6.
Table 6
Accuracy of multilabel classification of targets and pathways.
RotatE | TransE | KG2E | DistMult | RGCN | CompGCN | ||
Accuracy | T | 0.7902 ± 0.0115 | 0.7766 ± 0.0188 | 0.7704 ± 0.0235 | 0.8200 ± 0.0201 | 0.8312 ± 0.0220 | 0.8205 ± 0.0248 |
0.8245 ± 0.0400 | 0.8047 ± 0.0630 | 0.8127 ± 0.0459 | 0.8167 ± 0.0634 | 0.8356 ± 0.0404 | 0.8307 ± 0.0336 |
Bold values indicate the best results on that task.
It can be seen that the RGCN model achieved the best performance in the multi-label classification prediction task for both T and
3.5. Case Studies
Through comparison experiments, we have demonstrated that the RotatE model is the best link prediction model, and through parameter optimization experiments, we have found the optimal parameters for the RotatE model. In this section, we conduct a case study of hub miRNAs obtained by the WGCNA & SimCluster_1 method.
The role of miRNAs in regulating gene expression is well worth investigating in depth. Unstable angina is a relatively complex disease, and the functions played by miRNAs in this disease are still not fully elucidated. Studying the function of hub miRNAs is a convenient way to understand the function of all miRNAs [62]. We used the RotatE model to predict the potential target genes of hub miRNAs, the unknown M-T link prediction task. This is also referred to as a complementary task for the knowledge graph.
Eleven hub miRNAs were obtained by the WGCNA & SimCluster_1 method. We performed M-T link prediction for these 11 hub miRNAs, and the detailed prediction results are shown in Table S5 in Supplementary Materials. When predicting potential M-T links, for each hub miRNA, the top 10 target genes in terms of predicted scores were taken, so there were 110 hub miRNA-target gene interactions in total. After de-duplication, only 38 genes of the predicted target genes remained.
3.6. Validation of the Results
In the network analysis section, we have used two datasets to validate the best network analysis method and hub miRNAs obtained by the method. In the network modelling section, the results for the potential target genes of hub miRNAs also need to be validated to demonstrate the reliability of model predictions [63]. Using the designed three-step validation method, we first validated the reliability of hub miRNA-target gene interactions. As can be seen in Figure 10(a), 50%, 53%, 48%, and 40% of the top 1, 3, 5, and 10 predictions, respectively, were validated by the literature or other databases. That is, after discarding those interactions that had already appeared in the dataset, a large proportion of the predicted unknown hub miRNA-target gene interactions were validated.
[figure(s) omitted; refer to PDF]
Second, the predicted target genes of hub miRNAs were validated using two different methods. On the one hand, based on a new gene expression dataset, we tested whether these target genes could distinguish unstable angina samples from control samples, i.e., whether they had potential as biomarkers. Figures 10(b) and 10(c) show the ROC curves and PR curves for 5 of the 38 target genes. It can be seen that these target genes can distinguish well between disease and control samples, and their differential expression between samples perhaps gives them the function of biomarkers. On the other hand, we searched the TF-Marker database to determine whether these target genes were transcription factors or related markers, or whether they were transcription factors or related markers that were closely associated with unstable angina. Table 7 shows the search results of the TF-Marker database. It can be found that six of the ten newly predicted target genes are transcription factors or related markers, and three of them are directly associated with unstable angina, with the proportions of 60% and 30%, respectively, which are larger than the proportions in the training or validation sets. Therefore, we validated the potential of these target genes as biomarkers in two ways.
Table 7
Number of target genes that are transcription factors or related markers.
Target genes of hub miRNAs | Number of target genes | Number of transcription factors or related markers | Number of transcription factors or related markers directly associated with UA |
Newly predicted (not present in the training or validation set) | 10 | 6 (0.60) | 3 (0.30) |
Present in the training or validation set | 28 | 14 (0.50) | 7 (0.25) |
Total | 38 | 20 (0.53) | 10 (0.26) |
Finally, we performed KEGG functional enrichment on the predicted target genes of hub miRNAs, and the top 20 pathways at a
4. Discussion
Many noncoding RNAs are regulators and play an important role in the process of gene expression in a regulatory manner [67]. Currently, miRNAs are considered important types of noncoding RNAs and mostly play a role in regulating gene expression in a negative regulatory manner [3]. The expression of some miRNAs and their regulated target genes (mRNAs) varies in different diseases and is disease-specific [68]. Therefore, it is theoretically feasible to find miRNAs or genes that are specific to a disease and act as biomarkers or diagnostic markers [69]. Unstable angina is a complex disease with multifactorial and multisystemic involvement, and the pathogenesis and treatment of this disease still require in-depth investigation. Previous studies on unstable angina have mostly focused on genes and function while neglecting regulatory functions played by regulatory factors such as miRNAs and are therefore inadequate and incomplete [70, 71]. In this study, we designed a research strategy using miRNA expression data and miRNA regulatory networks to first find the best performing WGCNA & SimCluster_1 method by comparing six network analysis methods for original and new datasets, and then, we used this method to obtain hub miRNAs that could be used as biomarkers. The best model from the network modelling section was then used to predict unknown functions based on the existing functions of hub miRNAs.
The miRNA regulatory network is also a complex network and the WGCNA approach only analyzes the correlation network constructed based on the expression levels of miRNAs and does not take full advantage of the other information in the complex network. The similarity of nodes in a network is also important information that can be exploited [72]. We constructed a first-order similarity network and a second-order similarity network based on the similarity of miRNA actions and designed a network analysis algorithm to find hub miRNAs using the similarity network. Notably, we used the formation of scale-free networks as the judgment criterion when screening similarity thresholds, which coincided with the WGCNA method when screening soft thresholds [61]. A comparison of multiple network analysis methods showed that the WGCNA & SimCluster_1 method, which combines correlation and similarity, achieved the best results for the new dataset. This suggests that it is feasible to use correlation and similarity between miRNAs to screen for hub miRNAs, with better results than using similarity or correlation alone.
Knowledge graph models are mostly used for large, complex heterogeneous graphs, often containing hundreds or thousands of types of nodes and edges [73]. Because various heterogeneous graphs are so different from each other, researchers have developed a variety of dedicated knowledge graph models, so that each model has its own advantages [74]. In fact, the MTP network that we have constructed is a small, complex heterogeneous network, and the knowledge graph model performs equally well on small heterogeneous networks. Of the six models we used, RotatE, TransE, KG2E, and DistMult are knowledge graph models, RGCN is a classical graph neural network model, and CompGCN is a model combining graph neural networks with knowledge graphs, all of which are trained in the same way as knowledge graph modelling. The knowledge graph model uses a set of many triples as a dataset and evaluates the performance of the model by predicting missing head or tail entities and using a scoring function. The task of predicting missing head or tail entities can also be considered a link prediction task, and the discovery of unknown links is of practical importance [46].
In this study, we performed node-level and edge-level predictions for the MTP network using a knowledge graph or graph neural network model. In edge-level modelling, we made predictions for all three types of edges in the network. For the Hits@5 metric, the RotatE model yielded results of 0.1789 ± 0.0167, 0.5600 ± 0.0185, and 0.3220 ± 0.0377 for M-T, T-T, and T-
The hub miRNAs that we have selected as key nodes in each module may play a vital regulatory role in the disease [76]. The functions of these hub miRNAs are well worth investigating in depth. We performed an M-T link prediction task on 11 hub miRNAs using the best link prediction model and optimal parameters to predict the target genes of these hub miRNAs and perform functional enrichment. The reliability of the results can only be demonstrated after the model has been validated [77], so we developed a three-step validation method for the designed model and the content of this study to perform a three-part validation. In the first part, we validated a total of 110 predicted hub miRNA-target gene interactions by searching the literature or other databases. The percentage of correct predictions in the top 1, top 3, and top 5 was 50%, 53%, and 48%, respectively. In the second part, the target genes of hub miRNAs had the ability to distinguish well between different groups of samples for the gene expression dataset. Moreover, 60% of the predicted novel target genes were transcription factors or markers, and 30% were directly associated with the development of unstable angina. In the third part, many of the enriched pathways were associated with unstable angina, which, on the other hand, proved the reliability of predicted target genes. In terms of specific mechanisms, lipid levels in blood and fluid shear stress of local blood are important for the pathogenesis of coronary atherosclerosis [65, 66], which is an important pathological feature of unstable angina. NF-kappa B is a key transcription factor involved in many physiological and pathological processes, including immune response, apoptosis, and inflammation [64]. Studies [78, 79] have shown that in the pathology of atherosclerosis, NF-kappa B is critical for the crosstalk between cytokines, adhesion molecules, and growth factors, leading to the formation, growth, and eventual rupture of atherosclerotic plaques. This three-step validation method contains three parts in sequential order, validating the results of our model in a cascading manner by verifying miRNA, target gene, and function in succession.
It can be seen that although the WGCNA & SimCluster_1 method performs best for the new dataset, the AUC and AUPPR are still low, and further improvement on this basis is necessary. In addition, none of the six knowledge graphs or graph neural network models that we used performed very well for M-T link prediction, which may be due to limitations in the models themselves or insufficient information in the MTP networks used. In the future, we will seek to develop superior models or use larger heterogeneous networks and we will also conduct generalization ability experiments in order to generalize the present modelling strategy to other diseases.
5. Conclusions
We constructed a complex heterogeneous network regulated by miRNAs in unstable angina and then analyzed and modelled it, which explored the functions played by noncoding RNAs in complex diseases from the miRNA perspective. Among the six network analysis methods for finding hub miRNAs, the WGCNA & SimCluster_1 method yielded the best results for the new dataset, identifying hub miRNAs that could act as biomarkers. Comparative experiments with six knowledge graphs or graph neural network models demonstrated that RotatE is a good link prediction model and that RGCN is the best multilabel classification model for the miRNA regulatory network. Optimal parameters for the M-T link prediction task were obtained by parameter optimization experiments. The results of the predicted target genes of hub miRNAs based on the best model and the best parameters were validated by three methods. Our modelling strategy can be used as a reference for other disease and noncoding RNA studies.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (81303315) and the Natural Science Foundation of Liaoning Province of China (20180550342).
[1] L. Wang, F. Lu, J. Xu, "Identification of potential miRNA-mRNA regulatory network contributing to hypertrophic cardiomyopathy (HCM)," Frontiers in Cardiovascular Medicine, vol. 8,DOI: 10.3389/fcvm.2021.660372, 2021.
[2] S. S. Das, P. Saha, N. Chakravorty, "miRwayDB: a database for experimentally validated microRNA-pathway associations in pathophysiological conditions," Database, vol. 2018,DOI: 10.1093/database/bay023, 2018.
[3] J. Liao, Y. Liu, J. Wang, "Identification of more objective biomarkers for Blood-Stasis syndrome diagnosis," BMC Complementary and Alternative Medicine, vol. 16 no. 1,DOI: 10.1186/s12906-016-1349-9, 2016.
[4] R. Zhang, Z. Ji, Y. Yao, W. Zuo, M. Yang, Y. Qu, Y. Su, G. Ma, Y. Li, "Identification of hub genes in unstable atherosclerotic plaque by conjoint analysis of bioinformatics," Life Sciences, vol. 262,DOI: 10.1016/j.lfs.2020.118517, 2020.
[5] L. Yu, T. Yao, Z. Jiang, T. Xu, "Integrated analysis of miRNA-mRNA regulatory networks associated with osteonecrosis of the femoral head," Evidence-based Complementary and Alternative Medicine, vol. 2021, pp. 2021-2111, DOI: 10.1155/2021/8076598, 2021.
[6] L. Yang, L. Song, D. Ma, J. Zhang, H. Xie, H. Wu, H. Liu, S. Yu, H. Liang, P. Zhang, L. Cui, H. Yuan, L. Chen, "Plasma S100A4 level and cardiovascular risk in patients with unstable angina pectoris," Biomarkers in Medicine, vol. 13 no. 17, pp. 1459-1467, DOI: 10.2217/bmm-2019-0137, 2019.
[7] Z.-X. Dong, J. Zhang, Y.-C. Luo, M. M. Zhao, J. G. Cai, S. Cheng, L. M. Zheng, X. Hai, "The correlation between trimethylamine N-oxide, lipoprotein ratios, and conventional lipid parameters in patients with unstable angina pectoris," Bioscience Reports, vol. 40 no. 1,DOI: 10.1042/BSR20192657, 2020.
[8] Y. Hu, H. Cong, L. Zheng, D. Jin, "Flow reserve fraction: the optimal choice in lesion assessing and interventional guiding for patient with unstable angina pectoris and intermediate lesion wrapped with myocardial bridge: a case report," Journal of Cardiothoracic Surgery, vol. 16 no. 1,DOI: 10.1186/s13019-021-01720-7, 2021.
[9] F. Lin, Y. Yang, Q. Guo, M. Xie, S. Sun, X. Wang, D. Li, G. Zhang, M. Li, J. Wang, G. Zhao, "Analysis of the molecular mechanism of acute coronary syndrome based on circRNA-miRNA network regulation," Evidence-based Complementary and Alternative Medicine, vol. 2020,DOI: 10.1155/2020/1584052, 2020.
[10] Z. H. Wang, X. Y. Sun, C. L. Li, Y. M. Sun, J. Li, L. F. Wang, Z. Q. Li, "miRNA-21 expression in the serum of elderly patients with acute myocardial infarction," Medical Science Monitor, vol. 23, pp. 5728-5734, DOI: 10.12659/msm.904933, 2017.
[11] M. Yang, H. He, T. Peng, Y. Lu, J. Yu, "Identification of 9 gene signatures by WGCNA to predict prognosis for colon adenocarcinoma," Computational Intelligence and Neuroscience, vol. 2022,DOI: 10.1155/2022/8598046, 2022.
[12] L. Li, X. Ayiding, R. Han, "miRNA-gene interaction network construction strategy to discern promising traditional Chinese medicine against osteoporosis," BioMed Research International, vol. 2022,DOI: 10.1155/2022/9093614, 2022.
[13] X. Wan, S. Hao, C. Hu, R. Qu, "Identification of a novel lncRNA‐miRNA‐mRNA competing endogenous RNA network associated with prognosis of breast cancer," Journal of Biochemical and Molecular Toxicology, vol. 36 no. 8,DOI: 10.1002/jbt.23089, 2022.
[14] J. Li, Y. Zhao, S. Zhou, Y. Zhou, L. Lang, "Inferring lncRNA functional similarity based on integrating heterogeneous network data," Frontiers in Bioengineering and Biotechnology, vol. 8,DOI: 10.3389/fbioe.2020.00027, 2020.
[15] E. A. Leicht, P. Holme, M. E. J. Newman, "Vertex similarity in networks," Economic and Statistical Review, vol. 73 no. 2,DOI: 10.1103/PhysRevE.73.026120, 2006.
[16] G. Qi, A. Zhang, Z. Xu, Z. Li, W. Zeng, X. Liu, J. Ma, X. Zheng, Z. Li, "Application of a complex network modeling approach to explore the material basis and mechanisms of traditional Chinese medicine: a case study of xuefu zhuyu decoction for the treatment of two types of angina pectoris," IEEE Access, vol. 10,DOI: 10.1109/ACCESS.2022.3217926, 2022.
[17] Z. Su, Z. Lin, J. Ai, H. Li, "Rating prediction in recommender systems based on user behavior probability and complex network modeling," IEEE Access, vol. 9,DOI: 10.1109/ACCESS.2021.3060016, 2021.
[18] L. Hua, P. Zhou, L. Li, H. Liu, Z. Yang, "Prioritizing breast cancer subtype related miRNAs using miRNA–mRNA dysregulated relationships extracted from their dual expression profiling," Journal of Theoretical Biology, vol. 331,DOI: 10.1016/j.jtbi.2013.04.008, 2013.
[19] J. Ma, D. Li, Y. Chen, Y. Qiao, H. Zhu, X. Zhang, "A knowledge graph entity disambiguation method based on entity-relationship embedding and graph structure embedding," Computational Intelligence and Neuroscience, vol. 2021,DOI: 10.1155/2021/2878189, 2021.
[20] Q. Ye, C.-Y. Hsieh, Z. Yang, Y. Kang, J. Chen, D. Cao, S. He, T. Hou, "A unified drug–target interaction prediction framework based on knowledge graph and recommendation system," Nature Communications, vol. 12 no. 1,DOI: 10.1038/s41467-021-27137-3, 2021.
[21] X. Mei, X. Cai, L. Yang, N. Wang, "Relation-aware heterogeneous graph transformer based drug repurposing," Expert Systems with Applications, vol. 190,DOI: 10.1016/j.eswa.2021.116165, 2022.
[22] W. Zhao, W. Lu, Z. Li, C. Zhou, H. Fan, Z. Yang, X. Lin, C. Li, "TCM herbal prescription recommendation model based on multi-graph convolutional network," Journal of Ethnopharmacology, vol. 297,DOI: 10.1016/j.jep.2022.115109, 2022.
[23] E. A. Ahmed, P. Rajendran, H. Scherthan, "The microRNA-202 as a diagnostic biomarker and a potential tumor suppressor," International Journal of Molecular Sciences, vol. 23 no. 11,DOI: 10.3390/ijms23115870, 2022.
[24] D. Cao, N. Xu, Y. Chen, H. Zhang, Y. Li, Z. Yuan, "Construction of a Pearson- and MIC-based Co-expression network to identify potential cancer genes," Interdisciplinary Sciences: Computational Life Sciences, vol. 14 no. 1, pp. 245-257, DOI: 10.1007/s12539-021-00485-w, 2022.
[25] J. Wang, J. Zhang, Y. Cai, L. Deng, "DeepMiR2GO: inferring functions of human MicroRNAs using a deep multi-label classification model," International Journal of Molecular Sciences, vol. 20 no. 23,DOI: 10.3390/ijms20236046, 2019.
[26] Y. Zhou, M. Liu, J. Li, B. Wu, W. Tian, L. Shi, J. Zhang, Z. Sun, "The inverted pattern of circulating miR-221-3p and miR-222-3p associated with isolated low HDL-C phenotype," Lipids in Health and Disease, vol. 17 no. 1,DOI: 10.1186/s12944-018-0842-1, 2018.
[27] Y.-Y. Shi, Y.-Q. Li, X. Xie, Y. T. Zhou, Q. Zhang, J. L. Yu, P. Li, N. Mi, F. Li, "Homotherapy for heteropathy active components and mechanisms of Qiang-Huo-Sheng-Shi decoction for treatment of rheumatoid arthritis and osteoarthritis," Computational Biology and Chemistry, vol. 89,DOI: 10.1016/j.compbiolchem.2020.107397, 2020.
[28] Y. Ji Diana Lee, V. Kim, D. C. Muth, K. W. Witwer, "Validated MicroRNA target databases: an evaluation," Drug Development Research, vol. 76 no. 7, pp. 389-396, DOI: 10.1002/ddr.21278, 2015.
[29] G. Qi, K. Jiang, J. Qu, A. Zhang, Z. Xu, Z. Li, X. Zheng, Z. Li, "The material basis and mechanism of xuefu zhuyu decoction in treating stable angina pectoris and unstable angina pectoris," Evidence-based Complementary and Alternative Medicine, vol. 2022,DOI: 10.1155/2022/3741027, 2022.
[30] H. Wang, H. Zhu, W. Zhu, Y. Xu, N. Wang, B. Han, H. Song, J. Qiao, "Bioinformatic analysis identifies potential key genes in the pathogenesis of turner syndrome," Frontiers in Endocrinology, vol. 11,DOI: 10.3389/fendo.2020.00104, 2020.
[31] O. Palasca, A. Santos, C. Stolte, J. Gorodkin, L. J. Jensen, "Tissues 2.0: an integrative web resource on mammalian tissue expression," Database, vol. 2018,DOI: 10.1093/database/bay003, 2018.
[32] D. Szklarczyk, A. L. Gable, D. Lyon, A. Junge, S. Wyder, J. Huerta-Cepas, M. Simonovic, N. T. Doncheva, J. H. Morris, P. Bork, L. J. Jensen, C. Mering, "STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets," Nucleic Acids Research, vol. 47 no. D1, pp. D607-D613, DOI: 10.1093/nar/gky1131, 2019.
[33] G. Yu, L. G. Wang, Y. Han, Q. Y. He, "clusterProfiler: an R package for comparing biological themes among gene clusters," OMICS: A Journal of Integrative Biology, vol. 16 no. 5, pp. 284-287, DOI: 10.1089/omi.2011.0118, 2012.
[34] J. Ren, J. Zhang, N. Xu, G. Han, Q. Geng, J. Song, S. Li, J. Zhao, H. Chen, "Signature of circulating MicroRNAs as potential biomarkers in vulnerable coronary artery disease," PLoS One, vol. 8 no. 12,DOI: 10.1371/journal.pone.0080738, 2013.
[35] H. J. Park, J. H. Noh, J. W. Eun, Y. S. Koh, S. M. Seo, W. S. Park, J. Y. Lee, K. Chang, K. B. Seung, P. J. Kim, S. W. Nam, "Assessment and diagnostic relevance of novel serum biomarkers for early decision of ST-elevation myocardial infarction," Oncotarget, vol. 6 no. 15,DOI: 10.18632/oncotarget.4001, 2015.
[36] R. Tan, G. Zhang, R. Liu, J. Hou, Z. Dong, C. Deng, S. Wan, X. Lai, H. Cui, "Identification of early diagnostic and prognostic biomarkers via WGCNA in stomach adenocarcinoma," Frontiers in Oncology, vol. 11,DOI: 10.3389/fonc.2021.636461, 2021.
[37] T. Chen, Y. X. Liu, L. Huang, "ImageGP: an easy‐to‐use data visualization web server for scientific researchers," iMeta, vol. 1 no. 1,DOI: 10.1002/imt2.5, 2022.
[38] J. I. F. Bass, A. Diallo, J. Nelson, J. M. Soto, C. L. Myers, A. J. M. Walhout, "Using networks to measure similarity between genes: association index selection," Nature Methods, vol. 10 no. 12, pp. 1169-1176, DOI: 10.1038/nmeth.2728, 2013.
[39] T. Mohr, S. Katz, V. Paulitschke, N. Aizarani, A. Tolios, "Systematic analysis of the transcriptome profiles and Co-expression networks of tumour endothelial cells identifies several tumour-associated modules and potential therapeutic targets in hepatocellular carcinoma," Cancers, vol. 13 no. 8,DOI: 10.3390/cancers13081768, 2021.
[40] M. E. J. Newman, "Modularity and community structure in networks," Proceedings of the National Academy of Sciences, vol. 103 no. 23, pp. 8577-8582, DOI: 10.1073/pnas.0601602103, 2006.
[41] W. J. Vlietstra, R. Vos, A. M. Sijbers, E. M. van Mulligen, J. A. Kors, "Using predicate and provenance information from a knowledge graph for drug efficacy screening," Journal of Biomedical Semantics, vol. 9 no. 1,DOI: 10.1186/s13326-018-0189-6, 2018.
[42] X. Su, L. Hu, Z. You, P. Hu, B. Zhao, "Attention-based knowledge graph representation learning for predicting drug-drug interactions," Briefings in Bioinformatics, vol. 23 no. 3,DOI: 10.1093/bib/bbac140, 2022.
[43] Z. Sun, Z.-H. Deng, J.-Y. Nie, J. Tang, "RotatE: knowledge graph embedding by relational rotation in complex space," Proceedings of the ICLR,DOI: 10.48550/arXiv.1902.10197, .
[44] Y. Ding, X. Jiang, Y. Kim, "Relational graph convolutional networks for predicting blood-brain barrier penetration of drug molecules," Bioinformatics, vol. 38 no. 10, pp. 2826-2831, DOI: 10.1093/bioinformatics/btac211, 2022.
[45] Y. Xiong, H. Peng, Y. Xiang, K. C. Wong, Q. Chen, J. Yan, B. Tang, "Leveraging Multi-source knowledge for Chinese clinical named entity recognition via relational graph convolutional network," Journal of Biomedical Informatics, vol. 128,DOI: 10.1016/j.jbi.2022.104035, 2022.
[46] R. Zhang, D. Hristovski, D. Schutte, A. Kastrin, M. Fiszman, H. Kilicoglu, "Drug repurposing for COVID-19 via knowledge graph completion," Journal of Biomedical Informatics, vol. 115,DOI: 10.1016/j.jbi.2021.103696, 2021.
[47] S. He, K. Liu, G. Ji, J. Zhao, "Learning to represent knowledge graphs with Gaussian embedding," Proceedings of the 24th ACM International on Conference on Information and Knowledge Management,DOI: 10.1145/2806416.2806502, .
[48] B. Yang, W.-T. Yih, X. He, J. Gao, L. Deng, "Embedding entities and relations for learning and inference in knowledge bases," 2014. https://arxiv.org/abs/1412.6575
[49] S. Vashishth, S. Sanyal, V. Nitin, P. Talukdar, "Composition-based multi-relational graph convolutional networks," Proceedings of the ICLR,DOI: 10.48550/arXiv.1911.03082, .
[50] M.-Y. C. Polley, R. A. Leon-Ferre, S. Leung, A. Cheng, D. Gao, J. Sinnwell, H. Liu, D. W. Hillman, A. Eyman-Casey, J. A. Gilbert, V. Negron, J. C. Boughey, M. C. Liu, J. N. Ingle, K. Kalari, F. Couch, J. M. Carter, D. W. Visscher, T. O. Nielsen, M. P. Goetz, "A clinical calculator to predict disease outcomes in women with triple-negative breast cancer," Breast Cancer Research and Treatment, vol. 185 no. 3, pp. 557-566, DOI: 10.1007/s10549-020-06030-5, 2021.
[51] S. Pang, Y. Zhuang, X. Wang, F. Wang, S. Qiao, "EOESGC: predicting miRNA-disease associations based on embedding of embedding and simplified graph convolutional network," BMC Medical Informatics and Decision Making, vol. 21 no. 1,DOI: 10.1186/s12911-021-01671-y, 2021.
[52] Z. Yu, F. Huang, X. Zhao, W. Xiao, W. Zhang, "Predicting drug–disease associations through layer attention graph convolutional network," Briefings in Bioinformatics, vol. 22 no. 4,DOI: 10.1093/bib/bbaa243, 2021.
[53] H. Fu, F. Huang, X. Liu, Y. Qiu, W. Zhang, "MVGCN: data integration through multi-view graph convolutional network for predicting links in biomedical bipartite networks," Bioinformatics, vol. 38 no. 2, pp. 426-434, DOI: 10.1093/bioinformatics/btab651, 2022.
[54] Y. Kalakoti, S. Yadav, D. Sundar, "Deep neural network-assisted drug recommendation systems for identifying potential drug-target interactions," ACS Omega, vol. 7 no. 14,DOI: 10.1021/acsomega.2c00424, 2022.
[55] P. Xuan, L. Gao, N. Sheng, T. Zhang, T. Nakaguchi, "Graph convolutional autoencoder and fully-connected autoencoder with attention mechanism based method for predicting drug-disease associations," IEEE Journal of Biomedical and Health Informatics, vol. 25 no. 5, pp. 1793-1804, DOI: 10.1109/JBHI.2020.3039502, 2021.
[56] Y. Zhang, I. Beketaev, A. M. Segura, W. Yu, Y. Xi, J. Chang, Y. Ma, J. Wang, "Contribution of increased expression of yin yang 2 to development of cardiomyopathy," Frontiers in Molecular Biosciences, vol. 7,DOI: 10.3389/fmolb.2020.00035, 2020.
[57] A. Sayad, M. Taheri, I. Azari, V. K. Oskoei, S. Ghafouri-Fard, "PIAS genes as disease markers in bipolar disorder," Journal of Cellular Biochemistry, vol. 120 no. 8,DOI: 10.1002/jcb.28564, 2019.
[58] M. Xu, X. Bai, B. Ai, G. Zhang, C. Song, J. Zhao, Y. Wang, L. Wei, F. Qian, Y. Li, X. Zhou, L. Zhou, Y. Yang, J. Chen, J. Liu, D. Shang, X. Wang, Y. Zhao, X. Huang, Y. Zheng, J. Zhang, Q. Wang, C. Li, "TF-Marker: a comprehensive manually curated database for transcription factors and related markers in specific cell and tissue types in human," Nucleic Acids Research, vol. 50 no. D1, pp. D402-D412, DOI: 10.1093/nar/gkab1114, 2022.
[59] K. Zhu, M. Zhang, J. Long, S. Zhang, H. Luo, "Elucidating the mechanism of action of salvia miltiorrhiza for the treatment of acute pancreatitis based on network pharmacology and molecular docking technology," Computational and Mathematical Methods in Medicine, vol. 2021,DOI: 10.1155/2021/8323661, 2021.
[60] M. Adhami, B. Sadeghi, A. Rezapour, A. A. Haghdoost, H. MotieGhader, "Repurposing novel therapeutic candidate drugs for coronavirus disease-19 based on protein-protein interaction network analysis," BMC Biotechnology, vol. 21 no. 1,DOI: 10.1186/s12896-021-00680-z, 2021.
[61] P. Langfelder, S. Horvath, "WGCNA: an R package for weighted correlation network analysis," BMC Bioinformatics, vol. 9 no. 1,DOI: 10.1186/1471-2105-9-559, 2008.
[62] T. T. T. Truong, C. C. Bortolasci, B. Spolding, B. Panizzutti, Z. S. Liu, S. Kidnapillai, M. Richardson, L. Gray, C. M. Smith, O. M. Dean, J. H. Kim, M. Berk, K. Walder, "Co-expression networks unveiled long non-coding RNAs as molecular targets of drugs used to treat bipolar disorder," Frontiers in Pharmacology, vol. 13,DOI: 10.3389/fphar.2022.873271, 2022.
[63] S. Pang, Y. Zhang, T. Song, X. Zhang, X. Wang, A. Rodriguez-Paton, "AMDE: a novel attention-mechanism-based multidimensional feature encoder for drug–drug interaction prediction," Briefings in Bioinformatics, vol. 23 no. 1,DOI: 10.1093/bib/bbab545, 2022.
[64] S. Frantz, K. Hu, B. Bayer, S. Gerondakis, J. Strotmann, A. Adamek, G. Ertl, J. Bauersachs, S. Frantz, K. Hu, B. Bayer, S. Gerondakis, J. Strotmann, A. Adamek, G. Ertl, J. Bauersachs, "Absence of NF- κ B subunit p50 improves heart failure after myocardial infarction," The FASEB Journal, vol. 20 no. 11, pp. 1918-1920, DOI: 10.1096/fj.05-5133fje, 2006.
[65] J. H. Haga, L. K. Jennings, S. M. Slack, "Inhibition of shear-stress-induced platelet aggregation and phosphotyrosine signaling by GPIIb–IIIa antagonists," Annals of Biomedical Engineering, vol. 30 no. 10, pp. 1262-1272, DOI: 10.1114/1.1528613, 2002.
[66] M. Kidawa, A. Gluba-Brzózka, M. Zielinska, B. Franczyk, M. Banach, J. Rysz, "Cholesterol subfraction analysis in patients with acute coronary syndrome," Current Vascular Pharmacology, vol. 17 no. 4, pp. 365-375, DOI: 10.2174/1570161116666180601083225, 2019.
[67] Y.-Y. Wu, H.-C. Kuo, "Functional roles and networks of non-coding RNAs in the pathogenesis of neurodegenerative diseases," Journal of Biomedical Science, vol. 27 no. 1,DOI: 10.1186/s12929-020-00636-z, 2020.
[68] S. Bjorkman, H. S. Taylor, "MicroRNAs in endometriosis: biological function and emerging biomarker candidates," Biology of Reproduction, vol. 101 no. 6, pp. 1167-1178, DOI: 10.1093/biolre/ioz014, 2019.
[69] G. Qin, L. Yang, Y. Ma, J. Liu, Q. Huo, "The exploration of disease-specific gene regulatory networks in esophageal carcinoma and stomach adenocarcinoma," BMC Bioinformatics, vol. 20 no. S22,DOI: 10.1186/s12859-019-3230-6, 2019.
[70] Y. D’Alessandra, M. C. Carena, L. Spazzafumo, F. Martinelli, B. Bassetti, P. Devanna, M. Rubino, G. Marenzi, G. I. Colombo, F. Achilli, S. Maggiolini, M. C. Capogrossi, G. Pompilio, "Diagnostic potential of plasmatic MicroRNA signatures in stable and unstable angina," PLoS One, vol. 8 no. 11,DOI: 10.1371/journal.pone.0080345, 2013.
[71] L. Wang, Y. Jin, "Noncoding RNAs as biomarkers for acute coronary syndrome," BioMed Research International, vol. 2020,DOI: 10.1155/2020/3298696, 2020.
[72] J. Liu, Z. Wu, R. Sun, S. Nie, H. Meng, Y. Zhong, X. Nie, W. Cheng, "Using mRNAsi to identify prognostic-related genes in endometrial carcinoma based on WGCNA," Life Sciences, vol. 258,DOI: 10.1016/j.lfs.2020.118231, 2020.
[73] B. de Bono, T. Gillespie, M. C. Surles-Zeigler, N. Kokash, J. S. Grethe, M. Martone, "Representing normal and abnormal physiology as routes of flow in ApiNATOMY," Frontiers in Physiology, vol. 13,DOI: 10.3389/fphys.2022.795303, 2022.
[74] B. Cheng, J. Zhang, H. Liu, M. Cai, Y. Wang, "Research on medical knowledge graph for stroke," Journal of Healthcare Engineering, vol. 2021,DOI: 10.1155/2021/5531327, 2021.
[75] W. Gu, F. Gao, X. Lou, J. Zhang, "Discovering latent node Information by graph attention network," Scientific Reports, vol. 11 no. 1,DOI: 10.1038/s41598-021-85826-x, 2021.
[76] T. Zhang, L. Chen, H. Ding, P. Wu, G. Zhang, Z. Pan, K. Xie, G. Dai, J. Wang, "Construction of miRNA-mRNA network in the differentiation of chicken preadipocytes," British Poultry Science, vol. 63 no. 3, pp. 298-306, DOI: 10.1080/00071668.2021.2000585, 2022.
[77] Y. Chu, A. C. Kaushik, X. Wang, W. Wang, Y. Zhang, X. Shan, D. R. Salahub, Y. Xiong, D. Q. Wei, "DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features," Briefings in Bioinformatics, vol. 22 no. 1, pp. 451-462, DOI: 10.1093/bib/bbz152, 2021.
[78] J. Dabek, A. Kulach, Z. Gasior, "Nuclear factor kappa-light-chain-enhancer of activated B cells (NF-kappaB): a new potential therapeutic target in atherosclerosis?," Pharmacological Reports, vol. 62 no. 5, pp. 778-783, DOI: 10.1016/s1734-1140(10)70338-8, 2010.
[79] Q. Su, L. Zhang, Z. Qi, B. Huang, "The mechanism of inflammatory factors and soluble vascular cell adhesion molecule-1 regulated by nuclear transcription factor NF- κ B in unstable angina pectoris," Journal of Immunology Research, vol. 2022,DOI: 10.1155/2022/6137219, 2022.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright © 2022 Guanpeng Qi et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/
Abstract
MicroRNAs (miRNAs) are important types of noncoding RNAs, and there is a lack of holistic and systematic understanding of the functions they play in disease. We proposed a research strategy, including two parts network analysis and network modelling, to analyze, model, and predict the regulatory network of miRNAs from a network perspective, using unstable angina pectoris as an example. In the network analysis section, we proposed the WGCNA & SimCluster method using both correlation and similarity to find hub miRNAs, and validation on two datasets showed better results than the methods using correlation or similarity alone. In the network modelling section, we used six knowledge graph or graph neural network models for link prediction of three types of edges and multilabel classification of two types of nodes. Comparative experiments showed that the RotatE model was a good model for link prediction, while the RGCN model was the best model for multilabel classification. Potential target genes were predicted for hub miRNAs and validation of hub miRNA-target gene interactions, target genes as biomarkers and target gene functions were performed using a three-step validation approach. In conclusion, our study provides a new strategy to analyze and model miRNA regulatory networks.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details



1 School of Pharmacy, Shenyang Pharmaceutical University, Shenyang 110016, China
2 School of Medical Devices, Shenyang Pharmaceutical University, Shenyang 110016, China
3 School of Life Sciences and Biopharmaceuticals, Shenyang Pharmaceutical University, Shenyang 110016, China