Content area
Abstract
Background
The Dirichlet Bayesian network model employs score-based structural learning, improving the precise understanding of the Bayesian network's structure. To classify patients with bipolar disorder who are receiving lithium treatment based on their gene expression profiles, using a Dirichlet Bayesian network model and compared with Support Vector Machine and Random Forest algorithms.
Methods
Gene expression data from 47,323 genes in bipolar disorder patients were analyzed, with 30 receiving standard treatment and 30 undergoing lithium treatment. Essential variables were selected using partial least squares regression to analyze and classify the data. The plaid algorithm was employed to identify identical patterns and biclusters within the gene expression data. We conducted a principal component analysis to represent a component for each bicluster. The Dirichlet Bayesian network model was developed to classify the gene expression network, and accuracy was assessed using Receiver Operating Characteristic curve analysis. R 3.6.2 was used for analysis.
Results
Partial least squares regression identified 10,788 essential genes, and the plaid algorithm revealed nine homogeneous biclusters. The representative component of the biclusters was selected, capturing at least 75% of the variance in the data using principal component analysis. Dirichlet Bayesian network classification achieved an accuracy of 0.86 and a precision of 0.91 compared to random forest and SVM algorithms.
Conclusions
This study demonstrates the potential of ensemble approaches for gene network analysis, providing more accurate and robust results than single models. Network analysis effectively detects coordinated changes in mutual and related gene expression and can be applied to other diseases using existing datasets.





