Full text

Turn on search term navigation

Headnote

Abstract: Reconstructing large-scale gene regulatory networks (GRNs) from highdimensional data is a major problem in many fields such as machine learning and bioinformatics. However, classical Dynamic Bayesian networks (DBNs) are based on the homogeneous Markov assumption and cannot handle non-homogeneous temporal processes. To resolve this problem and reconstruct network structures effectively, we proposed a latent-variable-sampling-based heuristic search algorithm for inferring dynamic GRNs, which predicts structures of a GRN based on time series microarray data. We apply the proposed method to identify the dynamic GRN from synthetic dataset and experimental data in the Escherichia coli. This approach involves selecting variables through ll-norm penalty regularization, which is similar to lasso style methods, and then evaluates network score and identifies an optimal network structure. Both the computational time and accuracy of the network structures estimated by the proposed method are compared with those of LARS, GeneNet, and GLASSO on simulated and real datasets. The results reveal that the proposed method outperforms the other three algorithms in terms of accuracy and efficiency and can infer large-scale GRNs.

Keywords: Learning algorithms, gene expression regulation, network modeling, variable selection, heuristics

(ProQuest: ... denotes formulae omitted.)

1.Introduction

Estimating the true gene regulatory network (GRN), when the number of genes is much greater than the number of samples, it has aroused considerable interest in the computational biology community. Several scientists and researchers have presented different approaches to this difficult problem and have advanced the field. However, many unsolved tasks in this area remain, including identifying high correlated covariates, noisy data, and reasonable prior knowledge necessary to accomplish GRNinferring and model estimation.

1.1.Related Works

The past 30 years has seen several developments concerned the learning structure of GRNs. The complex relationships between such components motivated us to identify a multivariate approach. Inferring gene networks is usually known as the process of identifying gene interactions from gene expression data through computational analysis. The entire inferring process can be summarized as a task of predicting connectivity among genes. Essentially, it involves learning the structure of a graph. However, in many domains, problems such as the large numbers of variables,small samples sizes, and possible presence of unmeasured causes, remain major impediments to practical applying these developments.

To accomplish this task and infer the structure of GRNs from high throughput microarray data, several techniques have been developed for the mathematical modeling of GRNs from expression data, notably differential equation (Chen, Wang, Tseng, Huang, & Kao, 2005; Zidong Wang, 2009), the vector autoregressive (VAR) model (Zidong Wang et al., 2008), dynamic Bayesian networks (Grzegorczyk & Husmeier, 2011; Li, Li, Krishnan, & Liu, 2011; Nguyen, Chetty, Coppel, & Wangikar, 2012), Boolean networks (Kim, Lee, & Park, 2007), and the information-theoretic method (W. Zhao, Serpedin, & Dougherty, 2006). In addition, other methods based on regression, such as least absolute shrinkage and selection operator(Lasso) (Tibshirani, 1996),or partial least squares (PLS)(Datta, Le-Rademacher, & Datta, 2007) and supervised learning (Bleakley, Biau, & Vert, 2007; Yamanishi, Vert, & Kanehisa, 2005), have shown positive results.

Although standard linear modeling approaches enable analysis of a modeled system, they are not effective in large-scale network discovery. This is because the number of candidate parameters and models is extremely high, and thus searching efficiently and reliably with tight control on many false positives is difficult. By contrast, by using ordinary differential equations to model transcriptional changes in terms of environmental and transcription factor influence, time-series network identification (TSNI)(Bansal, Gatta, & di Bernardo, 2006) constructs a local regulatory network of genes that are affected by an external perturbation. Dynamic regulatory events miner (DREM) (Schulz et al., 2012) uses an Hidden Markov Model (HMM) based algorithm to identify transcription factors that control divergence points in gene expression profiles in order to reconstruct dynamic regulatory networks. Friedman et al. (Friedman, 2004) were among the first to suggest using dynamic Bayesian networks (DBNs) to model regulatory networks that change over time, as such models can capture time-dependent structures such as feedback loops that are impossible to express using traditional probabilistic networks.

A Boolean network is useful for monitoring dynamic behavior in complex systems based on the massive expression profiling data (Kim et al., 2007). However, it has a limitation regarding the complex modeling Boolean function of a gene. Thus, most Boolean network algorithms can be used only with a small number of variables and a lower degree value. In addition, Robinson et al. (Robinson & Hartemink, 2010) developed an Markov chain Monte Carlo (MCMC) sampler to infer non-stationary dynamical Bayesian networks. This has an attractive feature in that the network structure within a temporal phase depends on the structure of the contiguous phases. Liao et al. used network component analysis (NCA) (Liao et al., 2003) to build a series of large-scale transcriptional networks by decomposing a dynamic gene expression matrix to learn transcription factor activities over time.

Wang et al. (Zidong Wang, 2009) proposed the extended Kalman filter (EKF) algorithm to model nonlinear dynamic GRNs from gene time series data. In this method, authors assume that the dynamics of the GRN under investigation is governed by a class of nonlinear stochastic differential equations. Another model for identifying evolving networks is the temporally smoothed Id-regularized logistic regression method, as described by Kolar et al. (Kolar, Song, Ahmed, & Xing, 2010) andAhmed and Xing (Ahmed & Xing, 2009). These works used dynamic modeling to identify transient co-expression relationships that capture static but temporally related snapshots of a dynamic network.

1.2.Main Content

The main purpose of this study is to developa variable-selection-based l1-norm penalty regularization method for high dimensional data. Our goal is to estimate the true "sparse pattern," that is, the set of covariates with nonzero regression coefficients. We propose a latent- variable-sampling-based heuristic search (LVSHS) algorithm, which is a novel machine learning algorithm for recovering the structure of GRNs from microarray data, thereby extending the capability of inference algorithms. In contrast to existing algorithms, our method can identify various scale GRNs. We note that other search spaces have been proposed, such as the space of network equivalence classes or the space of skeletons. However, these involve fairly complex operators that are computationally expensive and difficult to implement. By contrast, the proposed algorithm is based on searching over the space of the network structure.One novelty of LVSHS when inferring connectivity in GRNs is that it allows inferring causation and considers the period at which this causation occurs.

Through numerous experiments, the applicability and utility of the LVSHS algorithm are demonstrated using both simulated and real data. Our results show that our method outperforms the state-of-the-art methods, both in finding the correct network structure and in computation time. For the E. coli gene expression dataset, we show that LVSHS can obtain a biologically stable GRN.

The remainder of the paper is organized as follows. In Section 2, we provide the mathematical and statistical background on the regressive model. Details of the proposed algorithm, LVSHS, are then provided. Section 3 presents experimental results and analysis and Section 4offers a conclusion and discusses future research.

2.Methods

2.1.Graphical Gaussian models

In this section, we formally define graphical Gaussian models (GGMs) and establish some notation that we use throughout. Consider the problem of estimating a GRN underlying a fixed set of genes. Given an observed data matrix x = {x^sub 1^,x^sub 2^,...,X^sub p^} with N rows (samples) and p columns (variables), assuming ritisfies a multivariate normal distribution N (µ,Σ) with a mean vector µ = (µ^sub 1^,...,µ^sub p^) and positive definite covariance matrix Σ = (σ^sub ij^), where 1 < i, j < p . A GGM for X, also known as a covariance selection model, can be represented by an undirected graph g = (v;e), where V contains p nodes corresponding to each of the p variables and the edges in the set E represent the conditional dependence relationship among p variables.

Assuming that covariance matrix s is invertible, we can denote the precision matrix by Ω = Σ^sup -1^ and Ω = (ω^sub ij^). It is a well-known fact that for GGMs, edges in the graph correspond to non-zero elements in the precision matrix. Although we cannot utilize the precision matrix to select edges directly, we can estimate the partial correlation coefficients p^sub ij^ between X^sub i^ and X^sub j^,

... (1)

As shown in (1), p^sub ij^ will be zero if and only if β^sup i^^sub j^ is zero. Consequently, inferring gene network structure is equivalent to evaluating the partial correlation between genes and j conditioned on the values of all other variables. Furthermore, inferring edges of a gene network from p^sub ij^ is equal to that from the non-zero elements ß^sup i^^sub j^. Mathematically, the edges of a network will be determined in this study as follows:

... (2)

2.2.Neighborhood variable selection

Although variable selection is a fundamental tool in analyzing high dimensional data, its abilityto select the important predictor or explanatory variables/models, especially under circumstances in which the number of predictors greatly exceeds that of observations, are often overlooked in computational network analysis. The basic assumption of variable selection is that the available data constitute a random sample from a multivariate distribution belonging to a GGM. In the context of microarray data, the number of variables exceeds the sample size and this precludes traditional structure learning procedures from being applied.

The neighborhood of each gene X^sub i^ is the subset of X , which represents the pair-wise interactions between variables, and is conditionally independent of all remaining variables. Lasso, also known as basis pursuit, possessesasparseproperty. The lasso regression coefficients can be used to identify the neighborhood of node X^sub i^ in the graph. For gene i, our model can be formulated as follows.

Let X^sub -i^ be the p -1 vector of all variables in the network except i .Here, we manage to solve regression coefficients β^sub -i^ using a linear regression. To estimate regression coefficients β^sub -i^, we adopt the neighborhood selection idea described previously. In addition, we penalize the difference between the correlated variable and within-group variables. More specifically, to recover the structure of a gene network, we propose the following convex optimization problem for estimating networks from high dimensional and correlated data.

...

... (3)

As shown in the two parts of (3) equations above, we mainly consider the inability of lasso to discern the different correlated variables when the pair-wise correlations between them are extremely high. For those variables sharing the same biological "pathway," the correlations between them can be high.

The example of indirect correlation as follows has the same settings as Zhao and Yu (P. Zhao & Yu, 2006). Given a model with p = 6, X^sub 1^, X^sub 2^, X^sub 3^, X^sub 4^, X^sub 5^, and s are first generated from the standard normal distribution. X^sub 6^, which also derives from the standard normal distribution, is generated for correlation with X^sub 1^, X^sub 2^, X^sub 3^, X^sub 4^, and X^sub 5^ by:

...

The true regression model is:

...

supposing that ß^sub 1^ < 0, ß^sub 2^ < 0, ß^sub 3^ > 0, ß^sub 4^ > 0, ß^sub 5^ > 0. Group effect: X^sub 1^, X^sub 2^, and s are first generatedfrom the standard normal distributionwith mean 0 and variance 1. Let X^sub 3^ = X^sub 2^.

X^sub 4^ is generated to be correlated with ..., which also has a standard normal distribution. The true linear model is: Y = -2X^sub 1^ + X^sub 2^ + X^sub 3^ + ε as reported in Zou and Hastie (Zou & Hastie, 2005),

Y = Xß + σε , ε ~ N (0,1 ), and σ = 15 . We choose, and p = 40 . The predictors X were generated as

X^sub i^ = Z^sub 1^ + ε^sub i^, Z^sub 1^ ~ N (0,1 ), i = 1,...,5,

X^sub i^ = Z^sub 2^ + ε^sub i^, Z^sub 2^ ~ N(0,1), i = 6,.,10,

follows: X^sub i^ = Z^sub 3^ + ε^sub i^ Z^sub 3^ ~ N(0,1), i = 11,.,15,

X^sub i^ ~ N(0,1 ), i = 16,.,40,

ε^sub i^ ~ N(0,0.01 ), i = 1,.,15.

In this model, we have three equally important groups, and within each group five members exist. In addition, 25 pure noise features are present. An ideal method would select only the 15 true features and set the coefficients of the 25 noise features to 0.

2.3.fí-norm regularization regression and penalty parameter X selection

Our method's goal is to leam a graph model from high dimensional data such that, given an ordered pair of two variables (i, j ), it can predict whether i is a regulator of j. Our approach attempts to solve this problem through neighbor selection as in (Meinshausen & Bühlmann, 2006), based on li-norm penalty least square estimation. In this approach, the neighborhood of each gene ¿is estimated independently using a penalized linear regression with a lasso-style (i.e., l1-norm) regularization over edge weights. The regression goes around every gene in the network, which leads to completing the network. In every neighbor estimation step, gene i is treated as a response variable, all other variables are the covariates, and the weights are the correlations between the other variables and i.

A li-norm penalty is applied to encourage a sparse solution, as in the lasso in (Tibshirani, 1996). This surprisingly simple method, when applied over i.i.d. nodal samples (e.g., microarray measurements), has extremely hightheoretical guarantees with respect to recovering the correct network structure. It has been shown that under certain variable conditions, obtaining an estimator of the edge set E that achieves a property known as sparseness (Meinshausen & Bühlmann, 2006) is possible., Sparseness refers to a case in which a consistent estimator of E (i.e., the network structure) can be attained when the true degree (i.e., the number of neighbors) of each node is much smaller than the size of the graph. This is possible even when the sample size is considerably smaller than the number of variables.

We must deal with a much more difficult problem because our samples are no longer i.i.d., and our networks are no longer independent of each other. For this purpose, we must extend the basic neighborhood selection lasso algorithm as shown in the following subsections.

It is known that the elastic net approach, can eliminate co-linearity and improve group effect efficiently. The naive estimate ß is defined as follows:

... (4)

We show that the proposed neighborhood selection scheme is consistent for sparse high-dimensional graphs. Our proposed algorithm for inferring large GGM from the N x p - dimensional matrix X consists of the following steps:

Suppose that a dataset has i.i.d. sample matrix X ( N x p - dimensional ). Let ... be the response vector and ...

where ... is the matrix consisting of p -1 predictors.

We can assume that the response vector derived from the regression model as follows.

...

Gaussian noise ... . We perform the following steps:

1. Standardize X^sub \i^ to be zero mean and unit standard deviation, and X^sub i^ to be centered.

2. For any fixed non-negative λ^sub 1^ and λ^sub 2^, we consider the penalized least squares criterion:

... (5)

... (6)

3. The estimation of ...

4. Let ... . We can then formulate the penalized least squares problem as the optimization problem as follows:

... (8)

5. The penalty ... is a convex combination of the lasso and ridge penalty. When a = 1, the aforementioned penalized least squares becomes a simple ridge regression. In this study, we consider only a < 1. When a = 0 , the penalized least squares becomes lasso regression. For all a > 0, thus having the characteristics of both lasso and ridge regression.

6. ChoosethecorrectregularizationparameterXin a data-dependent mannerbyusing the standard technique of the Bayesian information criterion (BIC) (Liu, Chen, & Jothi, 2009). The BIC-type criterion is as follows, where rss denotes the residual sum of squares:

... (9)

Here, as a result of selecting the penalty parameter λ, (5) can be estimated. We use the modified version of the BIC criterion, because the ordinary BIC criterion tends to include many spurious variables when the complexity of the model space is large. Now, λ is chosen by a minimization:

... (10)

In summary, the LVSHS algorithm obtains the penalty parameter by means of the modified BIC criterion previously mentioned, and then calculates the gene regulatory coefficient. Finally, it generates the network. The details of the algorithm are discussed in the section that follows.

Our ongoing theoretical analysis suggests that LVSHS can detect causal relationships in graph evolution. In addition, our method represents an extension of the well-known lasso-style sparse structure recovery technique and is based on a major assumption that temporally adjacent networks are unlikely to be dramatically different from each other in topology and thus are more likely to share common edges than are temporally distant networks.

2.4.Network reconstructing algorithm

To facilitate computing efficiency, we propose algorithm based on all-norm penalty regularization optimization problem for minimizing the penalty regularization function using a limited-memory quasi-Newton update method. The quasi-Newton method is extremely suitable to learn sparse orparsimonious models especially in a case in whichthe cost of evaluating the objective function isconsiderably high. For instance, structure learning in GGMs or MRFs.

li-norm regulation can be converted toa constrained optimization problem. Although many conventional algorithms such as gradient descent or the interior point method offer theoretical guarantees, these approaches, which always must solve problems in a large ill-conditioned linear system or consider only function gradient information, are inclined to trap into local optimum and incursubstantial computational cost. In our algorithm, we introduced an efficient limited-memory quasi-Newton method, L-BFGS,to promote efficiency and reduce computational complexity for unconstrained optimization problemswith many variables, which updates with second-order derivativeinformation inferred from observed changes in the gradient.

3.Results and Discussion

In this section, results of experiments with synthetic and real-world datasets are included. We present a numerical comparison with three existing algorithms, including an analysis using E. coli data, and show that our algorithm is much computationally efficient. Although the other three methods provide efficient inference for medium-scale data, they typically cannot fully capture the relational complexity for large-scale datasets. Experiments demonstrate the viability and the effectiveness of our LVSHS algorithm. Furthermore, our LVSHS algorithm occupies a higher position in the accuracy/efficiency trade-off.

3.1.Simulated data experiments

To estimate the accuracy and efficiency of the LVSHS algorithm and compare its performance with three commonly used competing algorithms (i.e., LARS, GeneNet, and GLASSO), we generated synthetic datasets by using SynTReN(Van den Bulcke et al., 2006), which simulates benchmark microarray datasets with the known underlying biological networks for the purpose of developing and testing new network inference algorithms. Through SynTReN, we simulated a biological network with a known topological structure as well as the corresponding time series gene expression data. We retained the default tuning parameters that control the complexity aspects and changed only those that control noise and the size of the dataset in order to generate datasets of different sizes and complexity in the software.

We generated 100 microarray datasets that consist of 200 genes and 100 sample points (noiseo =0.5). The resulting graphs had approximately 500 connections. For each generated dataset, the network structure that was learned from each method was then compared with the true underlying structure. We ran each experiment at least four times and averaged the results. Receiver operating characteristic (ROC), positive prediction value (PPV),and false discovery rate (FDR) curves of LVSHS in comparison to those of LARS, GeneNet, and GLASSO on simulated datasets are shown in Fig. 1, which shows the performance of the four algorithms averaged over the 100 datasets. Fig. 1 clearly shows that LVSHS outperforms the other three algorithms on the simulated datasets. The ROC curves in Fig. 1 reveal that LVSHS has an advantage over other approaches when the microarray data are "large p, small n."

Fig. 2 shows boxplots of the area under curve (AUC) values, based on the aforementioned simulated datasets, to visualize accuracy for all four algorithms under investigation. The boxplots in Fig. 2 show that for all inference methods, the highest AUC value, 0.89, is obtained by LVSHS.

From Fig. 2, we can see that LVSHS results in much higher AUC values than do other algorithms. This figure shows that, statistically, on a sample size 100, LARS with a median AUC of approximately 0.5 is considerably worse than in other methods: GeneNet with a median AUC approximately 0.65, GLASSO with approximately 0.66, and LVSHS with approximately 0.89. It is also worth mentioning that GeneNetis remarkably similar toGLASSO in terms of performance.

In Tables 1 and 2, we provide additional details of the results over 100 simulated datasets. These tables reveal that LVSHS outperforms the other three methods.

As Table 2 shows, the LVSHS algorithm obtains a comparatively lower false positive rate (FPR) and higher true positive rate (TPR) for more edges than do the other three methods. By contrast, LARS obtains the worst performance among the four algorithms.It should be emphasized that these values show the distinct inference ability of algorithms on the same underlying network.

For our simulations with genes 100, 200, 300, 500, and 1000,in which the sample size is set the same as 100, the time costs of the three algorithms are shown in Table 2. Table 2 shows runtime in minutes of all four methods over 100 simulated datasets. The simulation is repeated 100 times for each algorithm. Table2shows that LVSHS is relatively computationally fast, especially when the number of nodes is high (i.e.,>300), and is thus superior to the other methods. In addition, LVSHS can infer the network structure for large datasets with 500 genes within a few minutes on an Intel dual 2.93 GHz processor and 4 GB RAM. A comparison of the timing, reported in minutes, among all four methods is given as follows.

3.2.Real-data experiments

We implemented our LVSHS algorithm to reconstruct a GRN from real-world time course microarray data and compared the performance of our approach with that of the other three methods. To evaluate the performance of the presented algorithm, we utilized the SOS DNA repair network of the Escherichia coli dataset collected by Alon et al. (Friedman, 2004). The SOS network has been well studied and used as the gold standard network in many studies. It is known that the SOS network is a highly conserved system and responsible for repairing the DNA after damage. It is composed of approximately 30 genes, the master regulator being the lexA. Usually, recA acts as a sensor of DNA damage by binding to single-stranded DNA and mediates lexA destruction. The drop in lexA levels causes the activation of SOS genes.

Experimental data can be downloaded from the homepage of Uri Alon et al. (http:// www.weizmann.ac.il/mcb/UriAlon/). The dataset comprises four samples associated with two irradiation levels and each sample has 50 evenly spaced time points.

In Fig. 3, expression profiles of eight genes in the SOS dataset are represented. To validate our method using real data, we have plotted expression profiles for all the genes with concise descriptions. The results of the experiments allow us to conclude that our learning technique associated with network inference is able to capture the true patterns of the network.

As depicted in Fig. 4a, the SOS network consists of eight essential genes. Predictions of the gene expression data in the SOS DNA repair network by the LVSHS are plotted in Fig. 4b. The results show that the estimated model has small deviation from measurement, which demonstrates LVSHS's merits.

4.Conclusion

Determining the relationship and interactions between groups of variables or a set of proteins is of great interest to biologists and computation scientists. In particular, reconstruction of gene and protein association networks from microarray data has been an active and interesting field in bioinformatics in the last few years. Identifying the properties of biological networks is necessary in successfully developing efficient inference techniques.

This study proposed a novel li-norm regularized sparse correlated graph (LVSHS) reconstruction approach for recovering the structure of networks from microarray data. To examine the consistency of the LVSHS approach and to test its biological applicability, we applied our new method to both synthetic and the SOS DNA repair network datasets. The new algorithm combines li-norm penalty parameter learning with Bayesian information criterion controlling to improve the efficiency of computation. Experiments show that the proposed LVSHS method considerably improves computation time and can efficiently infer large-scale GRNs. LVSHS outperforms three existing algorithms both in terms of finding true edges of networks and computational time. With better predictive power, LVSHS could be used in the analysis of genetic networks.

Considerable work remains. Extending the LVSHS approach to non-linear network model would be interesting. Because our model is only a linear approximation of gene regulation systems, care must be given to model validation and extrapolating results. In addition, we will investigate different options for the penalty trade-off parameter X and improve the computational efficiency of generic solvers.

References

References

Ahmed, A., Xing, E. (2009). Recovering time-varying networks of dependencies in social and biological studies. Proceedings of the National Academy of Sciences, 106(29), 11878-11883.

Bansal, M., Gatta, G., Bernardo, D. (2006). Inference of gene regulatory networks and compound mode of action from time course gene expression profiles. BIOINFORMATICS, 22(7), 815-822.

Bleakley, K., Biau, G., Vert, J. (2007). Supervised reconstruction of biological networks with local models. BIOINFORMATICS, 23(13), i57-65.

Chen, K., Wang, T., Tseng, H., Huang, C. (2005). A stochastic differential equation model for quantifying transcriptional regulatory network in Saccharomyces cerevisiae. Bioinformatics, 21(12), 2883-2890.

Datta, S., Le-Rademacher, J., Datta, S. (2007). Predicting Patient Survival from Microarray Data by Accelerated Failure Time Modeling Using Partial Least Squares and LASSO. Biometrics, 63(1), 259-271. doi: 10.1111/j.1541-0420.2006.00660.x

Friedman, N. (2004). Inferring cellular networks using probabilistic graphical models. Science, 303(5659), 799-805.

Grzegorczyk, M., Husmeier, D. (2011). Improvements in the reconstruction of time-varying gene regulatory networks: dynamic programming and regularization by information sharing among genes. Bioinformatics, 27(5), 693-699. doi: 10.1093/ bioinformatics/btq711

Kim, H., Lee, J., Park, T. (2007). Boolean networks using the chi-square test for inferring large-scale gene regulatory networks. BMC Bioinformatics, 8(1), 37.

Kolar, M., Song, L., Ahmed, A., Xing, E. (2010). Estimating time-varying networks. The Annals of Applied Statistics, 4(1), 94-123.

Li, Z., Li, P., Krishnan, A. (2011). Large-scale dynamic gene regulatory network inference combining differential equation models with local dynamic Bayesian network analysis. Bioinformatics, 27(19), 2686-2691. doi: 10.1093/bioinformatics/btr454

Liao, C., Boscolo, R., Yang, Y. (2003). Network component analysis: Reconstruction of regulatory signals in biological systems. Proceedings of the National Academy of Sciences, 100(26), 15522-15527. doi: I0.i073/pnas.2i36632i00

Liu, M., Chen, X., Jothi, R. (2009). Knowledge-guided inference of domain-domain interactions from incomplete protein-protein interaction networks. BIOINFORMATICS, 25(19), 2492-2499. doi: 10.1093/bioinformatics/btp480

Meinshausen, N., Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. The Annals of Statistics, 34(3), 1436-1462

Nguyen, X., Chetty, M., Coppel, R. (2012). Gene regulatory network modeling via global optimization of high-order dynamic bayesian network. BMC Bioinformatics, 13(1), 131.

Pereira, C., Ferreira, C. (2015). Identification of IT Value Management Practices and Resources in COBIT 5. RISTI-Revista Ibérica de Sistemas e Tecnologias de Informaçdo, (15), 17-33.

Robinson, W., Hartemink, A. (2010). Learning Non-Stationary Dynamic Bayesian Networks. J. Mach. Learn. Res., 11, 3647-3680.

Schulz, M., Devanny, W., Gitter, A. (2012). DREM 2.0: Improved reconstruction of dynamic regulatory networks from time-series expression data. BMC Systems Biology, 6(1), 104.

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. 58(1), 267-288.

Van B., Tim, V., Koenraad, N. (2006). SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC Bioinformatics, 7(1), 43-45.

Wang, Z. (2009). An Extended Kalman Filtering Approach to Modeling Nonlinear Dynamic Gene Regulatory Networks via Short Gene Expression Time Series. IEEE/ ACM Transactions on Computational Biology and Bioinformatics, 6(3), 410-419.

Wang, Z., Fuwen, Y., Ho, D. (2008). Stochastic Dynamic Modeling of Short Gene Expression Time-Series Data. IEEE Transactions on NanoBioscience, 7(1), 44-55.

Yamanishi, Y., Vert, J., Kanehisa, M. (2005). Supervised enzyme network inference from the integration of genomic data and chemical information. BIOINFORMATICS, 21, 468-477.

Zhao, P., & Yu, B. (2006). On model selection consistency of Lasso. The Journal of Machine Learning Research, 7, 2541-2563.

Zhao, W., Serpedin, E. (2006). Inferring gene regulatory networks from time series data using the minimum description length principle. BIOINFORMATICS, 22(17), 2129-2135.

Zou, H., Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301-320.

AuthorAffiliation

YajunLiu1, Bo Yang2, JunYing Zhang1

*[email protected]

1 School of Computer Science and Technology, Xidian University, Xi'an 710071, China

2 School of Computer Science and Engineering, Xi'an Technological University, Xi'an 710021, China

Word count: 4755

Show less

Details

Title

Combining Heuristic Search and Latent Variable Sampling to Infer Dynamic Gene Regulatory Networks

Author

YajunLiu; Yang, Bo; Zhang, JunYing

Pages

198-211

Publication year

2016

Publication date

Oct 2016

Publisher

Associação Ibérica de Sistemas e Tecnologias de Informacao

ISSN

16469895

Source type

Scholarly Journal

Language of publication

English

ProQuest document ID

1838916697

Combining Heuristic Search and Latent Variable Sampling to Infer Dynamic Gene Regulatory Networks

Jump to:

Full text

Details

Suggested sources