CGBayesNets: MATLAB Software Package for building and predicting with Conditional Gaussian Bayesian Networks

>> Download CGBayesNets <<

CGBayesNets

by Michael McGeachie and Hsun-Hsien Chang.  

CGBayesNets builds and predicts with conditional Gaussian Bayesian networks (CGBNs), enabling biological researchers to infer predictive networks based on multimodal genomic datasets. The package provides many other functions for supporting all phases of model exploration and verification, including cross validation, bootstrapping, and AUC manipulation.

CGBayesNets now comes integrated with three useful network learning algorithms : K2, Pheno-Centric, and a Full-Exhaustive greedy search. K2 is a traditional bayesian network learning algorithm that is appropriate for building networks that prioritize a particular phenotype for prediction; but it is not guaranteed to maximize prediction.  A Pheno-centric network can be built around a discrete variable that maximizes predictive accuracy of the network to predict the phenotype node.  The Full-Exhaustive method can be used on domains of limited size to provide a complete picture of the strongest statistical interactions among variables without biasing the network toward phenotype prediction. In all cases, we use true Bayesian inference : in network building to maximize the posterior likelihood of the data given the model using the Bayesian lilkelihood formulation of Ramoni and Chang[1]; and in inference we implement algorithm of Cowell[2] to compute probabilities of assignments to the prediction variable. 

Conditional Gaussian Bayesian Networks were first described by Heckerman and Geiger[3], and are a modeling technique that combines discrete and continuous variables into a Bayesian Network, where typical Bayesian Networks are limited to discrete variables only.  This represents an important distinction between CGBayesNets and other free Bayesian network software.

This package is available as Open Source software.  Some familiarity with MATLAB is required. For more information, email mmcgeach (at) csail (dot) mit (dot) edu, or fill in the form below. 

If you use the CGBayesNets Package in your research, please cite the following reference:

McGeachie MJ, Chang HH, Weiss ST. CGBayesNets: Conditional Gaussian Bayesian Network Learning and Inference with Mixed Discrete and Continuous
Data. PLoS Computational Biology. 2014 Jun 12;10(6):e1003676. doi: 10.1371/journal.pcbi.1003676. eCollection 2014.

Refs:
[1]Chang HH, Ramoni MF, "Robust cross-race gene expression analysis," Proc. IEEE International Conference Acoustics, Speech, and Signal Processing 2009 (ICASSP '09), Taipei, Taiwan, April 19-24, 2009, pp. 505-508.
[2]Cowell RG. Local Propagation in Conditional Gaussian Bayesian Networks. Machine Learninig Research 2005;6:1517-50.
[3]Heckerman D, Gieger D. Learning Bayesian Networks: A unification for discrete and Gaussian domains. In: Uncertainty in Artificial Intelligence: Morgan Kaufmann; 1995.

    Contact M. McGeachie and H.H. Chang

Submit
Picture

CGBayesNets Software Article in PLoS Computational Biology : 
CGBayesNets: McGeachie, Chang & Weiss.

Papers using CGBayesNets:
  • Boudewijn IM, Roffel MP, Vermeulen CJ, Nawijn MC, Koc K, Terpstra MM, Koppelman GH, Guryev V, van den Berge M.  A Novel Role for Bronchial MicroRNAs and Long Noncoding RNAs in Asthma Remission. Am J Resp Crit Care Med.  Aug 15;202(4). 2020.
  • Lugo-Martinez J, Ruiz-Perez D, Narasimhan G, Bar-Joseph Z. Dynamic interaction network inference from longitudinal microbiome data.  Microbiome, Apr 2;7(1). 2019.
  • Kelly RS*, McGeachie MJ*, Lee-Sarwar KA, Kachroo P, Chu SH, Virkud YV, Huang M, Litonjua AA, Weiss ST, Lasky-Su J. Partial Least Squares Discriminant Analysis and Bayesian Networks for metabolomic prediction of childhood asthma. Metabolites, Oct 23;8(4). 2018.
  • McGeachie MJ, Davis JS, Kho AT, Dahlin A, Sordillo JE, Sun M, Lu Q, Weiss ST, Tantisira KG. Asthma Remission: Predicting Future Airways Responsiveness using a miRNA Network. J Allergy Clin Immunol. 2017 Feb 23. 
  • McGeachie MJ*, Sordillo JE*, Gibson T, Weinstock GM, Liu YY, Gold DR, Litonjua A. Longitudinal prediction of the infant gut microbiome with dynamic Bayesian networks.  Scientific Reports, 2016, Feb 8;6:20359. 
  • McGeachie M*, Dahlin A*, Qiu W, Croteau-Chonka D, Savage J, Wu AC, Wan E, Sordillo J, Martinez F, Strunk R, Lemanske R, Liu A, Raby B, Clish C, Lasky-Su J. The metabolomics of asthma control: a promising link between genetics and disease. Immunity Inflammation and Disease. 2015 Sep;3(3):224-38.
  • McGeachie MJ*, Chang H-H*, Weiss ST. CGBayesNets: Conditional Gaussian Bayesian Network Learning and Inference with Mixed Discrete and Continuous Data. PLoS Computational Biology, 2014; 10(6): e1003676. doi:10.1371/journal.pcbi.1003676.
  • Rogers AJ*, McGeachie M*, Baron RM, Gazourian L, Haspel JA,  Nakahira K, Fredenburgh L, Hunninghake GM, Raby BA, Matthay MA, Otero RM, Fowler VG, Rivers EP, Woods CW, Kingsmore S,  Langley R, BWH MICU Registry, Choi AMK. Metabolomic Derangements are associated with Higher Mortality in Critically Ill Adult Patients. PLoS One, 2014 Jan 30;9(1):e87538.
  • Porth, I., Klápště, J., Skyba, O., Friedmann, M. C., Hannemann, J., Ehlting, J., El-Kassaby, Y. A., Mansfield, S. D. and Douglas, C. J. (2013), Network analysis reveals the relationship among wood properties, gene expression levels and genotypes of natural Populus trichocarpa accessions. New Phytol, 200: 727–742. 

FAQ

1. What licensing is required to use CGBayesNets?
CGBayesNets is now Open Source software.  Its distributed under the MIT license.  
2. How is CGBayesNets distributed? 
It is implemented in MATLAB and distributed as MATLAB source code. It runs on both Windows and Linux systems.
3. Does this work with MATLAB version 2011b?
No, there is a bug with 2011b.  We suggest upgrading to the latest version of MATLAB.
4. What about older versions of MATLAB?
Versions prior to 2008 or so may not understand the "~" operator from function returns.  If you want to, you can remove the ~'s and replace them with dummy variables, when used as a function return value.
5. Does this use any of the MATLAB toolboxes?
There are a couple optional calls to the Bioinformatics Toolbox.  In files ElimOrdering.m and PropagateCGRegression.m, you may set a constant "useBioinfToolbox" to false, which is declared at the beginning of the file. This will disable calls to the Bioinformatics Toolbox.  
Proudly powered by Weebly
  • Home
  • Downloads