| Title: | Algorithm for Searching the Space of Gaussian Directed Acyclic Graph Models Through Moment Fractional Bayes Factors |
|---|---|
| Description: | We propose an objective Bayesian algorithm for searching the space of Gaussian directed acyclic graph (DAG) models. The algorithm uses moment fractional Bayes factors (MFBF) and is suitable for learning sparse graphs. The algorithm is implemented using Armadillo, an open-source C++ linear algebra library. |
| Authors: | Davide Altomare [aut, cre], Guido Consonni [aut], Luca La Rocca [aut] |
| Maintainer: | Davide Altomare <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 1.3 |
| Built: | 2026-05-25 07:53:48 UTC |
| Source: | https://github.com/cran/FBFsearch |
Data on a set of flow cytometry experiments on signaling networks of human immune system cells. The dataset includes p=11 proteins and n=7466 samples.
data(HumanPw)data(HumanPw)
dataHuman contains the following objects:
ObsMatrix (7466x11) with the observations.
PermsList of 5 matrices (1x11) each of which with a permutation of the nodes.
TDagMatrix (11x11) with the adjacency matrix of the known regulatory network.
Sachs, K., Perez, O., Pe'er, D., Lauffenburger, D., and Nolan, G. (2003). Casual protein- signaling networks derived from multiparameter single-cell data. Science 308, 504-6.
D. Altomare, G. Consonni and L. La Rocca (2012). Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
Shojaie, A. and Michailidis, G. (2010). Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs. Biometrika 97, 519-538.
Data on publishing productivity among academics.
data(PubProd)data(PubProd)
dataPub contains the following objects:
CorrMatrix (7x7) with the correlation matrix of the variables.
nobsScalar with the number of observations.
Spirtes, P., Glymour, C., and Scheines, R. (2000). Causation, prediction and search (2nd edition). Cambridge, MA: The MIT Press. pages 1-16.
Drton, M. and Perlman, M. D. (2008). A SINful approach to Gaussian graphical model selection. J. Statist. Plann. Inference 138, 1179-1200.
D. Altomare, G. Consonni and L. La Rocca (2012). Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
dataSim100 is a list with the adjacency matrix of a randomly generated DAG with 100 nodes and 100 edges, 10 samples generated from the DAG and 5 permutations of the nodes.
data(SimDag100)data(SimDag100)
dataSim100 contains the following objects:
ObsList of 10 matrices (100x100) each of which with 100 observations generated from the DAG.
PermsList of 5 matrices (1x100) each of which with a permutation of the nodes.
TDagMatrix (100x100) with the adjacency matrix of the DAG.
D. Altomare, G. Consonni and L. La Rocca (2012). Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
Shojaie, A. and Michailidis, G. (2010). Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs. Biometrika 97, 519-538.
dataSim200 is a list with the adjacency matrix of a randomly generated DAG with 200 nodes and 100 edges, 10 samples generated from the DAG and 5 permutations of the nodes.
data(SimDag200)data(SimDag200)
dataSim200 contains the following objects:
ObsList of 10 matrices (100x200) each of which with 100 observations simulated from the DAG.
PermsList of 5 matrices (1x200) each of which with a permutation of the nodes.
TDagMatrix (200x200) with the adjacency matrix of the DAG.
D. Altomare, G. Consonni and L. La Rocca (2012). Objective Bayesian search of gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
Shojaie, A. and Michailidis, G. (2010). Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs. Biometrika 97, 519-538.
dataSim50 is a list with the adjacency matrix of a randomly generated DAG with 50 nodes and 100 edges, 10 samples generated from the DAG and 5 permutations of the nodes.
data(SimDag50)data(SimDag50)
dataSim50 contains the following objects:
ObsList of 10 matrices (100x50) each of which with 100 observations simulated from the DAG.
PermsList of 5 matrices (1x50) each of which with a permutation of the nodes.
TDagMatrix (50x50) with the adjacency matrix of the DAG.
D. Altomare, G. Consonni and L. La Rocca (2012). Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
Shojaie, A. and Michailidis, G. (2010). Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs. Biometrika 97, 519-538.
dataSim6 is a list with the adjacency matrix of a randomly generated DAG with 6 nodes and 5 edges and 100 correlation matrices generated from the DAG.
data(SimDag6)data(SimDag6)
dataSim6 contains the following objects:
CorrList of 100 matrices (6x6) each of which with a correlation matrix generated from the DAG.
TDagMatrix (6x6) with the adjacency matrix of the DAG.
D. Altomare, G. Consonni and L. La Rocca (2012). Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
Data generated from the known regulatory network of human cell signalling data.
data(SimHumanPw)data(SimHumanPw)
dataSimHuman contains the following objects:
ObsList of 100 matrices (100x11) each of which with 100 observations simulated from the known regulatory network.
PermsList of 5 matrices (1x11) each of which with a permutation of the nodes.
TDagMatrix (11x11) with the adjacency matrix of the known regulatory network.
D. Altomare, G. Consonni and L. La Rocca (2012). Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
Sachs, K., Perez, O., Pe'er, D., Lauffenburger, D., and Nolan, G. (2003). Casual protein- signaling networks derived from multiparameter single-cell data. Science 308, 504-6.
Shojaie, A. and Michailidis, G. (2010). Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs. Biometrika 97, 519-538.
Estimate the edge inclusion probabilities for a Gaussian DAG with q nodes from observational data, using the moment fractional Bayes factor approach with global prior.
FBF_GS(Corr, nobs, G_base, h, C, n_tot_mod, n_hpp)FBF_GS(Corr, nobs, G_base, h, C, n_tot_mod, n_hpp)
Corr |
qxq correlation matrix. |
nobs |
Number of observations. |
G_base |
Base DAG. |
h |
Parameter prior. |
C |
Costant who keeps the probability of all local moves bounded away from 0 and 1. |
n_tot_mod |
Maximum number of different models which will be visited by the algorithm, for each equation. |
n_hpp |
Number of the highest posterior probability models which will be returned by the procedure. |
An object of class list with:
M_qMatrix (qxq) with the estimated edge inclusion probabilities.
M_GMatrix (n*n_hpp)xq with the n_hpp highest posterior probability models returned by the procedure.
M_PVector (n_hpp) with the n_hpp posterior probabilities of the models in M_G.
Davide Altomare ([email protected]).
D. Altomare, G. Consonni and L. La Rocca (2012). Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
data(SimDag6) Corr=dataSim6$SimCorr[[1]] nobs=50 q=ncol(Corr) Gt=dataSim6$TDag Res_search=FBF_GS(Corr, nobs, matrix(0,q,q), 1, 0.01, 1000, 10) M_q=Res_search$M_q M_G=Res_search$M_G M_P=Res_search$M_P G_med=M_q G_med[M_q>=0.5]=1 G_med[M_q<0.5]=0 #median probability DAG G_high=M_G[1:q,1:q] #Highest Posterior Probability DAG (HPP) pp_high=M_P[1] #Posterior Probability of the HPP #Structural Hamming Distance between the true DAG and the median probability DAG sum(sum(abs(G_med-Gt))) #Structural Hamming Distance between the true DAG and the highest probability DAG sum(sum(abs(G_high-Gt)))data(SimDag6) Corr=dataSim6$SimCorr[[1]] nobs=50 q=ncol(Corr) Gt=dataSim6$TDag Res_search=FBF_GS(Corr, nobs, matrix(0,q,q), 1, 0.01, 1000, 10) M_q=Res_search$M_q M_G=Res_search$M_G M_P=Res_search$M_P G_med=M_q G_med[M_q>=0.5]=1 G_med[M_q<0.5]=0 #median probability DAG G_high=M_G[1:q,1:q] #Highest Posterior Probability DAG (HPP) pp_high=M_P[1] #Posterior Probability of the HPP #Structural Hamming Distance between the true DAG and the median probability DAG sum(sum(abs(G_med-Gt))) #Structural Hamming Distance between the true DAG and the highest probability DAG sum(sum(abs(G_high-Gt)))
Estimate the edge inclusion probabilities for a directed acyclic graph (DAG) from observational data, using the moment fractional Bayes factor approach with local prior.
FBF_LS(Corr, nobs, G_base, h, C, n_tot_mod)FBF_LS(Corr, nobs, G_base, h, C, n_tot_mod)
Corr |
qxq correlation matrix. |
nobs |
Number of observations. |
G_base |
Base DAG. |
h |
Parameter prior. |
C |
Costant who keeps the probability of all local moves bounded away from 0 and 1. |
n_tot_mod |
Maximum number of different models which will be visited by the algorithm, for each equation. |
An object of class matrix with the estimated edge inclusion probabilities.
Davide Altomare ([email protected]).
D. Altomare, G. Consonni and L. LaRocca (2012).Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors.Article submitted to Biometric Methodology.
data(SimDag6) Corr=dataSim6$SimCorr[[1]] nobs=50 q=ncol(Corr) Gt=dataSim6$TDag M_q=FBF_LS(Corr, nobs, matrix(0,q,q), 0, 0.01, 1000) G_med=M_q G_med[M_q>=0.5]=1 G_med[M_q<0.5]=0 #median probability DAG #Structural Hamming Distance between the true DAG and the median probability DAG sum(sum(abs(G_med-Gt)))data(SimDag6) Corr=dataSim6$SimCorr[[1]] nobs=50 q=ncol(Corr) Gt=dataSim6$TDag M_q=FBF_LS(Corr, nobs, matrix(0,q,q), 0, 0.01, 1000) G_med=M_q G_med[M_q>=0.5]=1 G_med[M_q<0.5]=0 #median probability DAG #Structural Hamming Distance between the true DAG and the median probability DAG sum(sum(abs(G_med-Gt)))
Estimate the edge inclusion probabilities for a regression model (Y(q) on Y(q-1),...,Y(1)) with q variables from observational data, using the moment fractional Bayes factor approach.
FBF_RS(Corr, nobs, G_base, h, C, n_tot_mod, n_hpp)FBF_RS(Corr, nobs, G_base, h, C, n_tot_mod, n_hpp)
Corr |
qxq correlation matrix. |
nobs |
Number of observations. |
G_base |
Base model. |
h |
Parameter prior. |
C |
Costant who keeps the probability of all local moves bounded away from 0 and 1. |
n_tot_mod |
Maximum number of different models which will be visited by the algorithm, for each equation. |
n_hpp |
Number of the highest posterior probability models which will be returned by the procedure. |
An object of class list with:
M_qMatrix (qxq) with the estimated edge inclusion probabilities.
M_GMatrix (n*n_hpp)xq with the n_hpp highest posterior probability models returned by the procedure.
M_PVector (n_hpp) with the n_hpp posterior probabilities of the models in M_G.
Davide Altomare ([email protected]).
D. Altomare, G. Consonni and L. LaRocca (2012). Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
data(SimDag6) Corr=dataSim6$SimCorr[[1]] nobs=50 q=ncol(Corr) Gt=dataSim6$TDag Res_search=FBF_RS(Corr, nobs, matrix(0,1,(q-1)), 1, 0.01, 1000, 10) M_q=Res_search$M_q M_G=Res_search$M_G M_P=Res_search$M_P Mt=rev(matrix(Gt[1:(q-1),q],1,(q-1))) #True Model M_med=M_q M_med[M_q>=0.5]=1 M_med[M_q<0.5]=0 #median probability model #Structural Hamming Distance between the true DAG and the median probability DAG sum(sum(abs(M_med-Mt)))data(SimDag6) Corr=dataSim6$SimCorr[[1]] nobs=50 q=ncol(Corr) Gt=dataSim6$TDag Res_search=FBF_RS(Corr, nobs, matrix(0,1,(q-1)), 1, 0.01, 1000, 10) M_q=Res_search$M_q M_G=Res_search$M_G M_P=Res_search$M_P Mt=rev(matrix(Gt[1:(q-1),q],1,(q-1))) #True Model M_med=M_q M_med[M_q>=0.5]=1 M_med[M_q<0.5]=0 #median probability model #Structural Hamming Distance between the true DAG and the median probability DAG sum(sum(abs(M_med-Mt)))