ML Reviews

bayesian-network-scoring

This is basically model selection, for [[bayesian-network]] models.

The idea is to infer the probability P(GD)P(G \vert D) for model graph GG and data DD. Apparently this is [[NP-hard]].

One computes the Bayesian score:

log(GD)=\logP(G)+i=1nj=1qi(log(Γ(αij0)Γ(αij0+mij0))+k=1rilog(Γ(αijk+mijk)Γ(αijk)))\log(G \vert D) = \logP(G) + \sum_{i=1}^n \sum_{j=1}^{q_i}(\log(\frac{\Gamma(\alpha_{ij0})}{\Gamma(\alpha_{ij0} + m_{ij0})}) + \sum_{k=1}^{r_i}\log(\frac{\Gamma(\alpha_{ijk} + m_{ijk})}{\Gamma(\alpha_{ijk})}))

Each graph can then be ascribed a score, representing which graphs do better in representing the data.

With a scoring function, you can then perform searches for optimal graphs. It seems that finding globally optimal graphs are NP-hard, and the number of possible directed acyclic graphs scales superexponentially.

To counter this, there are a number of approximations that can be made, for example exploiting local symmetries like Markov equivalence1, or searching local neighborhoods (i.e. local graph operations like directivity, edge removal/creation).


  1. Two graphs are Markov equivalent if they are the same when undirected, or if they don't have the "immoral" v-structure (e.g. X -> Z <- Y>).