Advertisement

Reconstruction of Markov Random Fields from Samples: Some Observations and Algorithms

  • Guy Bresler
  • Elchanan Mossel
  • Allan Sly
  • Guy Bresler
    • 1
  • Elchanan Mossel
    • 2
  • Allan Sly
    • 3
  1. 1.Dept. of Electrical Engineering and Computer SciencesU.C. Berkeley 
  2. 2.Dept. of Statistics and Dept. of Electrical Engineering and Computer SciencesU.C. Berkeley 
  3. 3.Dept. of StatisticsU.C. Berkeley 
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5171)

Abstract

Markov random fields are used to model high dimensional distributions in a number of applied areas. Much recent interest has been devoted to the reconstruction of the dependency structure from independent samples from the Markov random fields. We analyze a simple algorithm for reconstructing the underlying graph defining a Markov random field on n nodes and maximum degree d given observations. We show that under mild non-degeneracy conditions it reconstructs the generating graph with high probability using Θ(d logn) samples which is optimal up to a multiplicative constant. Our results seem to be the first results for general models that guarantee that the generating model is reconstructed. Furthermore, we provide an explicit O(d n d + 2 logn) running time bound. In cases where the measure on the graph has correlation decay, the running time is O(n 2 logn) for all fixed d. In the full-length version we also discuss the effect of observing noisy samples. There we show that as long as the noise level is low, our algorithm is effective. On the other hand, we construct an example where large noise implies non-identifiability even for generic noise and interactions. Finally, we briefly show that in some cases, models with hidden nodes can also be recovered.

Keywords

Ising Model Hide Node Markov Random Field Soft Constraint Candidate Neighborhood 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Chow, C.K., Liu, C.N.: Approximating discrete probability distributions with dependence trees. IEEE Trans. Info. Theory IT-14, 462–467 (1968)CrossRefGoogle Scholar
  2. 2.
    Chickering, D.: Learning Bayesian networks is NP-complete. In: Proceedings of AI and Statistics (1995)Google Scholar
  3. 3.
    Abbeel, P., Koller, D., Ng, A.: Learning factor graphs in polynomial time and sample complexity. Journal of Machine Learning Research 7, 1743–1788 (2006)MathSciNetGoogle Scholar
  4. 4.
    Santhanam, N., Wainwright, M.J.: Information-theoretic limits of graphical model selection in high dimensions (submitted, January 2008)Google Scholar
  5. 5.
    Wainwright, M.J., Ravikumar, P., Lafferty, J.D.: High-dimensional graphical model selection using ℓ1-regularized logistic regression. In: NIPS 2006, Vancouver, BC, Canada (2006)Google Scholar
  6. 6.
    Baldassi, C., Braunstein, A., Brunel, N., Zecchina, R.: Efficient supervised learning in networks with binary synapses; arXiv:0707.1295v1Google Scholar
  7. 7.
    Mahmoudi, H., Pagnani, A., Weigt, M., Zecchina, R.: Propagation of external and asynchronous dynamics in random Boolean networks; arXiv:0704.3406v1Google Scholar
  8. 8.
    Dobrushin, R.L., Shlosman, S.B.: Completely analytical Gibbs fields. In: Fritz, J., Jaffe, A., Szasz, D. (eds.) Statistical mechanics and dynamical systems, pp. 371–403. Birkhauser, Boston (1985)Google Scholar
  9. 9.
    Friedman, N.: Infering cellular networks using probalistic graphical models. In: Science (February 2004)Google Scholar
  10. 10.
    Kasif, S.: Bayes networks and graphical models in computational molecular biology and bioinformatics, survey of recent research (2007), http://genomics10.bu.edu/bioinformatics/kasif/bayes-net.html
  11. 11.
    Daskalakis, C., Mossel, E., Roch, S.: Optimal phylogenetic reconstruction. In: STOC 2006: Proceedings of the 38th Annual ACM Symposium on Theory of Computing, pp. 159–168. ACM, New York (2006)CrossRefGoogle Scholar
  12. 12.
    Erdös, P.L., Steel, M.A., Székely, L.A., Warnow, T.A.: A few logs suffice to build (almost) all trees (part 1). Random Struct. Algor. 14(2), 153–184 (1999)MATHCrossRefGoogle Scholar
  13. 13.
    Mossel, E.: Distorted metrics on trees and phylogenetic forests. IEEE/ACM Trans. Comput. Bio. Bioinform. 4(1), 108–116 (2007)CrossRefMathSciNetGoogle Scholar
  14. 14.
    Bresler, G., Mossel, E., Sly, A.: Reconstruction of Markov Random Fields from Samples: Some Observations and Algorithms; arXiv:0712.1402v1Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Personalised recommendations

We use cookies to personalise content and ads, to provide social media features and to analyse our traffic. We also share information about your use of our site with our social media, advertising and analytics partners in accordance with our Privacy Statement. You can manage your preferences in Manage Cookies.