Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published June 2011 | Submitted + Published
Journal Article Open

Learning High-Dimensional Markov Forest Distributions: Analysis of Error Rates

Abstract

The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is both structurally consistent and risk consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size. For the high-dimensional scenario where the size of the model d and the number of edges k scale with the number of samples n, sufficient conditions on (n,d,k) are given for the algorithm to satisfy structural and risk consistencies. In addition, the extremal structures for learning are identified; we prove that the independent (resp., tree) model is the hardest (resp., easiest) to learn using the proposed algorithm in terms of error rates for structure learning.

Additional Information

© 2011 Vincent Tan, Animashree Anandkumar and Alan Willsky. This work was supported by a AFOSR funded through Grant FA9559-08-1-1080, a MURI funded through ARO Grant W911NF-06-1-0076 and a MURI funded through AFOSR Grant FA9550-06-1-0324. V. Tan is also funded by A*STAR, Singapore. The authors would like to thank Sanjoy Mitter, Lav Varshney, Matt Johnson and James Saunderson for discussions. The authors would also like to thank Rui Wu (UIUC) for pointing out an error in the proof of Theorem 3.

Attached Files

Published - tan11a.pdf

Submitted - 1005.0766.pdf

Files

1005.0766.pdf
Files (836.0 kB)
Name Size Download all
md5:10534ca18d9843e1788b8928dcb3006f
504.6 kB Preview Download
md5:2ce9ba1977d49e9e1ad1ca970ce09c8b
331.4 kB Preview Download

Additional details

Created:
August 19, 2023
Modified:
October 17, 2023