Speaker: Genevera Allen
A General Framework for Mixed Graphical Models
Mixed Big-Data, or data with heterogeneous variables, is prevalent in areas such as genomics and proteomics, imaging genetics, national security, social networking, and Internet advertising. Few statistical techniques exist to jointly analyze mixed data and even fewer multivariate distributions exist that can capture direct dependencies between variables of different types. In this paper, we introduce a novel class of Markov Random Field (MRF) distributions that can be used to model both directed and undirected conditional dependencies between variables of different types including count, binary, continuous, and skewed continuous variables. Using the basic building block of node-conditional univariate exponential families first introduced for homogeneous graphs in Yang et al., (2012), we introduce a series of mixed MRF and mixed conditional MRF distributions that can then be put together recursively via chained conditional distributions to yield a most general and flexible class of what we term Recursively Chained Mixed Graphical Models. We introduce conditions under which these distributions exist and are normalizable, study several instances of our models, and propose scalable penalized conditional likelihood estimators with statistical guarantees for recovering the underlying network structure. Simulations as well as an application to learning mixed genomic networks from next generation sequencing expression data and mutation data demonstrate the versatility of our methods. Joint Work with Eunho Yang, Yulia Baker, Pradeep Ravikumar and Zhadong Liu.
Bio: Genevera Allen is the Dobelman Family Junior Chair and an Assistant Professor of Statistics and Electrical and Computer Engineering at Rice University. She is also a member of the Jan and Dan Duncan Neurological Research Institute at Texas Children’s Hospital and Baylor College of Medicine where she holds a joint appointment. Dr. Allen’s research focuses on developing statistical methods to help scientists make sense of their big-data in applications such as high-throughput genomics and neuroimaging. She is the recipient of several grant awards and honors including being named to the “Forbes ’30 under 30′: Science and Healthcare” list.