Short Bio

Currently, I am a Ph.D student in the department of computer science at Brown University working with Erik Sudderth. My interests lie in the development of scalable Bayesian Nonparametric models to help discover hidden structures within complex datasets such as documents, images, biological data, and more. The unique problems associated with each dataset motivates the need to develop new and interesting models that can best represent these latent structures. I have a particular fondness for data-driven journalism as well and hope to extend my work in helping these particular issues.

 

I'm also quite interested in spreading the beauty and usefulness of these tools to the wider public. I've been working on developing intuitive visualizations that anyone can use with a little dedication to help better express the results that come from applying these models to interesting datasets. More specifically, much of my work has been with large document corpuses and examples of visualizing these results can be seen here.

Current Work

My current active research area is to develop novel Bayesian nonparametric models that are also scalable. This requires a careful formulation of our underlying model assumptions and principled inference techniques. For scalability, my current work lies primarily in variational inference. If you'd like to chat about these ideas, please feel free to drop me an e-mail. Thanks!

News

12/4/2013: Efficient Online Inference for Bayesian Nonparametric Relational Models has been accepted into NIPS 2013 and will be presented as a poster. We develop the Hierarchical Dirichlet Process Relational model for networks and show how for undirected networks we can perform a variational optimization that incorporates an efficient structured mean-field approach. This results allows us to scale this model linearly in the number of hidden communities (versus quadratic). We then implement stochastic variational inference to further enhance scalability along with pruning moves that remove unused communities in the online setting.

6/1/2012: The Nonparametric Metadata Dependent Topic Model has been accepted into ICML 2012! The paper will be presented as a poster and a 20 minute spotlight session. It introduces a new Bayesian nonparametric model for networks that allows you to incorporate metadata to influence its latent community space along with a principled approach for growing the number of clusters. We find metadata to be highly useful in improving AUC scores as well as aiding in the interpretation of our latent structures.

12/1/2011: The Doubly Correlated Nonparametric Topic Model will be presented at NIPS 2011! The paper is an exciting extension to current topic models that allows for correlated topics, document metadata, and a nonparametric prior which allows for a potentially infinite number of topics.