Stephen Bach

Assistant Professor
Computer Science Department
Brown University, Providence, RI
CIT 335

Home | BATS | Projects | Publications | Teaching | CV

My latest research is on weakly supervised machine learning, in which the goal is to train models without hand labeled data. With the advent of data-hungry representation learning techniques like deep neural networks, curating labeled training data has replaced feature engineering as the most expensive and time consuming task in machine learning. Weak supervision aims to overcome this bottleneck. I also work on statistical relational learning and information extraction.


  • Our work on weakly supervised sequence tagging, e.g., named entity recognition, is accepted to AAAI 2020!
  • Snorkel is now in production at Google. Our paper at SIGMOD 2019 has the technical details, and is featured on the Google AI Blog.
  • Our paper on Snorkel was selected as a "Best of VLDB 2018" paper!


I lead the BATS machine learning research group. In the tradition of groups like LINQS and DAGS, BATS stands for "Bach's Awesome Team of Students."

Ph.D. Students Post-Doc Master's and Undergrad Students
  • Trisha Ballakur
  • Tiffany Ding
  • Top Piriyakulkij
  • Dylan Sam
  • Jeffrey Zhu
Alumni (Role, Year, Next Position)
  • Berkan Hiziroglu (Master's, 2020, Amazon)
  • Angie Kim (Undergrad, 2020, The New York Times)
  • Esteban Safranchik (Undergrad, 2020, Ph.D. at U. Washington)

Snorkel is a framework for creating noisy training labels for machine learning. It uses statistical methods to combine weak supervision sources like heuristic rules and task-related data sets, i.e., distant supervision, which are far less expensive to use than hand labeling data. With the resulting estimated labels, users can train many kinds of state-of-the-art models. Snorkel is used at numerous technology companies like Google, research labs, and agencies like the FDA.
Probabilistic soft logic is a formalism for building statistical models over relational data like knowledge bases and social networks. PSL programs define hinge-loss MRFs, a type of probabilistic graphical model that admits fast, convex optimization for MAP inference, which makes them very scalable. Researchers around the world have used PSL for bioinformatics, computational social science, natural language processing, information extraction, and computer vision.


In spring semesters, I teach machine learning (CSCI 1420).

In Fall 2018, I taught a seminar on learning with limited labeled data (CSCI 2952-C).