Chenggang Wu

Email: cw75 AT cs DOT brown DOT edu

Office: CIT344

I am a senior undergraduate student working with Stan Zdonik, Tim Kraska, and Ugur Cetintemel in Data Management Research Group at Brown University. Before coming to Brown, I spent one year working with Xiangyang Li on Bayes classifier in Wireless Networking Research Group at Illinois Institute of Technology.

I will join UC Berkeley as a PhD student in Fall of 2015.

Research Interests

Broadly, I am interested in Database Management Systems (DBMSs) and Data Science. Specifically, my interests lie in Big Data Visualization, Query Optimization, Transaction Processing, Stream Processing, and how the advent of new computer hardware shifts the computer system's architecture and affect the performance of DBMSs.

Research Projects

S-Store: Real-Time Analytics Meets Transaction Processing
The goal is to build a stream processing system that can simultaneously accommodate OLTP and streaming applications.

  • Investigated how nested transactions can help preserve the data integrity in a streaming context.
  • Developed an efficient nested transaction facility in S-Store to guarantee the consistency of the stored state.
  • Collaboration with Carnegie Mellon University, Intel Labs (ISTC Big Data), and Massachusetts Institute of Technology.

DBNav: Query Optimization on Analytical Visualization Systems
The goal is to develop novel query optimization techniques to improve the scalability of visualization systems.

  • Developed predictive checkpointing techniques to do multi-query optimization via automatic view materialization and pre-aggregation for bar chart, pie chart, histogram, and interactive k-means clustering visualizations.
  • Developed a visual approximate sampling algorithm for optimizing the scatter plot visualization. The algorithm significantly reduced the back-end data-fetching latency and at the same time preserved the visual correctness of the original dataset.
  • Conducted a user study with over 200 participants on Amazon Mechanical Turk to compare my algorithm against the stratified sampling algorithm. Under the same 10% sampling rate, over 99% of the participants claimed that my algorithm offered a much more accurate visual representation of the original dataset.

Seer: Predictive Middleware for Big Data Visualization
The goal is to build a predictive prefetching and caching middleware to aid the exploratory visualization of big data.

  • Co-developed a profile-driven hierarchical prediction algorithm using Markov model and sequential rule mining.
  • Co-developed a multidimensional predictive cache that employed predictive LRU eviction policy.
  • Conducted a user study to evaluate the effectiveness of the prediction algorithm on real-world dataset. The result suggested our learning-based algorithm significantly outperformed locality-based prefetching techniques.

Publications & Presentations


I love teaching. I believe that a successful researcher not only knows how to produce new knowledge, but also knows how to share and propagate the discovery to others. Below are the courses in which I have served as a teaching assistant: