Yingjun Wu

Amazon Web Services
yjwu@amazon.com


About Me

I am a software engineer at the Redshift team, Amazon Web Services. My expertise lies in database management systems, specifically ML-enhanced database components, query execution, query optimization, indexing mechanism, storage management, and transaction processing. Before joining Amazon, I was a researcher at the Database group, IBM Almaden Research Center.

I received my Ph.D. degree in 2017 from National University of Singapore, where I was affiliated with the Database Group (advisor: Kian-Lee Tan). Previously, I was a visiting Ph.D. student at the Database Group, Carnegie Mellon University (host advisor: Andrew Pavlo), a research intern at the System Group, Microsoft Research Asia, and a research intern at the Cloud Infrastructure Group, EMC Labs China. I earned my bachelor's degree from South China University of Technology in 2012.

I am enthusiastic to integrate research into real-world systems. Right now, I am responsible for boosting the performance of a commercial cloud data warehouse, Amazon Redshift. Before that, I was a developer of a commercial cloud-native DBMS, IBM Db2 Event Store, and a key developer of two main-memory DBMS prototypes, namely Peloton and Cavalia. I was also an early contributor to Apache Flink.

I am glad to review database/systems-related research papers!

PhD Thesis

Title: Transaction Management In Multi-Core Main-Memory Database Systems. [thesis]
Thesis Committee: Bingsheng He, Yong Meng Teo, Alan Fekete.

Publications

WiSer: A Highly Available HTAP DBMS for IoT Applications. [paper]
Ronald Barber, Christian Garcia-Arellano, Ronen Grosman, Guy Lohman, C. Mohan, Rene Muller, Hamid Pirahesh, Vijayshankar Raman, Richard Sidle, Adam Storm, Yuanyuan Tian, Pinar Tozun, and Yingjun Wu.
IEEE BigData 2019.

HERMIT in Action: Succinct Secondary Indexing Mechanism via Correlation Exploration. [paper]
Yingjun Wu, Jia Yu, Yuanyuan Tian, Richard Sidle, and Ronald Barber.
VLDB 2019. (Demo Track)

Designing Succinct Secondary Indexing Mechanism by Exploiting Column Correlations. [paper]
Yingjun Wu, Jia Yu, Yuanyuan Tian, Richard Sidle, and Ronald Barber.
SIGMOD 2019.

Fast Failure Recovery for Main-Memory DBMSs on Multicores. [paper]
Yingjun Wu, Wentian Guo, Chee-Yong Chan, and Kian-Lee Tan.
SIGMOD 2017.

An Empirical Evaluation of In-Memory Multi-Version Concurrency Control. [paper]
Yingjun Wu, Joy Arulraj, Jiexi Lin, Ran Xian, and Andrew Pavlo.
VLDB 2017.

Self-Driving Database Management Systems. [paper]
Andrew Pavlo, Gustavo Angulo, Joy Arulraj, Haibin Lin, Jiexi Lin, Lin Ma, Prashanth Menon, Todd Mowry, Matthew Perron, Ian Quah, Siddharth Santurkar, Anthony Tomasic, Skye Toor, Dana Van Aken, Ziqi Wang, Yingjun Wu, Ran Xian, and Tieying Zhang.
CIDR 2017.

Transaction Healing: Scaling Optimistic Concurrency Control on Multicores. [paper]
Yingjun Wu, Chee-Yong Chan, and Kian-Lee Tan.
SIGMOD 2016.

Scalable In-Memory Transaction Processing with HTM. [paper] [website]
Yingjun Wu and Kian-Lee Tan.
USENIX ATC 2016.

ChronoStream: Elastic Stateful Stream Computation in the Cloud. [paper]
Yingjun Wu and Kian-Lee Tan.
ICDE 2015.

SocialTransfer: Transferring Social Knowledge for Cold-Start Crowdsourcing. [paper]
Zhou Zhao, James Cheng, Furu Wei, Ming Zhou, Wilfred Ng, and Yingjun Wu.
CIKM 2014.

Grand challenge: SPRINT Stream Processing Engine as a Solution. [paper]
Yingjun Wu, David Maier, and Kian-Lee Tan.
DEBS 2013. (Best Paper Award)

Understanding the Effects of Hypervisor I/O Scheduling for Virtual Machine Performance Interference. [paper]
Ziye Yang, Haifeng Fang, Yingjun Wu, Chunqi Li, Bin Zhao, and H. Howie Huang.
CloudCom 2012.

Invited Talks

A Deep Dive into the Compaction for Log-Structured Storage.
IBM Almaden Research Center, San Jose, CA, USA, July 2018.

Building an Efficient Index Structure for Modern Database Systems.
IBM Almaden Research Center, San Jose, CA, USA, June 2018.

Optimization Of OLTP Database Systems Through Program Analysis.
Carnegie Mellon University, Pittsburgh, PA, USA, May 2017. [link]
Brown University, Providence, RI, USA, May 2017.

Building Faster Main-Memory Database Management Systems on Multicores.
National University of Singapore, Singapore, October 2016. [link]

This is the Best Paper Ever on In-Memory Multi-Version Concurrency Control.
Carnegie Mellon University, Pittsburgh, PA, USA, September 2016. [link]

Scalable In-Memory Transaction Processing with HTM.
Carnegie Mellon University, Pittsburgh, PA, USA, June 2016.

Transaction Healing: Scaling Optimistic Concurrency Control on Multicores.
Carnegie Mellon University, Pittsburgh, PA, USA, March 2016. [link]

ChronoStream: Elastic Stateful Stream Computation in the Cloud.
National University of Singapore, Singapore, May 2015.

Awards

Manager's Choice Award, IBM Almaden Research Center, 2018, 2019.

Dean's Graduate Research Award, National University of Singapore, 2017.

Excellent Graduate Thesis Award, South China University of Technology, 2012.

Services

Program Chair: AIDB Workshop@VLDB 2019, SMDB Workshop@ICDE 2020.

Program Committee: SIGMOD 2018 (Demo), ICDE 2018 (Demo), EuroSys 2018 (Shadow), VLDB 2019, VLDB 2019 (PhD Workshop), IEEE Big Data 2019 (Industrial), VLDB 2020.

Journal Reviewer: JCST, KAIS, TCC, TKDE, TPDS, VLDBJ, TODS.

Teaching

CS3103: Computer Networks and Protocols.
National University of Singapore, 2014-2015.





Last update: Sep. 2019