1. Yipei Wang Email : wangyipei01@gmail.com
yipeiw@alumni.cmu.edu
Phone: +1-412-613-1984
Education
•
Carnegie Mellon University, School of Computer Science Pittsburgh, PA
Master’s, major in Artificial Intelligence May 2015
•
Tsinghua University Beijing, China
Bachelor of Electrical Engineering July. 2012
Experience
•
Particle Media, Inc. Santa Clara, CA
Machine Learning Scientist Sep 2015 - Present
Responsibilities: The company is a startup which provides personalized article reading. The job is
responsible for improving the recommendation system using statistical methods.
◦ Developed offline training pipeline (spark) and static feature extraction pipeline, supporting organizing
data by time range, pre-sampling, normalization, model selection, adding position bias feature.
◦ Developed user profile pipeline based on semantics of user viewed articles.
◦ Build dashboard for monitoring daily Click-through-rate over various dimensions. Applied data analysis
for improving recommendation strategies, including user behavior analysis, automatic selection of popular
articles.
◦ Set up and maintain workflow for offline pipelines.
•
Carnegie Mellon University Pittsburgh, PA
Research Assistant 2012 Oct to 2015 Jun
Project: Communication Filters for Distributed Optimization Algorithm (thesis), advised by Alex
Smola, Sep 2014 to Jun 2015
◦ Proposed various compression strategies and derived the convergence conditions for applying filters in
first-order distributed optimization methods.
◦ Designed efficient filtering algorithms with priority sampling, randomized rounding techniques. Deployed
on open source parameter server (C++11)
◦ Evaluated the efficacy in logistic regression application (batch, online setting) regularized by L1 or L2
norm over various datasets with different scales.
◦ Explored stacking filters for maximal communication reduction. Examined and scalability using AWS
service. In advertisement click prediction task, we achieved up to 90% reduction without significant affect
in prediction accuracy.
Project: TRECVID Multimedia Event Detection (government funded), advised by Florian Metze,
Sep 2012- Oct 2013
◦ Designed random forest based pipeline (python) for semantic concept extraction. The MAP improved
over 10% than GMM-HMM baseline. [paper 1]
◦ Proposed unsupervised method (LDA, hierarchical clustering) to expand the semantic concept
vocabulary, which is proved to capture richer semantics to improve video retrieval. [paper 2]
◦ Explored the fusion of various features and systems. CMU team ranked 1st in 2013 NIST evaluation.
[paper 3,4]
•
Carnegie Mellon University Pittsburgh, PA
Teaching Assistant Summer 2011
◦ Machine Learning 10601, 2015 fall
2. ◦ Machine learning on Large Scale Data, 2015 spring. Topics covered: map-reduce framework & Hadoop,
parameter server, streaming algorithm, randomized algorithms, paralleled algorithms for LDA, pagerank,
matrix fractorization, etc.
◦ Responsibilities: Prepare assignments, exam questions, set up grading platform, give recitation, hold
office hour
•
Microsoft Beijing, China
Intern 2012 Feb 2012 July
◦ Prototyped (c#, java) the supervised (SVM, decision tree, neural network) and semi-supervised
(multi-view learning) prosody prediction tool.
◦ Researched effective heuristic strategies in sample selection strategy during iterative training.
Publications
• [1]: Yipei Wang, Shourabh Rawat, Florian Metze, ”Exploring Audio Semantic Concepts for Event-based Video
Retrieval”, 2014 IEEE Iternational Conference on Acoustic, Speech and Signal Processing (ICASSP), Florence, Italy
• [2]: Yipei Wang, Shourabh Rawat, Florian Metze, ”Semi-automatic Audio Semantic Concept Discovery for Multimedia
Retrieval”, 2014 IEEE Iternational Conference on Acoustic, Speech and Signal Processing (ICASSP), Florence, Italy
• [3]: Florian Metze, Shourabh Rawat and Yipei Wang, ”Improved audio features for large-scale multimedia event
detection”, 2014 IEEE International Conference on Multimedia and Expo (ICME), Chengdu, China
• [4]: Zhen-Zhong Lan, Lu Jiang, Shoou-I Yu, Shourabh Rawat, Yang Cai, Chengqiang Gao, Shicheng Xu, Haoquan
Shen, Xuanchong Li, Yipei Wang, Wei Tong, Yi Yang, Waito Sze, Susanne Burger, Florian Metze, Rita Singh, Bhiksha
Raj, Richard Stern, Teruko Mitamura, Eric Nyberg, and Alex Hauptmann, Informedia E-Lamp@TRECVID 2013
Multimedia Event Detection and Recounting,TRECVID 2013 Video Retrieval Evaluation workshop
Programming Skills
• Languages: C++, Python, Java, Scala, SQL, Pig, MATLAB
• Technologies: weka, scitkit-learn, HDFS, Spark, Kafka, mongoDB