SlideShare ist ein Scribd-Unternehmen logo
1 von 2
Downloaden Sie, um offline zu lesen
Yipei Wang Email : wangyipei01@gmail.com
yipeiw@alumni.cmu.edu
Phone: +1-412-613-1984
Education
•
Carnegie Mellon University, School of Computer Science Pittsburgh, PA
Master’s, major in Artificial Intelligence May 2015
•
Tsinghua University Beijing, China
Bachelor of Electrical Engineering July. 2012
Experience
•
Particle Media, Inc. Santa Clara, CA
Machine Learning Scientist Sep 2015 - Present
Responsibilities: The company is a startup which provides personalized article reading. The job is
responsible for improving the recommendation system using statistical methods.
◦ Developed offline training pipeline (spark) and static feature extraction pipeline, supporting organizing
data by time range, pre-sampling, normalization, model selection, adding position bias feature.
◦ Developed user profile pipeline based on semantics of user viewed articles.
◦ Build dashboard for monitoring daily Click-through-rate over various dimensions. Applied data analysis
for improving recommendation strategies, including user behavior analysis, automatic selection of popular
articles.
◦ Set up and maintain workflow for offline pipelines.
•
Carnegie Mellon University Pittsburgh, PA
Research Assistant 2012 Oct to 2015 Jun
Project: Communication Filters for Distributed Optimization Algorithm (thesis), advised by Alex
Smola, Sep 2014 to Jun 2015
◦ Proposed various compression strategies and derived the convergence conditions for applying filters in
first-order distributed optimization methods.
◦ Designed efficient filtering algorithms with priority sampling, randomized rounding techniques. Deployed
on open source parameter server (C++11)
◦ Evaluated the efficacy in logistic regression application (batch, online setting) regularized by L1 or L2
norm over various datasets with different scales.
◦ Explored stacking filters for maximal communication reduction. Examined and scalability using AWS
service. In advertisement click prediction task, we achieved up to 90% reduction without significant affect
in prediction accuracy.
Project: TRECVID Multimedia Event Detection (government funded), advised by Florian Metze,
Sep 2012- Oct 2013
◦ Designed random forest based pipeline (python) for semantic concept extraction. The MAP improved
over 10% than GMM-HMM baseline. [paper 1]
◦ Proposed unsupervised method (LDA, hierarchical clustering) to expand the semantic concept
vocabulary, which is proved to capture richer semantics to improve video retrieval. [paper 2]
◦ Explored the fusion of various features and systems. CMU team ranked 1st in 2013 NIST evaluation.
[paper 3,4]
•
Carnegie Mellon University Pittsburgh, PA
Teaching Assistant Summer 2011
◦ Machine Learning 10601, 2015 fall
◦ Machine learning on Large Scale Data, 2015 spring. Topics covered: map-reduce framework & Hadoop,
parameter server, streaming algorithm, randomized algorithms, paralleled algorithms for LDA, pagerank,
matrix fractorization, etc.
◦ Responsibilities: Prepare assignments, exam questions, set up grading platform, give recitation, hold
office hour
•
Microsoft Beijing, China
Intern 2012 Feb 2012 July
◦ Prototyped (c#, java) the supervised (SVM, decision tree, neural network) and semi-supervised
(multi-view learning) prosody prediction tool.
◦ Researched effective heuristic strategies in sample selection strategy during iterative training.
Publications
• [1]: Yipei Wang, Shourabh Rawat, Florian Metze, ”Exploring Audio Semantic Concepts for Event-based Video
Retrieval”, 2014 IEEE Iternational Conference on Acoustic, Speech and Signal Processing (ICASSP), Florence, Italy
• [2]: Yipei Wang, Shourabh Rawat, Florian Metze, ”Semi-automatic Audio Semantic Concept Discovery for Multimedia
Retrieval”, 2014 IEEE Iternational Conference on Acoustic, Speech and Signal Processing (ICASSP), Florence, Italy
• [3]: Florian Metze, Shourabh Rawat and Yipei Wang, ”Improved audio features for large-scale multimedia event
detection”, 2014 IEEE International Conference on Multimedia and Expo (ICME), Chengdu, China
• [4]: Zhen-Zhong Lan, Lu Jiang, Shoou-I Yu, Shourabh Rawat, Yang Cai, Chengqiang Gao, Shicheng Xu, Haoquan
Shen, Xuanchong Li, Yipei Wang, Wei Tong, Yi Yang, Waito Sze, Susanne Burger, Florian Metze, Rita Singh, Bhiksha
Raj, Richard Stern, Teruko Mitamura, Eric Nyberg, and Alex Hauptmann, Informedia E-Lamp@TRECVID 2013
Multimedia Event Detection and Recounting,TRECVID 2013 Video Retrieval Evaluation workshop
Programming Skills
• Languages: C++, Python, Java, Scala, SQL, Pig, MATLAB
• Technologies: weka, scitkit-learn, HDFS, Spark, Kafka, mongoDB

Weitere ähnliche Inhalte

Was ist angesagt?

MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...multimediaeval
 
Stars application software
Stars application softwareStars application software
Stars application softwareMegan
 
Sensors, Signals and Sense-making in Human-Energy Relationships
Sensors, Signals and Sense-making in Human-Energy RelationshipsSensors, Signals and Sense-making in Human-Energy Relationships
Sensors, Signals and Sense-making in Human-Energy RelationshipsMartha Russell
 
Finding and Exploring Commonalities between Researchers Using the ResXplorer
Finding and Exploring Commonalities between Researchers Using the ResXplorerFinding and Exploring Commonalities between Researchers Using the ResXplorer
Finding and Exploring Commonalities between Researchers Using the ResXplorerEducational Technology
 
John Moore Resume_3.2
John Moore Resume_3.2John Moore Resume_3.2
John Moore Resume_3.2John Moore
 
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...Dominik Kowald
 
Information Retrieval and User-centric Recommender System Evaluation
Information Retrieval and User-centric Recommender System EvaluationInformation Retrieval and User-centric Recommender System Evaluation
Information Retrieval and User-centric Recommender System EvaluationAlan Said
 
Project ASTRO Reporting Success Stories and the Next Phase - BbWorld10
Project ASTRO Reporting Success Stories and the Next Phase - BbWorld10Project ASTRO Reporting Success Stories and the Next Phase - BbWorld10
Project ASTRO Reporting Success Stories and the Next Phase - BbWorld10ekunnen
 
Aemoo: Linked Data Exploration based on Knowledge Patterns
Aemoo: Linked Data Exploration based on Knowledge PatternsAemoo: Linked Data Exploration based on Knowledge Patterns
Aemoo: Linked Data Exploration based on Knowledge PatternsAndrea Nuzzolese
 

Was ist angesagt? (11)

Gunderman, Slayton, and Wang, "Planning for the Long-Term"
Gunderman, Slayton, and Wang, "Planning for the Long-Term"Gunderman, Slayton, and Wang, "Planning for the Long-Term"
Gunderman, Slayton, and Wang, "Planning for the Long-Term"
 
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
 
Stars application software
Stars application softwareStars application software
Stars application software
 
Sensors, Signals and Sense-making in Human-Energy Relationships
Sensors, Signals and Sense-making in Human-Energy RelationshipsSensors, Signals and Sense-making in Human-Energy Relationships
Sensors, Signals and Sense-making in Human-Energy Relationships
 
Finding and Exploring Commonalities between Researchers Using the ResXplorer
Finding and Exploring Commonalities between Researchers Using the ResXplorerFinding and Exploring Commonalities between Researchers Using the ResXplorer
Finding and Exploring Commonalities between Researchers Using the ResXplorer
 
John Moore Resume_3.2
John Moore Resume_3.2John Moore Resume_3.2
John Moore Resume_3.2
 
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
WWW2014: Long Time No See: The Probability of Reusing Tags as a Function of F...
 
Information Retrieval and User-centric Recommender System Evaluation
Information Retrieval and User-centric Recommender System EvaluationInformation Retrieval and User-centric Recommender System Evaluation
Information Retrieval and User-centric Recommender System Evaluation
 
Project ASTRO Reporting Success Stories and the Next Phase - BbWorld10
Project ASTRO Reporting Success Stories and the Next Phase - BbWorld10Project ASTRO Reporting Success Stories and the Next Phase - BbWorld10
Project ASTRO Reporting Success Stories and the Next Phase - BbWorld10
 
PTW_CV
PTW_CVPTW_CV
PTW_CV
 
Aemoo: Linked Data Exploration based on Knowledge Patterns
Aemoo: Linked Data Exploration based on Knowledge PatternsAemoo: Linked Data Exploration based on Knowledge Patterns
Aemoo: Linked Data Exploration based on Knowledge Patterns
 

Ähnlich wie sourabh_bajaj_resume

resume-tina-tingchu-lin
resume-tina-tingchu-linresume-tina-tingchu-lin
resume-tina-tingchu-linTing-Chu Lin
 
2015-11-11 research seminar
2015-11-11 research seminar2015-11-11 research seminar
2015-11-11 research seminarifi8106tlu
 
Agent-Based Problem Solving Methods In Big Data Environment
Agent-Based Problem Solving Methods In Big Data EnvironmentAgent-Based Problem Solving Methods In Big Data Environment
Agent-Based Problem Solving Methods In Big Data EnvironmentLaurie Smith
 
PATHS state of the art monitoring report
PATHS state of the art monitoring reportPATHS state of the art monitoring report
PATHS state of the art monitoring reportpathsproject
 
Zejia_CV_final
Zejia_CV_finalZejia_CV_final
Zejia_CV_finalZJ Zheng
 
Software Sustainability Institute
Software Sustainability InstituteSoftware Sustainability Institute
Software Sustainability InstituteNeil Chue Hong
 
Data-Driven Learning Strategy
Data-Driven Learning StrategyData-Driven Learning Strategy
Data-Driven Learning StrategyJessie Chuang
 
SampleLiteratureReviewTemplate_IVBTechIISEM_MajorProject.pptx
SampleLiteratureReviewTemplate_IVBTechIISEM_MajorProject.pptxSampleLiteratureReviewTemplate_IVBTechIISEM_MajorProject.pptx
SampleLiteratureReviewTemplate_IVBTechIISEM_MajorProject.pptx20211a05p7
 
Varsha.sarawagi.resume
Varsha.sarawagi.resumeVarsha.sarawagi.resume
Varsha.sarawagi.resumedlimxc
 
Timothy Chu Resume
Timothy Chu ResumeTimothy Chu Resume
Timothy Chu ResumeTimothy Chu
 
H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellSri Ambati
 
Jiali_Han_Resume
Jiali_Han_ResumeJiali_Han_Resume
Jiali_Han_ResumeJiali Han
 
ResumeAmanRajJuly2016
ResumeAmanRajJuly2016ResumeAmanRajJuly2016
ResumeAmanRajJuly2016Aman Raj
 
Dr Daniel J Clouse Resume
Dr Daniel J Clouse ResumeDr Daniel J Clouse Resume
Dr Daniel J Clouse ResumeDaniel Clouse
 
Dr DanielJ Clouse resumeobf
Dr DanielJ Clouse resumeobfDr DanielJ Clouse resumeobf
Dr DanielJ Clouse resumeobfDaniel Clouse
 

Ähnlich wie sourabh_bajaj_resume (20)

resume-tina-tingchu-lin
resume-tina-tingchu-linresume-tina-tingchu-lin
resume-tina-tingchu-lin
 
resume-shicheng
resume-shichengresume-shicheng
resume-shicheng
 
2015-11-11 research seminar
2015-11-11 research seminar2015-11-11 research seminar
2015-11-11 research seminar
 
Word
WordWord
Word
 
Agent-Based Problem Solving Methods In Big Data Environment
Agent-Based Problem Solving Methods In Big Data EnvironmentAgent-Based Problem Solving Methods In Big Data Environment
Agent-Based Problem Solving Methods In Big Data Environment
 
Information entanglement
Information entanglementInformation entanglement
Information entanglement
 
PATHS state of the art monitoring report
PATHS state of the art monitoring reportPATHS state of the art monitoring report
PATHS state of the art monitoring report
 
Zejia_CV_final
Zejia_CV_finalZejia_CV_final
Zejia_CV_final
 
Software Sustainability Institute
Software Sustainability InstituteSoftware Sustainability Institute
Software Sustainability Institute
 
Data-Driven Learning Strategy
Data-Driven Learning StrategyData-Driven Learning Strategy
Data-Driven Learning Strategy
 
Data Science and Analysis.pptx
Data Science and Analysis.pptxData Science and Analysis.pptx
Data Science and Analysis.pptx
 
SampleLiteratureReviewTemplate_IVBTechIISEM_MajorProject.pptx
SampleLiteratureReviewTemplate_IVBTechIISEM_MajorProject.pptxSampleLiteratureReviewTemplate_IVBTechIISEM_MajorProject.pptx
SampleLiteratureReviewTemplate_IVBTechIISEM_MajorProject.pptx
 
Varsha.sarawagi.resume
Varsha.sarawagi.resumeVarsha.sarawagi.resume
Varsha.sarawagi.resume
 
Timothy Chu Resume
Timothy Chu ResumeTimothy Chu Resume
Timothy Chu Resume
 
H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
 
Jiali_Han_Resume
Jiali_Han_ResumeJiali_Han_Resume
Jiali_Han_Resume
 
ResumeAmanRajJuly2016
ResumeAmanRajJuly2016ResumeAmanRajJuly2016
ResumeAmanRajJuly2016
 
CV _Manoj
CV _ManojCV _Manoj
CV _Manoj
 
Dr Daniel J Clouse Resume
Dr Daniel J Clouse ResumeDr Daniel J Clouse Resume
Dr Daniel J Clouse Resume
 
Dr DanielJ Clouse resumeobf
Dr DanielJ Clouse resumeobfDr DanielJ Clouse resumeobf
Dr DanielJ Clouse resumeobf
 

sourabh_bajaj_resume

  • 1. Yipei Wang Email : wangyipei01@gmail.com yipeiw@alumni.cmu.edu Phone: +1-412-613-1984 Education • Carnegie Mellon University, School of Computer Science Pittsburgh, PA Master’s, major in Artificial Intelligence May 2015 • Tsinghua University Beijing, China Bachelor of Electrical Engineering July. 2012 Experience • Particle Media, Inc. Santa Clara, CA Machine Learning Scientist Sep 2015 - Present Responsibilities: The company is a startup which provides personalized article reading. The job is responsible for improving the recommendation system using statistical methods. ◦ Developed offline training pipeline (spark) and static feature extraction pipeline, supporting organizing data by time range, pre-sampling, normalization, model selection, adding position bias feature. ◦ Developed user profile pipeline based on semantics of user viewed articles. ◦ Build dashboard for monitoring daily Click-through-rate over various dimensions. Applied data analysis for improving recommendation strategies, including user behavior analysis, automatic selection of popular articles. ◦ Set up and maintain workflow for offline pipelines. • Carnegie Mellon University Pittsburgh, PA Research Assistant 2012 Oct to 2015 Jun Project: Communication Filters for Distributed Optimization Algorithm (thesis), advised by Alex Smola, Sep 2014 to Jun 2015 ◦ Proposed various compression strategies and derived the convergence conditions for applying filters in first-order distributed optimization methods. ◦ Designed efficient filtering algorithms with priority sampling, randomized rounding techniques. Deployed on open source parameter server (C++11) ◦ Evaluated the efficacy in logistic regression application (batch, online setting) regularized by L1 or L2 norm over various datasets with different scales. ◦ Explored stacking filters for maximal communication reduction. Examined and scalability using AWS service. In advertisement click prediction task, we achieved up to 90% reduction without significant affect in prediction accuracy. Project: TRECVID Multimedia Event Detection (government funded), advised by Florian Metze, Sep 2012- Oct 2013 ◦ Designed random forest based pipeline (python) for semantic concept extraction. The MAP improved over 10% than GMM-HMM baseline. [paper 1] ◦ Proposed unsupervised method (LDA, hierarchical clustering) to expand the semantic concept vocabulary, which is proved to capture richer semantics to improve video retrieval. [paper 2] ◦ Explored the fusion of various features and systems. CMU team ranked 1st in 2013 NIST evaluation. [paper 3,4] • Carnegie Mellon University Pittsburgh, PA Teaching Assistant Summer 2011 ◦ Machine Learning 10601, 2015 fall
  • 2. ◦ Machine learning on Large Scale Data, 2015 spring. Topics covered: map-reduce framework & Hadoop, parameter server, streaming algorithm, randomized algorithms, paralleled algorithms for LDA, pagerank, matrix fractorization, etc. ◦ Responsibilities: Prepare assignments, exam questions, set up grading platform, give recitation, hold office hour • Microsoft Beijing, China Intern 2012 Feb 2012 July ◦ Prototyped (c#, java) the supervised (SVM, decision tree, neural network) and semi-supervised (multi-view learning) prosody prediction tool. ◦ Researched effective heuristic strategies in sample selection strategy during iterative training. Publications • [1]: Yipei Wang, Shourabh Rawat, Florian Metze, ”Exploring Audio Semantic Concepts for Event-based Video Retrieval”, 2014 IEEE Iternational Conference on Acoustic, Speech and Signal Processing (ICASSP), Florence, Italy • [2]: Yipei Wang, Shourabh Rawat, Florian Metze, ”Semi-automatic Audio Semantic Concept Discovery for Multimedia Retrieval”, 2014 IEEE Iternational Conference on Acoustic, Speech and Signal Processing (ICASSP), Florence, Italy • [3]: Florian Metze, Shourabh Rawat and Yipei Wang, ”Improved audio features for large-scale multimedia event detection”, 2014 IEEE International Conference on Multimedia and Expo (ICME), Chengdu, China • [4]: Zhen-Zhong Lan, Lu Jiang, Shoou-I Yu, Shourabh Rawat, Yang Cai, Chengqiang Gao, Shicheng Xu, Haoquan Shen, Xuanchong Li, Yipei Wang, Wei Tong, Yi Yang, Waito Sze, Susanne Burger, Florian Metze, Rita Singh, Bhiksha Raj, Richard Stern, Teruko Mitamura, Eric Nyberg, and Alex Hauptmann, Informedia E-Lamp@TRECVID 2013 Multimedia Event Detection and Recounting,TRECVID 2013 Video Retrieval Evaluation workshop Programming Skills • Languages: C++, Python, Java, Scala, SQL, Pig, MATLAB • Technologies: weka, scitkit-learn, HDFS, Spark, Kafka, mongoDB