2. Outcomes
• Contemi
• Big Data presence
• Big Data know-how
• Big Data experience
• Interns
• Linux
• R / Python language
• Machine Learning practice
• Process
• Scrum
• Cross Industry Standard Process
for Data Mining (CRISM-DM)
• Kaggle profile
• Hadoop
3. Preparation
• Platform: Ubuntu 12.04 LTS
• Process:
• Scrum
• Cross Industry Standard Process for Data Mining (CRISP DM)
• Weekly blog
• http://contanalytics.wordpress.com/
4. Headstart for Dung
• 16/09 – 30/09
• Learn R / Python
• Try Digit Recognizer competition on Kaggle.com
• Join in Introduction to Recommender System and Web Intelligence and Big
Data on Coursera.com
5. 3 month plan
• 1/10 – 31/10
• Go through all typical Machine Learning algorithms, implement, demo and present to Contemi
• 1/11 – 15/11
• Compete for AMS 2013-2014 Solar Energy Prediction Contest
• URL: http://www.kaggle.com/c/ams-2014-solar-energy-prediction-contest
• 16/11 – 22/11
• Compete for Accelerometer Biometric Competition
• URL: http://www.kaggle.com/c/accelerometer-biometric-competition
• 23/11 – 31/12 (end of internship)
• Deploy Hadoop
• Learn Java
• Run Word counting and Sorting experiments with large data (> 1GB)
• Compete for Facebook Recruting III – Keyword Extraction (personally)
• Re-optimize built model basing on Hadoop
6. Next plan for next internships
• App using Singapore open datasets
• Stock prediction app for Vietnam market
• Visualization
• GitHub
• R-Bloggers