SlideShare ist ein Scribd-Unternehmen logo
1 von 91
Downloaden Sie, um offline zu lesen
Data Analysis Making Big Data Work 
David Chiu 
2014/11/24
About Me 
Founder of LargitData 
Ex-Trend Micro Engineer 
ywchiu.com
Big Data & Data Science
US Election Prediction 
4
World Cup Prediction
Hurricane Prediction
Data Science 
http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
Being A Data Scientist, You Need to Know That Much? Seriously?
Statistic 
Single Variable、Multi Variable、ANOVA 
Data Munging 
Data Extraction, Transformation, Loading 
Data Visualization 
Figure, Business Intelligence 
Required Skills
What You Probably Need Is A Team 
Business Analyst Knowing how to use different tools under different circumstance 
Statistician How to process big data? 
DBA How to deal with unstructured data 
Software Engineer Knowing how to user statistics
Four Dimension 
12 
Single Machine Memory R Local File 
Cloud Distributed Hadoop HDFS 
Statistics Analysis Linear Algebra 
Architect Management Standard 
Concept MapReduce Linear Algebra Logistic Regression 
Tool Hadoop PostgreSQL R 
Analyst How to use these tools 
Hackers R Python Java
“80% are doing summing and averaging” 
Content 
1.Data Munging 
2.Data Analysis 
3.Interpret Result 
What Data Scientists Do?
Application of Data Analysis 
Text Mining 
Classify Spam Mail 
Build Index 
Data Search Engine 
Social Network Analysis 
Finding Opinion Leader 
Recommendation System 
What user likes? 
Opinion Mining 
Positive/Negative Opinion 
Fraud Analysis 
Credit Card Fraud
Feed data to computer 
Make Computer to Do Analysis
Let Computer Predict For You
Predictive Analysis 
Learn from experience (Data), to predict future behavior 
What to Predict? 
e.g. Who is likely to click on that ad? 
For What? 
e.g. According to the click possibility and revenue to decide which ad to show. 
Predictive Analysis
Customer buying beer will also buy pampers? 
People are surfing telephone fee rate are likely to switch its vendor 
People belong to same group are tend to have same telecom vendor 
Surprising Conclusion
According to personal behavior, predictive model can use personal characteristic to generate a probabilistic score, which the higher the score, the more likely the behavior. 
Predictive Model
Linear Model 
e.g. Based on a cosmetic ad. We can give 90% weight to female customers, give10% to male customer. Based on the click probability (15%), we can calculate the possibility score (or probability) 
Female 13.5%,Male1.5% 
Rule Model 
e.g. 
If the user is “She” 
And Income is over 30k 
And haven’t seen the ad yet 
The click rate is 11% 
Simple Predictive Model
Induction 
From detail to general 
A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E 
-- Tom Mitchell (1998) 
Discover an effective model 
Start from a simple model 
Update the model based on feeding data 
Keep on improving prediction power 
Machine Learning
Statistic Analysis 
Regression Analysis 
Clustering 
Classification 
Recommendation 
Text Mining 
Application 
22
Image recognition
Decision Tree 
Rate > 1,299/Month 
Probability to switch vendor 15% 
Probability to switch vendor 3% 
Yes 
No
Decision Tree 
Rate > 1,299/Month 
Probability to switch vendor 3% 
Yes 
No 
Probability to switch vendor 10% 
Probability to switch vendor 22% 
Income>22,000 
Yes 
No
Decision Tree 
Rate > 1,299/Month 
Yes 
No 
Probability to switch vendor 10% 
Probability to switch vendor 22% 
Income>22,000 
Yes 
No 
Probability to switch vendor 1% 
Probability to switch vendor 7% 
Free for intranet 
Yes 
No
Supervised Learning 
Regression 
Classification 
Unsupervised Learning 
Dimension Reduction 
Clustering 
Machine Learning
Supervised Learning
Classification 
e.g. Stock prediction on bull/bear market 
Regression 
e.g. Price prediction 
Supervised Learning
Dimension Reduction 
e.g. Making a new index 
Clustering 
e.g. Customer Segmentation 
Unsupervised Learning
Lift 
The better the lift, the greater the cost? 
The more decision rule, the more campaign? 
Design strategy for different persona? 
The lift for 4 campaign? 
The lift for 20 ampaign? 
Lift
Can we use the production rate of butter to predict stock market? 
Overfitting
Use noise as information 
Over assumption 
Over Interpretation 
What overfitting learn is not truth 
Like memorize all answers in a single test. 
Overfitting
Testing Model 
Use external data or partial data as testing dataset
Traditional Analysis Tool
Statistics On The Fly 
Built-in Math and Graphic Function 
Free and Open Source 
http://cran.r-project.org/src/base/ 
R Language 
36
Functional Programming 
Use Function Definition To Retrieve Answer 
Interpreted Language 
Statistics On the Fly 
Object Oriented Language 
S3 and S4 Method 
R Language
Most Used Analytic Language 
Most popular languages are R, Python (39%), SQL (37%). SAS (20%). 
By Gregory Piatetsky, Aug 27, 2013.
Kaggle 
http://www.kaggle.com/ 
Most often used language in Kaggle competition
Data Scientist in Google and Apple Use R 
What is your programming language of choice, R, Python or something else? 
“I use R, and occasionally matlab, for data analysis. There is a large, active and extremely knowledgeable R community at Google.” 
http://simplystatistics.org/2013/02/15/interview-with-nick-chamandy-statistician-at-google/ 
“Expert knowledge of SAS (With Enterprise Guide/Miner) required and candidates with strong knowledge of R will be preferred” 
http://www.kdnuggets.com/jobs/13/03-29-apple-sr-data- scientist.html?utm_source=twitterfeed&utm_medium=facebook&utm_campaign=tfb&utm_content=FaceBook&utm_term=analytics#.UVXibgXOpfc.facebook
Discover which customer is likely to churn? 
Customer Churn Analysis
Account Information 
state 
account length. 
area code 
phone number 
User Behavior 
international plan 
voice mail plan, number vmail messages 
total day minutes, total day calls, total day charge 
total eve minutes, total eve calls, total eve charge 
total night minutes, total night calls, total night charge 
total intl minutes, total intl calls, total intl charge 
number customer service calls 
Target 
Churn (Yes/No) 
Data Description
> install.packages("C50") > library(C50) > data(churn) > str(churnTrain) > churnTrain = churnTrain[,! names(churnTrain) %in% c("state", "area_code", "account_length") ] > set.seed(2) > ind <- sample(2, nrow(churnTrain), replace = TRUE, prob=c(0.7, 0.3)) > trainset = churnTrain[ind == 1,] > testset = churnTrain[ind == 2,] 
Split data into training and testing dataset 
70% as training dataset 
30% as testing dataset
churn.rp <- rpart(churn ~ ., data=trainset) plot(churn.rp, margin= 0.1) text(churn.rp, all=TRUE, use.n = TRUE) 
Build Classifier 
Classfication
> predictions <- predict(churn.rp, testset, type="class") > table(testset$churn, predictions) 
Prediction Result 
pred 
no 
yes 
no 
859 
18 
yes 
41 
100
> confusionMatrix(table(predictions, testset$churn)) Confusion Matrix and Statistics predictions yes no yes 100 18 no 41 859 Accuracy : 0.942 95% CI : (0.9259, 0.9556) No Information Rate : 0.8615 P-Value [Acc > NIR] : < 2.2e-16 Kappa : 0.7393 Mcnemar's Test P-Value : 0.004181 Sensitivity : 0.70922 Specificity : 0.97948 Pos Pred Value : 0.84746 Neg Pred Value : 0.95444 Prevalence : 0.13851 Detection Rate : 0.09823 Detection Prevalence : 0.11591 Balanced Accuracy : 0.84435 'Positive' Class : yes 
Use Confusion Matrix
Use Testing Data to Validate Result 
predictions <- predict(churn.rp, testset, type="prob") pred.to.roc <- predictions[, 1] pred.rocr <- prediction(pred.to.roc, as.factor(testset[,(dim(testset)[[2]])])) perf.rocr <- performance(pred.rocr, measure = "auc", x.measure = "cutoff") perf.tpr.rocr <- performance(pred.rocr, "tpr","fpr") plot(perf.tpr.rocr, colorize=T,main=paste("AUC:",(perf.rocr@y.values)))
Finding Most Important Variable model=fit(churn~.,trainset,model="svm") VariableImportance=Importance(model,trainset,method="sensv") L=list(runs=1,sen=t(VariableImportance$imp),sresponses=VariableImportance$ sresponses) mgraph(L,graph="IMP",leg=names(trainset),col="gray",Grid=10)
Dynamic Language 
Execution at runtime 
Dynamic Type 
Interpreted Language 
See the result after execution 
OOP 
Python Language 
49
Cross Platform(Python VM) 
Third-Party Resource 
(Data Analysis、Graphics、Website Development) 
Simple, and easy to learn 
Benefit of Python
Data Analysis 
Scipy 
Numpy 
Scikit-learn 
Pandas 
51
Company that use python 
52
Use InfoLite Tool To Extract DOM
Use Python To Build Up Dashboard
Monitor Social Media and News 
Monitor post on social media 
Configure keyword and alert 
Use line plot to show daily post statistics 
55 
蘋果, nownews, udn, 中央跟風傳媒 還有 其他財經媒體
Daily Statistics Report 
56
Examine Associate Article 
57
Configure Alert and Keyword 
58
Configure Monitor Channel 
59
Track Specific Article 
60
Have You Learned Big Data? 
61
The 3Vs of Big Data
Product Centric 
Customer Centric 
Product Centric v.s. Customer Centric
Customer Centric? 
http://goo.gl/iuy4lY
Personal Recommendation
Knowing Who You Are? 
Personal recommendation 
Customer relation management 
Knowing What Futures Likes? 
From the history, we can see the future 
Predictive analysis 
Knowing What is Hidden Beneath? 
Correlation, Correlation, Correlation 
So… What is Big Data?
So… How To Analyze?
Apache Project – From Yahoo 
Feature 
Extensible 
Cost Effective 
Flexible 
High Fault Tolerant 
Hadoop
Hadoop Eco System 
HDFS 
MR 
IMPALA 
HBASE 
PIG 
HIVE 
SQOOP FLUME 
HUE, Oozie, Mahout
Tools for different scale 
Size 
Classification 
Tools 
Lines 
Sample Data 
Analysis and Visualisation 
Whiteboard, 
Bash, ... 
KBs – low MBs 
Prototype Data 
Analysis and Visualisation 
Matlab, Octave, R, Processing, Bash, ... 
MBs – low GBs 
Online Data 
Storage 
MySQL (DBs), ... 
Analysis 
NumPy, SciPy, Pandas, Weka.. 
Visualisation 
Flare, AmCharts, Raphael 
GBs 
– TBs 
– PBs 
Big Data 
Storage 
HDFS, Hbase, Cassandra,... 
Analysis 
Hive, Giraph, Hama, Mahout
Amazon
Facebook
Recommendation System 
Javascript 
Flume 
HDFS 
HBase 
Pig 
Mahout
Item- Based
User - Based
Monitor User Rating
Send User Behavior to Backend
Use Flume To Collect Streaming Data 
From /tmp/postlog.txt To /user/cloudera/flume
JSON sample data 
{"food":"Tacos", "person":"Alice", "amount":3} 
{"food":"Tomato Soup", "person":"Sarah", "amount":2} 
{"food":"Grilled Cheese", "person":"Alex", "amount":5} 
Demo Code 
second_table = LOAD 'second_table.json' 
USING JsonLoader('food:chararray, person:chararray, amount:int'); 
Use Pig To Load JSON
Build Recommendation Model
$ hbase shell 
> create ‘mydata’, ‘mycf’ 
Build Table In HBase
Examine Data In HDFS
Use Pig To Transfer Data Into HBase
Examine Data In HBase
Build API
Recommendation System
Focus on algorithm 
Divide and Conquer, Trie, Collaborative Filtering 
Being an expert of single programming language 
But knowing what tools and algorithm you can use to solve your problem 
Define your role 
Statistician 
Software engineer 
What You Should Do
Website: 
largitdata.com 
ywchiu.com 
Email: 
david@largitdata.com 
tr.ywchiu@gmail.com 
Contacts
Data Analysis - Making Big Data Work

Weitere ähnliche Inhalte

Was ist angesagt?

Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Edureka!
 
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Edureka!
 
Introduction To Data Science With Python
Introduction To Data Science With PythonIntroduction To Data Science With Python
Introduction To Data Science With PythonSpotle.ai
 
Predictive analytics and big data tutorial
Predictive analytics and big data tutorial Predictive analytics and big data tutorial
Predictive analytics and big data tutorial Benjamin Taylor
 
Different Career Paths in Data Science
Different Career Paths in Data ScienceDifferent Career Paths in Data Science
Different Career Paths in Data ScienceRoger Huang
 
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...Edureka!
 
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...Edureka!
 
Association Mining
Association Mining Association Mining
Association Mining Edureka!
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Edureka!
 
Python webinar 4th june
Python webinar 4th junePython webinar 4th june
Python webinar 4th juneEdureka!
 
How to program your way into data science?
How to program your way into data science?How to program your way into data science?
How to program your way into data science?DeZyre
 
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...Edureka!
 
Myths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data ScientistsMyths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data ScientistsDavid Pittman
 
How to Become a Data Scientist
How to Become a Data ScientistHow to Become a Data Scientist
How to Become a Data Scientistryanorban
 
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Simplilearn
 
Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples Edureka!
 
Data Science : Make Smarter Business Decisions
Data Science : Make Smarter Business DecisionsData Science : Make Smarter Business Decisions
Data Science : Make Smarter Business DecisionsEdureka!
 

Was ist angesagt? (20)

Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
 
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
 
Introduction To Data Science With Python
Introduction To Data Science With PythonIntroduction To Data Science With Python
Introduction To Data Science With Python
 
Predictive analytics and big data tutorial
Predictive analytics and big data tutorial Predictive analytics and big data tutorial
Predictive analytics and big data tutorial
 
Different Career Paths in Data Science
Different Career Paths in Data ScienceDifferent Career Paths in Data Science
Different Career Paths in Data Science
 
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
 
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
 
Association Mining
Association Mining Association Mining
Association Mining
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
 
Python webinar 4th june
Python webinar 4th junePython webinar 4th june
Python webinar 4th june
 
Data science
Data scienceData science
Data science
 
How to program your way into data science?
How to program your way into data science?How to program your way into data science?
How to program your way into data science?
 
Using hadoop for big data
Using hadoop for big dataUsing hadoop for big data
Using hadoop for big data
 
Python for Data Science
Python for Data SciencePython for Data Science
Python for Data Science
 
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
 
Myths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data ScientistsMyths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data Scientists
 
How to Become a Data Scientist
How to Become a Data ScientistHow to Become a Data Scientist
How to Become a Data Scientist
 
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
 
Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples
 
Data Science : Make Smarter Business Decisions
Data Science : Make Smarter Business DecisionsData Science : Make Smarter Business Decisions
Data Science : Make Smarter Business Decisions
 

Andere mochten auch

新聞 X 謊言 用文字探勘挖掘財經新聞沒告訴你的真相(丘祐瑋)
新聞 X 謊言 用文字探勘挖掘財經新聞沒告訴你的真相(丘祐瑋)新聞 X 謊言 用文字探勘挖掘財經新聞沒告訴你的真相(丘祐瑋)
新聞 X 謊言 用文字探勘挖掘財經新聞沒告訴你的真相(丘祐瑋)David Chiu
 
如何建置關鍵字精靈 How to Build an Keyword Wizard
如何建置關鍵字精靈 How to Build an Keyword Wizard如何建置關鍵字精靈 How to Build an Keyword Wizard
如何建置關鍵字精靈 How to Build an Keyword Wizard晨揚 施
 
PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)
PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)
PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)David Chiu
 
Social Network Analysis With R
Social Network Analysis With RSocial Network Analysis With R
Social Network Analysis With RDavid Chiu
 
Big Data Analysis With RHadoop
Big Data Analysis With RHadoopBig Data Analysis With RHadoop
Big Data Analysis With RHadoopDavid Chiu
 
Tune up your data science process
Tune up your data science processTune up your data science process
Tune up your data science processBenjamin Skrainka
 
Analysis, data & process modeling
Analysis, data & process modelingAnalysis, data & process modeling
Analysis, data & process modelingChi D. Nguyen
 
Cross border - off-shoring and outsourcing privacy sensitive data
Cross border - off-shoring and outsourcing privacy sensitive dataCross border - off-shoring and outsourcing privacy sensitive data
Cross border - off-shoring and outsourcing privacy sensitive dataUlf Mattsson
 
Data science training in hyderabad
Data science training in hyderabadData science training in hyderabad
Data science training in hyderabadKelly Technologies
 
Statistical analysis of process data 7 stages oil flow chart power point temp...
Statistical analysis of process data 7 stages oil flow chart power point temp...Statistical analysis of process data 7 stages oil flow chart power point temp...
Statistical analysis of process data 7 stages oil flow chart power point temp...SlideTeam.net
 
Data analysis with R and Julia
Data analysis with R and JuliaData analysis with R and Julia
Data analysis with R and JuliaMark Tabladillo
 
Data Science and Goodhart's Law
Data Science and Goodhart's LawData Science and Goodhart's Law
Data Science and Goodhart's LawDomino Data Lab
 
Machine Learning With R
Machine Learning With RMachine Learning With R
Machine Learning With RDavid Chiu
 
Data Wrangling and Oracle Connectors for Hadoop
Data Wrangling and Oracle Connectors for HadoopData Wrangling and Oracle Connectors for Hadoop
Data Wrangling and Oracle Connectors for HadoopGwen (Chen) Shapira
 
A Tour of the Data Science Process, a Case Study Using Movie Industry Data
A Tour of the Data Science Process, a Case Study Using Movie Industry DataA Tour of the Data Science Process, a Case Study Using Movie Industry Data
A Tour of the Data Science Process, a Case Study Using Movie Industry DataDomino Data Lab
 
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)Securing the Data Hub--Protecting your Customer IP (Technical Workshop)
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)Cloudera, Inc.
 
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Cloudera, Inc.
 
Big Data Step-by-Step: Infrastructure 2/3: Running R and RStudio on EC2
Big Data Step-by-Step: Infrastructure 2/3: Running R and RStudio on EC2Big Data Step-by-Step: Infrastructure 2/3: Running R and RStudio on EC2
Big Data Step-by-Step: Infrastructure 2/3: Running R and RStudio on EC2Jeffrey Breen
 
How to read a data model
How to read a data modelHow to read a data model
How to read a data modelsanksh
 

Andere mochten auch (19)

新聞 X 謊言 用文字探勘挖掘財經新聞沒告訴你的真相(丘祐瑋)
新聞 X 謊言 用文字探勘挖掘財經新聞沒告訴你的真相(丘祐瑋)新聞 X 謊言 用文字探勘挖掘財經新聞沒告訴你的真相(丘祐瑋)
新聞 X 謊言 用文字探勘挖掘財經新聞沒告訴你的真相(丘祐瑋)
 
如何建置關鍵字精靈 How to Build an Keyword Wizard
如何建置關鍵字精靈 How to Build an Keyword Wizard如何建置關鍵字精靈 How to Build an Keyword Wizard
如何建置關鍵字精靈 How to Build an Keyword Wizard
 
PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)
PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)
PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)
 
Social Network Analysis With R
Social Network Analysis With RSocial Network Analysis With R
Social Network Analysis With R
 
Big Data Analysis With RHadoop
Big Data Analysis With RHadoopBig Data Analysis With RHadoop
Big Data Analysis With RHadoop
 
Tune up your data science process
Tune up your data science processTune up your data science process
Tune up your data science process
 
Analysis, data & process modeling
Analysis, data & process modelingAnalysis, data & process modeling
Analysis, data & process modeling
 
Cross border - off-shoring and outsourcing privacy sensitive data
Cross border - off-shoring and outsourcing privacy sensitive dataCross border - off-shoring and outsourcing privacy sensitive data
Cross border - off-shoring and outsourcing privacy sensitive data
 
Data science training in hyderabad
Data science training in hyderabadData science training in hyderabad
Data science training in hyderabad
 
Statistical analysis of process data 7 stages oil flow chart power point temp...
Statistical analysis of process data 7 stages oil flow chart power point temp...Statistical analysis of process data 7 stages oil flow chart power point temp...
Statistical analysis of process data 7 stages oil flow chart power point temp...
 
Data analysis with R and Julia
Data analysis with R and JuliaData analysis with R and Julia
Data analysis with R and Julia
 
Data Science and Goodhart's Law
Data Science and Goodhart's LawData Science and Goodhart's Law
Data Science and Goodhart's Law
 
Machine Learning With R
Machine Learning With RMachine Learning With R
Machine Learning With R
 
Data Wrangling and Oracle Connectors for Hadoop
Data Wrangling and Oracle Connectors for HadoopData Wrangling and Oracle Connectors for Hadoop
Data Wrangling and Oracle Connectors for Hadoop
 
A Tour of the Data Science Process, a Case Study Using Movie Industry Data
A Tour of the Data Science Process, a Case Study Using Movie Industry DataA Tour of the Data Science Process, a Case Study Using Movie Industry Data
A Tour of the Data Science Process, a Case Study Using Movie Industry Data
 
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)Securing the Data Hub--Protecting your Customer IP (Technical Workshop)
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)
 
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
 
Big Data Step-by-Step: Infrastructure 2/3: Running R and RStudio on EC2
Big Data Step-by-Step: Infrastructure 2/3: Running R and RStudio on EC2Big Data Step-by-Step: Infrastructure 2/3: Running R and RStudio on EC2
Big Data Step-by-Step: Infrastructure 2/3: Running R and RStudio on EC2
 
How to read a data model
How to read a data modelHow to read a data model
How to read a data model
 

Ähnlich wie Data Analysis - Making Big Data Work

MB2208A- Business Analytics- unit-4.pptx
MB2208A- Business Analytics- unit-4.pptxMB2208A- Business Analytics- unit-4.pptx
MB2208A- Business Analytics- unit-4.pptxssuser28b150
 
A day in the life of a data scientist in an AI company
A day in the life of a data scientist in an AI companyA day in the life of a data scientist in an AI company
A day in the life of a data scientist in an AI companyFrancesca Lazzeri, PhD
 
Machine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual WorkshopMachine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual WorkshopCCG
 
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution AnalyticsTime-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution AnalyticsRevolution Analytics
 
Better Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data DecisionsBetter Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data DecisionsProduct School
 
What is Data analytics? How is data analytics a better career option?
What is Data analytics? How is data analytics a better career option?What is Data analytics? How is data analytics a better career option?
What is Data analytics? How is data analytics a better career option?Aspire Techsoft Academy
 
JDO 2019: Data Science for Developers - Matthew Renze
JDO 2019: Data Science for Developers -  Matthew RenzeJDO 2019: Data Science for Developers -  Matthew Renze
JDO 2019: Data Science for Developers - Matthew RenzePROIDEA
 
Fuel for the cognitive age: What's new in IBM predictive analytics
Fuel for the cognitive age: What's new in IBM predictive analytics Fuel for the cognitive age: What's new in IBM predictive analytics
Fuel for the cognitive age: What's new in IBM predictive analytics IBM SPSS Software
 
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)Leslie McFarlin
 
Machine Learning and Remarketing
Machine Learning and RemarketingMachine Learning and Remarketing
Machine Learning and RemarketingClark Boyd
 
What's New in Predictive Analytics IBM SPSS
What's New in Predictive Analytics IBM SPSSWhat's New in Predictive Analytics IBM SPSS
What's New in Predictive Analytics IBM SPSSVirginia Fernandez
 
What's New in Predictive Analytics IBM SPSS - Apr 2016
What's New in Predictive Analytics IBM SPSS - Apr 2016What's New in Predictive Analytics IBM SPSS - Apr 2016
What's New in Predictive Analytics IBM SPSS - Apr 2016Edgar Alejandro Villegas
 
Intro to Data Analytics with Oscar's Director of Product
 Intro to Data Analytics with Oscar's Director of Product Intro to Data Analytics with Oscar's Director of Product
Intro to Data Analytics with Oscar's Director of ProductProduct School
 
Mixed Methods Research in the Age of Big Data: A Primer for UX Researchers
Mixed Methods Research in the Age of Big Data: A Primer for UX ResearchersMixed Methods Research in the Age of Big Data: A Primer for UX Researchers
Mixed Methods Research in the Age of Big Data: A Primer for UX ResearchersUXPA International
 
UXPA 2016: Mixed Methods Research in the Age of Big Data
UXPA 2016: Mixed Methods Research in the Age of Big DataUXPA 2016: Mixed Methods Research in the Age of Big Data
UXPA 2016: Mixed Methods Research in the Age of Big DataZachary Sam Zaiss
 
Data science tutorial
Data science tutorialData science tutorial
Data science tutorialAakashdata
 
Big data and Marketing by Edward Chenard
Big data and Marketing by Edward ChenardBig data and Marketing by Edward Chenard
Big data and Marketing by Edward ChenardEdward Chenard
 

Ähnlich wie Data Analysis - Making Big Data Work (20)

Intro to ai application emeritus uob-final
Intro to ai application emeritus uob-finalIntro to ai application emeritus uob-final
Intro to ai application emeritus uob-final
 
MB2208A- Business Analytics- unit-4.pptx
MB2208A- Business Analytics- unit-4.pptxMB2208A- Business Analytics- unit-4.pptx
MB2208A- Business Analytics- unit-4.pptx
 
A day in the life of a data scientist in an AI company
A day in the life of a data scientist in an AI companyA day in the life of a data scientist in an AI company
A day in the life of a data scientist in an AI company
 
Machine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual WorkshopMachine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual Workshop
 
Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution AnalyticsTime-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
 
Better Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data DecisionsBetter Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data Decisions
 
1 kwyfvb
1 kwyfvb1 kwyfvb
1 kwyfvb
 
ForresterPredictiveWave
ForresterPredictiveWaveForresterPredictiveWave
ForresterPredictiveWave
 
What is Data analytics? How is data analytics a better career option?
What is Data analytics? How is data analytics a better career option?What is Data analytics? How is data analytics a better career option?
What is Data analytics? How is data analytics a better career option?
 
JDO 2019: Data Science for Developers - Matthew Renze
JDO 2019: Data Science for Developers -  Matthew RenzeJDO 2019: Data Science for Developers -  Matthew Renze
JDO 2019: Data Science for Developers - Matthew Renze
 
Fuel for the cognitive age: What's new in IBM predictive analytics
Fuel for the cognitive age: What's new in IBM predictive analytics Fuel for the cognitive age: What's new in IBM predictive analytics
Fuel for the cognitive age: What's new in IBM predictive analytics
 
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
 
Machine Learning and Remarketing
Machine Learning and RemarketingMachine Learning and Remarketing
Machine Learning and Remarketing
 
What's New in Predictive Analytics IBM SPSS
What's New in Predictive Analytics IBM SPSSWhat's New in Predictive Analytics IBM SPSS
What's New in Predictive Analytics IBM SPSS
 
What's New in Predictive Analytics IBM SPSS - Apr 2016
What's New in Predictive Analytics IBM SPSS - Apr 2016What's New in Predictive Analytics IBM SPSS - Apr 2016
What's New in Predictive Analytics IBM SPSS - Apr 2016
 
Intro to Data Analytics with Oscar's Director of Product
 Intro to Data Analytics with Oscar's Director of Product Intro to Data Analytics with Oscar's Director of Product
Intro to Data Analytics with Oscar's Director of Product
 
Mixed Methods Research in the Age of Big Data: A Primer for UX Researchers
Mixed Methods Research in the Age of Big Data: A Primer for UX ResearchersMixed Methods Research in the Age of Big Data: A Primer for UX Researchers
Mixed Methods Research in the Age of Big Data: A Primer for UX Researchers
 
UXPA 2016: Mixed Methods Research in the Age of Big Data
UXPA 2016: Mixed Methods Research in the Age of Big DataUXPA 2016: Mixed Methods Research in the Age of Big Data
UXPA 2016: Mixed Methods Research in the Age of Big Data
 
Data science tutorial
Data science tutorialData science tutorial
Data science tutorial
 
Big data and Marketing by Edward Chenard
Big data and Marketing by Edward ChenardBig data and Marketing by Edward Chenard
Big data and Marketing by Edward Chenard
 

Kürzlich hochgeladen

Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...gajnagarg
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...karishmasinghjnh
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 

Kürzlich hochgeladen (20)

Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 

Data Analysis - Making Big Data Work

  • 1. Data Analysis Making Big Data Work David Chiu 2014/11/24
  • 2. About Me Founder of LargitData Ex-Trend Micro Engineer ywchiu.com
  • 3. Big Data & Data Science
  • 8.
  • 9. Being A Data Scientist, You Need to Know That Much? Seriously?
  • 10. Statistic Single Variable、Multi Variable、ANOVA Data Munging Data Extraction, Transformation, Loading Data Visualization Figure, Business Intelligence Required Skills
  • 11. What You Probably Need Is A Team Business Analyst Knowing how to use different tools under different circumstance Statistician How to process big data? DBA How to deal with unstructured data Software Engineer Knowing how to user statistics
  • 12. Four Dimension 12 Single Machine Memory R Local File Cloud Distributed Hadoop HDFS Statistics Analysis Linear Algebra Architect Management Standard Concept MapReduce Linear Algebra Logistic Regression Tool Hadoop PostgreSQL R Analyst How to use these tools Hackers R Python Java
  • 13. “80% are doing summing and averaging” Content 1.Data Munging 2.Data Analysis 3.Interpret Result What Data Scientists Do?
  • 14. Application of Data Analysis Text Mining Classify Spam Mail Build Index Data Search Engine Social Network Analysis Finding Opinion Leader Recommendation System What user likes? Opinion Mining Positive/Negative Opinion Fraud Analysis Credit Card Fraud
  • 15. Feed data to computer Make Computer to Do Analysis
  • 17. Predictive Analysis Learn from experience (Data), to predict future behavior What to Predict? e.g. Who is likely to click on that ad? For What? e.g. According to the click possibility and revenue to decide which ad to show. Predictive Analysis
  • 18. Customer buying beer will also buy pampers? People are surfing telephone fee rate are likely to switch its vendor People belong to same group are tend to have same telecom vendor Surprising Conclusion
  • 19. According to personal behavior, predictive model can use personal characteristic to generate a probabilistic score, which the higher the score, the more likely the behavior. Predictive Model
  • 20. Linear Model e.g. Based on a cosmetic ad. We can give 90% weight to female customers, give10% to male customer. Based on the click probability (15%), we can calculate the possibility score (or probability) Female 13.5%,Male1.5% Rule Model e.g. If the user is “She” And Income is over 30k And haven’t seen the ad yet The click rate is 11% Simple Predictive Model
  • 21. Induction From detail to general A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E -- Tom Mitchell (1998) Discover an effective model Start from a simple model Update the model based on feeding data Keep on improving prediction power Machine Learning
  • 22. Statistic Analysis Regression Analysis Clustering Classification Recommendation Text Mining Application 22
  • 24. Decision Tree Rate > 1,299/Month Probability to switch vendor 15% Probability to switch vendor 3% Yes No
  • 25. Decision Tree Rate > 1,299/Month Probability to switch vendor 3% Yes No Probability to switch vendor 10% Probability to switch vendor 22% Income>22,000 Yes No
  • 26. Decision Tree Rate > 1,299/Month Yes No Probability to switch vendor 10% Probability to switch vendor 22% Income>22,000 Yes No Probability to switch vendor 1% Probability to switch vendor 7% Free for intranet Yes No
  • 27. Supervised Learning Regression Classification Unsupervised Learning Dimension Reduction Clustering Machine Learning
  • 29. Classification e.g. Stock prediction on bull/bear market Regression e.g. Price prediction Supervised Learning
  • 30. Dimension Reduction e.g. Making a new index Clustering e.g. Customer Segmentation Unsupervised Learning
  • 31. Lift The better the lift, the greater the cost? The more decision rule, the more campaign? Design strategy for different persona? The lift for 4 campaign? The lift for 20 ampaign? Lift
  • 32. Can we use the production rate of butter to predict stock market? Overfitting
  • 33. Use noise as information Over assumption Over Interpretation What overfitting learn is not truth Like memorize all answers in a single test. Overfitting
  • 34. Testing Model Use external data or partial data as testing dataset
  • 36. Statistics On The Fly Built-in Math and Graphic Function Free and Open Source http://cran.r-project.org/src/base/ R Language 36
  • 37. Functional Programming Use Function Definition To Retrieve Answer Interpreted Language Statistics On the Fly Object Oriented Language S3 and S4 Method R Language
  • 38. Most Used Analytic Language Most popular languages are R, Python (39%), SQL (37%). SAS (20%). By Gregory Piatetsky, Aug 27, 2013.
  • 39. Kaggle http://www.kaggle.com/ Most often used language in Kaggle competition
  • 40. Data Scientist in Google and Apple Use R What is your programming language of choice, R, Python or something else? “I use R, and occasionally matlab, for data analysis. There is a large, active and extremely knowledgeable R community at Google.” http://simplystatistics.org/2013/02/15/interview-with-nick-chamandy-statistician-at-google/ “Expert knowledge of SAS (With Enterprise Guide/Miner) required and candidates with strong knowledge of R will be preferred” http://www.kdnuggets.com/jobs/13/03-29-apple-sr-data- scientist.html?utm_source=twitterfeed&utm_medium=facebook&utm_campaign=tfb&utm_content=FaceBook&utm_term=analytics#.UVXibgXOpfc.facebook
  • 41. Discover which customer is likely to churn? Customer Churn Analysis
  • 42. Account Information state account length. area code phone number User Behavior international plan voice mail plan, number vmail messages total day minutes, total day calls, total day charge total eve minutes, total eve calls, total eve charge total night minutes, total night calls, total night charge total intl minutes, total intl calls, total intl charge number customer service calls Target Churn (Yes/No) Data Description
  • 43. > install.packages("C50") > library(C50) > data(churn) > str(churnTrain) > churnTrain = churnTrain[,! names(churnTrain) %in% c("state", "area_code", "account_length") ] > set.seed(2) > ind <- sample(2, nrow(churnTrain), replace = TRUE, prob=c(0.7, 0.3)) > trainset = churnTrain[ind == 1,] > testset = churnTrain[ind == 2,] Split data into training and testing dataset 70% as training dataset 30% as testing dataset
  • 44. churn.rp <- rpart(churn ~ ., data=trainset) plot(churn.rp, margin= 0.1) text(churn.rp, all=TRUE, use.n = TRUE) Build Classifier Classfication
  • 45. > predictions <- predict(churn.rp, testset, type="class") > table(testset$churn, predictions) Prediction Result pred no yes no 859 18 yes 41 100
  • 46. > confusionMatrix(table(predictions, testset$churn)) Confusion Matrix and Statistics predictions yes no yes 100 18 no 41 859 Accuracy : 0.942 95% CI : (0.9259, 0.9556) No Information Rate : 0.8615 P-Value [Acc > NIR] : < 2.2e-16 Kappa : 0.7393 Mcnemar's Test P-Value : 0.004181 Sensitivity : 0.70922 Specificity : 0.97948 Pos Pred Value : 0.84746 Neg Pred Value : 0.95444 Prevalence : 0.13851 Detection Rate : 0.09823 Detection Prevalence : 0.11591 Balanced Accuracy : 0.84435 'Positive' Class : yes Use Confusion Matrix
  • 47. Use Testing Data to Validate Result predictions <- predict(churn.rp, testset, type="prob") pred.to.roc <- predictions[, 1] pred.rocr <- prediction(pred.to.roc, as.factor(testset[,(dim(testset)[[2]])])) perf.rocr <- performance(pred.rocr, measure = "auc", x.measure = "cutoff") perf.tpr.rocr <- performance(pred.rocr, "tpr","fpr") plot(perf.tpr.rocr, colorize=T,main=paste("AUC:",(perf.rocr@y.values)))
  • 48. Finding Most Important Variable model=fit(churn~.,trainset,model="svm") VariableImportance=Importance(model,trainset,method="sensv") L=list(runs=1,sen=t(VariableImportance$imp),sresponses=VariableImportance$ sresponses) mgraph(L,graph="IMP",leg=names(trainset),col="gray",Grid=10)
  • 49. Dynamic Language Execution at runtime Dynamic Type Interpreted Language See the result after execution OOP Python Language 49
  • 50. Cross Platform(Python VM) Third-Party Resource (Data Analysis、Graphics、Website Development) Simple, and easy to learn Benefit of Python
  • 51. Data Analysis Scipy Numpy Scikit-learn Pandas 51
  • 52. Company that use python 52
  • 53. Use InfoLite Tool To Extract DOM
  • 54. Use Python To Build Up Dashboard
  • 55. Monitor Social Media and News Monitor post on social media Configure keyword and alert Use line plot to show daily post statistics 55 蘋果, nownews, udn, 中央跟風傳媒 還有 其他財經媒體
  • 58. Configure Alert and Keyword 58
  • 61. Have You Learned Big Data? 61
  • 62.
  • 63. The 3Vs of Big Data
  • 64.
  • 65. Product Centric Customer Centric Product Centric v.s. Customer Centric
  • 68. Knowing Who You Are? Personal recommendation Customer relation management Knowing What Futures Likes? From the history, we can see the future Predictive analysis Knowing What is Hidden Beneath? Correlation, Correlation, Correlation So… What is Big Data?
  • 69. So… How To Analyze?
  • 70. Apache Project – From Yahoo Feature Extensible Cost Effective Flexible High Fault Tolerant Hadoop
  • 71. Hadoop Eco System HDFS MR IMPALA HBASE PIG HIVE SQOOP FLUME HUE, Oozie, Mahout
  • 72. Tools for different scale Size Classification Tools Lines Sample Data Analysis and Visualisation Whiteboard, Bash, ... KBs – low MBs Prototype Data Analysis and Visualisation Matlab, Octave, R, Processing, Bash, ... MBs – low GBs Online Data Storage MySQL (DBs), ... Analysis NumPy, SciPy, Pandas, Weka.. Visualisation Flare, AmCharts, Raphael GBs – TBs – PBs Big Data Storage HDFS, Hbase, Cassandra,... Analysis Hive, Giraph, Hama, Mahout
  • 75. Recommendation System Javascript Flume HDFS HBase Pig Mahout
  • 79. Send User Behavior to Backend
  • 80. Use Flume To Collect Streaming Data From /tmp/postlog.txt To /user/cloudera/flume
  • 81. JSON sample data {"food":"Tacos", "person":"Alice", "amount":3} {"food":"Tomato Soup", "person":"Sarah", "amount":2} {"food":"Grilled Cheese", "person":"Alex", "amount":5} Demo Code second_table = LOAD 'second_table.json' USING JsonLoader('food:chararray, person:chararray, amount:int'); Use Pig To Load JSON
  • 83. $ hbase shell > create ‘mydata’, ‘mycf’ Build Table In HBase
  • 85. Use Pig To Transfer Data Into HBase
  • 89. Focus on algorithm Divide and Conquer, Trie, Collaborative Filtering Being an expert of single programming language But knowing what tools and algorithm you can use to solve your problem Define your role Statistician Software engineer What You Should Do
  • 90. Website: largitdata.com ywchiu.com Email: david@largitdata.com tr.ywchiu@gmail.com Contacts