SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Downloaden Sie, um offline zu lesen
Analyzing Realtime News
Raffaele Lorusso – Marco Fusi
Milan, November 2015 #RateMe
CREARE	LA	
NOTIZIA	
This project has been realized during the 2015-2016 master “Business Intelligence
and Big Data Analytics” at Università di Milano - BicoccaCONTEXT	
#RateMe
L'IT riesce a conseguire una sostanziale riduzione dei costi operativi attraverso la
modernizzazione delle proprie Data Architecture. L'innovazione include l'implementazione di
Active Archive per i cold data, l’offloading di processi ETL e l'enrichment dei dati.
	
CREARE	LA	
NOTIZIA	
BIG	DATA	 Quali son le tecnologie e le potenzialità dei Big Data
Twitter as an example of new media and realtime news sharingTWITTER	
#RateMe
TIMELINE
NEWS	LIFECYCLE	 How news spreads on Twitter and other new-media
News	
#RateMe
TIMELINE
NEWS	LIFECYCLE	 How news spreads on Twitter and other new-media
Tweet	 News	
#RateMe
TIMELINE
NEWS	LIFECYCLE	 How news spreads on Twitter and other new-media
News	Tweet	
Tweet	Tweet	
Tweet	
Tweet	Tweet	Tweet	
Tweet	
#RateMe
TIMELINE
NEWS	LIFECYCLE	 How news spreads on Twitter and other new-media
News	Tweet	
Tweet	Tweet	
Tweet	
Tweet	Tweet	Tweet	
Tweet	
Tweet	
Tweet	 Tweet	 Tweet	
Tweet	
Tweet	
Tweet	
Tweet	
#RateMe
TIMELINE
NEWS	LIFECYCLE	 How news spreads on Twitter and other new-media
News	
Tweet	Tweet	
Tweet	
Tweet	Tweet	Tweet	
Tweet	
Tweet	
Tweet	 Tweet	 Tweet	
Tweet	Tweet	
Tweet	
Tweet	
Tweet	 Tweet	 Tweet	
Tweet	
Tweet	
Tweet	
Tweet	
Tweet	
#RateMe
TIMELINE
NEWS	LIFECYCLE	 How news spreads on Twitter and other new-media
Tweet	
Tweet	Tweet	
Tweet	
Tweet	Tweet	Tweet	
Tweet	
Tweet	
Tweet	 Tweet	 Tweet	
Tweet	
Tweet	
Tweet	
Tweet	
Tweet	
Tweet	
Tweet	 Tweet	 Tweet	
Tweet	Tweet	
Tweet	
Tweet	 Tweet	
News	
#RateMe
L'IT riesce a conseguire una sostanziale riduzione dei costi operativi attraverso la
modernizzazione delle proprie Data Architecture. L'innovazione include l'implementazione di
Active Archive per i cold data, l’offloading di processi ETL e l'enrichment dei dati.
	
Twitter is an easy way to create and share news and opinions.
It’s a new flow of content and information associated with huge opportunities.
With the collected data it’s possible to conduct statystical analysis that allow us to
extrapolate quantitative and qualitative indicators in order to identify trends, correlations,
flows, sentiment,….
CREATE	
ANALYZE	
FOLLOW	
Follow the news evolution during the time by analyzing and contextualyizing it in the reality
and comparing the externals events that can contribute to generete and modify the news
itself.
#RateMe
L'IT riesce a conseguire una sostanziale riduzione dei costi operativi attraverso la
modernizzazione delle proprie Data Architecture. L'innovazione include l'implementazione di
Active Archive per i cold data, l’offloading di processi ETL e l'enrichment dei dati.
ARCHITECTURE	 Main Components
#RateMe
BATCH
LAYER
SPEED
LAYER
DATA
SOURCES
Machine
Learning
PRESENTATION
LAYER
	
CREARE	LA	
NOTIZIA	
ARCHITECTURE	 The Lambda Architecture
#RateMe
Case Study: Big Data Ecosystem on Twitter
#RateMe
BIG DATA
FRONTEND	
BIG DATA
BACKEND BIG DATA
FRONTEND
Big Data Ecosystem
BIG DATA
BACKEND	
#RateMe
Big Data Ecosystem at a glance
40k	 1	Month	
100	k	
28	k	
170	k	
1.2	k	
30	k	
#RateMe
L'IT riesce a conseguire una sostanziale riduzione dei costi operativi attraverso la
modernizzazione delle proprie Data Architecture. L'innovazione include l'implementazione di
Active Archive per i cold data, l’offloading di processi ETL e l'enrichment dei dati.
Big Data Ecosystem
#RateMe
L'IT riesce a conseguire una sostanziale riduzione dei costi operativi attraverso la
modernizzazione delle proprie Data Architecture. L'innovazione include l'implementazione di
Active Archive per i cold data, l’offloading di processi ETL e l'enrichment dei dati.
SENTIMENT	
ANALYSIS	
From the text of the Tweets it’s possible to compute a measure relative to the sentiment
associated with it.
In this project we have built two different models.
BIG DATA
BACKEND
BIG DATA
FRONTEND
CLUSTER
THEN
PREDICT	
BIG DATA
BACKEND
DICTIONARY
ALGORITM	
#RateMe
L'IT riesce a conseguire una sostanziale riduzione dei costi operativi attraverso la
modernizzazione delle proprie Data Architecture. L'innovazione include l'implementazione di
Active Archive per i cold data, l’offloading di processi ETL e l'enrichment dei dati.
SENTIMENT	
ANALYSIS	
This model concept is to split a Tweet into tokens composed by the single words, and then
associate a score to each word by looking in a dictionary table containing positive and
negative words and a numerical score.
BIG DATA
BACKEND
BIG DATA
BACKEND
DICTIONARY
ALGORITM	
#RateMe
L'IT riesce a conseguire una sostanziale riduzione dei costi operativi attraverso la
modernizzazione delle proprie Data Architecture. L'innovazione include l'implementazione di
Active Archive per i cold data, l’offloading di processi ETL e l'enrichment dei dati.
SENTIMENT	
ANALYSIS	
This model is based upon clustering Tweets with similar words and then applying a
Random Forest algorithm on each cluster
“Improved Twitter Sentiment prediction through Cluster then Predict Model”
International Journal of Computer Science and Network, August 2015
BIG DATA
FRONTEND
CLUSTER
THEN
PREDICT	
#RateMe
DASHBOARD	
*LIVE	DEMO	
#RateMe
CREARE	LA	
NOTIZIA	
CONCLUSIONS	
• The «Lambda Architecture» seems a good approach thanks to the tradeoff between the need of RealTime Analysis
and Batch computations
• The Big Data Ecosystem is composed by etherogeneous technologies and each of them solve just a part of the
whole problem
• Many technlogies are easily interoperable and composable
• There are many first mover in the Big Data market but also consolidated ones that are nowdays a must have in a
Big Data Architecture
Big Data Ecosystem - Architecture
#RateMe
CREARE	LA	
NOTIZIA	
BIG	DATA	
CONCLUSIONS	
•  The most twitted technlogies are not always the ones that has the largest market share
•  It seems there’s no correlation between real Big Data Events and tweets volumes
•  In this case study the sentiment analysis made with the cluster then predict model is worse than the one made
with the dictionary algorithm
•  The dictionary algorithm approach is very susceptible to the usage of a good dictionary with a lot of words.
With the dictionary we used only 42% tweets were scored
•  The analysis between the senders and the mentioned users underlyned that there are many influencers who
are actually closely connected to the technologies or even the official accounts of that technlogy
•  45% of the tweets were sent by official apps from Web platform, Android and IOS
Big Data Ecosystem – Data Analysis
#RateMe
Case Study: Data Science seminar @masterBIBDA
Milan, 19 November 2015 #RateMe
Game
Rate this seminar
Players
Our speakers and YOU!
Objectives
Have Fun!
#RateMe Rules
#RateMe
Tweet to
@masterbibda
Reference the keyword
by using an hashtag
#datascientistprofiles
Vote
alto – medio - basso
Example#RateMe
#RateMe
CREIAMO	LA	NOTIZIA	
and…
Feel free to Tweet your toughts @masterbibda!
Every Tweet will be analyzed!
#RateMe
#RateMe
DASHBOARD	
*LIVE	DEMO	
#RateMe
Tweet	
Tweet	Tweet	
Tweet	
Tweet	Tweet	Tweet	
Tweet	
Tweet	
Tweet	 Tweet	 Tweet	
Tweet	
Tweet	
Tweet	
Tweet	
Tweet	
Tweet	
Tweet	 Tweet	 Tweet	
Tweet	Tweet	
Tweet	
Tweet	 Tweet	
News	
Enjoy #RateMe
#RateMe
Raffaele Lorusso – Marco Fusi
Milan, November 2015
THANKS!
Analyzing Realtime News
#RateMe

Weitere ähnliche Inhalte

Was ist angesagt?

Data sciences and marketing analytics
Data sciences and marketing analyticsData sciences and marketing analytics
Data sciences and marketing analyticsMJ Xavier
 
Real time streaming analytics
Real time streaming analyticsReal time streaming analytics
Real time streaming analyticsAnirudh
 
DataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven OrganizationsDataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven OrganizationsEllen Friedman
 
Bigdatacooltools
BigdatacooltoolsBigdatacooltools
Bigdatacooltoolssuresh sood
 
Case of success: Visualization as an example for exercising democratic transp...
Case of success: Visualization as an example for exercising democratic transp...Case of success: Visualization as an example for exercising democratic transp...
Case of success: Visualization as an example for exercising democratic transp...Big Data Spain
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICSNAGARAJAGIDDE
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsDomino Data Lab
 
"Agile Analytics" - Marianne Faro, Analytics Competence Lead at Itility
"Agile Analytics" - Marianne Faro, Analytics Competence Lead at Itility"Agile Analytics" - Marianne Faro, Analytics Competence Lead at Itility
"Agile Analytics" - Marianne Faro, Analytics Competence Lead at ItilityDataconomy Media
 
"Social innovation with (big) data" - Maurice Fransen, Analytics Lead Public ...
"Social innovation with (big) data" - Maurice Fransen, Analytics Lead Public ..."Social innovation with (big) data" - Maurice Fransen, Analytics Lead Public ...
"Social innovation with (big) data" - Maurice Fransen, Analytics Lead Public ...Dataconomy Media
 
Dataiku - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku  - for Data Geek Paris@Criteo - Close the Data CircleDataiku  - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku - for Data Geek Paris@Criteo - Close the Data CircleDataiku
 
Big Data and Computer Science Education
Big Data and Computer Science EducationBig Data and Computer Science Education
Big Data and Computer Science EducationJames Hendler
 
The Four V’s of Big Data Testing: Variety, Volume, Velocity, and Veracity
The Four V’s of Big Data Testing: Variety, Volume, Velocity, and VeracityThe Four V’s of Big Data Testing: Variety, Volume, Velocity, and Veracity
The Four V’s of Big Data Testing: Variety, Volume, Velocity, and VeracityTechWell
 
On Big Data Analytics - opportunities and challenges
On Big Data Analytics - opportunities and challengesOn Big Data Analytics - opportunities and challenges
On Big Data Analytics - opportunities and challengesPetteri Alahuhta
 
Big Data Landscape 2018
Big Data Landscape 2018Big Data Landscape 2018
Big Data Landscape 2018Leanne Hwee
 
6 levels of big data analytics applications
6 levels of big data analytics applications6 levels of big data analytics applications
6 levels of big data analytics applicationspanoratio
 

Was ist angesagt? (19)

Data sciences and marketing analytics
Data sciences and marketing analyticsData sciences and marketing analytics
Data sciences and marketing analytics
 
Real time streaming analytics
Real time streaming analyticsReal time streaming analytics
Real time streaming analytics
 
5 v of big data
5 v of big data5 v of big data
5 v of big data
 
NewMR 2016 presents: 9 Big Applications of Big Data
NewMR 2016 presents: 9 Big Applications of Big DataNewMR 2016 presents: 9 Big Applications of Big Data
NewMR 2016 presents: 9 Big Applications of Big Data
 
DataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven OrganizationsDataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven Organizations
 
Bigdatacooltools
BigdatacooltoolsBigdatacooltools
Bigdatacooltools
 
Case of success: Visualization as an example for exercising democratic transp...
Case of success: Visualization as an example for exercising democratic transp...Case of success: Visualization as an example for exercising democratic transp...
Case of success: Visualization as an example for exercising democratic transp...
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICS
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science Tools
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
"Agile Analytics" - Marianne Faro, Analytics Competence Lead at Itility
"Agile Analytics" - Marianne Faro, Analytics Competence Lead at Itility"Agile Analytics" - Marianne Faro, Analytics Competence Lead at Itility
"Agile Analytics" - Marianne Faro, Analytics Competence Lead at Itility
 
"Social innovation with (big) data" - Maurice Fransen, Analytics Lead Public ...
"Social innovation with (big) data" - Maurice Fransen, Analytics Lead Public ..."Social innovation with (big) data" - Maurice Fransen, Analytics Lead Public ...
"Social innovation with (big) data" - Maurice Fransen, Analytics Lead Public ...
 
Dataiku - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku  - for Data Geek Paris@Criteo - Close the Data CircleDataiku  - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku - for Data Geek Paris@Criteo - Close the Data Circle
 
Big Data and Computer Science Education
Big Data and Computer Science EducationBig Data and Computer Science Education
Big Data and Computer Science Education
 
Applications of R (DataWeek 2014)
Applications of R (DataWeek 2014)Applications of R (DataWeek 2014)
Applications of R (DataWeek 2014)
 
The Four V’s of Big Data Testing: Variety, Volume, Velocity, and Veracity
The Four V’s of Big Data Testing: Variety, Volume, Velocity, and VeracityThe Four V’s of Big Data Testing: Variety, Volume, Velocity, and Veracity
The Four V’s of Big Data Testing: Variety, Volume, Velocity, and Veracity
 
On Big Data Analytics - opportunities and challenges
On Big Data Analytics - opportunities and challengesOn Big Data Analytics - opportunities and challenges
On Big Data Analytics - opportunities and challenges
 
Big Data Landscape 2018
Big Data Landscape 2018Big Data Landscape 2018
Big Data Landscape 2018
 
6 levels of big data analytics applications
6 levels of big data analytics applications6 levels of big data analytics applications
6 levels of big data analytics applications
 

Ähnlich wie Analyzing Real Time News

Take Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessTake Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessInside Analysis
 
From IoT to IoTA
From IoT to IoTAFrom IoT to IoTA
From IoT to IoTAStriim
 
Business in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for IntegrationBusiness in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for IntegrationInside Analysis
 
The Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value ThereafterThe Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value ThereafterInside Analysis
 
Big Data Scotland
Big Data ScotlandBig Data Scotland
Big Data ScotlandRay Bugg
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Prof.Balakrishnan S
 
The State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and ScaleThe State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and ScaleVoltDB
 
Architecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsArchitecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsCaserta
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big dataSitaram Kotnis
 
Agile data science
Agile data scienceAgile data science
Agile data scienceJoel Horwitz
 
Logitech Accelerates Cloud Analytics Using Data Virtualization by Avinash Des...
Logitech Accelerates Cloud Analytics Using Data Virtualization by Avinash Des...Logitech Accelerates Cloud Analytics Using Data Virtualization by Avinash Des...
Logitech Accelerates Cloud Analytics Using Data Virtualization by Avinash Des...Data Con LA
 
Smarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with AutomationSmarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with AutomationInside Analysis
 
Integrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientIntegrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientPerficient, Inc.
 
Denodo Datafest 2017 London Tekin Mentes Logitech
Denodo Datafest 2017 London Tekin Mentes LogitechDenodo Datafest 2017 London Tekin Mentes Logitech
Denodo Datafest 2017 London Tekin Mentes LogitechTekin Mentes
 
Smart Data for Smart Labs
Smart Data for Smart Labs Smart Data for Smart Labs
Smart Data for Smart Labs OSTHUS
 
Predictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing MeetupPredictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing MeetupCaserta
 
8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshare8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshareJulianna DeLua
 

Ähnlich wie Analyzing Real Time News (20)

Take Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessTake Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven Business
 
From IoT to IoTA
From IoT to IoTAFrom IoT to IoTA
From IoT to IoTA
 
Business in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for IntegrationBusiness in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for Integration
 
Analyzing Real Time News
Analyzing Real Time NewsAnalyzing Real Time News
Analyzing Real Time News
 
The Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value ThereafterThe Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value Thereafter
 
Introduction to ELK
Introduction to ELKIntroduction to ELK
Introduction to ELK
 
Big Data Scotland
Big Data ScotlandBig Data Scotland
Big Data Scotland
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19
 
The State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and ScaleThe State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and Scale
 
Architecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsArchitecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment Options
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
 
Agile data science
Agile data scienceAgile data science
Agile data science
 
Logitech Accelerates Cloud Analytics Using Data Virtualization by Avinash Des...
Logitech Accelerates Cloud Analytics Using Data Virtualization by Avinash Des...Logitech Accelerates Cloud Analytics Using Data Virtualization by Avinash Des...
Logitech Accelerates Cloud Analytics Using Data Virtualization by Avinash Des...
 
Smarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with AutomationSmarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with Automation
 
Integrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientIntegrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and Perficient
 
Denodo Datafest 2017 London Tekin Mentes Logitech
Denodo Datafest 2017 London Tekin Mentes LogitechDenodo Datafest 2017 London Tekin Mentes Logitech
Denodo Datafest 2017 London Tekin Mentes Logitech
 
IT In Europe
IT In EuropeIT In Europe
IT In Europe
 
Smart Data for Smart Labs
Smart Data for Smart Labs Smart Data for Smart Labs
Smart Data for Smart Labs
 
Predictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing MeetupPredictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing Meetup
 
8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshare8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshare
 

Kürzlich hochgeladen

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 

Kürzlich hochgeladen (20)

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 

Analyzing Real Time News

  • 1. Analyzing Realtime News Raffaele Lorusso – Marco Fusi Milan, November 2015 #RateMe
  • 2. CREARE LA NOTIZIA This project has been realized during the 2015-2016 master “Business Intelligence and Big Data Analytics” at Università di Milano - BicoccaCONTEXT #RateMe
  • 3. L'IT riesce a conseguire una sostanziale riduzione dei costi operativi attraverso la modernizzazione delle proprie Data Architecture. L'innovazione include l'implementazione di Active Archive per i cold data, l’offloading di processi ETL e l'enrichment dei dati. CREARE LA NOTIZIA BIG DATA Quali son le tecnologie e le potenzialità dei Big Data Twitter as an example of new media and realtime news sharingTWITTER #RateMe
  • 4. TIMELINE NEWS LIFECYCLE How news spreads on Twitter and other new-media News #RateMe
  • 5. TIMELINE NEWS LIFECYCLE How news spreads on Twitter and other new-media Tweet News #RateMe
  • 6. TIMELINE NEWS LIFECYCLE How news spreads on Twitter and other new-media News Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet #RateMe
  • 7. TIMELINE NEWS LIFECYCLE How news spreads on Twitter and other new-media News Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet #RateMe
  • 8. TIMELINE NEWS LIFECYCLE How news spreads on Twitter and other new-media News Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet #RateMe
  • 9. TIMELINE NEWS LIFECYCLE How news spreads on Twitter and other new-media Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet Tweet News #RateMe
  • 10. L'IT riesce a conseguire una sostanziale riduzione dei costi operativi attraverso la modernizzazione delle proprie Data Architecture. L'innovazione include l'implementazione di Active Archive per i cold data, l’offloading di processi ETL e l'enrichment dei dati. Twitter is an easy way to create and share news and opinions. It’s a new flow of content and information associated with huge opportunities. With the collected data it’s possible to conduct statystical analysis that allow us to extrapolate quantitative and qualitative indicators in order to identify trends, correlations, flows, sentiment,…. CREATE ANALYZE FOLLOW Follow the news evolution during the time by analyzing and contextualyizing it in the reality and comparing the externals events that can contribute to generete and modify the news itself. #RateMe
  • 11. L'IT riesce a conseguire una sostanziale riduzione dei costi operativi attraverso la modernizzazione delle proprie Data Architecture. L'innovazione include l'implementazione di Active Archive per i cold data, l’offloading di processi ETL e l'enrichment dei dati. ARCHITECTURE Main Components #RateMe
  • 13. Case Study: Big Data Ecosystem on Twitter #RateMe
  • 14. BIG DATA FRONTEND BIG DATA BACKEND BIG DATA FRONTEND Big Data Ecosystem BIG DATA BACKEND #RateMe
  • 15. Big Data Ecosystem at a glance 40k 1 Month 100 k 28 k 170 k 1.2 k 30 k #RateMe
  • 16. L'IT riesce a conseguire una sostanziale riduzione dei costi operativi attraverso la modernizzazione delle proprie Data Architecture. L'innovazione include l'implementazione di Active Archive per i cold data, l’offloading di processi ETL e l'enrichment dei dati. Big Data Ecosystem #RateMe
  • 17. L'IT riesce a conseguire una sostanziale riduzione dei costi operativi attraverso la modernizzazione delle proprie Data Architecture. L'innovazione include l'implementazione di Active Archive per i cold data, l’offloading di processi ETL e l'enrichment dei dati. SENTIMENT ANALYSIS From the text of the Tweets it’s possible to compute a measure relative to the sentiment associated with it. In this project we have built two different models. BIG DATA BACKEND BIG DATA FRONTEND CLUSTER THEN PREDICT BIG DATA BACKEND DICTIONARY ALGORITM #RateMe
  • 18. L'IT riesce a conseguire una sostanziale riduzione dei costi operativi attraverso la modernizzazione delle proprie Data Architecture. L'innovazione include l'implementazione di Active Archive per i cold data, l’offloading di processi ETL e l'enrichment dei dati. SENTIMENT ANALYSIS This model concept is to split a Tweet into tokens composed by the single words, and then associate a score to each word by looking in a dictionary table containing positive and negative words and a numerical score. BIG DATA BACKEND BIG DATA BACKEND DICTIONARY ALGORITM #RateMe
  • 19. L'IT riesce a conseguire una sostanziale riduzione dei costi operativi attraverso la modernizzazione delle proprie Data Architecture. L'innovazione include l'implementazione di Active Archive per i cold data, l’offloading di processi ETL e l'enrichment dei dati. SENTIMENT ANALYSIS This model is based upon clustering Tweets with similar words and then applying a Random Forest algorithm on each cluster “Improved Twitter Sentiment prediction through Cluster then Predict Model” International Journal of Computer Science and Network, August 2015 BIG DATA FRONTEND CLUSTER THEN PREDICT #RateMe
  • 21. CREARE LA NOTIZIA CONCLUSIONS • The «Lambda Architecture» seems a good approach thanks to the tradeoff between the need of RealTime Analysis and Batch computations • The Big Data Ecosystem is composed by etherogeneous technologies and each of them solve just a part of the whole problem • Many technlogies are easily interoperable and composable • There are many first mover in the Big Data market but also consolidated ones that are nowdays a must have in a Big Data Architecture Big Data Ecosystem - Architecture #RateMe
  • 22. CREARE LA NOTIZIA BIG DATA CONCLUSIONS •  The most twitted technlogies are not always the ones that has the largest market share •  It seems there’s no correlation between real Big Data Events and tweets volumes •  In this case study the sentiment analysis made with the cluster then predict model is worse than the one made with the dictionary algorithm •  The dictionary algorithm approach is very susceptible to the usage of a good dictionary with a lot of words. With the dictionary we used only 42% tweets were scored •  The analysis between the senders and the mentioned users underlyned that there are many influencers who are actually closely connected to the technologies or even the official accounts of that technlogy •  45% of the tweets were sent by official apps from Web platform, Android and IOS Big Data Ecosystem – Data Analysis #RateMe
  • 23. Case Study: Data Science seminar @masterBIBDA Milan, 19 November 2015 #RateMe
  • 24. Game Rate this seminar Players Our speakers and YOU! Objectives Have Fun! #RateMe Rules #RateMe
  • 25. Tweet to @masterbibda Reference the keyword by using an hashtag #datascientistprofiles Vote alto – medio - basso Example#RateMe #RateMe
  • 26. CREIAMO LA NOTIZIA and… Feel free to Tweet your toughts @masterbibda! Every Tweet will be analyzed! #RateMe #RateMe
  • 29. Raffaele Lorusso – Marco Fusi Milan, November 2015 THANKS! Analyzing Realtime News #RateMe