SlideShare ist ein Scribd-Unternehmen logo
1 von 20
1/20
The role of Statistics
in the
Internet of Things
Paulo Canas Rodrigues
Research Director
CAST (Centre for Applied Statistics and Data Analytics)
University of Tampere
MINDTREK
October 18, 2016
2/20
CAST – Vision and Mission
• With the increase of (big and unstructured) data collected every day
in many disciplines, appropriate quantitative methodology tools are
needed for both evaluating the research hypotheses timely, and for
reducing the risks in the decision-making procedure
• MISSION: To promote the understanding and good practice of statistics and
data analytics within UTA/Tampere3 and in a global scene
• VISION: To become recognized as a strong partner to researchers and to
industry in statistics and data analytics, locally and globally, especially in
applied sciences
• MEMBERS: wide range of expertise, from methodological statistics to
biostatistics, machine learning, data mining, data visualization, time series,
computational statistics, etc.
3/20
CAST – R&D services
• CAST provides diverse expert quantitative research R&D services
for different purposes starting from project planning to data analysis
and results reporting. These services range from tailored and
complex statistical research to traditional survey research, for
private and public sector.
• For companies: operation and maintenance management visualization,
performance prediction and problem case diagnostics, customer and
product content insights, sales predictions, marketing recommendations,
operation evaluation and optimization, data analytics expertise in
developing digital services
• For public organizations: expertise in quantitative methodology part of a
research and development project, quantitative analytics for operation
evaluation and rationalization, data analytics expertise in developing digital
public services
4/20
CAST – R&D services
• Different kinds of methodologies are available for graphical
visualization, data mining, analysis, modeling and prediction
including but not limited to:
• big data analytics
• Internet of Things
• machine learning
• time series modeling, analysis and prediction
• Bayesian modeling
• data and text mining
• interactive analytics
• survey design and analysis.
• training
• More details in http://www.uta.fi/cast/
5/20
Internet of Things (IoT)
6/20
7/20
How much is data driving decisions today?
• recently surveyed senior business stakeholders across
Europe to ask about their attitudes about data and analytics. The
findings of the survey were summarized in the Business Grammar
Research Report.
• Here’s a glimpse of some of the key findings:
• 96% use data and analytics to inform business decisions today
• 59% of European business leaders consider data and analytics
savviness to be one of the two most important skills for new employees
• Data and analytics skills are now considered more important than
industry experience or a second language
8/20
Unstructured data
June 8, 2011 (Joe McKendrick): Unstructured
data: the elephant in the Big Data room
• [… Many organizations are becoming overwhelmed with the
volumes of unstructured information -- audio, video, graphics, social
media messages -- that falls outside the purview of their "traditional"
databases. Organizations that do get their arms around this data will
gain significant competitive edge…]
• [… 91% (in the survey) say unstructured information already lives in
their organizations, but many aren’t sure what to do about it…]
9/20
• But, what to do with all this data?
• How to transform data in information
for decision making?
10/20
Statistics: A world of possibilities
Marie Davidian
(North Carolina
State University)
Thomas Louis
(Johns Hopkins
Bloomberg School
of Public Health)
Statistics is the science of learning
from data, and of measuring,
controlling, and communicating
uncertainty; and it thereby
provides the navigation essential
for controlling the course of
scientific and societal advances.
11/20
IoT – A time series challenge
• The world is becoming more and more instrumented,
interconnected and intelligent, resulting in mountains of newly
generated data.
• With storage costs coming down significantly, companies now
want to leverage this instrument-generated data (including
meter, temperature and all types of sensor data over time) for
conducting analysis.
• Among all the types of big data, data from sensors is the most
widespread and is referred to as time-series data.
• So many records, so little time!
Source: http://www.ibmbigdatahub.com/blog/internet-things-time-series-data-challenge/
12/20
Route 13 Hervanta-Ylöjärvi: travel time per bus for 2 days
13/20
Smartphone sensor data – How to identify breakpoints?
Source: http://beautifuldata.net/tag/sensor-data/
Data: accelerometer smartphone that Datarella provided in its Data Fiction competition.
The dataset shows the acceleration along the three axes of the smartphone:
x – sideways acceleration of the device
y – forward and backward acceleration of the device
z – acceleration up and down
So, for example, the activity of taking the smartphone out of your pocket and reading a
tweet can look the following way:
• y acceleration – the smartphone had been in the pocket top down and is now taken out of the pocket
• z and y acceleration – turning the smartphone so that is horizontal
• x acceleration – moving the smartphone from the left to the middle of your body
• z acceleration – lifting the smartphone so you can read the fine print of the tweet
14/20
Smartphone sensor data – How to identify breakpoints?
Source: http://beautifuldata.net/tag/sensor-data/
This is the sensor data for one user on one day:
15/20
Smartphone sensor data – How to identify breakpoints?
Source: http://beautifuldata.net/tag/sensor-data/
Let’s zoom in to the period between 12:32 and 13:00:
• In the beginning, the smartphone seems to lie flat on a horizontal surface – the sensor is reading a value
of around 9.8 in positive direction – this means, the gravitational force only effects the z axis and not
the x and y axes.
• But then things change and after a few movements (our change points) the last observation has the
smartphone on a position where the x axis has around -9.6 acceleration, i.e. the smartphone is being
held in landscape orientation pointing to the right.
16/20
Smartphone sensor data – How to identify breakpoints?
Source: http://beautifuldata.net/tag/sensor-data/
• This quick analysis of the acceleration in the x direction gives us 4 change points, where the acceleration
suddenly changes.
17/20
Anomaly Detection with Wikipedia Page View Data
Source: https://www.r-bloggers.com/anomaly-detection-with-wikipedia-page-view-data/
• We choose an interesting Wikipedia page and download 90 days of PageView statistics
• A first plot shows this pattern (for the USA Wikipedia page)
• Now, let’s look for anomalies using the R package AnomalyDetection
18/20
Anomaly Detection with Wikipedia Page View Data
Source: https://www.r-bloggers.com/anomaly-detection-with-wikipedia-page-view-data/
• In our case, the algorithm has discovered 4 anomalies.
• The first on October 30 2014 being an exceptionally high value overall
• The second is a very high Sunday
• The third a high value overall
• The forth a high Saturday (normally, this day is also quite weak).
19/20
Concluding remarks
• The Internet of Things brought a new way of looking at
the world and, with it, mountains of newly generated data
have been collected.
• With decreasing costs of data storage, companies are
now looking at strategies to analyze and to take proper
advantage of the deluge of data they have been storing.
• How to do that? Statistics can help!
20/20
Contat information
• Research Director: Paulo Canas Rodrigues (Paulo.Rodrigues@uta.fi)
www.uta.fi/cast

Weitere ähnliche Inhalte

Was ist angesagt?

The NEEDS vs. the WANTS in IoT
The NEEDS vs. the WANTS in IoTThe NEEDS vs. the WANTS in IoT
The NEEDS vs. the WANTS in IoT
Prasant Misra
 

Was ist angesagt? (20)

SMAC
SMACSMAC
SMAC
 
P. Struijs, Toward the Use of Big Data for European Statistics
P. Struijs, Toward the Use of Big Data for European StatisticsP. Struijs, Toward the Use of Big Data for European Statistics
P. Struijs, Toward the Use of Big Data for European Statistics
 
Data science applications and usecases
Data science applications and usecasesData science applications and usecases
Data science applications and usecases
 
Big Data, the Future of Statistics: Experiences at Statistics Netherlands
Big Data, the Future of Statistics: Experiences at Statistics NetherlandsBig Data, the Future of Statistics: Experiences at Statistics Netherlands
Big Data, the Future of Statistics: Experiences at Statistics Netherlands
 
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
 
Responsible Data Science at Statistics Netherlands
Responsible Data Science at Statistics NetherlandsResponsible Data Science at Statistics Netherlands
Responsible Data Science at Statistics Netherlands
 
Suds summary
Suds summarySuds summary
Suds summary
 
Challenges in Analytics for BIG Data
Challenges in Analytics for BIG DataChallenges in Analytics for BIG Data
Challenges in Analytics for BIG Data
 
Quality Approaches to Big Data
Quality Approaches to Big DataQuality Approaches to Big Data
Quality Approaches to Big Data
 
Big data Big impact?
Big data Big impact?Big data Big impact?
Big data Big impact?
 
The NEEDS vs. the WANTS in IoT
The NEEDS vs. the WANTS in IoTThe NEEDS vs. the WANTS in IoT
The NEEDS vs. the WANTS in IoT
 
Big Data & Smart City Applications
Big Data & Smart City ApplicationsBig Data & Smart City Applications
Big Data & Smart City Applications
 
Big Data and Nowcasting
Big Data and NowcastingBig Data and Nowcasting
Big Data and Nowcasting
 
BDE Technical Webinar 1 : Pilot Instantiation
BDE Technical Webinar 1 :  Pilot InstantiationBDE Technical Webinar 1 :  Pilot Instantiation
BDE Technical Webinar 1 : Pilot Instantiation
 
data science
data sciencedata science
data science
 
EDF2014: Talk of Vassileios Tsetsos, Chief Technical Officer, Mobics Ltd: Pre...
EDF2014: Talk of Vassileios Tsetsos, Chief Technical Officer, Mobics Ltd: Pre...EDF2014: Talk of Vassileios Tsetsos, Chief Technical Officer, Mobics Ltd: Pre...
EDF2014: Talk of Vassileios Tsetsos, Chief Technical Officer, Mobics Ltd: Pre...
 
J. Van der Valk - From Labour Force Survey to Labour Market Statistics
J. Van der Valk - From Labour Force Survey to  Labour Market Statistics J. Van der Valk - From Labour Force Survey to  Labour Market Statistics
J. Van der Valk - From Labour Force Survey to Labour Market Statistics
 
Big Data and official statistics with examples of their use
Big Data and official statistics with examples of their useBig Data and official statistics with examples of their use
Big Data and official statistics with examples of their use
 
New Data for Innovation Policy
New Data for Innovation PolicyNew Data for Innovation Policy
New Data for Innovation Policy
 
Measuring the promise of Open Data: Development of the Impact Monitoring Fram...
Measuring the promise of Open Data: Development of the Impact Monitoring Fram...Measuring the promise of Open Data: Development of the Impact Monitoring Fram...
Measuring the promise of Open Data: Development of the Impact Monitoring Fram...
 

Andere mochten auch

appreciation letter by MIA Corporation
appreciation letter by MIA Corporationappreciation letter by MIA Corporation
appreciation letter by MIA Corporation
mian amjad
 
[216]딥러닝예제로보는개발자를위한통계 최재걸
[216]딥러닝예제로보는개발자를위한통계 최재걸[216]딥러닝예제로보는개발자를위한통계 최재걸
[216]딥러닝예제로보는개발자를위한통계 최재걸
NAVER D2
 
Introduction To Statistics
Introduction To StatisticsIntroduction To Statistics
Introduction To Statistics
albertlaporte
 

Andere mochten auch (11)

appreciation letter by MIA Corporation
appreciation letter by MIA Corporationappreciation letter by MIA Corporation
appreciation letter by MIA Corporation
 
Researching the creative city and cultural policy
Researching the creative city and cultural policyResearching the creative city and cultural policy
Researching the creative city and cultural policy
 
ALÇADO 2
ALÇADO 2ALÇADO 2
ALÇADO 2
 
Paginas web
Paginas webPaginas web
Paginas web
 
Pp hiperligação
Pp hiperligaçãoPp hiperligação
Pp hiperligação
 
THE FUTURE IS HERE - James Duez, Rainbird
THE FUTURE IS HERE - James Duez, RainbirdTHE FUTURE IS HERE - James Duez, Rainbird
THE FUTURE IS HERE - James Duez, Rainbird
 
Tips on reapplication of study permit in Canada
Tips on reapplication of study permit in CanadaTips on reapplication of study permit in Canada
Tips on reapplication of study permit in Canada
 
[216]딥러닝예제로보는개발자를위한통계 최재걸
[216]딥러닝예제로보는개발자를위한통계 최재걸[216]딥러닝예제로보는개발자를위한통계 최재걸
[216]딥러닝예제로보는개발자를위한통계 최재걸
 
AL SARABI CV
AL SARABI CVAL SARABI CV
AL SARABI CV
 
Introduction To Statistics
Introduction To StatisticsIntroduction To Statistics
Introduction To Statistics
 
Digital in 2016
Digital in 2016Digital in 2016
Digital in 2016
 

Ähnlich wie Paulo Canas Rodrigues - The role of Statistics in the Internet of Things - Mindtrek 2016

La telefonía móvil como fuente de información para el estudio de la movilidad...
La telefonía móvil como fuente de información para el estudio de la movilidad...La telefonía móvil como fuente de información para el estudio de la movilidad...
La telefonía móvil como fuente de información para el estudio de la movilidad...
Esri España
 
Harnessing Big Data_UCLA
Harnessing Big Data_UCLAHarnessing Big Data_UCLA
Harnessing Big Data_UCLA
Paul Barsch
 

Ähnlich wie Paulo Canas Rodrigues - The role of Statistics in the Internet of Things - Mindtrek 2016 (20)

Datapreneurs
DatapreneursDatapreneurs
Datapreneurs
 
La telefonía móvil como fuente de información para el estudio de la movilidad...
La telefonía móvil como fuente de información para el estudio de la movilidad...La telefonía móvil como fuente de información para el estudio de la movilidad...
La telefonía móvil como fuente de información para el estudio de la movilidad...
 
Identifying the new frontier of big data as an enabler for T&T industries: Re...
Identifying the new frontier of big data as an enabler for T&T industries: Re...Identifying the new frontier of big data as an enabler for T&T industries: Re...
Identifying the new frontier of big data as an enabler for T&T industries: Re...
 
Snap4City November 2019 Course: Smart City IOT Data Analytics
Snap4City November 2019 Course: Smart City IOT Data AnalyticsSnap4City November 2019 Course: Smart City IOT Data Analytics
Snap4City November 2019 Course: Smart City IOT Data Analytics
 
Building Smart Cities: The Data-Driven Way (Created For The Big 5 Construct 2...
Building Smart Cities: The Data-Driven Way (Created For The Big 5 Construct 2...Building Smart Cities: The Data-Driven Way (Created For The Big 5 Construct 2...
Building Smart Cities: The Data-Driven Way (Created For The Big 5 Construct 2...
 
Barga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 KeynoteBarga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 Keynote
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
 
Analysing Transportation Data with Open Source Big Data Analytic Tools
Analysing Transportation Data with Open Source Big Data Analytic ToolsAnalysing Transportation Data with Open Source Big Data Analytic Tools
Analysing Transportation Data with Open Source Big Data Analytic Tools
 
Data science and visualization lab presentation
Data science and visualization lab presentationData science and visualization lab presentation
Data science and visualization lab presentation
 
Harnessing Big Data_UCLA
Harnessing Big Data_UCLAHarnessing Big Data_UCLA
Harnessing Big Data_UCLA
 
A Big Data Telco Solution by Dr. Laura Wynter
A Big Data Telco Solution by Dr. Laura WynterA Big Data Telco Solution by Dr. Laura Wynter
A Big Data Telco Solution by Dr. Laura Wynter
 
Predictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesPredictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use Cases
 
BigDataEurope: Project Introduction @ Year #1 Workshops
BigDataEurope: Project Introduction @ Year #1 WorkshopsBigDataEurope: Project Introduction @ Year #1 Workshops
BigDataEurope: Project Introduction @ Year #1 Workshops
 
Big data and Internet
Big data and InternetBig data and Internet
Big data and Internet
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Computational intelligence for big data analytics bda 2013
Computational intelligence for big data analytics   bda 2013Computational intelligence for big data analytics   bda 2013
Computational intelligence for big data analytics bda 2013
 
Blocks & Bots - Digital Summit Harvard Business School 2015
Blocks & Bots - Digital Summit Harvard Business School 2015Blocks & Bots - Digital Summit Harvard Business School 2015
Blocks & Bots - Digital Summit Harvard Business School 2015
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 

Mehr von Mindtrek

Mehr von Mindtrek (20)

What the AI revolution means for Open Source, Open Tech and Open Societies
What the AI revolution means for Open Source, Open Tech and Open SocietiesWhat the AI revolution means for Open Source, Open Tech and Open Societies
What the AI revolution means for Open Source, Open Tech and Open Societies
 
Data balance sheets laying foundations for sustainable and ethical use of data
Data balance sheets laying foundations for sustainable and ethical use of dataData balance sheets laying foundations for sustainable and ethical use of data
Data balance sheets laying foundations for sustainable and ethical use of data
 
Towards data responsibility - how to put ideals into action
Towards data responsibility - how to put ideals into actionTowards data responsibility - how to put ideals into action
Towards data responsibility - how to put ideals into action
 
Täytä velvollisuudet ja hyödynnä mahdollisuudet – käytännön työkaluja regulaa...
Täytä velvollisuudet ja hyödynnä mahdollisuudet – käytännön työkaluja regulaa...Täytä velvollisuudet ja hyödynnä mahdollisuudet – käytännön työkaluja regulaa...
Täytä velvollisuudet ja hyödynnä mahdollisuudet – käytännön työkaluja regulaa...
 
Datatalouden ja tekoälyn regulaatio – missä mennään?
Datatalouden ja tekoälyn regulaatio – missä mennään?Datatalouden ja tekoälyn regulaatio – missä mennään?
Datatalouden ja tekoälyn regulaatio – missä mennään?
 
Green ICT Tools for Sustainable Digitalization
Green ICT Tools for Sustainable DigitalizationGreen ICT Tools for Sustainable Digitalization
Green ICT Tools for Sustainable Digitalization
 
Future-proof digitalization is on full speed – are you on board?
Future-proof digitalization is on full speed – are you on board?Future-proof digitalization is on full speed – are you on board?
Future-proof digitalization is on full speed – are you on board?
 
How to (Help to) Save Our Planet with Green Coding
How to (Help to) Save Our Planet with Green CodingHow to (Help to) Save Our Planet with Green Coding
How to (Help to) Save Our Planet with Green Coding
 
National Library of Finland - open source solutions in the development of nat...
National Library of Finland - open source solutions in the development of nat...National Library of Finland - open source solutions in the development of nat...
National Library of Finland - open source solutions in the development of nat...
 
The Case for Open Source in the Public Sector
The Case for Open Source in the Public SectorThe Case for Open Source in the Public Sector
The Case for Open Source in the Public Sector
 
KEYNOTE: From Lutece to CiteLibre, City of Paris' commitment to open source
KEYNOTE: From Lutece to CiteLibre, City of Paris' commitment to open sourceKEYNOTE: From Lutece to CiteLibre, City of Paris' commitment to open source
KEYNOTE: From Lutece to CiteLibre, City of Paris' commitment to open source
 
Freedom & Functionality – A Startup Approach to Open Source & Innovation for ...
Freedom & Functionality – A Startup Approach to Open Source & Innovation for ...Freedom & Functionality – A Startup Approach to Open Source & Innovation for ...
Freedom & Functionality – A Startup Approach to Open Source & Innovation for ...
 
How open source empowers startups to start big, with case Double Open Oy
How open source empowers startups to start big, with case Double Open OyHow open source empowers startups to start big, with case Double Open Oy
How open source empowers startups to start big, with case Double Open Oy
 
Sustainable Open Source; Balancing Business and Community
Sustainable Open Source; Balancing Business and CommunitySustainable Open Source; Balancing Business and Community
Sustainable Open Source; Balancing Business and Community
 
Empowering Employment: The Swedish Public Employment Service’s digital transf...
Empowering Employment: The Swedish Public Employment Service’s digital transf...Empowering Employment: The Swedish Public Employment Service’s digital transf...
Empowering Employment: The Swedish Public Employment Service’s digital transf...
 
KEYNOTE: How to automate the world the open source way
KEYNOTE: How to automate the world the open source wayKEYNOTE: How to automate the world the open source way
KEYNOTE: How to automate the world the open source way
 
"Perspectives from the EU level" by Henna Virkkunen
"Perspectives from the EU level" by Henna Virkkunen"Perspectives from the EU level" by Henna Virkkunen
"Perspectives from the EU level" by Henna Virkkunen
 
"Sand battery and other new energy concepts by Vatajankoski" by Pekka Passi
"Sand battery and other new energy concepts by Vatajankoski" by Pekka Passi"Sand battery and other new energy concepts by Vatajankoski" by Pekka Passi
"Sand battery and other new energy concepts by Vatajankoski" by Pekka Passi
 
"Finnish National Rural Network: Support framework for Smart Villages" by Sal...
"Finnish National Rural Network: Support framework for Smart Villages" by Sal..."Finnish National Rural Network: Support framework for Smart Villages" by Sal...
"Finnish National Rural Network: Support framework for Smart Villages" by Sal...
 
"Smart Villages in Finland" by Marianne Selkäinaho
"Smart Villages in Finland" by Marianne Selkäinaho"Smart Villages in Finland" by Marianne Selkäinaho
"Smart Villages in Finland" by Marianne Selkäinaho
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Kürzlich hochgeladen (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

Paulo Canas Rodrigues - The role of Statistics in the Internet of Things - Mindtrek 2016

  • 1. 1/20 The role of Statistics in the Internet of Things Paulo Canas Rodrigues Research Director CAST (Centre for Applied Statistics and Data Analytics) University of Tampere MINDTREK October 18, 2016
  • 2. 2/20 CAST – Vision and Mission • With the increase of (big and unstructured) data collected every day in many disciplines, appropriate quantitative methodology tools are needed for both evaluating the research hypotheses timely, and for reducing the risks in the decision-making procedure • MISSION: To promote the understanding and good practice of statistics and data analytics within UTA/Tampere3 and in a global scene • VISION: To become recognized as a strong partner to researchers and to industry in statistics and data analytics, locally and globally, especially in applied sciences • MEMBERS: wide range of expertise, from methodological statistics to biostatistics, machine learning, data mining, data visualization, time series, computational statistics, etc.
  • 3. 3/20 CAST – R&D services • CAST provides diverse expert quantitative research R&D services for different purposes starting from project planning to data analysis and results reporting. These services range from tailored and complex statistical research to traditional survey research, for private and public sector. • For companies: operation and maintenance management visualization, performance prediction and problem case diagnostics, customer and product content insights, sales predictions, marketing recommendations, operation evaluation and optimization, data analytics expertise in developing digital services • For public organizations: expertise in quantitative methodology part of a research and development project, quantitative analytics for operation evaluation and rationalization, data analytics expertise in developing digital public services
  • 4. 4/20 CAST – R&D services • Different kinds of methodologies are available for graphical visualization, data mining, analysis, modeling and prediction including but not limited to: • big data analytics • Internet of Things • machine learning • time series modeling, analysis and prediction • Bayesian modeling • data and text mining • interactive analytics • survey design and analysis. • training • More details in http://www.uta.fi/cast/
  • 7. 7/20 How much is data driving decisions today? • recently surveyed senior business stakeholders across Europe to ask about their attitudes about data and analytics. The findings of the survey were summarized in the Business Grammar Research Report. • Here’s a glimpse of some of the key findings: • 96% use data and analytics to inform business decisions today • 59% of European business leaders consider data and analytics savviness to be one of the two most important skills for new employees • Data and analytics skills are now considered more important than industry experience or a second language
  • 8. 8/20 Unstructured data June 8, 2011 (Joe McKendrick): Unstructured data: the elephant in the Big Data room • [… Many organizations are becoming overwhelmed with the volumes of unstructured information -- audio, video, graphics, social media messages -- that falls outside the purview of their "traditional" databases. Organizations that do get their arms around this data will gain significant competitive edge…] • [… 91% (in the survey) say unstructured information already lives in their organizations, but many aren’t sure what to do about it…]
  • 9. 9/20 • But, what to do with all this data? • How to transform data in information for decision making?
  • 10. 10/20 Statistics: A world of possibilities Marie Davidian (North Carolina State University) Thomas Louis (Johns Hopkins Bloomberg School of Public Health) Statistics is the science of learning from data, and of measuring, controlling, and communicating uncertainty; and it thereby provides the navigation essential for controlling the course of scientific and societal advances.
  • 11. 11/20 IoT – A time series challenge • The world is becoming more and more instrumented, interconnected and intelligent, resulting in mountains of newly generated data. • With storage costs coming down significantly, companies now want to leverage this instrument-generated data (including meter, temperature and all types of sensor data over time) for conducting analysis. • Among all the types of big data, data from sensors is the most widespread and is referred to as time-series data. • So many records, so little time! Source: http://www.ibmbigdatahub.com/blog/internet-things-time-series-data-challenge/
  • 12. 12/20 Route 13 Hervanta-Ylöjärvi: travel time per bus for 2 days
  • 13. 13/20 Smartphone sensor data – How to identify breakpoints? Source: http://beautifuldata.net/tag/sensor-data/ Data: accelerometer smartphone that Datarella provided in its Data Fiction competition. The dataset shows the acceleration along the three axes of the smartphone: x – sideways acceleration of the device y – forward and backward acceleration of the device z – acceleration up and down So, for example, the activity of taking the smartphone out of your pocket and reading a tweet can look the following way: • y acceleration – the smartphone had been in the pocket top down and is now taken out of the pocket • z and y acceleration – turning the smartphone so that is horizontal • x acceleration – moving the smartphone from the left to the middle of your body • z acceleration – lifting the smartphone so you can read the fine print of the tweet
  • 14. 14/20 Smartphone sensor data – How to identify breakpoints? Source: http://beautifuldata.net/tag/sensor-data/ This is the sensor data for one user on one day:
  • 15. 15/20 Smartphone sensor data – How to identify breakpoints? Source: http://beautifuldata.net/tag/sensor-data/ Let’s zoom in to the period between 12:32 and 13:00: • In the beginning, the smartphone seems to lie flat on a horizontal surface – the sensor is reading a value of around 9.8 in positive direction – this means, the gravitational force only effects the z axis and not the x and y axes. • But then things change and after a few movements (our change points) the last observation has the smartphone on a position where the x axis has around -9.6 acceleration, i.e. the smartphone is being held in landscape orientation pointing to the right.
  • 16. 16/20 Smartphone sensor data – How to identify breakpoints? Source: http://beautifuldata.net/tag/sensor-data/ • This quick analysis of the acceleration in the x direction gives us 4 change points, where the acceleration suddenly changes.
  • 17. 17/20 Anomaly Detection with Wikipedia Page View Data Source: https://www.r-bloggers.com/anomaly-detection-with-wikipedia-page-view-data/ • We choose an interesting Wikipedia page and download 90 days of PageView statistics • A first plot shows this pattern (for the USA Wikipedia page) • Now, let’s look for anomalies using the R package AnomalyDetection
  • 18. 18/20 Anomaly Detection with Wikipedia Page View Data Source: https://www.r-bloggers.com/anomaly-detection-with-wikipedia-page-view-data/ • In our case, the algorithm has discovered 4 anomalies. • The first on October 30 2014 being an exceptionally high value overall • The second is a very high Sunday • The third a high value overall • The forth a high Saturday (normally, this day is also quite weak).
  • 19. 19/20 Concluding remarks • The Internet of Things brought a new way of looking at the world and, with it, mountains of newly generated data have been collected. • With decreasing costs of data storage, companies are now looking at strategies to analyze and to take proper advantage of the deluge of data they have been storing. • How to do that? Statistics can help!
  • 20. 20/20 Contat information • Research Director: Paulo Canas Rodrigues (Paulo.Rodrigues@uta.fi) www.uta.fi/cast