IES' Daniel Tuohy presentation provided an introduction to big data at our IES Faculty event, which took place in London on 27th April, 2016. The seminar focused on the application and status of Intelligent Big Data in the fields of building services, architecture and construction.
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
IES Faculty - Introduction to Big Data
1. IES Faculty Event – 27th April 2016
Intelligent Big Data in Buildings -
Introduction to Big Data
2. How much data is out there?
• 40% growth fuelled by more people and enterprises doing everything
online.
• Data from embedded systems (major part of IOT) will account for 10%
of Digital Universe in 2020.
3. How much data is out there?
• Next year we will see 3 times more connected devices than people
• Next year mobile data traffic will have grown 13 times in last 5 years
4. Big Data Explosion
• Poses significant challenges:
– Processing
– Storage
– Networking and architecture.
Data available
Percentage that
can be processed
5. Characteristics of Big Data – 3V’s
• Volume – 44 ZB by 2020
and % data that can be
processed will decrease.
• Variety – Not just
structured data anymore
(relational e.g. tables).
Much of new data will be
unstructured (audio,
videos, docs)
• Velocity – Demand for
near real time analysis.
• Some people talk about
the fourth ‘V’ of Veracity.
6. Characteristics of Big Data – Volume
• % of overall valuable or “Target-rich” data expected to double by 2020.
• Digital Universe data is mostly transient – e.g. unsaved Netflix movie streams
or online gamer interactions (2020 storage capacity will store 15% of DU)
(5% in 2013)
7. Characteristics of Big Data – Variety
• Structured data that fitted into tables & relational databases (e.g.
transactional or financial data) relatively straight forward to handle
• Data often unstructured - creates problems for storing, mining & analysing
(text, photo, metadata)
8. Characteristics of Big Data – Velocity
• Data-at-rest and data-in-motion.
• Ability to analyse real-time data can bring competitive advantages
• Life-time of data utility – how long will data be useful? Determines analysis
(no longer only batch analysis)
9. Addressing Big Data Challenge
• Cloud Computing
Solutions.
• Large In-Memory
Databases
• Real-time analysis
• Distributed processing
ecosystems.
10. Big Data Technologies - Storage
• Built namely in SQL & business mainstay
• Issues scaling when dataset gets ‘Big’
• Not designed to be distributed
• Master-slave & sharding approaches
used for scaling up
• Need for other data storage tools.
• NOSQL databases are non-
relational & document-orientated
• NOSQL is response to growing scale
of databases (facebook/twitter) &
falling hardware costs
11. Big Data Technologies - Storage
• Very fast and scalable
• Easy to distribute
• BUT – Many data structures
cant be modelled
• Richer data than key/value pairs
• Eventually consistent (current data
visible by all)
• BUT – No ACID type transactions
(important in banking)
12. Big Data Technologies - Processing
• Some problems require use of
collection of computers used
• Computing problem divided
into parts and worked on
• Hadoop is a open source data
storage & processing API
(vendor versions also).
• Hadoop is good for:
– Large data sets & cheap scaling
– Fast parallel data processing
– Data from multiple sources/formats
– Need to move computation to data
– Point of Sale Transaction Analysis,
Ad Targeting, Recommendation
Engine, Risk Modelling
13. Big Data Skills
• Range of different professionals
• City planner uses data to understand
citizens and plan developments
• Business skills such as communication very
important (complex findings/message)
• Databases and SQL scripting
• Programming (R and Python)
• Advanced Excel, SPSS, SAS
• Business Intelligence & Analytics
(Tableau, Qlik, SAP)
14. Conclusions
• Big Data explosion from human
and machine generated data
• Big Data characteristics pose
different challenges
• Technologies need to keep apace
with data growth & consumer
expectations
• Big data presents huge
opportunity for many businesses.
IT professionals will not be solely
responsible for making sense of
this data.