Presentation: Study: #Big Data in #Austria, Mario Meir-Huber, Big Data Leader Eastern Europe, Teradata GmbH & Martin Köhler, Austrian Institute of Technology, AIT (AT), at the European Data Economy Workshop taking place back to back to SEMANTiCS2015 on 15 September 2015 in Vienna.
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
Study: #Big Data in #Austria
1. #Big Data in #Austria
Big Data – Challenges and Potentials
Mario Meir-Huber and Martin Köhler
European Data Economy Workshop, Semantics 2015
15.09.2015, Vienna, Austria
2. Study „#BigData in #Austria“
Study „#BigData in #Austria“
Project duration: 1.11.2013 – 30.04.2014
Project partners:
• IDC Central Europe GmbH
• AIT Austrian Institute of Technology, Mobility Department
Contact persons:
• Mario Meir-Huber, IDC (Teradata)
• Martin Köhler, AIT
Content:
• State-of-the-Art in Big Data
• Market analysis
• Best practice for Big Data projects
Download (in german):
• FFG „Studies of ICT of the future“: https://www.ffg.at/studien-aus-ikt-der-zukunft
#Big Data in #Austria has been funded in the funding frame „ICT of the future “ of the
Austrian Research Promotion Agency (FFG) and the Austrian Ministry for Transport,
Innovation and Technology (BMVIT).
2
4. Big Data Definition
430.09.2015
“Big Data” is a term encompassing the use of techniques to capture, process,
analyse and visualize potentially large datasets in a reasonable timeframe
not accessible to standard IT technologies. By extension, the platform, tools
and software used for this purpose are collectively called “Big Data
technologies”.
NESSI White Paper, December 2012
4
Four characteristics:
•Volume: In the last years the amount of generated data increased enormously
•Velocity: Analysing more data in shorter time frames
•Variety: Huge diversity of data formats (Arbitrary–> Relational > Freitext)
•Value: Extracting value (knowledge)
Hardware and software technologies for manageing and
Analyzing huge amounts of data
Or simply said
IF DATA IS PART OF THE PROBLEM
6. Big Data Technology Stack
Hadoop
Ecosystem
Big Data
Platforms
Data
Ingestion
And
Processing
Efficiency
Trust
Workload
Governance
Tools
Platform
Programming
Parallel
Big Data
Analytics
Data
Science
Transform
question to
algorithm
Machine
Learning
Analysis
Integration
Query
Performance
Transform
Warehousing
Big Data
Utilization
Domain
Expertise
Asking the
right
question
Reporting &
Dashboards
Alerting &
Recommendat
ions
Business
Intelligence
Text Analysis
and Search
30/09/2015 6
Data
Centers
Big Data
Management
Scalable Data
Storage
IaaS
Cloud
Virtualization
Network
Compute
Storage
DBMS
NoSQL
ManagementSecurity
PrivacyGovernance
Data
Value
7. Big Data Management
7
Technologies for the efficient management of huge
data amounts
• Storage and management of data
• Provisioning and management of the infrastructure
Cloud Ressources (Internal) Data
Centers
Storage
8. Big Data Platforms
8
Technologies for (massively) parallel execution of data analytics on
huge amounts of data
• Provisioning of parallelized and scalable execution systems
• Real-time integration of sensor data
Massively parallel
programming
Programming models
for data-intensive
applications
(e.g. MapReduce)
High-Level Query
languages
Scripting languages
and abstraction of low-
level data-intensive
query languages
Streaming
Real-time processing of
(sensor-) data (which can
not be stored)
Ad-Hoc queries
Real-time access on
huge data amounts
(Query optimization –
SQL vs. MapReduce)
Google Pregel
Apache Drill
9. Big Data Analytics
9
Technologies for extracting information/knowledge from huge data
amounts
• Pattern recognition
• Pattern matching
• .
10. Big Data Utilization
10
Technologies for extracting value
• Strengthening the market situation of an organization
• Technologies for (simplified) utilization of data
Business
Intelligence
Provisioning of efficient
indicators based on
data (Reporting, KPIs,
Audit, …)
Knowledge
Management
Management and
representation of
knowledge
(Ontologies,
LinkedData,
Knowledge
management systems)
Decision Support
Supporting decision
making; incorporates
data management,
modelling, innovative
and interactive user
interfaces
Visualization
Interactive Visualization
of complex informations
and networks on different
levels of abstractions
(Visual Analytics)
11. Traditional versus Data-intensive Approach
– 11 –
HADOOP
Iterate over structure
Transform and analyze
Hadoop Approach
• Apply schema on read
• Support range of access patterns to
data stored in HDFS: polymorphic
access
Batch Interactive Real-time
Right Engine, Right Job
In-memory
Traditional Approach
• Apply schema on write
• Heavily dependent on IT
Determine list of questions
Design solution
Collect structured data
Ask questions from list
Detect additional questions
Single Query Engine
SQL
12. Technical and scientific challenges
Visual Analytics
• Combine the strengths of human and
electronic data processing
Big Data Analytics
• Techniques making use of complete
data set, instead of sampling
Real time analytics, (cross)-
stream processing
• Expect real-time or near real-time
responses from the systems
Content Validation
• Validating the vast amount of
information in content networks, Trust
1230/09/2015
Distributed Storage (IaaS, NoSQL)
Datacenter
Parallel Stream Processing
MapReduce Extensions
Use Cases and Enterprise Services
Scientific Data Life Sciences Business Reporting
DatacenterDatacenter
14. Global market
IDC expects a growth of the
global market from 9,8 Billion
USD in 2012 to 32,4 Billion
USD in 2017
Yearly growth rate: 27%
Austrian market 2013:
• ~ 23 Mio Euro
15. Code of practice for big data projects
Support and orientation for the impementation of big data projects
Reference projects
• Medicine
• Mobility
• Earth observation
• Crisis and disaster management
• Trade
15
Process model Maturity model
Reference architecture
16. Code of practice for big data projects
16
„We will soon have a huge skills shortage for data-
related jobs.“
Neelie Kroes (ICT 2013, Nov.7, Vilnius)
„Data Scientist: The Sexiest Job of the 21st Century“
http://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/ar/1