The document discusses Big Data Europe (BDE), a coordination and support action to address challenges around big data in Europe. It outlines two main measures BDE will implement: 1) Coordination of stakeholder engagement to understand requirements for big data infrastructure across societal challenges, and 2) Support for designing, developing, and evaluating a big data aggregator platform to meet requirements and maximize opportunities from European research. BDE will build interest groups and involve stakeholders from Horizon 2020 challenges to coordinate this work, and design a cloud-ready aggregator platform to realize the described measures.
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Big Data Europe Workshop Focuses on Data Value Chain & Big Data Integration
1. BIG DATA EUROPE
HTTP://WWW.BIG-DATA-EUROPE.EU/
Integrating Big Data, Software & Communities for Addressing
Europe’s Societal Challenges
European Data Economy Workshop,
Focus: Data Value Chain & Big Data & Open Data
15 September 2015, University of Economics Vienna
2. Semantic Web Company
(SWC)
SWC was founded 2001, head-quartered in Vienna
25 experts in linked data technologies
Product: PoolParty Semantic Suite (launched 2009)
Serving customers from all over the world
EU- & US-based consulting services
3. Semantic Web Company
(SWC)
Some of our Customers
● Credit Suisse
● Boehringer Ingelheim
● Roche
● Wolters Kluwer
● BMJ Publishing Group
● Red Bull Media House
● Canadian Broadcasting Corporation (CBC)
● Pearson
● Council of the EU
● DG Environment, EC
● Healthdirect Australia
● Ministry of Finance (Austria)
● World Bank Group
● Inter-American Development Bank (IADB)
● International Atomic Energy Agency (IAEA)
● Buildings Performance Institute Europe
(BPIE)
● Renewable Energy & Energy Efficiency P
(REEEP)
● Global Buildings Performance Network
(GBPN)
● American Physical Society
Finance / Automotive / Publisher / Health Care / Public Administration / Energy /
Education
Selected Partners
● EBCONT
● EPAM Systems
● iQuest
● PwC
● Tenforce
● OpenLink Software
● Ontotext
● MarkLogic
● Gravity Zero
● Altotech
● Wolters Kluwer
● Taxonomy Strategies
● Digirati
● Fraunhofer (IAIS)
● University of Leipzig
(INFAI)
We all have one goal in mind: Make machines smart enough so that
they can help us to find those needles in the haystack, which are
really relevant to us.
4. The Motivation – Big Data
Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the
world today has been created in the last two years alone.
This data comes from everywhere: sensors used to gather climate information, posts to
social media sites, digital pictures and videos, purchase transaction records, and cell phone
GPS signals to name a few.
This data is big data. Source:
8. Big Data in Europe: Challenges,
Opportunities
Health
Climate
Energy
Transport
Food
Societies
Security
Lorem
ipsum
dolors
KSDJO
PSCKK
SDKAB
LKASJL
LAWWD
S
wpweppe
pwpisio
we
10101
00110
10100
10101
0
Regional Data Repositories
10101
00110
10100
10101
0
10101
00110
10100
10101
0
#2: Interlink, Centralise Access, Explore
1010101001010101010
0101101000101010101
0010101010100001011
0100010101010100101
0101010010010101010
0101010101001011010
001010
Data
Eleme
nt
#3: Analyse, Discover, Visualize
#4: Mashup, Cross-domain Exploitation
Journalists Authorities
9. Big Data in Europe:
Obstacles
30-sept.-15
#1 Big Data “Variety“ problem
Multiple Data Sources
Required: Integration, Harmonisation
#2 Opening-up Data concerns
Loss of control, lack of tracking
Reservations about large corporations
#3 Limited Skills, Training,
Technology
Lack of Data Scientists
Lack of Generic Architectures, components
10. Big Data in Europe:
Obstacles
30-sept.-15
Extraction, Curation Quality, Linking,
Integration
Publication,
Visualization, Analysis
Extraction, Curation, Quality,
Linking, Integration, Publication,
Visualization, Analysis
Health
Transport
Security
Extraction Curation Quality Linking Integration Publication Visualization Analysis
Data Repositories Linked Open Data
Cloud
Stage 1
Stage 2
Stage 3
Food SocietiesClimate Energy
12. Rationale
Show societal value of Big Data
Lower barrrier for using big data technologies
o Required effort and resources
o Limited data science skills
Help establishing cross-
lingual/organizational/domain Data Value
Chains
30-sept.-15
14. Summary
Two clearly defined coordination and support measures:
Coordination: Engaging with a diverse range of stakeholder groups representing particularly
the Horizon 2020 societal challenges Health, Food & Agriculture, Energy, Transport, Climate,
Social Sciences and Security; Collecting requirements for the ICT infrastructure needed by data-
intensive science practitioners tackling a wide range of societal challenges; covering all aspects
of publishing and consuming semantically interoperable, large-scale data and knowledge assets;
Support: Designing, realizing and evaluating a Big Data Aggregator platform infrastructure
that meets requirements, minimises disruption to current workflows, and maximises the
opportunities to take advantage of the latest European RTD developments (incl. multilingual data
harvesting, data analytics & visualisation).
BigDataEurope will implement and apply two main instruments to successfully realize these
measures:
Build Societal Big Data Interest/Community Groups in the W3C interest group scheme &
involving a large number of stakeholders from the Horizon 2020 societal challenges as well as
technical Big Data experts;
Design, integrate and deploy a cloud-deployment-ready Big Data aggregator platform
15. Orthogonal Dimensions of Big Data
Ecosystems
Generic Big Data Enabling Technologies
Data Value Chain
Data Generation
& Acquisition
Data Analysis &
Processing
Data Storage &
Curation
Data
Visualization &
Usage
Data-driven
Services
SocietalChallenges
DomainSpecificDataAssets&Technology
Healthcare
Food Security
Energy
Intelligent Transport
Climate & Environment
Inclusive & Reflective Societies
Secure Societies
16. BDE Stakeholder Engagement Approach &
Activities
BDE Community Tools – JOIN IN NOW !
• Website: news, events, community, …
• 7 x BDE W3C Community Groups
• 7+1x Mailing Lists
• 7 x SC Workshops/Year = 21 Workshops
• Full set of communication tool-set…
Future Outlook
• BDE Aggregator Platform
• For download / internal use
• Cloud Version
• Big Data Technology Support Tools
17. Domains, Focus Areas & Data
Assets
Societal Domain Preliminary Big Data Focus area Selected Key Data assets
Life Sciences &
Health
Heterogeneous data Linking &
integration
Biomedical Semantic Indexing & QA
ACD Labs / ChemSpider, ChEBI, ChEMBL, Con-ceptWiki, DrugBank, EN-
ZYME, Gene Ontology, GO Annotation, Swis-sProt, UniProt, Wik-iPathways,
PubMed, MeSH, Disease Ontology (DO), Joint Chemical Dic-tionary
(Jochem), Bio-ASQ datasets
Food &
Agriculture
Large-scale distributed data integration
INFOODS, AQUASTAT Green Learning Network (GLN), Agricultural
Bibliography Network (ABN), AGRIS, AquaMaps, Fishbase
Energy
Real-time monitoring, stream
processing, data analytics, and
decision support
European Energy Exchange Data, smart meter measurement data,
gas/fuels/energy market/price data, consumption statistics, equipment
condition monitoring data)
Transport
Streaming sensor network & geo-
spatial data integration
GTFS data, OSM/ LinkedGeoData, MobilityMaps, Transport sensor data,
ROSATTE Road safety attributes, European Road Data Infrastructure -
EuroRoadS
Climate
Real-time monitoring, stream
processing, and data analytics.
European Grid Infrastructure (EGI), Databases hosting atmospheric data.
Several software frameworks for simulation, calibration and reconstruction.
Social Sciences
Statistical and research data linking &
integration
Federated social sciences data catalogs, statistical data from public data
portals and statistical offices (e.g. EuroStats, UNESCO, WorldBank)
Security
Real-time monitoring, stream
processing, and data analytics.
Image data analysis
Earth Observation data (e.g. Very High Resolution Satellite Imagery acquired
from commercial providers and governmental systems) and collateral data
for supporting CFSP/CSDP missions and operations, Databases hosting
atmospheric Data. Experimental and simulation data concerning dispersion
18. Work Packages & Implementation
Phases
Community
Building
M1-M12 M13-M24 M25-M36
Enabling
Technologies
Component
Integration
Uptake
Integrator
Deployment
Community
Assessment
WP3 – Big Data Generic Enabling
Technologies & Architecture
WP5 – Big Data Integrator Instances
WP7 – Dissemination & Communication
WP2 – Community Building & Requirements
WP4 – Big Data Integrator Platform
WP6 – Real-life Deployment & User Evaluation
19. Blueprint of the Data Aggregator
Platform
Batch Layer
Speed Layer
Data Storage
Real-time data &
Transactions …
Batch View
Real-time
View
messagepassing
message passing
Applications & Showcases
Real-time dashboards
Domain-specific BDE apps
Big Data Analytics
In-stream Mining
BDEPlatform&
Intelligence
Input data
Stream
Spatial
Social
Statistical
Temporal
Transaction
al
Imagery
+ Semantic Layer (Retaining Semantics using LD
Lambda Architecture
20. Announcements….
Workshop SC2 (Agriculture & Food): 22.9.2015, Paris, INFOS
Workshop SC7 (Secure Societies): 30.9.2015, Brussels, INFO
Workshop SC4 (Transport): .10.2015, Bordeaux, INFO
Workshop SC6 (Social Science): 18.11.2015, Luxembourg, INFO
21. Martin Kaltenböck, m.kaltenboeck@semantic-web.at
Semantic Web Company GmbH
Mariahilfer Strasse 70/8, A-1070 Vienna
+43-1-4021235
http://www.semantic-web.at
http://www.poolparty-software.com
http://slideshare.net/semwebcompany
http://youtube.com/semwebcompany
Your Questions please….
www.big-data-europe.eu
30-sept.-15
#BigDataEurope