SlideShare ist ein Scribd-Unternehmen logo
1 von 77
Big Data Workshop - DLD Summer 15
Big Data – Workshop
DLD Summer 15
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Understanding Big Data
And getting the right mindset
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Agenda
 Syncing
 Defining Big Data
 Hype or Evolution
 Tech Drivers
 Big Data – Big Business?
 What‘s it all about?
 How do we get there?
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Syncing
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Syncing
 Please tell us your opinion about Big Data
 Please tell us about your Big Data projects
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Defining Big Data
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Definition(s)
“Big Data describes datasets so large they become very
difficult to manage with traditional database tools.”
„big data is “data that exceeds the processing capacity
of conventional database systems. The data is too big,
moves too fast, or doesn’t fit the strictures of your
database architectures”.“
"Very pragmatically, it's about building net-new analytic
applications based on new types of data that (an
organization) wasn't previously tracking."
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
The 3 V‘s
 Variety
 Tables, Images,
Videos, XML, Logs
 Velocity
 Batch, Streams, Real-
Time
 Volume
 Lot‘s of xBytes
Variety
VolumeVelocity
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Variety
 Mix of Data types
 BLOB‘s and CLOB‘s
 Images, Audio, Videos, Log Files
 Semi-Structured, Unstructured
 Email, EDI-Messages, Transaction Logs, Sensor-
Data
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Velocity
 Crucial – Speed of „Feedback Loop“
 Streaming Data
 Complex Event Processing
 From Batch to (Near) Real-Time
 Different Lifetime
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Volume - Big?
 KiloByte
 MegaByte
 GigaByte
 TeraByte
 PetaByte
 Exabtye
 ZettaByte
 YottaByte
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Figures
 „Digital Universe“ according to EMC/IDG Study
2014 in 2013 4.4 Zettabytes, in 2020 44 Zettabytes
 All human speech ever spoken 42 Zettabyte
(16kHz, 16bit)
 2013 - Speculations about NSA Datacenter 1 YB,
real estimation 3-12 EB
 CERN / LHC Datacenter passes 100 PB
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Volume – Most famous quote
 2.5 Exabytes of Data Created each Day
(2,500,000,000,000,000,000 bytes) ≈ 1 ZB/Year
 (with 90% of World Data created in the last two
years)
 Source IBM CMO Study 2011
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Even more V‘s
 Veracity
 Uncertainty of Data, Trustworthiness, Accountability
 Value
 Big Data only if it generates value
 Visibility
 Security, stitching together data from various
sources
 Validity
 Logic inference, Correlation vs. Causation
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Hype or Evolution?
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Old wine?
 OLTP, OLAP,
DataWareHouse
- Around since 1970s
- ACID (Atomicity,
Consistency,
Isolation, Durability)
- based on SQL
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Big Data 15 years ago
OLTP
Orders
Articles
Receiving
Orders,
Articles,
Receiving
Etc.
Data Warehouse
Decision Support
Systems (OLAP)
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Business Intelligence
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Enter Big Data
http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation
http://www.gartner.com/newsroom/id/1731916
http://chucksblog.emc.com/chucks_blog/2011/06/2011-idc-digital-universe-study-big-data-is-here-now-what.html
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
“New” Big Data
 New Paradigm
 BASE (Basic Availability, Soft State and Eventually
consistency)
 New Data Model
 Data LifeCycle and Variability
 Data Linking and referral integrity
 New Analytics
 Real-time/streaming analysis, interactive
 Machine-learning
 New Infrastructure and Tools
 High Performance Computing, Storage, Network
 Multi-Provider Services Integration
 New Data Centric service models and security models
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Hadoop on
Premise
Big Data
Cluster
Mgmt /
Monitoring
NoSQL
NewSQL
Databases
MPP Databases
Graph
DB
Crowd-
sourcing
Transfo
rmation
Security
Storage
App Dev
Cross Infrastructure / Cloud Services
Analytics
Platform
BI
Platforms
For
Business
Analysts
Data
Science /
Platform
Data Visualization
Unstru
ctured
Data
AI Social Analytics
Analytic Services
Machine
Learning
Location/Pe
ople/Events
Search
Statistical
Computing
Log
Analytics
Crowd-
source
d
RealTime SMB
Frame-
work
Query Data Access
Collab.
workflow
Real-
Time
Stat.
Tools
ML
Data Source Sensors DataData Markets Incubators
Cloud
Deploy
Gov /
Regu
lation
Security
Education /
Learning
Health
Log
Analytics
Search
Finance
Human
Capital
Legal
Marketing
Publisher
Tools
Ad
Optimi-
zation
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Big Data
 Hype AND Evolution
 Some Vendors use it to remarket “old” stuff
 Many “new” products/services
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Tech Drivers
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Drivers
 Vendors
 Hardware, Storage, Network, Software
 Business
 Mobile
 Social
 Customer Insights
 Technology
 Open Source Technology, Cloud Computing
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
The Elephant in the Room
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Hadoop
- Hadoop is an Open Source „Big Data“
Framework
- Distributed Storage (HDFS) and Processing
(Map Reduce)
- Reliable, Fault tolerant
- Horizontal scalability from Single to thousands of
Cluster Nodes
- Cost 2.500$ / TB vs. 250.000$ / TB in
Datawarehouses
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
MapReduce
 Programming Model/Framework for processing
large Data Sets
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
NoSQL Databases
 Traditional
RDBMS outdated
for modern
paradigms
- Big Data
- Connectivity
- Concurrency
- Diversity
- Cloud
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
The difference – SQL / Tables
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
The NoSQL difference
{
_id: ObjectId(”2341"),
type: "Article",
author: ”Chris Boos",
title: ”Introduction AutoPilot",
date: ISODate("2015-04-21T13:21:12.343Z"),
},
{
_id: ObjectId(2342"),
type: "Book",
author: ”Roland Judas",
title: ”Big Data",
isbn: "978-0-213434235-5-7"
}
Document-based
„User1“, „Roland Judas“
„User2“, „Chris Boos“
„User3“, „Charly Brown“
Key-Value
Graph-Based
Columns
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Pros/Cons Hadoop / NoSQL
 Pro
 Highly flexible, agile, available, performant
 Scalable
 Modern, open technology with Commercial Support
 Support for very large datasets on commodity
hardware
 Cons
 Immature
 No Standardization - Schema-free means
Application needs to know how to retrieve data
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Even more tools
 Search/Index
 Business Intelligence
 Analytical Programming
 Visualisation
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Machine Learning
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Big Data – Big Business?
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Big Data Market
 Big Data Market projected in 2015 – $125bn*
(in comparison Public Cloud - $95bn**)
 Big Funding
 Cloudera – $1.2bn
 MongoDB – $300m
 HortonWorks – $250m
 DataStax – $190m
 BIRST – $130m
* According to Forbes.co / 2014/12/11 / 6 Predictions for Big Data / IDC Research
** According to Forrester Research
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Shares of Big Data Market
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Vendors love Big Data
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Vendors REALLY love Big Data!
Latest in Corporate Tech: In-Memory
 Oracle Exalytics
 SAP HANA
„Has SAP Bet The House With The Biggest
Update to its ERP in Two Decades?“
http://www.forbes.com/sites/greatspeculations/2015/03/04/has-sap-bet-the-house-with-the-biggest-update-to-its-erp-in-two-decades/
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Even more Sales!!!
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Best Practices DWH / BI / Big Data
 Analyze problem / data / quality
 Data Cleaning
 Data quality initiatives
 Sync Business / IT
 Buy stuff
 Implement stuff
 Train users
 Use governance / strategic approaches
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
And the success?
 Through 2017, 60% of big data projects will fail to go
beyond piloting and experimentation and will be
abandoned.
 Through 2017, fewer than half of lagging
organizations will have made cultural or business
model adjustments sufficient to benefit from big
data.
 Through 2018, 90% of deployed data lakes will be
useless as they are overwhelmed with information
assets captured for uncertain use cases.
Gartner: Predicts 2015: Big Data Challenges Move From Technology to the Organization
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Challenges
 Usage Scenarios
 Goals
 Skills
 Missing Data Scientists
 Need to understand the Math
 Technical
 Data Integration
 Privacy
 Main discussion in Germany
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Syncing
 What‘s your opinion?
 Do you have experience with big vendors
offerings?
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
What‘s it all about?
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
What‘s it all about?
 Data contains information of great business
value
 If you can extract those insights you can make
far better decisions
 Ultimately - Predicting the future
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Common Use Cases
 Customer Insights
 Market Basket/Pricing optimization
 Fraud Detection / Security Analytics
 (Proactive) Monitoring
 Sensor Data (IoT)
 Data Warehouse Optimization
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Understanding is important
Data Understanding
Connectedness
Information
Knowledge
Intelligence/Wisdom
Understanding
relations
Understanding
patterns
Understanding
principles
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
How do we get there?
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Syncing
 Anyone heard about „Semantic Web“ or
„Ontology“?
 Anyone having experience or projects around
Ontologies?
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Mapping the territory
 Enterprise Architecture (traditional)
 „Holistic“ Approach
 Many „Best practices“ and patterns
 Big Data Discovery
 Kind of Self-Service for Big Data
 Next Big Thing?
 Semantic Layer
 Should exist from BI implementation (proprietary)
 Or use modern approach “Linked Data”
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Data + Semantic = Knowledge
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Key is getting machine readable Data
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:admin="http://webns.net/mvcb/">
<foaf:PersonalProfileDocument rdf:about="">
<foaf:maker rdf:resource="#me"/>
<foaf:primaryTopic rdf:resource="#me"/>
</foaf:PersonalProfileDocument>
<foaf:Person rdf:ID="me">
<foaf:name>Roland Judas</foaf:name>
<foaf:title>Mr.</foaf:title>
<foaf:givenname>Roland</foaf:givenname>
<foaf:family_name>Judas</foaf:family_name>
<foaf:homepage rdf:resource="http://about.me/rjudas"/>
<foaf:workplaceHomepage rdf:resource="http://arago.co"/>
<foaf:knows>
<foaf:Person>
<foaf:name>Chris Boos</foaf:name>
</foaf:Person></foaf:knows></foaf:Person>
</rdf:RDF>
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Ontologies
 “A Data Model that represents Knowledge
as a set of concepts within a domain and
the relationships between these concepts”
 FOAF
 Schema.org
 DBPedia Ontology
 Good Relations
 http://www.w3.org/wiki/Good_Ontologies
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Triples
 Representation of facts
PredicateSubject Object
Is a (has type)Roland Person
http://about.me/rjudas rdf:type foaf:Person
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
From Triples to Graphs
Is a
Person
Roland
likes
DLD
Songs
plays
Vertice / Node
Edge
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Famous Examples
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
A pragmatic Approach
From the Basement
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Bringing Pieces together
Semantic Graphs
Big DataAPIs
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
http://github.com/arago/ogit
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Semantic Data Platform
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Visualization
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Use Cases from/beyond the IT Department
 Ticket Statistics
 Provider Management
 Network Planning
 Comparing Architectures
 Forecasting Technological Trends
 Data Center Planning
 Application Migration
 Technical Analysis for Business Processes
 IT Organisation Insights
 User Ranking
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
The right Mindset
Semantics
Graphs
APIs
“New” Big Data Tools
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
www.autopilot.co www.graphit.co www.tabtab.co
21/06/15, DLD Summer
Big Data Workshop - DLD Summer 15
Roland Judas
 Frankfurt, Germany
 Technical Evangelist, Product
Manager at arago
 Organizer Webmontag
Frankfurt, Cloudcamp
Frankfurt
 Mail: rjudas@arago.de
 Twitter:
 @rjudas (en)
 @rolandjudas (de)
 http://about.me/rjudas
21/06/15, DLD Summer 15, @rjudas
Big Data Workshop - DLD Summer 15
Image References and Licenses
Facebook Datacenter https://www.flickr.com/photos/intelfreepress/ License CC BY 2.0
Winery https://www.flickr.com/photos/joceykinghorn/ License CC BY-SA 2.0
BI Dashboard https://www.flickr.com/photos/ctsi-global/ License CC BY-SA 2.0
Dollars https://www.flickr.com/photos/amagill/ License CC BY 2.0
Old Timer Truck: https://www.flickr.com/photos/ell-r-brown/ License CC BY 2.0
SQL Designer https://www.flickr.com/photos/ejk/ License CC BY-SA 2.0
Crystal Ball https://www.flickr.com/photos/frogman2212/ License CC BY 2.0
MapReduce https://www.flickr.com/photos/lkaestner/ License CC BY-SA 2.0
Foaf https://www.flickr.com/photos/dullhunk/ License CC BY 2.0
Linked Open Data Richard Cyganiak and Anja Jentzsch License CC BY-SA 3.0
Rear-View Mirror https://www.flickr.com/photos/labyrinthx-2/ License CC BY-SA 2.0
Servers-8055_13.jpg https://commons.wikimedia.org/wiki/User:Victorgrigas License CC BY-SA 3.0
Watson https://commons.wikimedia.org/wiki/User:Clockready License CC BY-SA 3.0
Wolfram Alpha https://www.flickr.com/photos/morville/ License CC BY 2.0
Social_Network_Visualization MartinGrandjean http://www.martingrandjean.ch/wp-content/
21/06/15, DLD Summer 15, @rjudas

Weitere ähnliche Inhalte

Was ist angesagt?

Big Data and Fast Data - big and fast combined, is it possible?
Big Data and Fast Data - big and fast combined, is it possible?Big Data and Fast Data - big and fast combined, is it possible?
Big Data and Fast Data - big and fast combined, is it possible?Guido Schmutz
 
GoDaddy Customer Success Dashboard Using Apache Spark with Baburao Kamble
GoDaddy Customer Success Dashboard Using Apache Spark with Baburao KambleGoDaddy Customer Success Dashboard Using Apache Spark with Baburao Kamble
GoDaddy Customer Success Dashboard Using Apache Spark with Baburao KambleDatabricks
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessInside Analysis
 
Teradata Aster Discovery Platform
Teradata Aster Discovery PlatformTeradata Aster Discovery Platform
Teradata Aster Discovery PlatformScott Antony
 
How Semantics Solves Big Data Challenges
How Semantics Solves Big Data ChallengesHow Semantics Solves Big Data Challenges
How Semantics Solves Big Data ChallengesDATAVERSITY
 
Accelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsAccelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsArcadia Data
 
ETL big data with apache hadoop
ETL big data with apache hadoopETL big data with apache hadoop
ETL big data with apache hadoopMaulik Thaker
 
Data Discoverability at SpotHero
Data Discoverability at SpotHeroData Discoverability at SpotHero
Data Discoverability at SpotHeroMaggie Hays
 
Worst Practices in Data Warehouse Design
Worst Practices in Data Warehouse DesignWorst Practices in Data Warehouse Design
Worst Practices in Data Warehouse DesignKent Graziano
 
Data Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data DiscoveryData Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data DiscoveryInside Analysis
 
Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869Edgar Alejandro Villegas
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big DataIndu Khemchandani
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...Denodo
 
The Evolution of Big Data Frameworks
The Evolution of Big Data FrameworksThe Evolution of Big Data Frameworks
The Evolution of Big Data FrameworkseXascale Infolab
 
How to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st centuryHow to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st centuryAli Dasdan
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsKamalika Dutta
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyNati Shalom
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution ShowcaseInside Analysis
 

Was ist angesagt? (20)

Big Data and Fast Data - big and fast combined, is it possible?
Big Data and Fast Data - big and fast combined, is it possible?Big Data and Fast Data - big and fast combined, is it possible?
Big Data and Fast Data - big and fast combined, is it possible?
 
GoDaddy Customer Success Dashboard Using Apache Spark with Baburao Kamble
GoDaddy Customer Success Dashboard Using Apache Spark with Baburao KambleGoDaddy Customer Success Dashboard Using Apache Spark with Baburao Kamble
GoDaddy Customer Success Dashboard Using Apache Spark with Baburao Kamble
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
 
Teradata Aster Discovery Platform
Teradata Aster Discovery PlatformTeradata Aster Discovery Platform
Teradata Aster Discovery Platform
 
How Semantics Solves Big Data Challenges
How Semantics Solves Big Data ChallengesHow Semantics Solves Big Data Challenges
How Semantics Solves Big Data Challenges
 
Accelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsAccelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time Analytics
 
Databricks delta
Databricks deltaDatabricks delta
Databricks delta
 
ETL big data with apache hadoop
ETL big data with apache hadoopETL big data with apache hadoop
ETL big data with apache hadoop
 
Data Discoverability at SpotHero
Data Discoverability at SpotHeroData Discoverability at SpotHero
Data Discoverability at SpotHero
 
Worst Practices in Data Warehouse Design
Worst Practices in Data Warehouse DesignWorst Practices in Data Warehouse Design
Worst Practices in Data Warehouse Design
 
Data Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data DiscoveryData Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data Discovery
 
Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
 
Big data
Big dataBig data
Big data
 
The Evolution of Big Data Frameworks
The Evolution of Big Data FrameworksThe Evolution of Big Data Frameworks
The Evolution of Big Data Frameworks
 
How to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st centuryHow to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st century
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time Systems
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution Showcase
 

Andere mochten auch (17)

Encoder and decoder
Encoder and decoderEncoder and decoder
Encoder and decoder
 
Digital logic design part1
Digital logic design part1Digital logic design part1
Digital logic design part1
 
Digital Creative Design course - UX Module day 01 (EDIT interactive design an...
Digital Creative Design course - UX Module day 01 (EDIT interactive design an...Digital Creative Design course - UX Module day 01 (EDIT interactive design an...
Digital Creative Design course - UX Module day 01 (EDIT interactive design an...
 
DLD Presentation By Team Reboot,Rafin Rayan,EUB
DLD Presentation By Team Reboot,Rafin Rayan,EUBDLD Presentation By Team Reboot,Rafin Rayan,EUB
DLD Presentation By Team Reboot,Rafin Rayan,EUB
 
DLD Practical Lab Work
DLD Practical Lab WorkDLD Practical Lab Work
DLD Practical Lab Work
 
Digital Logic Design
Digital Logic Design Digital Logic Design
Digital Logic Design
 
Digital System Design Basics
Digital System Design BasicsDigital System Design Basics
Digital System Design Basics
 
Encoders
EncodersEncoders
Encoders
 
Parallel adders
Parallel addersParallel adders
Parallel adders
 
Digital logic design DLD Logic gates
Digital logic design DLD Logic gatesDigital logic design DLD Logic gates
Digital logic design DLD Logic gates
 
dld 01-introduction
dld 01-introductiondld 01-introduction
dld 01-introduction
 
Bangladesh ic-design-program-rev4-1 bd
Bangladesh ic-design-program-rev4-1 bdBangladesh ic-design-program-rev4-1 bd
Bangladesh ic-design-program-rev4-1 bd
 
Lect 1
Lect 1Lect 1
Lect 1
 
Logic gates
Logic gatesLogic gates
Logic gates
 
Digital Logic & Design (DLD) presentation
Digital Logic & Design (DLD) presentationDigital Logic & Design (DLD) presentation
Digital Logic & Design (DLD) presentation
 
digital logic design number system
digital logic design number systemdigital logic design number system
digital logic design number system
 
Computer arithmetic
Computer arithmeticComputer arithmetic
Computer arithmetic
 

Ähnlich wie DLD Summer Workshop Big Data

Dataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin BuzzwordsDataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin BuzzwordsDataiku
 
Yaroslav Ravlinko “Evolution of Data Processing platform from Hadoop to nowad...
Yaroslav Ravlinko “Evolution of Data Processing platform from Hadoop to nowad...Yaroslav Ravlinko “Evolution of Data Processing platform from Hadoop to nowad...
Yaroslav Ravlinko “Evolution of Data Processing platform from Hadoop to nowad...Lviv Startup Club
 
Overview of big data in cloud computing
Overview of big data in cloud computingOverview of big data in cloud computing
Overview of big data in cloud computingViet-Trung TRAN
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Denodo
 
The New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the CloudThe New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the CloudInside Analysis
 
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォームPivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォームMasayuki Matsushita
 
Data Vault 2.0: Big Data Meets Data Warehousing
Data Vault 2.0: Big Data Meets Data WarehousingData Vault 2.0: Big Data Meets Data Warehousing
Data Vault 2.0: Big Data Meets Data WarehousingAll Things Open
 
Big Data or Data Warehousing? How to Leverage Both in the Enterprise
Big Data or Data Warehousing? How to Leverage Both in the EnterpriseBig Data or Data Warehousing? How to Leverage Both in the Enterprise
Big Data or Data Warehousing? How to Leverage Both in the EnterpriseDean Hallman
 
Data Scientists: Your Must-Have Business Investment
Data Scientists: Your Must-Have Business InvestmentData Scientists: Your Must-Have Business Investment
Data Scientists: Your Must-Have Business InvestmentKalido
 
Introduction to Spark Training
Introduction to Spark TrainingIntroduction to Spark Training
Introduction to Spark TrainingSpark Summit
 
Intro to Spark development
 Intro to Spark development  Intro to Spark development
Intro to Spark development Spark Summit
 
Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019mark madsen
 
Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Cindy Gross
 
Guest Lecture on Big Data in Business,
Guest Lecture on Big Data in Business, Guest Lecture on Big Data in Business,
Guest Lecture on Big Data in Business, saravana krishnamurthy
 
Foundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information ArchitectureFoundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information ArchitectureInside Analysis
 
Prague data management meetup #30 2019-10-04
Prague data management meetup #30 2019-10-04Prague data management meetup #30 2019-10-04
Prague data management meetup #30 2019-10-04Martin Bém
 

Ähnlich wie DLD Summer Workshop Big Data (20)

Dataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin BuzzwordsDataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin Buzzwords
 
Yaroslav Ravlinko “Evolution of Data Processing platform from Hadoop to nowad...
Yaroslav Ravlinko “Evolution of Data Processing platform from Hadoop to nowad...Yaroslav Ravlinko “Evolution of Data Processing platform from Hadoop to nowad...
Yaroslav Ravlinko “Evolution of Data Processing platform from Hadoop to nowad...
 
Overview of big data in cloud computing
Overview of big data in cloud computingOverview of big data in cloud computing
Overview of big data in cloud computing
 
Viet stack 2nd meetup - BigData in Cloud Computing
Viet stack 2nd meetup - BigData in Cloud ComputingViet stack 2nd meetup - BigData in Cloud Computing
Viet stack 2nd meetup - BigData in Cloud Computing
 
7 trends-for-big-data
7 trends-for-big-data7 trends-for-big-data
7 trends-for-big-data
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
 
The New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the CloudThe New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the Cloud
 
datavault2.pptx
datavault2.pptxdatavault2.pptx
datavault2.pptx
 
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォームPivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
 
Data Vault 2.0: Big Data Meets Data Warehousing
Data Vault 2.0: Big Data Meets Data WarehousingData Vault 2.0: Big Data Meets Data Warehousing
Data Vault 2.0: Big Data Meets Data Warehousing
 
Big Data or Data Warehousing? How to Leverage Both in the Enterprise
Big Data or Data Warehousing? How to Leverage Both in the EnterpriseBig Data or Data Warehousing? How to Leverage Both in the Enterprise
Big Data or Data Warehousing? How to Leverage Both in the Enterprise
 
Data Scientists: Your Must-Have Business Investment
Data Scientists: Your Must-Have Business InvestmentData Scientists: Your Must-Have Business Investment
Data Scientists: Your Must-Have Business Investment
 
Introduction to Spark Training
Introduction to Spark TrainingIntroduction to Spark Training
Introduction to Spark Training
 
Intro to Spark development
 Intro to Spark development  Intro to Spark development
Intro to Spark development
 
Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019
 
Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015
 
Guest Lecture on Big Data in Business,
Guest Lecture on Big Data in Business, Guest Lecture on Big Data in Business,
Guest Lecture on Big Data in Business,
 
Road Map for Careers in Big Data
Road Map for Careers in Big DataRoad Map for Careers in Big Data
Road Map for Careers in Big Data
 
Foundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information ArchitectureFoundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information Architecture
 
Prague data management meetup #30 2019-10-04
Prague data management meetup #30 2019-10-04Prague data management meetup #30 2019-10-04
Prague data management meetup #30 2019-10-04
 

Kürzlich hochgeladen

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

DLD Summer Workshop Big Data

  • 1. Big Data Workshop - DLD Summer 15 Big Data – Workshop DLD Summer 15 21/06/15, DLD Summer 15, @rjudas
  • 2. Big Data Workshop - DLD Summer 15 Understanding Big Data And getting the right mindset 21/06/15, DLD Summer 15, @rjudas
  • 3. Big Data Workshop - DLD Summer 15 Agenda  Syncing  Defining Big Data  Hype or Evolution  Tech Drivers  Big Data – Big Business?  What‘s it all about?  How do we get there? 21/06/15, DLD Summer 15, @rjudas
  • 4. Big Data Workshop - DLD Summer 15 Syncing 21/06/15, DLD Summer 15, @rjudas
  • 5. Big Data Workshop - DLD Summer 15 Syncing  Please tell us your opinion about Big Data  Please tell us about your Big Data projects 21/06/15, DLD Summer 15, @rjudas
  • 6. Big Data Workshop - DLD Summer 15 Defining Big Data 21/06/15, DLD Summer 15, @rjudas
  • 7. Big Data Workshop - DLD Summer 15 Definition(s) “Big Data describes datasets so large they become very difficult to manage with traditional database tools.” „big data is “data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the strictures of your database architectures”.“ "Very pragmatically, it's about building net-new analytic applications based on new types of data that (an organization) wasn't previously tracking." 21/06/15, DLD Summer 15, @rjudas
  • 8. Big Data Workshop - DLD Summer 15 The 3 V‘s  Variety  Tables, Images, Videos, XML, Logs  Velocity  Batch, Streams, Real- Time  Volume  Lot‘s of xBytes Variety VolumeVelocity 21/06/15, DLD Summer 15, @rjudas
  • 9. Big Data Workshop - DLD Summer 15 Variety  Mix of Data types  BLOB‘s and CLOB‘s  Images, Audio, Videos, Log Files  Semi-Structured, Unstructured  Email, EDI-Messages, Transaction Logs, Sensor- Data 21/06/15, DLD Summer 15, @rjudas
  • 10. Big Data Workshop - DLD Summer 15 Velocity  Crucial – Speed of „Feedback Loop“  Streaming Data  Complex Event Processing  From Batch to (Near) Real-Time  Different Lifetime 21/06/15, DLD Summer 15, @rjudas
  • 11. Big Data Workshop - DLD Summer 15 Volume - Big?  KiloByte  MegaByte  GigaByte  TeraByte  PetaByte  Exabtye  ZettaByte  YottaByte 21/06/15, DLD Summer 15, @rjudas
  • 12. Big Data Workshop - DLD Summer 15 Figures  „Digital Universe“ according to EMC/IDG Study 2014 in 2013 4.4 Zettabytes, in 2020 44 Zettabytes  All human speech ever spoken 42 Zettabyte (16kHz, 16bit)  2013 - Speculations about NSA Datacenter 1 YB, real estimation 3-12 EB  CERN / LHC Datacenter passes 100 PB 21/06/15, DLD Summer 15, @rjudas
  • 13. Big Data Workshop - DLD Summer 15 Volume – Most famous quote  2.5 Exabytes of Data Created each Day (2,500,000,000,000,000,000 bytes) ≈ 1 ZB/Year  (with 90% of World Data created in the last two years)  Source IBM CMO Study 2011 21/06/15, DLD Summer 15, @rjudas
  • 14. Big Data Workshop - DLD Summer 15 Even more V‘s  Veracity  Uncertainty of Data, Trustworthiness, Accountability  Value  Big Data only if it generates value  Visibility  Security, stitching together data from various sources  Validity  Logic inference, Correlation vs. Causation 21/06/15, DLD Summer 15, @rjudas
  • 15. Big Data Workshop - DLD Summer 15 Hype or Evolution? 21/06/15, DLD Summer 15, @rjudas
  • 16. Big Data Workshop - DLD Summer 15 Old wine?  OLTP, OLAP, DataWareHouse - Around since 1970s - ACID (Atomicity, Consistency, Isolation, Durability) - based on SQL 21/06/15, DLD Summer 15, @rjudas
  • 17. Big Data Workshop - DLD Summer 15 Big Data 15 years ago OLTP Orders Articles Receiving Orders, Articles, Receiving Etc. Data Warehouse Decision Support Systems (OLAP) 21/06/15, DLD Summer 15, @rjudas
  • 18. Big Data Workshop - DLD Summer 15 Business Intelligence 21/06/15, DLD Summer 15, @rjudas
  • 19. Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
  • 20. Big Data Workshop - DLD Summer 15 Enter Big Data http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation http://www.gartner.com/newsroom/id/1731916 http://chucksblog.emc.com/chucks_blog/2011/06/2011-idc-digital-universe-study-big-data-is-here-now-what.html 21/06/15, DLD Summer 15, @rjudas
  • 21. Big Data Workshop - DLD Summer 15 “New” Big Data  New Paradigm  BASE (Basic Availability, Soft State and Eventually consistency)  New Data Model  Data LifeCycle and Variability  Data Linking and referral integrity  New Analytics  Real-time/streaming analysis, interactive  Machine-learning  New Infrastructure and Tools  High Performance Computing, Storage, Network  Multi-Provider Services Integration  New Data Centric service models and security models 21/06/15, DLD Summer 15, @rjudas
  • 22. Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
  • 23. Big Data Workshop - DLD Summer 15 Hadoop on Premise Big Data Cluster Mgmt / Monitoring NoSQL NewSQL Databases MPP Databases Graph DB Crowd- sourcing Transfo rmation Security Storage App Dev Cross Infrastructure / Cloud Services Analytics Platform BI Platforms For Business Analysts Data Science / Platform Data Visualization Unstru ctured Data AI Social Analytics Analytic Services Machine Learning Location/Pe ople/Events Search Statistical Computing Log Analytics Crowd- source d RealTime SMB Frame- work Query Data Access Collab. workflow Real- Time Stat. Tools ML Data Source Sensors DataData Markets Incubators Cloud Deploy Gov / Regu lation Security Education / Learning Health Log Analytics Search Finance Human Capital Legal Marketing Publisher Tools Ad Optimi- zation 21/06/15, DLD Summer 15, @rjudas
  • 24. Big Data Workshop - DLD Summer 15 Big Data  Hype AND Evolution  Some Vendors use it to remarket “old” stuff  Many “new” products/services 21/06/15, DLD Summer 15, @rjudas
  • 25. Big Data Workshop - DLD Summer 15 Tech Drivers 21/06/15, DLD Summer 15, @rjudas
  • 26. Big Data Workshop - DLD Summer 15 Drivers  Vendors  Hardware, Storage, Network, Software  Business  Mobile  Social  Customer Insights  Technology  Open Source Technology, Cloud Computing 21/06/15, DLD Summer 15, @rjudas
  • 27. Big Data Workshop - DLD Summer 15 The Elephant in the Room 21/06/15, DLD Summer 15, @rjudas
  • 28. Big Data Workshop - DLD Summer 15 Hadoop - Hadoop is an Open Source „Big Data“ Framework - Distributed Storage (HDFS) and Processing (Map Reduce) - Reliable, Fault tolerant - Horizontal scalability from Single to thousands of Cluster Nodes - Cost 2.500$ / TB vs. 250.000$ / TB in Datawarehouses 21/06/15, DLD Summer 15, @rjudas
  • 29. Big Data Workshop - DLD Summer 15 MapReduce  Programming Model/Framework for processing large Data Sets 21/06/15, DLD Summer 15, @rjudas
  • 30. Big Data Workshop - DLD Summer 15 NoSQL Databases  Traditional RDBMS outdated for modern paradigms - Big Data - Connectivity - Concurrency - Diversity - Cloud 21/06/15, DLD Summer 15, @rjudas
  • 31. Big Data Workshop - DLD Summer 15 The difference – SQL / Tables 21/06/15, DLD Summer 15, @rjudas
  • 32. Big Data Workshop - DLD Summer 15 The NoSQL difference { _id: ObjectId(”2341"), type: "Article", author: ”Chris Boos", title: ”Introduction AutoPilot", date: ISODate("2015-04-21T13:21:12.343Z"), }, { _id: ObjectId(2342"), type: "Book", author: ”Roland Judas", title: ”Big Data", isbn: "978-0-213434235-5-7" } Document-based „User1“, „Roland Judas“ „User2“, „Chris Boos“ „User3“, „Charly Brown“ Key-Value Graph-Based Columns 21/06/15, DLD Summer 15, @rjudas
  • 33. Big Data Workshop - DLD Summer 15 Pros/Cons Hadoop / NoSQL  Pro  Highly flexible, agile, available, performant  Scalable  Modern, open technology with Commercial Support  Support for very large datasets on commodity hardware  Cons  Immature  No Standardization - Schema-free means Application needs to know how to retrieve data 21/06/15, DLD Summer 15, @rjudas
  • 34. Big Data Workshop - DLD Summer 15 Even more tools  Search/Index  Business Intelligence  Analytical Programming  Visualisation 21/06/15, DLD Summer 15, @rjudas
  • 35. Big Data Workshop - DLD Summer 15 Machine Learning 21/06/15, DLD Summer 15, @rjudas
  • 36. Big Data Workshop - DLD Summer 15 Big Data – Big Business? 21/06/15, DLD Summer 15, @rjudas
  • 37. Big Data Workshop - DLD Summer 15 Big Data Market  Big Data Market projected in 2015 – $125bn* (in comparison Public Cloud - $95bn**)  Big Funding  Cloudera – $1.2bn  MongoDB – $300m  HortonWorks – $250m  DataStax – $190m  BIRST – $130m * According to Forbes.co / 2014/12/11 / 6 Predictions for Big Data / IDC Research ** According to Forrester Research 21/06/15, DLD Summer 15, @rjudas
  • 38. Big Data Workshop - DLD Summer 15 Shares of Big Data Market 21/06/15, DLD Summer 15, @rjudas
  • 39. Big Data Workshop - DLD Summer 15 Vendors love Big Data 21/06/15, DLD Summer 15, @rjudas
  • 40. Big Data Workshop - DLD Summer 15 Vendors REALLY love Big Data! Latest in Corporate Tech: In-Memory  Oracle Exalytics  SAP HANA „Has SAP Bet The House With The Biggest Update to its ERP in Two Decades?“ http://www.forbes.com/sites/greatspeculations/2015/03/04/has-sap-bet-the-house-with-the-biggest-update-to-its-erp-in-two-decades/ 21/06/15, DLD Summer 15, @rjudas
  • 41. Big Data Workshop - DLD Summer 15 Even more Sales!!! 21/06/15, DLD Summer 15, @rjudas
  • 42. Big Data Workshop - DLD Summer 15 Best Practices DWH / BI / Big Data  Analyze problem / data / quality  Data Cleaning  Data quality initiatives  Sync Business / IT  Buy stuff  Implement stuff  Train users  Use governance / strategic approaches 21/06/15, DLD Summer 15, @rjudas
  • 43. Big Data Workshop - DLD Summer 15 And the success?  Through 2017, 60% of big data projects will fail to go beyond piloting and experimentation and will be abandoned.  Through 2017, fewer than half of lagging organizations will have made cultural or business model adjustments sufficient to benefit from big data.  Through 2018, 90% of deployed data lakes will be useless as they are overwhelmed with information assets captured for uncertain use cases. Gartner: Predicts 2015: Big Data Challenges Move From Technology to the Organization 21/06/15, DLD Summer 15, @rjudas
  • 44. Big Data Workshop - DLD Summer 15 Challenges  Usage Scenarios  Goals  Skills  Missing Data Scientists  Need to understand the Math  Technical  Data Integration  Privacy  Main discussion in Germany 21/06/15, DLD Summer 15, @rjudas
  • 45. Big Data Workshop - DLD Summer 15 Syncing  What‘s your opinion?  Do you have experience with big vendors offerings? 21/06/15, DLD Summer 15, @rjudas
  • 46. Big Data Workshop - DLD Summer 15 What‘s it all about? 21/06/15, DLD Summer 15, @rjudas
  • 47. Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
  • 48. Big Data Workshop - DLD Summer 15 What‘s it all about?  Data contains information of great business value  If you can extract those insights you can make far better decisions  Ultimately - Predicting the future 21/06/15, DLD Summer 15, @rjudas
  • 49. Big Data Workshop - DLD Summer 15 Common Use Cases  Customer Insights  Market Basket/Pricing optimization  Fraud Detection / Security Analytics  (Proactive) Monitoring  Sensor Data (IoT)  Data Warehouse Optimization 21/06/15, DLD Summer 15, @rjudas
  • 50. Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
  • 51. Big Data Workshop - DLD Summer 15 Understanding is important Data Understanding Connectedness Information Knowledge Intelligence/Wisdom Understanding relations Understanding patterns Understanding principles 21/06/15, DLD Summer 15, @rjudas
  • 52. Big Data Workshop - DLD Summer 15 How do we get there? 21/06/15, DLD Summer 15, @rjudas
  • 53. Big Data Workshop - DLD Summer 15 Syncing  Anyone heard about „Semantic Web“ or „Ontology“?  Anyone having experience or projects around Ontologies? 21/06/15, DLD Summer 15, @rjudas
  • 54. Big Data Workshop - DLD Summer 15 Mapping the territory  Enterprise Architecture (traditional)  „Holistic“ Approach  Many „Best practices“ and patterns  Big Data Discovery  Kind of Self-Service for Big Data  Next Big Thing?  Semantic Layer  Should exist from BI implementation (proprietary)  Or use modern approach “Linked Data” 21/06/15, DLD Summer 15, @rjudas
  • 55. Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
  • 56. Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
  • 57. Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
  • 58. Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
  • 59. Big Data Workshop - DLD Summer 15 Data + Semantic = Knowledge 21/06/15, DLD Summer 15, @rjudas
  • 60. Big Data Workshop - DLD Summer 15 Key is getting machine readable Data <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:admin="http://webns.net/mvcb/"> <foaf:PersonalProfileDocument rdf:about=""> <foaf:maker rdf:resource="#me"/> <foaf:primaryTopic rdf:resource="#me"/> </foaf:PersonalProfileDocument> <foaf:Person rdf:ID="me"> <foaf:name>Roland Judas</foaf:name> <foaf:title>Mr.</foaf:title> <foaf:givenname>Roland</foaf:givenname> <foaf:family_name>Judas</foaf:family_name> <foaf:homepage rdf:resource="http://about.me/rjudas"/> <foaf:workplaceHomepage rdf:resource="http://arago.co"/> <foaf:knows> <foaf:Person> <foaf:name>Chris Boos</foaf:name> </foaf:Person></foaf:knows></foaf:Person> </rdf:RDF> 21/06/15, DLD Summer 15, @rjudas
  • 61. Big Data Workshop - DLD Summer 15 Ontologies  “A Data Model that represents Knowledge as a set of concepts within a domain and the relationships between these concepts”  FOAF  Schema.org  DBPedia Ontology  Good Relations  http://www.w3.org/wiki/Good_Ontologies 21/06/15, DLD Summer 15, @rjudas
  • 62. Big Data Workshop - DLD Summer 15 Triples  Representation of facts PredicateSubject Object Is a (has type)Roland Person http://about.me/rjudas rdf:type foaf:Person 21/06/15, DLD Summer 15, @rjudas
  • 63. Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
  • 64. Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
  • 65. Big Data Workshop - DLD Summer 15 From Triples to Graphs Is a Person Roland likes DLD Songs plays Vertice / Node Edge 21/06/15, DLD Summer 15, @rjudas
  • 66. Big Data Workshop - DLD Summer 15 Famous Examples 21/06/15, DLD Summer 15, @rjudas
  • 67. Big Data Workshop - DLD Summer 15 A pragmatic Approach From the Basement 21/06/15, DLD Summer 15, @rjudas
  • 68. Big Data Workshop - DLD Summer 15 Bringing Pieces together Semantic Graphs Big DataAPIs 21/06/15, DLD Summer 15, @rjudas
  • 69. Big Data Workshop - DLD Summer 15 http://github.com/arago/ogit 21/06/15, DLD Summer 15, @rjudas
  • 70. Big Data Workshop - DLD Summer 1521/06/15, DLD Summer 15, @rjudas
  • 71. Big Data Workshop - DLD Summer 15 Semantic Data Platform 21/06/15, DLD Summer 15, @rjudas
  • 72. Big Data Workshop - DLD Summer 15 Visualization 21/06/15, DLD Summer 15, @rjudas
  • 73. Big Data Workshop - DLD Summer 15 Use Cases from/beyond the IT Department  Ticket Statistics  Provider Management  Network Planning  Comparing Architectures  Forecasting Technological Trends  Data Center Planning  Application Migration  Technical Analysis for Business Processes  IT Organisation Insights  User Ranking 21/06/15, DLD Summer 15, @rjudas
  • 74. Big Data Workshop - DLD Summer 15 The right Mindset Semantics Graphs APIs “New” Big Data Tools 21/06/15, DLD Summer 15, @rjudas
  • 75. Big Data Workshop - DLD Summer 15 www.autopilot.co www.graphit.co www.tabtab.co 21/06/15, DLD Summer
  • 76. Big Data Workshop - DLD Summer 15 Roland Judas  Frankfurt, Germany  Technical Evangelist, Product Manager at arago  Organizer Webmontag Frankfurt, Cloudcamp Frankfurt  Mail: rjudas@arago.de  Twitter:  @rjudas (en)  @rolandjudas (de)  http://about.me/rjudas 21/06/15, DLD Summer 15, @rjudas
  • 77. Big Data Workshop - DLD Summer 15 Image References and Licenses Facebook Datacenter https://www.flickr.com/photos/intelfreepress/ License CC BY 2.0 Winery https://www.flickr.com/photos/joceykinghorn/ License CC BY-SA 2.0 BI Dashboard https://www.flickr.com/photos/ctsi-global/ License CC BY-SA 2.0 Dollars https://www.flickr.com/photos/amagill/ License CC BY 2.0 Old Timer Truck: https://www.flickr.com/photos/ell-r-brown/ License CC BY 2.0 SQL Designer https://www.flickr.com/photos/ejk/ License CC BY-SA 2.0 Crystal Ball https://www.flickr.com/photos/frogman2212/ License CC BY 2.0 MapReduce https://www.flickr.com/photos/lkaestner/ License CC BY-SA 2.0 Foaf https://www.flickr.com/photos/dullhunk/ License CC BY 2.0 Linked Open Data Richard Cyganiak and Anja Jentzsch License CC BY-SA 3.0 Rear-View Mirror https://www.flickr.com/photos/labyrinthx-2/ License CC BY-SA 2.0 Servers-8055_13.jpg https://commons.wikimedia.org/wiki/User:Victorgrigas License CC BY-SA 3.0 Watson https://commons.wikimedia.org/wiki/User:Clockready License CC BY-SA 3.0 Wolfram Alpha https://www.flickr.com/photos/morville/ License CC BY 2.0 Social_Network_Visualization MartinGrandjean http://www.martingrandjean.ch/wp-content/ 21/06/15, DLD Summer 15, @rjudas

Hinweis der Redaktion

  1. http://wikibon.org/wiki/v/Big_Data_Vendor_Revenue_and_Market_Forecast_2013-2017
  2. http://www.webopedia.com/TERM/B/big_data.html http://www.forbes.com/sites/edddumbill/2014/05/07/defining-big-data/ http://www.informationweek.com/big-data/big-data-analytics/big-data-a-practical-definition/d/d-id/1111290
  3. CEP – Process streams in realtime and react on it, financial trading, fraud detection, (process) monitoring
  4. https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation Monocausation
  5. Top Players Commercial Microsoft, Hyperion (Oracle), Cognos (IBM), Business Objects (SAP) Open Source Pentaho, Jedox ACID – Computing Principle from the 70s Transaction safety, Isolation: Concurrency control
  6. reporting, online analytical processing, analytics, data mining, process mining, complex event processing, business performance management, benchmarking, text mining, predictive analytics and prescriptive analytics. Dash-Boards Drill-Down Data-Mining Also Predictive Challenge: unstructured data
  7. Google published End 2004 the MapReduce Algorithm and GFS Doug Cutting, Engineer at Yahoo implemented this at Yahoo Since 2008 Apache Foundation
  8. Cassandra, MongoDB, HBASE
  9. Key / Value: e.g. Redis, MemcacheDB, etc. Column: e.g. Cassandra, HBase, etc. Document: e.g. MongoDB, Couchbase, etc Graph: e.g. OrientDB, Neo4J, etc
  10. Key / Value: e.g. Redis, MemcacheDB, etc. Column: e.g. Cassandra, HBase, etc. Document: e.g. MongoDB, Couchbase, etc Graph: e.g. OrientDB, Neo4J, etc
  11. SOLR: Enterprise grade Search/Index Server ElasticSearch: Search/Indexserver Pentaho: Data Integration/ Business / Big Data Analytics, Jaspersoft Report/Analytics R: Statistical Programming Language, Revolution Analytics in 2015 acq. By Microsoft Python: Programming Language, Pandas: Big Data Gephi: Graph Visualization Tool D3: Java Library for Visualization More Tools at http://www.datamation.com/data-center/50-top-open-source-tools-for-big-data-1.html
  12. Apache Mahout: Highly Scalable Machine Learning Framework based on Hadoop Apache Spark: Cluster-Computing Framework adding SQL, R-Query and ML to Big Data Stores/Databases Azure ML: Microsoft Machine Learning as a Service Resources: https://www.udacity.com/course/intro-to-machine-learning--ud120 http://alex.smola.org/drafts/thebook.pdf
  13. http://www.forbes.com/sites/gilpress/2014/12/11/6-predictions-for-the-125-billion-big-data-analytics-market-in-2015/ Compared Public Cloud $95bn in 2015 according to Forrester Research http://wikibon.org/wiki/v/Big_Data_Vendor_Revenue_and_Market_Forecast_2013-2017
  14. Limited functionality , expensive 1 TB / $1m/yr / Cloud $0.6m So far, it appears to be limited to 1TB in size, Analytic Workloads and doesn’t support mission critical scenarios, but it’s fair to assume that SAP are working on extending this. https://blogs.saphana.com/2014/03/06/a-no-brainer-the-tco-of-hana-cloud-platform-vs-on-premise/
  15. According to Gartner
  16. Customer Insigths: Behavioral Analytics, Customer Segmentation
  17. Data: symbols Information: data that are processed to be useful; provides answers to "who", "what", "where", and "when" questions Knowledge: application of data and information; answers "how" questions Understanding: appreciation of "why“ Wisdom: evaluated understanding.
  18. http://www.systems-thinking.org/dikw/dikw.htm
  19. Gartner Says Big Data Disruptions Can Be Tamed With Enterprise Architecture http://www.gartner.com/newsroom/id/1986015 Data Wrangling, Analytical Latencies 1.  Data access 2.  Data preparation 3.  Model development 4.  Execution 5.  Implementation 6.  Model audit & update This is where the rubber meets the road: Speed = Value
  20. http://xmlns.com/foaf/spec/ - Wordpress, Identi.ca http://schema.org/ - Google, Yahoo, Bing, Yandex http://dbpedia.org/ontology/ http://www.heppnetz.de/projects/goodrelations/ - Google, Yahoo, Sears, Bestbuy
  21. RDF: Resource Definition Framework OWL: Web Ontology Language
  22. Examples: Facebook – Undirected Graph WWW – Directed Graph Intercity Road Network – Weighted, undirected Graph
  23. http://github.com/arago/ogit
  24. https://graphit.co/ogit/graph.php?dataset=ontology https://cassandra3.tech.arago.de:8443/_static/explorer/index.html