Big Data & Cloud Expo

•Als PPTX, PDF herunterladen•

1 gefällt mir•3,921 views

The document discusses big data and cloud computing. It defines big data as large and complex data sets that are difficult to process using traditional database tools. It notes that the volume of data is growing rapidly, expected to increase over 40 times from 2010 to 2020. The document presents examples of how companies like Walmart and Target are using big data analytics in the cloud to gain business insights from their customer data.

Big Data & Cloud

Infinite Monkey Theorem

CloudCon Expo & Conference
October, 2012

First
What is Big Data?

“data sets so large and complex that it becomes
difficult to process using on-hand database
management tools.”

10/19/2012 Infochimps Confidential 2

Data Volume
Growing 44x

2010 = 1.2 2020 = 35.2
Zettabytes/yr Zettabytes/yr

Source: 2011 IDC Digital Universe Study
10/19/2012 Infochimps Confidential 3

Enterprise Data Warehouse
Request Answer
Parsing
? Engines

BYNET Interconnect

Amp Amp Amp
Node Node Node

....

PARC | 4

Big Data Warehouse
Search Recommend

Rank
Analytic
Request Master: Answer
Score Next-Best-Action Name Node
Job Tracker

Ethernet Interconnect

Slave: Slave: Slave:
Task Trckr Task Trckr Task Trckr
Data Node Data Node Data Node

Semi-
.... Structured
Data

PARC | 5

Real
Time

Traditional Operational
Application Ecosystem

Deployment in
Analytic Public/Private Cloud
Appliances
Toolset Integration

Traditional
Decision Support Hardened

Batch
Large Small
Enterprise Enterprise

10/19/2012 Infochimps Confidential 6

Next
Infinite Monkey Theorem (2):

an infinite number of monkeys hitting
keys on a typewriter for a period of time
will almost surely type a given text, such
as Shakespeare”s Hamlet.

10/19/2012 Infochimps Confidential 7

“unexperienced and unobservable“
based on
“real experiences and real
observations“

10/19/2012 Infochimps Confidential 8

““
Infinite Monkey Theorem (2):

an infinite number of monkeys hitting keys
on a typewriterfor a period of time will
atypewriter for a period of time will
almost surely type a given text, such as
Shakespeare”s Hamlet.

10/19/2012 Infochimps Confidential 9

infinite number keys on a almost Shakespeare”s
of monkeys typewriter surely Hamlet

unlimited processing statistically insights
computational data significant
power

10/19/2012 Infochimps Confidential 10

#thisischimpy

10/19/2012 Infochimps Confidential 11

Problem
“Little Data For Business Users“

10/19/2012 Infochimps Confidential 12

“Big Data For Business Users“

10/19/2012 Infochimps Confidential 15

Reduce
Friction

$ $
$ $

?

Executive
Data

10/19/2012 Infochimps Confidential
16

#thisisreallygood

10/19/2012 Infochimps Confidential 17

Public

unlimited
computational
power
Private
Virtual
Private

10/19/2012 Infochimps Confidential 18

analysts use these images to
count shipping containers
coming off ships in California
and are able to get a sense of
overall US import activity

10/19/2012 Infochimps Confidential 19

Public

data
processing

Private
Virtual
Private

10/19/2012 Infochimps Confidential 20

Walmart

10/19/2012 Infochimps Confidential 21

Target

10/19/2012 Infochimps Confidential 22

Images Web, Mobile, CRM,
ERP, SCM…

Business
Docs,
Transactions &
Text Interactions

Web
Logs SQL NoSQL NewSQL

Social EDW MPP NewSQL

Sensors Business
Intelligence &
Analytics
Dashboards, Reports
GPS Visualization…

10/19/2012 Infochimps Confidential 23

Public

statistically
significant

Private
Virtual
Private

10/19/2012 Infochimps Confidential 24

#lotsofdata + #simplealgorithms

10/19/2012 Infochimps Confidential 25

Cars
In Lot

News
Text

Web
Pricing Quarterly
Revenue
Prediction
Social
Sentiment

Weather
Sensors

Local
Employment

10/19/2012 Infochimps Confidential 26

Public

insights

Private
Virtual
Private

10/19/2012 Infochimps Confidential 27

New Media
Data Scientist App Developer
Gnip
Powertrack
Business Users

Gnip
EDC

Sources Sentiment

Moreover
Metabase
In-Motion
Data Delivery APIs Listening
Service Application
TV
Transcription
NoSQL

Radio
Transcription

Print
Transcription
IT Staff
Traditional Media
10/19/2012 Infochimps Confidential 28

unlimited processing statistically insights
computational data significant
power

10/19/2012 Infochimps Confidential 29

#1BigDataCloudService

10/19/2012 Infochimps Confidential 30

#inspiredbyAvinashKaushik

10/19/2012 Infochimps Confidential 31

Empfohlen

Infochimps Cloudcon 2012Jim Kaskade

Infochimps CxO Seminar @ PARCJim Kaskade

Vmware Serengeti - Based on Infochimps IronfanJim Kaskade

RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...Databricks

Streamline Data Governance with Egeria: The Industry's First Open Metadata St...DataWorks Summit

Data Wrangling on Hadoop - Olivier De Garrigues, Trifactahuguk

Flink Meetup Septmeber 2017 2018Christos Hadjinikolis

Predictive Analytics: Why (I)IoT Is DifferentAltoros

Empfohlen

Infochimps Cloudcon 2012Jim Kaskade

Infochimps CxO Seminar @ PARCJim Kaskade

Vmware Serengeti - Based on Infochimps IronfanJim Kaskade

RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...Databricks

Streamline Data Governance with Egeria: The Industry's First Open Metadata St...DataWorks Summit

Data Wrangling on Hadoop - Olivier De Garrigues, Trifactahuguk

Flink Meetup Septmeber 2017 2018Christos Hadjinikolis

Predictive Analytics: Why (I)IoT Is DifferentAltoros

Data Wrangling and the Art of Big Data DiscoveryInside Analysis

Rabobank - There is something about DataBigDataExpo

Introduction to Neo4jNeo4j

WEBINAR: Emerging Technologies in Supply ChainFlytBase

Linkurious Enterprise: graph visualization platform neo4jLinkurious

Big Data Scotland 2017Ray Bugg

Introduction to Deep Learning and AI at Scale for ManagersDataWorks Summit

Session 2.3 semantics for safeguarding & security – a police storysemanticsconference

Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...Dublinked .

Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...Dublinked .

The 3 Key Barriers Keeping Companies from Deploying Data Products Dataiku

Session 1.1 linked data applied: a field report from the netherlandssemanticsconference

Improving Response Times at Optum with Elastic APMElasticsearch

Data Science Application in Business Portfolio & Risk ManagementData Science Thailand

Using a Semantic and Graph-based Data Catalog in a Modern Data FabricCambridge Semantics

Translating the Human Analog to Digital with GraphsNeo4j

EclipseCon France 2015 - Science TrackBoris Adryan

Accelerating Big Data Implementations for the Connected WorldDataWorks Summit/Hadoop Summit

Action from Insight - Joining the 2 Percent Who are Getting Big Data RightStampedeCon

The lean principles of data opsLars Albertsson

Infochimps + CloudCon: Infinite Monkey TheoremInfochimps, a CSC Big Data Business

Big data - teams not technologyUpside Energy Ltd

Weitere ähnliche Inhalte

Was ist angesagt?