Big data e xposed from big data to smart data

•Als PPTX, PDF herunterladen•

1 gefällt mir•1,291 views

This is the deck I presented in the Big Data eXposed event, September 30, David Intercontinental, Israel. In this session I’ll take the audience to a short trip in the eXelate’s cloud and present three big data related challenges and how we faced them.

Technologie Business

1© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
From Big data
to Smart data
A journey into the
eXelate cloud
Motty Cohen,
Chief Architect, eXelate

2© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
eXelate is the smart
data company that
powers smarter
digital marketing
decisions worldwide
Advertiser
1st Party
Data
Data Providers
Offline
Data
Online
Data
Media Platforms
Modeling
Scoring
Segmentation
Analytics
Distribution
Marketing
Data Exchange Platform

3© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
• Demographic
• Age: 40-55
• Urbanicity: Suburban
• Income: High
• Education: Graduate Plus
• Employment: Management
• Interest
• Sport
• Travels
• Wines
• Gadgets
• Intent
• Travel to Barcelona
• 4-star resort
Smart Data:
Accurate &
actionable audience
segmentation

4© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Our journey begins
in the browser

5© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Inside eXelate
Cloud:
Real-time Serving
& Smart data
delivery
Get Event Info
Add History
Data
Apply Rules &
Models
Sell to buyers

6© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Challenges
Big Data
Relevancy Access Time
On demand
Analytics

7© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013

8© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Challenge 1:
Relevancy
Grabbing the
relevant audience
on site, on time

9© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Generating Models
Model
Model
Model
Data Mining
Analytics
Create Models
Netezza
tables
Running
Analytics on
Amazon
Java
Packages

10© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Real time
segmentation:
Running rules and
models
Basic
Rules
Association
Rules
Analytic
Models
Model
Model
Model
Can we run all these within the limited time frame?

11© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Continuous
Incremental
Segmentation
Continuous Incremental Segmentation

12© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Challenge 2:
Fast access to
distributed big
storage

13© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
User Object • User Info
• Segments, Delivery info, Intermediate results
• Object Size: x10 KB ~ x100 KB
• ~ 850M UU
• Access time
• Read / Write within a few ms
• Availability
• For any machine in the cluster
• For any cluster in every data center

14© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Aerospike:
Frontend storage
for fast access
Aerospike Cluster
Serving Cluster
XDR: Cross Data Center Replication
Optimized for SSD, Indexed in RAM
Smart Eviction Policy
Fast read/writes: 500K+ TPS
Key-value NoSQL distributed DB

15© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Replicated storage
across data centers
US WEST
CA
US CENRAL
TX
EUROPE
NL
US EAST
NY
Aerospike XDR:
Cross Datacenter
Replication

16© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Challenge 3:
On demand
analytics
Show me the data,
Now!

17© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
optiX:
Interactive data
analytics
On Demand Calculation

18© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
optiX:
Interactive data
analytics
On Demand Calculation

19© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Data Center
Elastic Search:
Using search engine
for counting.
Netezza
DWH
Aggregator
ES Cluster
(30 Nodes)
Reporter
S3
Loader
optiX
REST FTP

20© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
What did we have
so far?
• Data relevancy
• Real-time scoring
• Parallel processing
• Split processing over time
• Big data access time
• Front end, Replicated, Aerospike cluster
• On-demand analytics
• Change your schema to optimize query time
• Move processing from querying to loading phase
• Trade off: Space + Processing -> Performance

21© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Thank You
Questions?

Empfohlen

Top 20 Big Data Tools 2019Bibrainia

Denodo Data Innovation Award: Creating a Logical Data Fabric to Digitize City...Denodo

How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...Denodo

Webinar - Fighting Bank Fraud with Real-time Graph Database DataStax

Kasabi Linked Data MarketplaceLeigh Dodds

Webinar: Rearchitecting Storage for the Next Wave of Splunk Data GrowthStorage Switzerland

Using NoSQL and Enterprise Shared Services (ESS) to Achieve a More Efficient ...MongoDB

Couchbase & HPCC Systems – A complete mobile & data platform in the enterpriseHPCC Systems

Empfohlen

Top 20 Big Data Tools 2019Bibrainia

Denodo Data Innovation Award: Creating a Logical Data Fabric to Digitize City...Denodo

How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...Denodo

Webinar - Fighting Bank Fraud with Real-time Graph Database DataStax

Kasabi Linked Data MarketplaceLeigh Dodds

Webinar: Rearchitecting Storage for the Next Wave of Splunk Data GrowthStorage Switzerland

Using NoSQL and Enterprise Shared Services (ESS) to Achieve a More Efficient ...MongoDB

Couchbase & HPCC Systems – A complete mobile & data platform in the enterpriseHPCC Systems

Creating Real-time Systems of Engagement with Analytics and Big DataMongoDB

Building a Consistent Hybrid Cloud Semantic Model In DenodoDenodo

Gianluigi Viganò - How to use HP HEAVEN-on-demand functions for Big Data appsCodemotion

Build robust streaming data pipelines with MongoDB and Kafka P2Ashnikbiz

Powering Asurion's Connected Home Platform with Spark Structured Streaming, D...Databricks

A Connections-first Approach to Supply Chain OptimizationNeo4j

Raising the Tides: Open Source Analytics for Data ScienceWes McKinney

[MongoDB.local Bengaluru 2018] Rapid Development at Scale with MongoDB at KoinexMongoDB

DMTI Spatial Location Hub Analytics: big data, analytics, visualizationDMTI Spatial

[Keynote HP] Guido Pezzin - Big Data - from theory to practice with the simpl...Codemotion

Big Data TelecomTrick Consulting

Demystifying Data Virtualization: Why it’s Now Critical for Your Data StrategyDenodo

Cloud Modernization and Data as a Service OptionDenodo

Bangalore Executive Seminar 2015: MongoDB - Your database of choice for real ...MongoDB

Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dataconomy Media

From Data to Insights to Action: When Transactions and Analytics ConvergeAli Hodroj

Building a Modern FinTech Big Data InfrastructureDatabricks

The Virtualization of Clouds - The New Enterprise Data Architecture OpportunityDenodo

Geo-Analytics with Apache Spark and In-Memory Data GridsAli Hodroj

MongoDB Days UK: Building an Enterprise Data Fabric at Royal Bank of Scotland...MongoDB

DataStaxMichael Shaler

Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnectaDigital

Weitere ähnliche Inhalte

Was ist angesagt?