SlideShare ist ein Scribd-Unternehmen logo
1 von 46
Downloaden Sie, um offline zu lesen
Beolink.org!
Data replication 



Fabrizio Manfredi Furuholmen

"
Beolink.org!
FOSDEM 2014"
2"
Agenda
!  Introduction
!  overview
!  Theorem
!  Common Pattern
!  Implementation
!  Filesystem
!  RDBMS
!  Nosql
!  Framework
!  Example
Beolink.org!
3"
Data Replication
http://blog.open-e.com/in-a-nutshell-data-replication-snapshots-and-backup/"
Beolink.org!
4"
Data Replication
http://www.dreamstime.com/stock-images-cloud-computing-scalability-reliability-background-concept-word-image34898574"
Beolink.org!
5"
Introduction
Beolink.org!
6"
World Connection
Beolink.org!
7"
Main Problem
VS!
Beolink.org!
8"
Main Problem
Beolink.org!
9"
CAP theorem
According to Brewer’s CAP theorem, it is impossible for any distributed computer
system to simultaneously provide all three of Consistency, Availability and
Partition Tolerance."
"
You "
can’t have the three at the
same time !
and get an acceptable latency."
Beolink.org!
10"
CAP
ACID!
!
Atomic: Everything in a transaction succeeds or the entire
transaction is rolled back."
Consistent: A transaction cannot leave the database in an
inconsistent state."
Isolated: Transactions cannot interfere with each other."
Durable: Completed transactions persist, even when servers
restart etc."
"
-  Strong consistency for transaction highest priority"
-  Pessimistic"
-  Complex mechanisms"
"
-  Availability and scaling highest priorities"
-  Weak consistency"
-  Optimistic"
-  Best Effort"
-  Simple and FAST "
Basic Availability"
Soft-state"
Eventual consistency"
"
BASE"
"
RDBMS!
NoSQL!
Beolink.org!
11"
Data Distribution
Business Decision!
Beolink.org!
12"
Start with some Algorithms
Beolink.org!
13"
Data Distribution
Replication!
Data
Placement"
Data
Consistency"
System
Coordination"
Data
Transmission"
Beolink.org!
14"
Data Placement
Better Distribution = partitioning !
Parallel operation = parallel stream/multi core!
!
Beolink.org!
15"
Data Placement
Beolink.org!
16"
Data placement by HASH
It isn’t rocket science !!
Beolink.org!
17"
Data Distribution
http://www.cs.rutgers.edu/~pxk/417/notes/23-lookup.html"
Consistent HASH!
Chord"
Space base/multi dimension"
Beolink.org!
18"
Data placement
http://highlyscalable.wordpress.com/2012/09/18/distributed-algorithms-in-nosql-databases/"
Vnode base" Proximity base"
Replication"
Beolink.org!
19"
Data Consistency
http://highlyscalable.wordpress.com/2012/09/18/distributed-algorithms-in-nosql-databases/"
To avoid ACID implementation but to
guarantee the consistency some solution
leave to the client the ownership of the
algorithm."
"
-  Read and Write quorum!
-  Write quorum Read all!
Beolink.org!
20"
Data Consistency
http://highlyscalable.wordpress.com/2012/09/18/distributed-algorithms-in-nosql-databases/"
To avoid ACID implementation but to
guarantee the consistency some solution
leave to the client the ownership of the
algorithm."
"
-  Read and Write quorum!
-  Write quorum Read all!
Beolink.org!
21"
Coordination Protocol
Consensus protocol!
"
Paxos , Raft, ect"
"
Based on the state machine approach (The state machine
approach is a technique for converting an algorithm into a
fault-tolerant, distributed implementation. )"
"
"
"
"
Epidemic (Gossip)!
"
epidemic: anybody can infect anyone "
else with equal probability"
"
"
"
"
"
"
Anti-entropy protocols assume
that synchronization is
performed by a fixed schedule
– every node regularly chooses
another node at random or by
some rule and exchanges
database contents, resolving
differences. "
O(log n)"
http://www.cis.cornell.edu/IAI/events/Gossip_Tutorial.pdf"
Beolink.org!
22"
Transmission Protocol
Optimization!
-  Re order"
-  Deduplication"
"
!
Transmission"
-  By difference (Merkel tree) "
-  Callback "
-  Compression"
-  Auto correction"
Locking!
-  Distributed locking"
-  Multiversioning"
-  …"
!
"
mitosis!
Beolink.org!
23"
Implementation
Beolink.org!
24"
Answer …no Answer
Block replication, file
Information
Document , blog,
session
Content with a TTL
over a 1m 
Distributed file system
RDMBS

NoSQL
Caching system
Beolink.org!
25"
Distributed Filesystem
DFS is a service that provides a single point of reference and
a logical tree structure for file system resources that may be
physically located anywhere on the network."
"
"
One significant responsibility of a file system is to ensure
that, regardless of the actions by programs accessing the
data, the structure remains consistent…"
Beolink.org!
26"
Filesystem
"
"
Properties of DFS!
"
•  Simple from application point of view"
•  Data consistency"
"
Base on the solution!
"
•  Partitioning Tolerance "
•  Scalability"
•  High Avaibility "
"
"
"
Beolink.org!
27"
Filesystem DRDB
DRDB!
!
Replication mode: Asynchronous, Memory
synchronous , Synchronous "
Transfer optimization: DRProxy "
"
"
Main Goals!
!
Disk replication, single service availability"
"
Disaster Recovery"
"
"
Beolink.org!
28"
Filesystem CEPH
"
"
Ceph!
Data distribution: Hash base"
Consensus protocol: Raft for consensus"
Write mode: Write one, read one, client is
notified when all replicas have been written"
Weak consistency with cache pool"
"
"
Openstack Backednd at Cern"
"
1128 OSDs"
3PB"
XXX vms"
"
http://www.slideshare.net/"
Inktank_Ceph/scaling-ceph-at-cern "
Main Goals!
!
- Blockdevice/base for other filesystem"
- Cloud support, image storage and vm
storage"
"
"
Beolink.org!
29"
CEPH
"
"
Users: > 5000"
VMs > 7000"
> 250k VMs spawned"
http://www.synnefo.org/resources.html"
Beolink.org!
30"
RDBMS
"
"
Property of RDBMS!
"
•  Quite Simple from application point of view"
•  Data consistency"
"
Base on the solution!
"
•  Low Partitioning Tolerance "
•  Low Scalability"
•  Low High Availability "
"
"
"
Beolink.org!
31"
RDBMS
!
Asynchronous Replication"
Semi synchronous"
"
Postgres"
Synchronous"
Asynchronous"
Beolink.org!
32"
NoSQL
Properties of DFS!
"
•  Fast"
"
"
Base on the solution!
"
•  Partitioning Tolerance "
•  Scalability"
•  High Availability"
•  Simple "
"
"
"
Beolink.org!
33"
NoSQL Performance
http://planetcassandra.org/nosql-performance-benchmarks/"
Beolink.org!
34"
Riak
Geo Replication!
Tunable trade-offs for distribution and
replication (N, R, W) "
Distributed Hash Table"
Beolink.org!
35"
Filesystem over NoSQL
FUSE!
In most of the case non stable"
!
S3 Interface!
Internet standard de facto"
Beolink.org!
36"
Filesystem over NoSQL
Wooga"
http://www.slideshare.net/wooga/riak-at-woogariak-meetup-sept-2013?
qid=4809eca2-8378-4e70-8e75-0db29b635fa5&v=qf1&b=&from_search=3"
https://fosdem.org/2014/schedule/event/nyt_cassandra/"
Beolink.org!
37"
Combine different solution
37"
Edge node (Varnish)!
Nosql!
Local!
cache!
Centralize!
cache!
Info!
Storage!
DFS!
Origin (Distribute cache)!
Local!
DB! Nosql!
Decreasethenumberoftherequests!
Increaseoftheageofthedata!
Beolink.org!
38"
Framework
Build your system if you
need … "
"
"
….do you really need"
CERN"
CERN"
Beolink.org!
39"
Framework
Don’t forget Rsync !!
Beolink.org!
40"
Framework
Replication or Caching ?!
Beolink.org!
41"
Build a solution
•  Split in pieces"
•  Track version "
•  Transfer when needed"
•  Transfer the difference"
•  Use Notification when is possible"
•  Move data close to computation"
•  Move master close to write operation"
•  Split counter to avoid dead lock"
•  In HTTP don’t forget the Etag and lastmodify"
"
"
"
openkad!
open-chord!
openReplica!
Raft!
Beolink.org!
42"
Build a solution
Beolink.org!"
Five pylons
43"
Objects"
• Separation
btw data and
metadata"
•  Each element
is marked with
a revision"
• Each element
is marked with
an hash."
Cache"
•  Client side"
•  Callback/
Notify"
•  Persistent!
Transmission"
•  Parallel
operation"
•  Http like
protocol"
•  Compression"
•  Transfer by
difference"
Distribution"
• Resource
discovery by
DNS"
• Data spread
on multi node
cluster"
• Decentralize!
• Independents
cluster!
• Data
Replication!
Security"
• Secure
connection"
•  Encryption
client side,"
•  Extend ACL"
•  Delegation/
Federation!
• Admin
Delegation!
Beolink.org!
44"
Build a solution
-  Consistent HASH"
-  Zmq transport protocol"
-  Gossip protocol for failure detection"
-  Tunable trade-offs "
"
Pisa is a simple block data replication !
on a wide range of node!
Beolink.org!"
And …
45"
“There is always a failure waiting
around the corner”"
*Werner Vogel!
"
Beolink.org!!
Thank you



http://restfs.beolink.org



manfred.furuholmen@gmail.com

"

Weitere ähnliche Inhalte

Ähnlich wie OSDC 2014: Fabrizio Manfredi - Data replication

ParaForming - Patterns and Refactoring for Parallel Programming
ParaForming - Patterns and Refactoring for Parallel ProgrammingParaForming - Patterns and Refactoring for Parallel Programming
ParaForming - Patterns and Refactoring for Parallel Programming
khstandrews
 
The economies of scaling software - Abdel Remani
The economies of scaling software - Abdel RemaniThe economies of scaling software - Abdel Remani
The economies of scaling software - Abdel Remani
jaxconf
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling Software
Abdelmonaim Remani
 
A DevOps Perspective: MongoDB & MMF
A DevOps Perspective: MongoDB & MMFA DevOps Perspective: MongoDB & MMF
A DevOps Perspective: MongoDB & MMF
MapMyFitness
 
The Ultimate Deobfuscator - ToorCON San Diego 2008
The Ultimate Deobfuscator - ToorCON San Diego 2008The Ultimate Deobfuscator - ToorCON San Diego 2008
The Ultimate Deobfuscator - ToorCON San Diego 2008
Stephan Chenette
 
Software Architectures, Week 5 - Advanced Architectures
Software Architectures, Week 5 - Advanced ArchitecturesSoftware Architectures, Week 5 - Advanced Architectures
Software Architectures, Week 5 - Advanced Architectures
Angelos Kapsimanis
 
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
Amazon Web Services
 

Ähnlich wie OSDC 2014: Fabrizio Manfredi - Data replication (20)

ParaForming - Patterns and Refactoring for Parallel Programming
ParaForming - Patterns and Refactoring for Parallel ProgrammingParaForming - Patterns and Refactoring for Parallel Programming
ParaForming - Patterns and Refactoring for Parallel Programming
 
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah BardUsing Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
 
NoSQL and MongoDB Introdction
NoSQL and MongoDB IntrodctionNoSQL and MongoDB Introdction
NoSQL and MongoDB Introdction
 
A Taste of Clojure
A Taste of ClojureA Taste of Clojure
A Taste of Clojure
 
GlobusWorld 2015
GlobusWorld 2015GlobusWorld 2015
GlobusWorld 2015
 
The economies of scaling software - Abdel Remani
The economies of scaling software - Abdel RemaniThe economies of scaling software - Abdel Remani
The economies of scaling software - Abdel Remani
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling Software
 
2015 09 emc lsug
2015 09 emc lsug2015 09 emc lsug
2015 09 emc lsug
 
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...
 
A DevOps Perspective: MongoDB & MMF
A DevOps Perspective: MongoDB & MMFA DevOps Perspective: MongoDB & MMF
A DevOps Perspective: MongoDB & MMF
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutes
 
Hidden pearls for High-Performance-Persistence
Hidden pearls for High-Performance-PersistenceHidden pearls for High-Performance-Persistence
Hidden pearls for High-Performance-Persistence
 
The Ultimate Deobfuscator - ToorCON San Diego 2008
The Ultimate Deobfuscator - ToorCON San Diego 2008The Ultimate Deobfuscator - ToorCON San Diego 2008
The Ultimate Deobfuscator - ToorCON San Diego 2008
 
Climb bath
Climb bathClimb bath
Climb bath
 
Best Practices to create High Load Websites
Best Practices to create High Load WebsitesBest Practices to create High Load Websites
Best Practices to create High Load Websites
 
Software Architectures, Week 5 - Advanced Architectures
Software Architectures, Week 5 - Advanced ArchitecturesSoftware Architectures, Week 5 - Advanced Architectures
Software Architectures, Week 5 - Advanced Architectures
 
Tackling complexity in giant systems: approaches from several cloud providers
Tackling complexity in giant systems: approaches from several cloud providersTackling complexity in giant systems: approaches from several cloud providers
Tackling complexity in giant systems: approaches from several cloud providers
 
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
 
Open Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOCOpen Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOC
 
Introduction to Storm
Introduction to StormIntroduction to Storm
Introduction to Storm
 

Kürzlich hochgeladen

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 

Kürzlich hochgeladen (20)

%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 

OSDC 2014: Fabrizio Manfredi - Data replication