Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
every read receives the most recent write
A B
WRITE
X=1
READ
X=1
CONSISTENCY
X=1 X=1
every read receives the most recent write
A B
WRITE
X=2
CONSISTENCY
X=1 X=1
every read receives the most recent write
A B
WRITE
X=2
CONSISTENCY
X=2 X=1
READ ERROR!!!
read and writes succeed on all nodes
A B
WRITE
X=2
READ
X=1
AVAILABILITY
X=2 X=1
M
R1 R2
syncsync
read read
write
Scaling Reads
M
R1 R2
asyncasync
read read
write
Scaling Reads
DDBS PA PC EL EC
SQL Server Yes Yes
CosmosDB Yes Yes
PNUTS Yes Yes
Cassandra Yes Yes
Riak Yes Yes
VoltDB/H-Store Yes Yes
M...
ONE SIZE
FITS ALL
A C I D
atomicity
consistency
isolation
durability
ACID
Experience at Amazon has shown that
data stores that provide ACID guarantees
tend to have poor availability. This has...
scalability
fault tolerance
Dynamo Databases
Requirements
Query Model: simple read and write operations by a key.
ACID: Dynamo targets the design space of an ‘Always W...
Data
Partitioning
N3 N1
N2
500
250750
0 let dataPlacement = data
|> getMD5hash
|> getFirst8Bytes
|> toInteger
A
C
N3 N1
N2
500
250750
0 let dataPlacement = data
|> getMD5hash
|> getFirst8Bytes
|> toInteger
A
C
N3 N1
N2
500
250750
0 let dataPlacement = data
|> getMD5hash
|> getFirst8Bytes
|> toInteger
A
C
N3
N2
500
250750
0 let dataPlacement = data
|> getMD5hash
|> getFirst8Bytes
|> toInteger
A
C
N3
N1
500
250750
0 let dataPlacement = data
|> getMD5hash
|> getFirst8Bytes
|> toInteger
A
C
N
500
250750
0
N
N
N
N
N
N
N
N
N
N
N
N
500
250750
0
N
N
N
N
N
N
N
N
N
N
N
N1
N1
N
500
250750
0
N
N
N
N
N
N
N
N
N
NON-UNIFORM
DATA & LOAD
DISTIBUTION
VIRTUAL NODES
WEEK Consistency
SLOPPY QUORUM
R + W > N
R – minimum number of nodes for read
W - minimum number of nodes for write
N – number of replicas
D B
A
C
PUT
W=2
S=4
N=3
2 + 2 > 3
R – minimum number of nodes for read
W - minimum number of nodes for write
N – number of...
ALWAYS
WRITABLE
D B
A
C
PUT
W=2
S=4
N=3
D B
A
C
PUT
W=2
S=4
N=3
D B
A
C
PUT
S=4
N=3
W=2
Hinted
Handoff
D B
A
C
Hinted
Handoff
W=2
S=4
N=3
temporary
D
AHinted
Handoff
W=2
S=4
N=3
temporary
B
C
D
AHinted
Handoff
W=2
S=4
N=3
B
C
CONFLICTS
LWW
VECTOR CLOCKS
CRDT
CAUSALITY
by timestamps
00.11.12 < 00.11.18
A
US Data Center
B
EU Data Center
t1 ADD cart ‘pizza’
t2 US Data Center Fails
Sync Fails
t3 ADD cart ‘vodka’ (Last Write W...
VECTOR CLOCKS
1988
N1
N1
N2 N3
N1
write handled by N1
write handled by N1
D1[N1, 1]
D2[N1, 2]
(nodeName, counter) map
VECTOR CLOCKS
D3[(N1, 2...
CRDT
2009
CRDT
Conflict-free
Replicated
Data
Type
CRDT
Commutativity: x * y = y * x
Associativity: (x * y) * z = x * (y * z)
Idempotency: x * x = x
System.Int32
addition: (+)
+
Commutativity: 1 + 2 = 2 + 1
Associativity: (1 + 2) + 3 = 1 + (2 + 3)
Idempotency: 1 + 1 ≠ 1
Set<T>
merge: (U)
U
Commutativity: 1 U 2 = 2 U 1
Associativity: (1 U 2) U 3 = 1 U (2 U 3)
Idempotency: 1 U 1 = 1
42
{ }
{ }
{ }
{ }
{ 55 }
{ }
55
{ 55 }
{ 55 }
{ }
{ 55 }
{ 55 }
{ 55 }
{ 55 }
{ 55 }
{ 55, 73 }
73
{ 55 }
{ 55 }
{ 55, 73 }
{ 55, 73 }
{ 55 }
{ 55, 73 }
{ 55, 73 }
{ 55 }
{ 55, 73 }
{ 55, 73 }
{ 55 }
{ 55, 73 }2
{ 55, 73 } U { 55 } U { 55, 73 } = { 55, 73 }
READ REPAIR
{ 55, 73 } ∆ { 55 } ∆ { 55, 73 } = { 73 }
{ 55, 73 }
{ 55, 73 }
{ 55, 73 }2
on
CRDT
on
LWW-Set
on
(timestamp, data)
let toLwwSet (tweet) =
(tweet.Timestamp, tweet)
let tweet = {
userId = “12bf-58ac-9pi6”
timestamp: “2018:09:01 12:30:00”
m...
(time,tweet)
add remove
node 1
add remove
node 2
(2018,a)
add remove
node 1
add remove
node 2
(2018,a) (2018,a)
(2018,b)
add remove
node 1
add remove
node 2
(2018,a) (2018,a)
(2018,b)
add remove
node 1
add remove
node 2
(2018,a) (2018,a)
(2018,b) (2018,b)
add remove
node 1
add remove
node 2
(2018,a) (2018,a)
(2018,b) (2018,b)
(2018,c)
add remove
node 1
add remove
node 2
(2018,a) (2018,a)
(2018,b) (2018,b)
(2018,c)
add remove
node 1
add remove
node 2
(2018,a) (2018,a)
(2018,b) (2018,b)
(2018,c)
(2018,c)
add remove
node 1
add remove
node 2
(2018,a) (2018,a)
(2018,b) (2018,b)
(2018,c)
(2018,c)
(2018,a) (2018,b)
(2018,c)
(2018,c)
READ REPAIR
THANKS
@antyadev
antyadev@gmail.com
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems
Nächste SlideShare
Wird geladen in …5
×

.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems

270 Aufrufe

Veröffentlicht am

В данном докладе я попытаюсь раскрыть текущее положение вещей относительно CAP теоремы в разрезе консистентности данных, а также баз данных в целом. Вы получите понимание о том, почему SQL Server - это не всегда лучший выбор, и даже техники тюнинга запросов вам могут не помочь. Постараюсь ответить на вопросы о том, какие базы данных стоит выбирать при проектировании систем заточенных на: intensive read/write, elastic scalability, fault tolerance. Также мы пройдемся по паттернам моделирования консистентности данных в распределенных системах, вы увидите то, как с этим справляются такие базы как Cassandra, Aerospike, MongoDb. В конце доклада у вас появится лучшее понимание о возможных компромиссах при моделировании распределенных систем, микросервисов.
Agenda:
- Problems with using CAP for real distributed systems: what you should use instead when you design or picking a new technology
- Consistency revolution: End of SQL Architectural Era ("One size fits all", ACID, scaling RDBMS)
- Popular patterns for dealing with data distribution at massive scale: Auto Partitioning, Handling temporary failures, Conflict resolution (how to handle concurrent updates which happening on different nodes for the same entity without locks)

Veröffentlicht in: Bildung
  • Als Erste(r) kommentieren

.NET Fest 2018. Антон Молдован. Scaling CAP theorem for modern distributed systems

  1. 1. every read receives the most recent write A B WRITE X=1 READ X=1 CONSISTENCY X=1 X=1
  2. 2. every read receives the most recent write A B WRITE X=2 CONSISTENCY X=1 X=1
  3. 3. every read receives the most recent write A B WRITE X=2 CONSISTENCY X=2 X=1 READ ERROR!!!
  4. 4. read and writes succeed on all nodes A B WRITE X=2 READ X=1 AVAILABILITY X=2 X=1
  5. 5. M R1 R2 syncsync read read write Scaling Reads
  6. 6. M R1 R2 asyncasync read read write Scaling Reads
  7. 7. DDBS PA PC EL EC SQL Server Yes Yes CosmosDB Yes Yes PNUTS Yes Yes Cassandra Yes Yes Riak Yes Yes VoltDB/H-Store Yes Yes MongoDB Yes Yes ElasticSearch Yes Yes PAC-ELC
  8. 8. ONE SIZE FITS ALL
  9. 9. A C I D atomicity consistency isolation durability
  10. 10. ACID Experience at Amazon has shown that data stores that provide ACID guarantees tend to have poor availability. This has been widely acknowledged by both the industry and academia
  11. 11. scalability fault tolerance
  12. 12. Dynamo Databases
  13. 13. Requirements Query Model: simple read and write operations by a key. ACID: Dynamo targets the design space of an ‘Always Writable’. Incremental Scale: Dynamo should be able to scale out one node at a time with minimal impact. Efficiency: services should be configurable for durability, availability, consistency guarantees.
  14. 14. Data Partitioning
  15. 15. N3 N1 N2 500 250750 0 let dataPlacement = data |> getMD5hash |> getFirst8Bytes |> toInteger A C
  16. 16. N3 N1 N2 500 250750 0 let dataPlacement = data |> getMD5hash |> getFirst8Bytes |> toInteger A C
  17. 17. N3 N1 N2 500 250750 0 let dataPlacement = data |> getMD5hash |> getFirst8Bytes |> toInteger A C
  18. 18. N3 N2 500 250750 0 let dataPlacement = data |> getMD5hash |> getFirst8Bytes |> toInteger A C
  19. 19. N3 N1 500 250750 0 let dataPlacement = data |> getMD5hash |> getFirst8Bytes |> toInteger A C
  20. 20. N 500 250750 0 N N N N N N N N N N N
  21. 21. N 500 250750 0 N N N N N N N N N N N N1 N1
  22. 22. N 500 250750 0 N N N N N N N N N
  23. 23. NON-UNIFORM DATA & LOAD DISTIBUTION
  24. 24. VIRTUAL NODES
  25. 25. WEEK Consistency
  26. 26. SLOPPY QUORUM R + W > N R – minimum number of nodes for read W - minimum number of nodes for write N – number of replicas
  27. 27. D B A C PUT W=2 S=4 N=3 2 + 2 > 3 R – minimum number of nodes for read W - minimum number of nodes for write N – number of replicas GET R=2 (W + R > N)
  28. 28. ALWAYS WRITABLE
  29. 29. D B A C PUT W=2 S=4 N=3
  30. 30. D B A C PUT W=2 S=4 N=3
  31. 31. D B A C PUT S=4 N=3 W=2 Hinted Handoff
  32. 32. D B A C Hinted Handoff W=2 S=4 N=3 temporary
  33. 33. D AHinted Handoff W=2 S=4 N=3 temporary B C
  34. 34. D AHinted Handoff W=2 S=4 N=3 B C
  35. 35. CONFLICTS LWW VECTOR CLOCKS CRDT
  36. 36. CAUSALITY by timestamps 00.11.12 < 00.11.18
  37. 37. A US Data Center B EU Data Center t1 ADD cart ‘pizza’ t2 US Data Center Fails Sync Fails t3 ADD cart ‘vodka’ (Last Write Wins) t4 US Data Center Recovers Resume Sync t5 GET cart ‘vodka’ GET cart ‘vodka’
  38. 38. VECTOR CLOCKS 1988
  39. 39. N1 N1 N2 N3 N1 write handled by N1 write handled by N1 D1[N1, 1] D2[N1, 2] (nodeName, counter) map VECTOR CLOCKS D3[(N1, 2)(N2,1)] D4[(N1, 2)(N3,1)] D5[(N1, 3)(N2,1)(N3,1)] client 1 client 2 client 1 client 1 reconciled and written by N1
  40. 40. CRDT 2009
  41. 41. CRDT Conflict-free Replicated Data Type
  42. 42. CRDT Commutativity: x * y = y * x Associativity: (x * y) * z = x * (y * z) Idempotency: x * x = x
  43. 43. System.Int32 addition: (+)
  44. 44. + Commutativity: 1 + 2 = 2 + 1 Associativity: (1 + 2) + 3 = 1 + (2 + 3) Idempotency: 1 + 1 ≠ 1
  45. 45. Set<T> merge: (U)
  46. 46. U Commutativity: 1 U 2 = 2 U 1 Associativity: (1 U 2) U 3 = 1 U (2 U 3) Idempotency: 1 U 1 = 1
  47. 47. 42
  48. 48. { } { } { }
  49. 49. { } { 55 } { } 55
  50. 50. { 55 } { 55 } { }
  51. 51. { 55 } { 55 } { 55 }
  52. 52. { 55 } { 55 } { 55, 73 } 73
  53. 53. { 55 } { 55 } { 55, 73 }
  54. 54. { 55, 73 } { 55 } { 55, 73 }
  55. 55. { 55, 73 } { 55 } { 55, 73 }
  56. 56. { 55, 73 } { 55 } { 55, 73 }2
  57. 57. { 55, 73 } U { 55 } U { 55, 73 } = { 55, 73 } READ REPAIR { 55, 73 } ∆ { 55 } ∆ { 55, 73 } = { 73 }
  58. 58. { 55, 73 } { 55, 73 } { 55, 73 }2
  59. 59. on CRDT
  60. 60. on LWW-Set
  61. 61. on (timestamp, data)
  62. 62. let toLwwSet (tweet) = (tweet.Timestamp, tweet) let tweet = { userId = “12bf-58ac-9pi6” timestamp: “2018:09:01 12:30:00” message: “Hi @User1” }
  63. 63. (time,tweet) add remove node 1 add remove node 2
  64. 64. (2018,a) add remove node 1 add remove node 2 (2018,a) (2018,a)
  65. 65. (2018,b) add remove node 1 add remove node 2 (2018,a) (2018,a) (2018,b)
  66. 66. add remove node 1 add remove node 2 (2018,a) (2018,a) (2018,b) (2018,b)
  67. 67. add remove node 1 add remove node 2 (2018,a) (2018,a) (2018,b) (2018,b) (2018,c)
  68. 68. add remove node 1 add remove node 2 (2018,a) (2018,a) (2018,b) (2018,b) (2018,c)
  69. 69. add remove node 1 add remove node 2 (2018,a) (2018,a) (2018,b) (2018,b) (2018,c) (2018,c)
  70. 70. add remove node 1 add remove node 2 (2018,a) (2018,a) (2018,b) (2018,b) (2018,c) (2018,c)
  71. 71. (2018,a) (2018,b) (2018,c) (2018,c) READ REPAIR
  72. 72. THANKS @antyadev antyadev@gmail.com

×