SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Downloaden Sie, um offline zu lesen
Distributed Counters
            in Cassandra




Friday, August 13, 2010
I: Goal
             II: Design
            III: Implementation




            Distributed Counters in Cassandra

Friday, August 13, 2010
I: Goal




            Distributed Counters in Cassandra

Friday, August 13, 2010
Goal




       Low Latency,
       Highly Available
       Counters




            Distributed Counters in Cassandra

Friday, August 13, 2010
II: Design




            Distributed Counters in Cassandra

Friday, August 13, 2010
I: Traditional Counter Design
             II: Abstract Strategy
            III: Distributed Counter Design




            Distributed Counters in Cassandra

Friday, August 13, 2010
Design



                 I: Traditional Counter Design




            Distributed Counters in Cassandra

Friday, August 13, 2010
Traditional Counter Design
       Atomic Counters


       1. single machine
       2. one order of execution
       3. strongly consistent



            Distributed Counters in Cassandra

Friday, August 13, 2010
Traditional Counter Design
       Problems


       1. SPOF / single master
       2. high latency
       3. manually sharded



            Distributed Counters in Cassandra

Friday, August 13, 2010
Traditional Counter Design
       Question




                          What constraints can we relax?




            Distributed Counters in Cassandra

Friday, August 13, 2010
Design



               II: Abstract Strategy




            Distributed Counters in Cassandra

Friday, August 13, 2010
Abstract Strategy
       Constraints to Relax



       1. one order of execution
       2. strong consistency




            Distributed Counters in Cassandra

Friday, August 13, 2010
Abstract Strategy
       Relax: One Order of Execution



       commutative operation:
         - operations must be re-orderable



            Distributed Counters in Cassandra

Friday, August 13, 2010
Abstract Strategy
       Relax: Strong Consistency

       partitioned work:
         - each op must occur once
         - unique partition identifier
       idempotent repair:
         - recognize ops from other partitions

            Distributed Counters in Cassandra

Friday, August 13, 2010
Design



            III: Distributed Counter Design




            Distributed Counters in Cassandra

Friday, August 13, 2010
Distributed Counter Design
       Requirements


       1. commutative operation
       2. partitioned work
       3. idempotent repair



            Distributed Counters in Cassandra

Friday, August 13, 2010
Distributed Counter Design
       Commutative Operation


       addition:
         - commutative operation
         - sum ops performed by all replicas
         -a + b = b + a

            Distributed Counters in Cassandra

Friday, August 13, 2010
Distributed Counter Design
       Partitioned Work



       each op assigned to a replica:
         - every replica sums all of its ops



            Distributed Counters in Cassandra

Friday, August 13, 2010
Distributed Counter Design
       Idempotent Repair


       save counts from remote replicas:
         - keep highest count seen
       prevent multiple execution:
         - do not transfer the target replica’s count


            Distributed Counters in Cassandra

Friday, August 13, 2010
III: Implementation




            Distributed Counters in Cassandra

Friday, August 13, 2010
I: Data Structure
             II: Single Node
            III: Eventual Consistency




            Distributed Counters in Cassandra

Friday, August 13, 2010
I: Data Structure




            Distributed Counters in Cassandra

Friday, August 13, 2010
Data Structure
       Requirements


       local counts:
         - incrementally update
       remote counts:
         - independently track partitions

            Distributed Counters in Cassandra

Friday, August 13, 2010
Data Structure
       Context Format



       list of (replica id, count) tuples:
                 [(replica A, count), (replica B, count), ...]




            Distributed Counters in Cassandra

Friday, August 13, 2010
Data Structure
       Context Mutations


       local write:
         sum local count and write delta
         note: memtable



            Distributed Counters in Cassandra

Friday, August 13, 2010
Data Structure
       Context Mutations


       remote repair:
         for each replica,
         keep highest count seen
         (local or from repair)


            Distributed Counters in Cassandra

Friday, August 13, 2010
II: Single Node




            Distributed Counters in Cassandra

Friday, August 13, 2010
Single Node
       Write Path

       client
          1. construct column
             - value: delta (big-endian long)
             - clock: empty
          2. thrift: insert / batch_mutate

            Distributed Counters in Cassandra

Friday, August 13, 2010
Single Node
       Write Path

       coordinator
         1. choose partition
                          - choose target replica
                          - requirement: ConsistencyLevel.ONE
                 2. construct clock
                          - context format: [(target replica id, count delta)]


            Distributed Counters in Cassandra

Friday, August 13, 2010
Single Node
       Write Path


       target replica
       insert:
                 1. memtable does not contain column
                 2. insert column into memtable



            Distributed Counters in Cassandra

Friday, August 13, 2010
Single Node
       Write Path
       target replica
       update:
                 1. memtable contains column
                 2. retrieve existing column
                 3. create new column
                    - context: sum local count w/ delta from write
                 4. replace column in ConcurrentSkipListMap
                 5. if failed to replace column, go to step 2.


            Distributed Counters in Cassandra

Friday, August 13, 2010
Single Node
       Write Path
       Interesting Note:
       MTs are serialized to SSTs, as-is
                 - each SST encapsulates the updates
                   when it was an MT
                 - local count total must be aggregated
                   across the MT and all SSTs

            Distributed Counters in Cassandra

Friday, August 13, 2010
Single Node
       Read Path
       target replica
       read:
                 1. construct collating iterator over:
                    - frozen snapshot of MT
                    - all relevant SSTs
                 2. resolve column
                    - local counts: sum
                    - remote counts: keep max
                 3. construct value
                    - sum local and remote counts (big-endian long)

            Distributed Counters in Cassandra

Friday, August 13, 2010
Single Node
       Compaction

       replica
       compaction:
                 1. construct collating iterator over all SSTs
                 2. resolve every column in the CF
                    - local counts: sum
                    - remote counts: keep max
                 3. write out resolved CF



            Distributed Counters in Cassandra

Friday, August 13, 2010
III: Eventual Consistency




            Distributed Counters in Cassandra

Friday, August 13, 2010
Eventual Consistency
       Read Repair


       coordinator / replica
       read repair:
                 1. calculate resolved (superset) CF
                    - resolve every column (local: sum, remote: max)
                 2. return resolved CF to client




            Distributed Counters in Cassandra

Friday, August 13, 2010
Eventual Consistency
       Read Repair

       coordinator / replica
       read repair:
                 1. calculate repair CF for each replica
                    - calculate diff CF between resolved and received
                    - modify columns to remove target replica’s counts
                 2. send repair CF to each replica



            Distributed Counters in Cassandra

Friday, August 13, 2010
Eventual Consistency
       Anti-Entropy Service


       sending replica
       AES:
                 1. follow normal AES code path
                    - calculate repair SST based on shared ranges
                    - send repair SST



            Distributed Counters in Cassandra

Friday, August 13, 2010
Eventual Consistency
       Anti-Entropy Service

       receiving replica
       AES:
                 1. post-process streamed SST
                    - re-build streamed SST
                    - note: strip out local replica’s counts
                 2. remove temporary descriptor
                 3. add to SSTableTracker



            Distributed Counters in Cassandra

Friday, August 13, 2010
Questions?




            Distributed Counters in Cassandra

Friday, August 13, 2010
More Information
       Issues:
       #580: Vector Clocks
       #1072: Distributed Counters

       Related Work:
       Helland and Campbell, Building on Quicksand, CIDR (2009),
       Sections 5 & 6.


       My email address:
       kakugawa@gmail.com


            Distributed Counters in Cassandra

Friday, August 13, 2010

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

OpenSCAP Overview(security scanning for docker image and container)
OpenSCAP Overview(security scanning for docker image and container)OpenSCAP Overview(security scanning for docker image and container)
OpenSCAP Overview(security scanning for docker image and container)
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
 
Software Process Models
Software Process ModelsSoftware Process Models
Software Process Models
 
Cyber Threat Intelligence Integration Center -- ONDI
Cyber Threat Intelligence Integration Center -- ONDICyber Threat Intelligence Integration Center -- ONDI
Cyber Threat Intelligence Integration Center -- ONDI
 
Chap 6.4 Estimate Activity Duration
Chap 6.4 Estimate Activity DurationChap 6.4 Estimate Activity Duration
Chap 6.4 Estimate Activity Duration
 
Capacity Planning Your Kafka Cluster | Jason Bell, Digitalis
Capacity Planning Your Kafka Cluster | Jason Bell, DigitalisCapacity Planning Your Kafka Cluster | Jason Bell, Digitalis
Capacity Planning Your Kafka Cluster | Jason Bell, Digitalis
 
An introduction to primavera risk analysis - Oracle Primavera P6 Collaborate 14
An introduction to primavera risk analysis  - Oracle Primavera P6 Collaborate 14An introduction to primavera risk analysis  - Oracle Primavera P6 Collaborate 14
An introduction to primavera risk analysis - Oracle Primavera P6 Collaborate 14
 
Rita's Process Chart
Rita's Process ChartRita's Process Chart
Rita's Process Chart
 
Thoughts on kafka capacity planning
Thoughts on kafka capacity planningThoughts on kafka capacity planning
Thoughts on kafka capacity planning
 
Earned Value Analysis
Earned Value AnalysisEarned Value Analysis
Earned Value Analysis
 
Finding attacks with these 6 events
Finding attacks with these 6 eventsFinding attacks with these 6 events
Finding attacks with these 6 events
 
Jenkins Introduction
Jenkins IntroductionJenkins Introduction
Jenkins Introduction
 
Beyond the Pentest: How C2, Internal Pivoting, and Data Exfiltration Show Tru...
Beyond the Pentest: How C2, Internal Pivoting, and Data Exfiltration Show Tru...Beyond the Pentest: How C2, Internal Pivoting, and Data Exfiltration Show Tru...
Beyond the Pentest: How C2, Internal Pivoting, and Data Exfiltration Show Tru...
 
What is jenkins
What is jenkinsWhat is jenkins
What is jenkins
 
CNIT 152: 9 Network Evidence
CNIT 152: 9 Network EvidenceCNIT 152: 9 Network Evidence
CNIT 152: 9 Network Evidence
 
Time Impact Analysis
Time Impact AnalysisTime Impact Analysis
Time Impact Analysis
 
Resilience reloaded - more resilience patterns
Resilience reloaded - more resilience patternsResilience reloaded - more resilience patterns
Resilience reloaded - more resilience patterns
 
Ceph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldCeph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud world
 
TRITON: The Next Generation of ICS Malware
TRITON: The Next Generation of ICS MalwareTRITON: The Next Generation of ICS Malware
TRITON: The Next Generation of ICS Malware
 
MongoDB at Baidu
MongoDB at BaiduMongoDB at Baidu
MongoDB at Baidu
 

Ähnlich wie Distributed Counters in Cassandra (Cassandra Summit 2010)

Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!
Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!
Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!
BertrandDrouvot
 

Ähnlich wie Distributed Counters in Cassandra (Cassandra Summit 2010) (16)

07 problem-solving
07 problem-solving07 problem-solving
07 problem-solving
 
Summary of "Cassandra" for 3rd nosql summer reading in Tokyo
Summary of "Cassandra" for 3rd nosql summer reading in TokyoSummary of "Cassandra" for 3rd nosql summer reading in Tokyo
Summary of "Cassandra" for 3rd nosql summer reading in Tokyo
 
L09.pdf
L09.pdfL09.pdf
L09.pdf
 
TechEvent Apache Cassandra
TechEvent Apache CassandraTechEvent Apache Cassandra
TechEvent Apache Cassandra
 
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
 
Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandra
 
L09-handout.pdf
L09-handout.pdfL09-handout.pdf
L09-handout.pdf
 
04 reports
04 reports04 reports
04 reports
 
Understanding AntiEntropy in Cassandra
Understanding AntiEntropy in CassandraUnderstanding AntiEntropy in Cassandra
Understanding AntiEntropy in Cassandra
 
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
 
SignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseSignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series Database
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystem
 
06 data
06 data06 data
06 data
 
ScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous SpeedScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous Speed
 
04 Reports
04 Reports04 Reports
04 Reports
 
Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!
Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!
Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!
 

Kürzlich hochgeladen

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Distributed Counters in Cassandra (Cassandra Summit 2010)

  • 1. Distributed Counters in Cassandra Friday, August 13, 2010
  • 2. I: Goal II: Design III: Implementation Distributed Counters in Cassandra Friday, August 13, 2010
  • 3. I: Goal Distributed Counters in Cassandra Friday, August 13, 2010
  • 4. Goal Low Latency, Highly Available Counters Distributed Counters in Cassandra Friday, August 13, 2010
  • 5. II: Design Distributed Counters in Cassandra Friday, August 13, 2010
  • 6. I: Traditional Counter Design II: Abstract Strategy III: Distributed Counter Design Distributed Counters in Cassandra Friday, August 13, 2010
  • 7. Design I: Traditional Counter Design Distributed Counters in Cassandra Friday, August 13, 2010
  • 8. Traditional Counter Design Atomic Counters 1. single machine 2. one order of execution 3. strongly consistent Distributed Counters in Cassandra Friday, August 13, 2010
  • 9. Traditional Counter Design Problems 1. SPOF / single master 2. high latency 3. manually sharded Distributed Counters in Cassandra Friday, August 13, 2010
  • 10. Traditional Counter Design Question What constraints can we relax? Distributed Counters in Cassandra Friday, August 13, 2010
  • 11. Design II: Abstract Strategy Distributed Counters in Cassandra Friday, August 13, 2010
  • 12. Abstract Strategy Constraints to Relax 1. one order of execution 2. strong consistency Distributed Counters in Cassandra Friday, August 13, 2010
  • 13. Abstract Strategy Relax: One Order of Execution commutative operation: - operations must be re-orderable Distributed Counters in Cassandra Friday, August 13, 2010
  • 14. Abstract Strategy Relax: Strong Consistency partitioned work: - each op must occur once - unique partition identifier idempotent repair: - recognize ops from other partitions Distributed Counters in Cassandra Friday, August 13, 2010
  • 15. Design III: Distributed Counter Design Distributed Counters in Cassandra Friday, August 13, 2010
  • 16. Distributed Counter Design Requirements 1. commutative operation 2. partitioned work 3. idempotent repair Distributed Counters in Cassandra Friday, August 13, 2010
  • 17. Distributed Counter Design Commutative Operation addition: - commutative operation - sum ops performed by all replicas -a + b = b + a Distributed Counters in Cassandra Friday, August 13, 2010
  • 18. Distributed Counter Design Partitioned Work each op assigned to a replica: - every replica sums all of its ops Distributed Counters in Cassandra Friday, August 13, 2010
  • 19. Distributed Counter Design Idempotent Repair save counts from remote replicas: - keep highest count seen prevent multiple execution: - do not transfer the target replica’s count Distributed Counters in Cassandra Friday, August 13, 2010
  • 20. III: Implementation Distributed Counters in Cassandra Friday, August 13, 2010
  • 21. I: Data Structure II: Single Node III: Eventual Consistency Distributed Counters in Cassandra Friday, August 13, 2010
  • 22. I: Data Structure Distributed Counters in Cassandra Friday, August 13, 2010
  • 23. Data Structure Requirements local counts: - incrementally update remote counts: - independently track partitions Distributed Counters in Cassandra Friday, August 13, 2010
  • 24. Data Structure Context Format list of (replica id, count) tuples: [(replica A, count), (replica B, count), ...] Distributed Counters in Cassandra Friday, August 13, 2010
  • 25. Data Structure Context Mutations local write: sum local count and write delta note: memtable Distributed Counters in Cassandra Friday, August 13, 2010
  • 26. Data Structure Context Mutations remote repair: for each replica, keep highest count seen (local or from repair) Distributed Counters in Cassandra Friday, August 13, 2010
  • 27. II: Single Node Distributed Counters in Cassandra Friday, August 13, 2010
  • 28. Single Node Write Path client 1. construct column - value: delta (big-endian long) - clock: empty 2. thrift: insert / batch_mutate Distributed Counters in Cassandra Friday, August 13, 2010
  • 29. Single Node Write Path coordinator 1. choose partition - choose target replica - requirement: ConsistencyLevel.ONE 2. construct clock - context format: [(target replica id, count delta)] Distributed Counters in Cassandra Friday, August 13, 2010
  • 30. Single Node Write Path target replica insert: 1. memtable does not contain column 2. insert column into memtable Distributed Counters in Cassandra Friday, August 13, 2010
  • 31. Single Node Write Path target replica update: 1. memtable contains column 2. retrieve existing column 3. create new column - context: sum local count w/ delta from write 4. replace column in ConcurrentSkipListMap 5. if failed to replace column, go to step 2. Distributed Counters in Cassandra Friday, August 13, 2010
  • 32. Single Node Write Path Interesting Note: MTs are serialized to SSTs, as-is - each SST encapsulates the updates when it was an MT - local count total must be aggregated across the MT and all SSTs Distributed Counters in Cassandra Friday, August 13, 2010
  • 33. Single Node Read Path target replica read: 1. construct collating iterator over: - frozen snapshot of MT - all relevant SSTs 2. resolve column - local counts: sum - remote counts: keep max 3. construct value - sum local and remote counts (big-endian long) Distributed Counters in Cassandra Friday, August 13, 2010
  • 34. Single Node Compaction replica compaction: 1. construct collating iterator over all SSTs 2. resolve every column in the CF - local counts: sum - remote counts: keep max 3. write out resolved CF Distributed Counters in Cassandra Friday, August 13, 2010
  • 35. III: Eventual Consistency Distributed Counters in Cassandra Friday, August 13, 2010
  • 36. Eventual Consistency Read Repair coordinator / replica read repair: 1. calculate resolved (superset) CF - resolve every column (local: sum, remote: max) 2. return resolved CF to client Distributed Counters in Cassandra Friday, August 13, 2010
  • 37. Eventual Consistency Read Repair coordinator / replica read repair: 1. calculate repair CF for each replica - calculate diff CF between resolved and received - modify columns to remove target replica’s counts 2. send repair CF to each replica Distributed Counters in Cassandra Friday, August 13, 2010
  • 38. Eventual Consistency Anti-Entropy Service sending replica AES: 1. follow normal AES code path - calculate repair SST based on shared ranges - send repair SST Distributed Counters in Cassandra Friday, August 13, 2010
  • 39. Eventual Consistency Anti-Entropy Service receiving replica AES: 1. post-process streamed SST - re-build streamed SST - note: strip out local replica’s counts 2. remove temporary descriptor 3. add to SSTableTracker Distributed Counters in Cassandra Friday, August 13, 2010
  • 40. Questions? Distributed Counters in Cassandra Friday, August 13, 2010
  • 41. More Information Issues: #580: Vector Clocks #1072: Distributed Counters Related Work: Helland and Campbell, Building on Quicksand, CIDR (2009), Sections 5 & 6. My email address: kakugawa@gmail.com Distributed Counters in Cassandra Friday, August 13, 2010