6. 提供サービス
Media
US
Search Video Answer Mail
JP
US
JP
Membership C2C Payment C2C EC B2C EC Local
Search Knowledge search MailNews
YAHUOKU!Premium Wallet Loco
19. Cassandra process goes down when too many clients
connect it
19
Client machine
Apache child process
Client machine
Client machine
20. • 200 client machines
Cassandra process goes down when too many clients
connect it
20
21. • 200 client machines * 128 apache child
processes
Cassandra process goes down when too many clients
connect it
21
22. • 200 client machines * 128 apache child
processes * 2 (request + heart beat) =
Cassandra process goes down when too many clients
connect it
22
23. • 200 client machines * 128 apache child
processes * 2 (request + heart beat) =
51,200 connections / node
Cassandra process goes down when too many clients
connect it
23
24. Cassandra process goes down when too many clients
connect it
24
Client machine
Apache child process
51,200 > 32,768 ( max open file num )
Client machine
Client machine
写真:アフロ
42. remark
• This is a summary of following tickets:
– https://issues.apache.org/jira/browse/CASSANDRA-11206
– https://issues.apache.org/jira/browse/CASSANDRA-9738
44. High level: read path
Row Cache
Key Cache
SSTables Mem Table
1. Check row cache before going to key cache
2. Check the key cache to get the
offsets to data
3. Find the offsets to data and retrieve data
4. Merge data from sstables and memtable
5. Populate row cache with new row returned
http://docs.datastax.com/en/cassandra/3.x/cassandra/dml/dmlAboutReads.html
45. Pattern 1. The row is in row cache
Partition
Summary
Disk
Mem Table
Compression
Offsets
Bloom Filter
Row Cache
Heap Off Heap
Key Cache
Partition
Index
Data
1. read request
2. return row when that is in row cache
46. Pattern 2. The key is in key cache
Partition
Summary
Disk
Mem Table
Compression
Offsets
Bloom Filter
Row Cache
Heap Off Heap
Key Cache
Partition
Index
Data
1. read request
2. Check bloom filters 3. Check the partition key is in key cache
4. Find the offset to the result set
5. Access the result set
47. Pattern 3. The key is not cached
Partition
Summary
Disk
Mem Table
Compression
Offsets
Bloom Filter
Row Cache
Heap Off Heap
Key Cache
Partition
Index
Data
1. read request
2. Miss -> Check bloom filters
3. Check the partition key is in key cache
4. Miss -> Bsearch the close location of index
5. Disk scan to find the offsets 6. Find the offset into the result set
7. Access the result set
8. Update key cache
49. Partition Index Recap
• http://distributeddatastore.blogspot.jp/2013/08/cassandra-sstable-storage-format.html
50. RowIndexEntry
• Partition size < 64 kb
– RowIndexEntry
• Position
• Seriarized size of data
• Partition size > 64 kb
– IndexedEntry
• Position
• Seriarized size of data
• IndexInfo[]
– Seriarize method
– Offset
– width
– Etc.
Approximation on 16 byte value
1mb : 3kb / > 200 objects
4mb : 11kb / > 800 objects
64mb : 180kb / > 13k objects
512mb : 1.4mb / > 106k objects
51. 3. The key is not cached
Partition
Summary
Disk
Mem Table
Compression
Offsets
Bloom Filter
Row Cache
Heap Off Heap
Key Cache
Partition
Index
Data
1. read request
2. Miss -> Check bloom filters
3. Check the partition key is in key cache
4. Miss -> Bsearch the close location of index
5. Disk scan to find the offsets 6. Find the offsets into the result set
7. Access the result set
8. Update key cache
9. GC, GC, GC…
52. Current solution
• If partition size <
column_index_cache_size_in_kb(configurable)
– IndexedEntry is kept on heap
• Otherwise
– Always read from disk when needed
• https://issues.apache.org/jira/browse/CASSANDRA-11206
• https://www.youtube.com/watch?v=qa84vABqftM
53. Other possible solutions
• IndexInfo never be kept on heap
– Read from disk when needed
– degrades performance when small partition is read
54. Other possible solutions
• Migrate key cache to be fully off heap
– https://issues.apache.org/jira/browse/CASSANDRA-9738
– Serialization & deserialization cost so much when large partition is
read
• Will Birch help us to solve this problem?
– https://issues.apache.org/jira/browse/CASSANDRA-9754
55. What we go for
• 来年もNGCCに呼んでもらえるように頑張ろう!
そのためには?: Cassandraコミュニティに貢献する
1. 日本で一番Apache Cassandraを使っている会社になる。
2. Cassandraのコード改善や問題提起の活動を継続する。
3. Cassandraコミュニティの人と仲良くなる。
55
60. 付録:NGCC動画集
• Next-Gen Schema
– https://www.youtube.com/watch?v=eAWRj0kqpvU
• Change Data Capture
– https://www.youtube.com/watch?v=Y0fOxa3tC98
• Explicit support for time series data
– https://www.youtube.com/watch?v=CmsQNNdDuSA
• Automated Repair
– https://www.youtube.com/watch?v=8sGUn6Q2bUU
• Storage format and key cache changes to support large partitions
– https://www.youtube.com/watch?v=qa84vABqftM
• SASI update
– https://www.youtube.com/watch?v=yUFoSAg6rA4
• Instagram’s use cases
– https://www.youtube.com/watch?v=VwhovoqavT4
• Lightning Talks
– https://www.youtube.com/watch?v=6y5UV4OTawg
60