SlideShare a Scribd company logo
A Technical Introduction to WiredTiger
Michael Cahill
Director of Engineering (Storage), MongoDB
You may have seen this:
or this…
How does WiredTiger do it?
6
What’s different about WiredTiger?
• Document-level concurrency
• In-memory performance
• Multi-core scalability
• Checksums
• Compression
• Durability with and without journaling
7
This presentation is not…
• How to write stand-alone WiredTiger apps
• How to configure MongoDB with WiredTiger for your workload
WiredTiger Background
9
Why create a new storage engine?
• Minimize contention between threads
– lock-free algorithms, hazard pointers
– eliminate blocking due to concurrency control
• Hotter cache and more work per I/O
– compact file formats
– compression
– big-block I/O
10
MongoDB’s Storage Engine API
• Allows different storage engines to "plug-in"
– Different workloads have different performance characteristics
– mmap is not ideal for all workloads
– More flexibility
• mix storage engines on same replica set/sharded cluster
• Opportunity to integrate further (HDFS, native encrypted,
hardware optimized …)
• Great way for us to demonstrate WiredTiger’s performance
11
Storage Engine Layer
Content
Repo
IoT Sensor
Backend
Ad Service
Customer
Analytics
Archive
MongoDB Query Language (MQL) + Native Drivers
MongoDB Document Data Model
MMAP V1 WT In-Memory ? ?
Supported in MongoDB 3.0 Future Possible Storage Engines
Management
Security
Example Future State
Experimental
WiredTiger Architecture
13
WiredTiger Architecture
WiredTiger Engine
Schema &
Cursors
Python API C API Java API
Database
Files
Transactions
Page
read/ write
Logging
Column
storage
Block
management
Row
storage
Snapshots
Log Files
Cache
In-memory
consistency
Per-file
checkpoints
14
Trees in cache
non-resident
child
ordinary pointer
root page
internal
page
internal
page
root page
leaf page
leaf page leaf page leaf page
1 memory
flush
read required
disk image
15
Pages in cache
cache
data files
page images
on-disk
page
image
index
clean
page on-disk
page
image
indexdirty
page
updates
constructed
during read
skiplist
reconciled
during write
Concurrency
and
multi-core scaling
17
Multiversion Concurrency Control (MVCC)
• Multiple versions of records kept in cache
• Readers see the version that was committed before the operation
started
– MongoDB “yields” turn large operations into small transactions
• Writers can create new versions concurrent with readers
• Concurrent updates to a single record cause write conflicts
– MongoDB retries with back-off
Transforming data during I/O
19
Checksums
• A checksum is stored with every page
• WiredTiger stores the checksum with the page address (typically
in a parent page)
– Extra safety against reading an old, valid page image
• Checksums are validated during page read
– Detects filesystem corruption, random bitflips
20
Compression
• WiredTiger uses snappy compression by default in MongoDB
• supported compression algorithms
– snappy [default]: good compression, low overhead
– zlib: better compression, more CPU
– none
• Indexes are compressed using prefix compression
– allows compression in memory
Durability
with and without
Journal
22
Writing a checkpoint
1. Write the leaves
2. Write the internal pages, including the root
– the old checkpoint is still valid
3. Sync the file
4. Write the new root’s address to the metadata
– free pages from old checkpoints once the metadata is durable
23
Durability without Journaling
• MMAPv1 uses a write-ahead log (journal) to guarantee
consistency
– Running with “nojournal” is unsafe
• WiredTiger doesn't have this need: no in-place updates
– Write-ahead log can be truncated at checkpoints
• Every 2GB or 60sec by default – configurable
– Updates are written to optional journal as they commit
• Not flushed on every commit by default
• Recovery rolls forward from last checkpoint
• Replication can guarantee durability
24
Journal and recovery
• Optional write-ahead logging
• Only written at transaction commit
• snappy compression by default
• Group commit
• Automatic log archive / removal
• On startup, we rely on finding a consistent checkpoint in the
metadata
• Check LSNs in the metadata to figure out where to roll forward
from
25
So, what’s different about WiredTiger?
• Multiversion Concurrent Control
– No lock manager
• Non-locking data structures
– Multi-core scalability
• Different representation of in-memory data vs on-disk
– Enabled checksums, compression
• Copy-on-write storage
– Durability without a journal
What’s next?
27
What’s next for WiredTiger?
• WiredTiger btree access method is optional in 3.0
• plan to make it the default in 3.2
• tuning for (many) more workloads
• make Log Structured Merge (LSM) work well with MongoDB
– out-of-cache, write-heavy workloads
• adding encryption
• more advanced transactional semantics in the storage engine API
Thanks!
Questions?
Michael Cahill
michael.cahill@mongodb.com

More Related Content

What's hot

Common MongoDB Use Cases
Common MongoDB Use Cases Common MongoDB Use Cases
Common MongoDB Use Cases
MongoDB
 
Delta Lake Streaming: Under the Hood
Delta Lake Streaming: Under the HoodDelta Lake Streaming: Under the Hood
Delta Lake Streaming: Under the Hood
Databricks
 
Apache Jackrabbit Oak on MongoDB
Apache Jackrabbit Oak on MongoDBApache Jackrabbit Oak on MongoDB
Apache Jackrabbit Oak on MongoDB
MongoDB
 

What's hot (20)

RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
 
A Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerA Technical Introduction to WiredTiger
A Technical Introduction to WiredTiger
 
FIFA 온라인 3의 MongoDB 사용기
FIFA 온라인 3의 MongoDB 사용기FIFA 온라인 3의 MongoDB 사용기
FIFA 온라인 3의 MongoDB 사용기
 
Airflow를 이용한 데이터 Workflow 관리
Airflow를 이용한  데이터 Workflow 관리Airflow를 이용한  데이터 Workflow 관리
Airflow를 이용한 데이터 Workflow 관리
 
MongoDB World 2019: MongoDB Read Isolation: Making Your Reads Clean, Committe...
MongoDB World 2019: MongoDB Read Isolation: Making Your Reads Clean, Committe...MongoDB World 2019: MongoDB Read Isolation: Making Your Reads Clean, Committe...
MongoDB World 2019: MongoDB Read Isolation: Making Your Reads Clean, Committe...
 
Spark + S3 + R3를 이용한 데이터 분석 시스템 만들기
Spark + S3 + R3를 이용한 데이터 분석 시스템 만들기Spark + S3 + R3를 이용한 데이터 분석 시스템 만들기
Spark + S3 + R3를 이용한 데이터 분석 시스템 만들기
 
MongoDB at Baidu
MongoDB at BaiduMongoDB at Baidu
MongoDB at Baidu
 
Common MongoDB Use Cases
Common MongoDB Use Cases Common MongoDB Use Cases
Common MongoDB Use Cases
 
elasticsearch_적용 및 활용_정리
elasticsearch_적용 및 활용_정리elasticsearch_적용 및 활용_정리
elasticsearch_적용 및 활용_정리
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
 
Massive service basic
Massive service basicMassive service basic
Massive service basic
 
[Gaming on AWS] AWS와 함께 한 쿠키런 서버 Re-architecting 사례 - 데브시스터즈
[Gaming on AWS] AWS와 함께 한 쿠키런 서버 Re-architecting 사례 - 데브시스터즈[Gaming on AWS] AWS와 함께 한 쿠키런 서버 Re-architecting 사례 - 데브시스터즈
[Gaming on AWS] AWS와 함께 한 쿠키런 서버 Re-architecting 사례 - 데브시스터즈
 
대용량 트래픽을 처리하는 최적의 서버리스 애플리케이션 - 안효빈, 구성완 AWS 솔루션즈 아키텍트 :: AWS Summit Seoul 2021
대용량 트래픽을 처리하는 최적의 서버리스 애플리케이션  - 안효빈, 구성완 AWS 솔루션즈 아키텍트 :: AWS Summit Seoul 2021대용량 트래픽을 처리하는 최적의 서버리스 애플리케이션  - 안효빈, 구성완 AWS 솔루션즈 아키텍트 :: AWS Summit Seoul 2021
대용량 트래픽을 처리하는 최적의 서버리스 애플리케이션 - 안효빈, 구성완 AWS 솔루션즈 아키텍트 :: AWS Summit Seoul 2021
 
Oak, the architecture of Apache Jackrabbit 3
Oak, the architecture of Apache Jackrabbit 3Oak, the architecture of Apache Jackrabbit 3
Oak, the architecture of Apache Jackrabbit 3
 
Delta Lake Streaming: Under the Hood
Delta Lake Streaming: Under the HoodDelta Lake Streaming: Under the Hood
Delta Lake Streaming: Under the Hood
 
Project Reactor By Example
Project Reactor By ExampleProject Reactor By Example
Project Reactor By Example
 
Beyond EXPLAIN: Query Optimization From Theory To Code
Beyond EXPLAIN: Query Optimization From Theory To CodeBeyond EXPLAIN: Query Optimization From Theory To Code
Beyond EXPLAIN: Query Optimization From Theory To Code
 
Toi uu hoa he thong 30 trieu nguoi dung
Toi uu hoa he thong 30 trieu nguoi dungToi uu hoa he thong 30 trieu nguoi dung
Toi uu hoa he thong 30 trieu nguoi dung
 
AWS 6월 웨비나 | AWS에서 MS SQL 서버 운영하기 (김민성 솔루션즈아키텍트)
AWS 6월 웨비나 | AWS에서 MS SQL 서버 운영하기 (김민성 솔루션즈아키텍트)AWS 6월 웨비나 | AWS에서 MS SQL 서버 운영하기 (김민성 솔루션즈아키텍트)
AWS 6월 웨비나 | AWS에서 MS SQL 서버 운영하기 (김민성 솔루션즈아키텍트)
 
Apache Jackrabbit Oak on MongoDB
Apache Jackrabbit Oak on MongoDBApache Jackrabbit Oak on MongoDB
Apache Jackrabbit Oak on MongoDB
 

Similar to MongoDB World 2015 - A Technical Introduction to WiredTiger

Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
MongoDB
 

Similar to MongoDB World 2015 - A Technical Introduction to WiredTiger (20)

https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
 
WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0
 
Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines	Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines
 
Let the Tiger Roar - MongoDB 3.0
Let the Tiger Roar - MongoDB 3.0Let the Tiger Roar - MongoDB 3.0
Let the Tiger Roar - MongoDB 3.0
 
Let the Tiger Roar!
Let the Tiger Roar!Let the Tiger Roar!
Let the Tiger Roar!
 
Running MongoDB 3.0 on AWS
Running MongoDB 3.0 on AWSRunning MongoDB 3.0 on AWS
Running MongoDB 3.0 on AWS
 
What'sNnew in 3.0 Webinar
What'sNnew in 3.0 WebinarWhat'sNnew in 3.0 Webinar
What'sNnew in 3.0 Webinar
 
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
MongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage Engine
MongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage EngineMongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage Engine
MongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage Engine
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production Deployment
 
Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage EnginesBeyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment Strategies
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
 
Conceptos Avanzados 1: Motores de Almacenamiento
Conceptos Avanzados 1: Motores de AlmacenamientoConceptos Avanzados 1: Motores de Almacenamiento
Conceptos Avanzados 1: Motores de Almacenamiento
 
MongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL DatabaseMongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL Database
 
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
 

Recently uploaded

Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
Machine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptxMachine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptx
benishzehra469
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
DilipVasan
 

Recently uploaded (20)

Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDB
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
how can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoinhow can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoin
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Machine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptxMachine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptx
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 

MongoDB World 2015 - A Technical Introduction to WiredTiger

  • 1.
  • 2. A Technical Introduction to WiredTiger Michael Cahill Director of Engineering (Storage), MongoDB
  • 3. You may have seen this:
  • 6. 6 What’s different about WiredTiger? • Document-level concurrency • In-memory performance • Multi-core scalability • Checksums • Compression • Durability with and without journaling
  • 7. 7 This presentation is not… • How to write stand-alone WiredTiger apps • How to configure MongoDB with WiredTiger for your workload
  • 9. 9 Why create a new storage engine? • Minimize contention between threads – lock-free algorithms, hazard pointers – eliminate blocking due to concurrency control • Hotter cache and more work per I/O – compact file formats – compression – big-block I/O
  • 10. 10 MongoDB’s Storage Engine API • Allows different storage engines to "plug-in" – Different workloads have different performance characteristics – mmap is not ideal for all workloads – More flexibility • mix storage engines on same replica set/sharded cluster • Opportunity to integrate further (HDFS, native encrypted, hardware optimized …) • Great way for us to demonstrate WiredTiger’s performance
  • 11. 11 Storage Engine Layer Content Repo IoT Sensor Backend Ad Service Customer Analytics Archive MongoDB Query Language (MQL) + Native Drivers MongoDB Document Data Model MMAP V1 WT In-Memory ? ? Supported in MongoDB 3.0 Future Possible Storage Engines Management Security Example Future State Experimental
  • 13. 13 WiredTiger Architecture WiredTiger Engine Schema & Cursors Python API C API Java API Database Files Transactions Page read/ write Logging Column storage Block management Row storage Snapshots Log Files Cache In-memory consistency Per-file checkpoints
  • 14. 14 Trees in cache non-resident child ordinary pointer root page internal page internal page root page leaf page leaf page leaf page leaf page 1 memory flush read required disk image
  • 15. 15 Pages in cache cache data files page images on-disk page image index clean page on-disk page image indexdirty page updates constructed during read skiplist reconciled during write
  • 17. 17 Multiversion Concurrency Control (MVCC) • Multiple versions of records kept in cache • Readers see the version that was committed before the operation started – MongoDB “yields” turn large operations into small transactions • Writers can create new versions concurrent with readers • Concurrent updates to a single record cause write conflicts – MongoDB retries with back-off
  • 19. 19 Checksums • A checksum is stored with every page • WiredTiger stores the checksum with the page address (typically in a parent page) – Extra safety against reading an old, valid page image • Checksums are validated during page read – Detects filesystem corruption, random bitflips
  • 20. 20 Compression • WiredTiger uses snappy compression by default in MongoDB • supported compression algorithms – snappy [default]: good compression, low overhead – zlib: better compression, more CPU – none • Indexes are compressed using prefix compression – allows compression in memory
  • 22. 22 Writing a checkpoint 1. Write the leaves 2. Write the internal pages, including the root – the old checkpoint is still valid 3. Sync the file 4. Write the new root’s address to the metadata – free pages from old checkpoints once the metadata is durable
  • 23. 23 Durability without Journaling • MMAPv1 uses a write-ahead log (journal) to guarantee consistency – Running with “nojournal” is unsafe • WiredTiger doesn't have this need: no in-place updates – Write-ahead log can be truncated at checkpoints • Every 2GB or 60sec by default – configurable – Updates are written to optional journal as they commit • Not flushed on every commit by default • Recovery rolls forward from last checkpoint • Replication can guarantee durability
  • 24. 24 Journal and recovery • Optional write-ahead logging • Only written at transaction commit • snappy compression by default • Group commit • Automatic log archive / removal • On startup, we rely on finding a consistent checkpoint in the metadata • Check LSNs in the metadata to figure out where to roll forward from
  • 25. 25 So, what’s different about WiredTiger? • Multiversion Concurrent Control – No lock manager • Non-locking data structures – Multi-core scalability • Different representation of in-memory data vs on-disk – Enabled checksums, compression • Copy-on-write storage – Durability without a journal
  • 27. 27 What’s next for WiredTiger? • WiredTiger btree access method is optional in 3.0 • plan to make it the default in 3.2 • tuning for (many) more workloads • make Log Structured Merge (LSM) work well with MongoDB – out-of-cache, write-heavy workloads • adding encryption • more advanced transactional semantics in the storage engine API

Editor's Notes

  1. Different in-memory representation vs on-disk – opportunities to optimize plus some interesting fetures Writing a page: Combine the previous page image with in-memory changes Allocate new space in the file If the result is too big, split Always write multiples of the allocation size Can configure direct I/O to keep reads and writes out of the filesystem cache