SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
VM and I/O Topics in Linux
 Page Replacement, Swap and I/O


 Jiannan Ouyang
 Ph.D. Student
 Computer Science Department
 University of Pittsburgh
 05/05/2011
Outline

        • Overview of Linux Memory Management

        • Page Reclamation

        • Swap & I/O




Jiannan Ouyang, CS PhD@PITT                     2
Describing Physical Memory
                              Node: NUMA memory region
                              Zone: memory type
                              Struct Page: page frame




Jiannan Ouyang, CS PhD@PITT                              3
Physical Page Allocation




          Binary Buddy Allocator:
          • If a block of the desired size is not available, a large block is broken up in half, and the
             two blocks are buddies to each other. One half is used for the allocation, and the other is
             free. The blocks are continuously halved as necessary until a block of the desired size is
             available.
          • When a block is later freed, the buddy is examined, and the two are coalesced if it is free.

Jiannan Ouyang, CS PhD@PITT                                                                                4
Page Table Management

        • Three Level Mapping




Jiannan Ouyang, CS PhD@PITT     5
Kernel Memory Mapping
                        display memory
                        device memory     896-MB
                                                    0xC0000000


                                                    4-GB

0x3FFFFFFF

          1-GB
                               896-MB

0x00000000                                           0x00000000
         Physical memory
 Jiannan Ouyang, CS PhD@PITT
                                         Virtual Memory     6
User Memory Mapping
                                                   kernel
                                                   space
                                                    stack
                              stack
                                      mappings
                              text
                              data                user space      3-GB

                    physical memory
                                                    data
                                                    text

Jiannan Ouyang, CS PhD@PITT                      virtual memory          7
User Memory Mapping
            virtual memory                      virtual memory
                              physical memory
                     kernel                        kernel
                     space                         space
                                  stack
                      stack                         stack
                                  data

                                  data
                 user space       stack           user space
                       data       text              data
                       text                         text

Jiannan Ouyang, CS PhD@PITT                                      8
Outline

        • Overview of Linux Memory Management

        • Page Reclamation

        • Swap & I/O




Jiannan Ouyang, CS PhD@PITT                     9
Memory Customers
                                             Kernel Code & data

                              Request        Slab Cache
       Buddy                                 Icache & dcache
       System
                              Reclaim        User Code & Data


                                              Page Cache

       • All memory except “User Code & data” are used by the kernel
       • “User Code & Data” are managed in user space, i.e. malloc/free,
         kernel can only swap out user pages
Jiannan Ouyang, CS PhD@PITT                                                10
Slab Cache




            • Cache for commonly used objects kept in an initialized state
              available for use by the kernel.
            • Save time of allocating, initializing and freeing the same object.
Jiannan Ouyang, CS PhD@PITT                                                        11
Disk related caches

        • Dcache (metadata): dentry objects
          representing filesystem pathnames.
        • Icache (metadata): inode objects
          representing disk inodes.
        • Page Cache (data): data pages from disk,
          main disk cache used


Jiannan Ouyang, CS PhD@PITT                          12
Memory Customers Review
                                              Kernel Code & data

                              Request         Slab Cache
       Buddy                                  Icache & dcache
       System
                              Reclaim         User Code & Data


                                               Page Cache


       We’ll see when will the kernel start reclaim pages, which pages to
       reclaim, and the replacement policy.

Jiannan Ouyang, CS PhD@PITT                                                 13
Reclamation: When?

        Zone Watermarks
        • Pages Low: kswapd is woken up by the buddy
          allocator to start freeing pages. The value is twice the
          value of pages min by default.
        • Pages Min: the allocator will do the kswapd work in
          a synchronous fashion, sometimes referred to as the
          direct-reclaim path.
        • Pages High: kswapd will go back to sleep. The
          default for pages high is three times the value of pages
          min.


Jiannan Ouyang, CS PhD@PITT                                      14
Jiannan Ouyang, CS PhD@PITT   15
Reclamation: Which?




Jiannan Ouyang, CS PhD@PITT   16
Reclamation: Which? (Con.)

        • Mapped & Anonymous Pages
                – Mapped: backed up by a file
                – Anonymous: anonymous memory region of a
                  process

        • Shared & Non-shared Pages
                – Unmapping from all page table entries at once:
                  reverse mapping, important improvement in Linux
                  2.6 Kernel
Jiannan Ouyang, CS PhD@PITT                                     17
Reclamation: Which? (Con.)




       shrink_caches until given target number of pages is met,
       1. slab cache (Kmem_cache_reap)
       2. User pages & page cache (refill & shrink_cache)
       3. dcache and icache


Jiannan Ouyang, CS PhD@PITT                                       18
Replacement Policy
                              (active, ref) = {11,10, 01, 00}
                               access
                                                                Ref=1, clear
 active                          active=1

                                      access                    Ref=0



 inactive                         active=0
                                                                          reclaim

Jiannan Ouyang, CS PhD@PITT                                                         19
Moving pages across the list




           mark_page_accessed( ):
                      on each access increase the (active, ref) counter;
                      if active=1 move inactive->active;
           Refill_inactive_zone():
                      if (ref=1) {ref=0; move to head of active list;}
                      else {move active -> inactive;}
Jiannan Ouyang, CS PhD@PITT                                                20
Outline

        • Overview of Linux Memory Management

        • Page Reclamation

        • Swap & I/O




Jiannan Ouyang, CS PhD@PITT                 21
Swap

        • Able to reclaim all the page frames
          obtained by a process, and not only those
          have an image on disk
                – anonymous pages (User stack or heap)
                – Dirty pages that belong to a private memory
                  mapping of a process
                – IPC shared pages

Jiannan Ouyang, CS PhD@PITT                                     22
Swap (Con.)
        • Set up “swap areas” on disk
        • allocating and freeing “page slots” in swap
          areas
        • Provide functions both to “swap out” pages
          from RAM into a swap area and to “swap in”
          pages from a swap area into RAM.
        • Mark Page Table entries to keep track of the
          positions of data in the swap areas.

Jiannan Ouyang, CS PhD@PITT                              23
Example
         While(1){
           p = malloc(N);
           memset(p, 0, N);
         //demand paging
         }


                       total   used      free    shared   buffers   cached
         Mem:          2013    1811       201         0       157      872
         -/+ buffers/cache:     782      1231
         Swap:          397       0       397




         $free -m
                total       used      free      shared    buffers    cached
         Mem:    2013       1956(+)   56(-)           0      4(-)     109(-)
         -/+ buffers/cache: 1842(+)   170(-)
         Swap:    397          8         389


Jiannan Ouyang, CS PhD@PITT                                                    24
Linux I/O Architecture
                              • Default file I/O API,
                                fwrite(), are buffered

                              • File System:
                                (dir, name, offset) -> LBA

                              • Device File: not normal
                                file



                              • How to do bypassing?
Jiannan Ouyang, CS PhD@PITT                               25
I/O Bypassing

        • Disk Cache
                – O_DIRECT

        • File System
                – Device file

        • I/O Scheduler
                – To be solved

Jiannan Ouyang, CS PhD@PITT      26
Thanks
                               Q&A




Jiannan Ouyang, CS PhD@PITT            27
Reference

        • Understanding the Linux Kernel, 3rd
        • Understanding the Linux Virtual
          Memory Manager




Jiannan Ouyang, CS PhD@PITT                     28
BACKUP SLICES


Jiannan Ouyang, CS PhD@PITT   29
Page Table Management

        • Three Level Mapping




Jiannan Ouyang, CS PhD@PITT     30
Page Table Management (Con.)

                              PGD Address



     Linear Address                         Physical Address
                                 MMU




Jiannan Ouyang, CS PhD@PITT                                31

Weitere ähnliche Inhalte

Was ist angesagt?

Memory management in Linux kernel
Memory management in Linux kernelMemory management in Linux kernel
Memory management in Linux kernelVadim Nikitin
 
Os solaris memory management
Os  solaris memory managementOs  solaris memory management
Os solaris memory managementTech_MX
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory ManagementRajan Kandel
 
Linux memorymanagement
Linux memorymanagementLinux memorymanagement
Linux memorymanagementpradeepelinux
 
31 address binding, dynamic loading
31 address binding, dynamic loading31 address binding, dynamic loading
31 address binding, dynamic loadingmyrajendra
 
Windows memory manager internals
Windows memory manager internalsWindows memory manager internals
Windows memory manager internalsSisimon Soman
 
Swap-space Management
Swap-space ManagementSwap-space Management
Swap-space ManagementAgnas Jasmine
 
Unix Memory Management - Operating Systems
Unix Memory Management - Operating SystemsUnix Memory Management - Operating Systems
Unix Memory Management - Operating SystemsDrishti Bhalla
 
Designing Information Structures For Performance And Reliability
Designing Information Structures For Performance And ReliabilityDesigning Information Structures For Performance And Reliability
Designing Information Structures For Performance And Reliabilitybryanrandol
 
Memory Management in Windows 7
Memory Management in Windows 7Memory Management in Windows 7
Memory Management in Windows 7Naveed Qadri
 
Operation System
Operation SystemOperation System
Operation SystemANANTHI1997
 
34 single partition allocation
34 single partition allocation34 single partition allocation
34 single partition allocationmyrajendra
 
Windows memory management
Windows memory managementWindows memory management
Windows memory managementTech_MX
 

Was ist angesagt? (20)

Memory management in Linux kernel
Memory management in Linux kernelMemory management in Linux kernel
Memory management in Linux kernel
 
Os solaris memory management
Os  solaris memory managementOs  solaris memory management
Os solaris memory management
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
 
Linux memorymanagement
Linux memorymanagementLinux memorymanagement
Linux memorymanagement
 
31 address binding, dynamic loading
31 address binding, dynamic loading31 address binding, dynamic loading
31 address binding, dynamic loading
 
SQL 2005 Memory Module
SQL 2005 Memory ModuleSQL 2005 Memory Module
SQL 2005 Memory Module
 
Windows memory manager internals
Windows memory manager internalsWindows memory manager internals
Windows memory manager internals
 
Swap-space Management
Swap-space ManagementSwap-space Management
Swap-space Management
 
Unix Memory Management - Operating Systems
Unix Memory Management - Operating SystemsUnix Memory Management - Operating Systems
Unix Memory Management - Operating Systems
 
Memory comp
Memory compMemory comp
Memory comp
 
Cache memory
Cache memoryCache memory
Cache memory
 
Linux Memory
Linux MemoryLinux Memory
Linux Memory
 
Designing Information Structures For Performance And Reliability
Designing Information Structures For Performance And ReliabilityDesigning Information Structures For Performance And Reliability
Designing Information Structures For Performance And Reliability
 
Memory managment
Memory managmentMemory managment
Memory managment
 
Memory management
Memory managementMemory management
Memory management
 
Memory Management in Windows 7
Memory Management in Windows 7Memory Management in Windows 7
Memory Management in Windows 7
 
Cache memory
Cache memoryCache memory
Cache memory
 
Operation System
Operation SystemOperation System
Operation System
 
34 single partition allocation
34 single partition allocation34 single partition allocation
34 single partition allocation
 
Windows memory management
Windows memory managementWindows memory management
Windows memory management
 

Andere mochten auch

Presentation on simple os
Presentation on simple osPresentation on simple os
Presentation on simple osBijay Rai
 
Kernel Recipes 2015: Solving the Linux storage scalability bottlenecks
Kernel Recipes 2015: Solving the Linux storage scalability bottlenecksKernel Recipes 2015: Solving the Linux storage scalability bottlenecks
Kernel Recipes 2015: Solving the Linux storage scalability bottlenecksAnne Nicolas
 
An Operating System for Multicore and Clouds: Mechanisms and Implementation
An Operating System for Multicore and Clouds: Mechanisms and ImplementationAn Operating System for Multicore and Clouds: Mechanisms and Implementation
An Operating System for Multicore and Clouds: Mechanisms and Implementationcucufrog
 
Kernel I/O Subsystem
Kernel I/O SubsystemKernel I/O Subsystem
Kernel I/O SubsystemSushil Ale
 
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...Anne Nicolas
 
Understand and optimize Linux I/O
Understand and optimize Linux I/OUnderstand and optimize Linux I/O
Understand and optimize Linux I/OAndrea Righi
 
High Performance Storage Devices in the Linux Kernel
High Performance Storage Devices in the Linux KernelHigh Performance Storage Devices in the Linux Kernel
High Performance Storage Devices in the Linux KernelKernel TLV
 
Os Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual MemoryOs Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual Memorysgpraju
 
Paging and Segmentation in Operating System
Paging and Segmentation in Operating SystemPaging and Segmentation in Operating System
Paging and Segmentation in Operating SystemRaj Mohan
 

Andere mochten auch (14)

Presentation on simple os
Presentation on simple osPresentation on simple os
Presentation on simple os
 
Kernel Recipes 2015: Solving the Linux storage scalability bottlenecks
Kernel Recipes 2015: Solving the Linux storage scalability bottlenecksKernel Recipes 2015: Solving the Linux storage scalability bottlenecks
Kernel Recipes 2015: Solving the Linux storage scalability bottlenecks
 
fundamentals of linux
fundamentals of linuxfundamentals of linux
fundamentals of linux
 
An Operating System for Multicore and Clouds: Mechanisms and Implementation
An Operating System for Multicore and Clouds: Mechanisms and ImplementationAn Operating System for Multicore and Clouds: Mechanisms and Implementation
An Operating System for Multicore and Clouds: Mechanisms and Implementation
 
Kernel I/O Subsystem
Kernel I/O SubsystemKernel I/O Subsystem
Kernel I/O Subsystem
 
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
 
Understand and optimize Linux I/O
Understand and optimize Linux I/OUnderstand and optimize Linux I/O
Understand and optimize Linux I/O
 
High Performance Storage Devices in the Linux Kernel
High Performance Storage Devices in the Linux KernelHigh Performance Storage Devices in the Linux Kernel
High Performance Storage Devices in the Linux Kernel
 
Memory management in linux
Memory management in linuxMemory management in linux
Memory management in linux
 
Paging
PagingPaging
Paging
 
Paging and segmentation
Paging and segmentationPaging and segmentation
Paging and segmentation
 
Os Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual MemoryOs Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual Memory
 
Demand paging
Demand pagingDemand paging
Demand paging
 
Paging and Segmentation in Operating System
Paging and Segmentation in Operating SystemPaging and Segmentation in Operating System
Paging and Segmentation in Operating System
 

Ähnlich wie VM and IO Topics in Linux

9 virtual memory management
9 virtual memory management9 virtual memory management
9 virtual memory managementDr. Loganathan R
 
ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "
ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "
ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "Kuniyasu Suzaki
 
Flash Usage Models for the Oracle Database
Flash Usage Models for the Oracle DatabaseFlash Usage Models for the Oracle Database
Flash Usage Models for the Oracle DatabaseIMEX Research
 
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...Alluxio, Inc.
 
My sql with enterprise storage
My sql with enterprise storageMy sql with enterprise storage
My sql with enterprise storageCaroline_Rose
 
36 fragmentaio nnd pageconcepts
36 fragmentaio nnd pageconcepts36 fragmentaio nnd pageconcepts
36 fragmentaio nnd pageconceptsmyrajendra
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsTony Nguyen
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsYoung Alista
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsJames Wong
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsFraboni Ec
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsHoang Nguyen
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsLuis Goldster
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsHarry Potter
 
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Alluxio, Inc.
 
M2 221-ssd fs-rl_2.2.1
M2 221-ssd fs-rl_2.2.1M2 221-ssd fs-rl_2.2.1
M2 221-ssd fs-rl_2.2.1MrudulaJoshi10
 
Challenges in Maintaining a High Performance Search Engine Written in Java
Challenges in Maintaining a High Performance Search Engine Written in JavaChallenges in Maintaining a High Performance Search Engine Written in Java
Challenges in Maintaining a High Performance Search Engine Written in Javalucenerevolution
 
Some key value stores using log-structure
Some key value stores using log-structureSome key value stores using log-structure
Some key value stores using log-structureZhichao Liang
 

Ähnlich wie VM and IO Topics in Linux (20)

9 virtual memory management
9 virtual memory management9 virtual memory management
9 virtual memory management
 
ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "
ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "
ASPLOS2011 workshop RESoLVE "Effect of Disk Prefetching of Guest OS "
 
Flash Usage Models for the Oracle Database
Flash Usage Models for the Oracle DatabaseFlash Usage Models for the Oracle Database
Flash Usage Models for the Oracle Database
 
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
 
My sql with enterprise storage
My sql with enterprise storageMy sql with enterprise storage
My sql with enterprise storage
 
36 fragmentaio nnd pageconcepts
36 fragmentaio nnd pageconcepts36 fragmentaio nnd pageconcepts
36 fragmentaio nnd pageconcepts
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessors
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessors
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessors
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessors
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessors
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessors
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessors
 
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
 
Extlect03
Extlect03Extlect03
Extlect03
 
M2 221-ssd fs-rl_2.2.1
M2 221-ssd fs-rl_2.2.1M2 221-ssd fs-rl_2.2.1
M2 221-ssd fs-rl_2.2.1
 
Challenges in Maintaining a High Performance Search Engine Written in Java
Challenges in Maintaining a High Performance Search Engine Written in JavaChallenges in Maintaining a High Performance Search Engine Written in Java
Challenges in Maintaining a High Performance Search Engine Written in Java
 
Presentation
PresentationPresentation
Presentation
 
Presentation
PresentationPresentation
Presentation
 
Some key value stores using log-structure
Some key value stores using log-structureSome key value stores using log-structure
Some key value stores using log-structure
 

Kürzlich hochgeladen

Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Kürzlich hochgeladen (20)

Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

VM and IO Topics in Linux

  • 1. VM and I/O Topics in Linux Page Replacement, Swap and I/O Jiannan Ouyang Ph.D. Student Computer Science Department University of Pittsburgh 05/05/2011
  • 2. Outline • Overview of Linux Memory Management • Page Reclamation • Swap & I/O Jiannan Ouyang, CS PhD@PITT 2
  • 3. Describing Physical Memory Node: NUMA memory region Zone: memory type Struct Page: page frame Jiannan Ouyang, CS PhD@PITT 3
  • 4. Physical Page Allocation Binary Buddy Allocator: • If a block of the desired size is not available, a large block is broken up in half, and the two blocks are buddies to each other. One half is used for the allocation, and the other is free. The blocks are continuously halved as necessary until a block of the desired size is available. • When a block is later freed, the buddy is examined, and the two are coalesced if it is free. Jiannan Ouyang, CS PhD@PITT 4
  • 5. Page Table Management • Three Level Mapping Jiannan Ouyang, CS PhD@PITT 5
  • 6. Kernel Memory Mapping display memory device memory 896-MB 0xC0000000 4-GB 0x3FFFFFFF 1-GB 896-MB 0x00000000 0x00000000 Physical memory Jiannan Ouyang, CS PhD@PITT Virtual Memory 6
  • 7. User Memory Mapping kernel space stack stack mappings text data user space 3-GB physical memory data text Jiannan Ouyang, CS PhD@PITT virtual memory 7
  • 8. User Memory Mapping virtual memory virtual memory physical memory kernel kernel space space stack stack stack data data user space stack user space data text data text text Jiannan Ouyang, CS PhD@PITT 8
  • 9. Outline • Overview of Linux Memory Management • Page Reclamation • Swap & I/O Jiannan Ouyang, CS PhD@PITT 9
  • 10. Memory Customers Kernel Code & data Request Slab Cache Buddy Icache & dcache System Reclaim User Code & Data Page Cache • All memory except “User Code & data” are used by the kernel • “User Code & Data” are managed in user space, i.e. malloc/free, kernel can only swap out user pages Jiannan Ouyang, CS PhD@PITT 10
  • 11. Slab Cache • Cache for commonly used objects kept in an initialized state available for use by the kernel. • Save time of allocating, initializing and freeing the same object. Jiannan Ouyang, CS PhD@PITT 11
  • 12. Disk related caches • Dcache (metadata): dentry objects representing filesystem pathnames. • Icache (metadata): inode objects representing disk inodes. • Page Cache (data): data pages from disk, main disk cache used Jiannan Ouyang, CS PhD@PITT 12
  • 13. Memory Customers Review Kernel Code & data Request Slab Cache Buddy Icache & dcache System Reclaim User Code & Data Page Cache We’ll see when will the kernel start reclaim pages, which pages to reclaim, and the replacement policy. Jiannan Ouyang, CS PhD@PITT 13
  • 14. Reclamation: When? Zone Watermarks • Pages Low: kswapd is woken up by the buddy allocator to start freeing pages. The value is twice the value of pages min by default. • Pages Min: the allocator will do the kswapd work in a synchronous fashion, sometimes referred to as the direct-reclaim path. • Pages High: kswapd will go back to sleep. The default for pages high is three times the value of pages min. Jiannan Ouyang, CS PhD@PITT 14
  • 15. Jiannan Ouyang, CS PhD@PITT 15
  • 17. Reclamation: Which? (Con.) • Mapped & Anonymous Pages – Mapped: backed up by a file – Anonymous: anonymous memory region of a process • Shared & Non-shared Pages – Unmapping from all page table entries at once: reverse mapping, important improvement in Linux 2.6 Kernel Jiannan Ouyang, CS PhD@PITT 17
  • 18. Reclamation: Which? (Con.) shrink_caches until given target number of pages is met, 1. slab cache (Kmem_cache_reap) 2. User pages & page cache (refill & shrink_cache) 3. dcache and icache Jiannan Ouyang, CS PhD@PITT 18
  • 19. Replacement Policy (active, ref) = {11,10, 01, 00} access Ref=1, clear active active=1 access Ref=0 inactive active=0 reclaim Jiannan Ouyang, CS PhD@PITT 19
  • 20. Moving pages across the list mark_page_accessed( ): on each access increase the (active, ref) counter; if active=1 move inactive->active; Refill_inactive_zone(): if (ref=1) {ref=0; move to head of active list;} else {move active -> inactive;} Jiannan Ouyang, CS PhD@PITT 20
  • 21. Outline • Overview of Linux Memory Management • Page Reclamation • Swap & I/O Jiannan Ouyang, CS PhD@PITT 21
  • 22. Swap • Able to reclaim all the page frames obtained by a process, and not only those have an image on disk – anonymous pages (User stack or heap) – Dirty pages that belong to a private memory mapping of a process – IPC shared pages Jiannan Ouyang, CS PhD@PITT 22
  • 23. Swap (Con.) • Set up “swap areas” on disk • allocating and freeing “page slots” in swap areas • Provide functions both to “swap out” pages from RAM into a swap area and to “swap in” pages from a swap area into RAM. • Mark Page Table entries to keep track of the positions of data in the swap areas. Jiannan Ouyang, CS PhD@PITT 23
  • 24. Example While(1){ p = malloc(N); memset(p, 0, N); //demand paging } total used free shared buffers cached Mem: 2013 1811 201 0 157 872 -/+ buffers/cache: 782 1231 Swap: 397 0 397 $free -m total used free shared buffers cached Mem: 2013 1956(+) 56(-) 0 4(-) 109(-) -/+ buffers/cache: 1842(+) 170(-) Swap: 397 8 389 Jiannan Ouyang, CS PhD@PITT 24
  • 25. Linux I/O Architecture • Default file I/O API, fwrite(), are buffered • File System: (dir, name, offset) -> LBA • Device File: not normal file • How to do bypassing? Jiannan Ouyang, CS PhD@PITT 25
  • 26. I/O Bypassing • Disk Cache – O_DIRECT • File System – Device file • I/O Scheduler – To be solved Jiannan Ouyang, CS PhD@PITT 26
  • 27. Thanks Q&A Jiannan Ouyang, CS PhD@PITT 27
  • 28. Reference • Understanding the Linux Kernel, 3rd • Understanding the Linux Virtual Memory Manager Jiannan Ouyang, CS PhD@PITT 28
  • 30. Page Table Management • Three Level Mapping Jiannan Ouyang, CS PhD@PITT 30
  • 31. Page Table Management (Con.) PGD Address Linear Address Physical Address MMU Jiannan Ouyang, CS PhD@PITT 31