SlideShare a Scribd company logo
1 of 19
Download to read offline
Managing Shared L2 Caches
  on Multicore Systems
       in Software
David Tam, Reza Azimi, Livio Soares, Michael Stumm

                University of Toronto
          {tamda, azimi, livio, stumm}@eecg.toronto.edu



                        WIOSCA 2007
                                                          Software Partitioning 1
Multicores Today
                                         Core



                                        Hardware
                                        Thread
                    Shared L2 Cache




Shared L2 Issues:
    Uncontrolled sharing  threads compete for space
●

      LRU eviction policy     interference


                                                   Software Partitioning 2
Uncontrolled Sharing is Bad
                               Dual Core Power5
                          100 100
                    100
                                              89            87
                     90

                     80
                                                      72
                     70

                                        57
                     60
      IPC (%)




                                                                     mcf
                     50                                              art

                     40

                     30

                     20

                     10
                                                                            Our System,
                      0
                                                                            In Software
                          Single-Prg   Multi-Prg,     Multi-Prg,
                                       Uncontrolled   Controlled

    Hardware partitioning mechanisms:
●


                     [Qureshi06],[Rafique06],[Chandra05],[Iyer04],[Suh04]
                ●




                     Not available today
                ●
                                                                               Software Partitioning 3
Software Partitioning
    Implemented on real system
●


    Apply classic page-coloring technique
●

            [Cho06],[Bershad94],[Lynch92],[Kessler92]
        ●



    Guided phys pages allocation
●

         controlled L2 cache line usage
                                                   Virtual Pages
                                                   Process A
                                 Physical Pages
                                 Color A


                  L2 Cache
                             }
        Color A                  Color B




                             }
        Color B


                                                   Virtual Pages
                  Physically                       Process B
                                 Color A
                  Indexed


                                 Color B
                         {


                      Fixed Mapping
                                            {
                        (Hardware)         OS Managed              Software Partitioning 4
Software-Based Properties
    Deployable right now
●


        OS, VMM
    ●



    Flexible
●


        Can allocate for specific: users, applications, processes,
    ●

        threads
        Unaffected by: # threads, location
    ●




                                                               MySQL
                                                           M
               A A MM       MA MM   AAAA          A MMM

                                                               Apache
                                                           A



                25%                   75%
               Apache                MySql
               (4 colors)           (12 colors)



               MMA A        MMMA    MMMM          A A MM

                                                                  Software Partitioning 5
Linux Implementation
        Color-aware physical page allocation
    ●


             Minimal changes to Linux
         ●



        List of local free physical pages:
    ●

             Single list
Old:
(per logical CPU)

        To find target color quickly:
    ●

                               Multiple lists
                           0

                           1

        New:
(per logical CPU)
                       15


        Looks like single list to rest of Linux
    ●

                                                  Software Partitioning 6
Experimental Setup
                    Power5


         Victim
                    1.9MB L2
                    14 cycles
         36MB
        90 cycles



                      Local         Remote


                        4 GB         4 GB
                      280 cycles    310 cycles


    16 partitions (colors):
●


            120 KB each (L2), 2.25 MB each (L3)
        ●


    Linux 2.6.15
●


    SPECcpu2000, SPECjbb2000
●



                                                  Software Partitioning 7
Overview of Results

    Show ability to control cache usage
●


        Single-programmed
    ●



    Show benefits of partitioning
●


        Multi-programmed
    ●



    Metric selection for performance analysis
●




                                            Software Partitioning 8
Overview of Results

    Show ability to control cache usage
●


        Single-programmed
    ●



    Show benefits of partitioning
●


        Multi-programmed
    ●



    Metric selection for performance analysis
●




                                            Software Partitioning 9
Controlling Cache Usage
    Mechanism can control cache usage
●

    Single-programmed
●




                                        Software Partitioning 10
Overview of Results

    Show ability to control cache usage
●


        Single-programmed
    ●



    Show benefits of partitioning
●


        Multi-programmed
    ●



    Metric selection for performance analysis
●




                                           Software Partitioning 11
Benefits of Partitioning
    Multiprogrammed (1 application per core)
●

    ● Base: multi-prg, no partitioning




                                               Software Partitioning 12
Benefits of Partitioning




                       Software Partitioning 13
Mechanism Issues
Inherent:
    L3 victim cache partitioned correspondingly
●


    Super-pages reduce # of partitions
●


    Time granularity
●


        Allocating more/less partitions
    ●




Current Implementation:
    Static partitioning
●

     ● Future work: dynamic partitioning




                                                  Software Partitioning 14
Overview of Results

    Show ability to control cache usage
●


        Single-programmed
    ●



    Show benefits of partitioning
●


        Multi-programmed
    ●



    Metric selection for performance analysis
●




                                          Software Partitioning 15
Experiences with Metrics
    Explaining & predicting performance
●



Obvious Metric: L2 Miss Rate Curve (MRC)
      Rate = per billion cycles
     ●


    Miss rate ignores cost variations [Moreto07],[Qureshi06]
●

        ● Miss latencies

        ● Memory level parallelism [Qureshi06]




Alternative Metric: Stall Rate Curve (SRC)
    Stall rate more inclusive [McGregor05]
●

         ● Instruction retirement stall

         ● Caused by memory latencies

         ● Directly measures pipeline degradation


    Measurable by hardware performance counters
●




                                                               Software Partitioning 16
L2 MRC Failure in Art
    Single-programmed art
●


     ● L2 MRC does not show trend

     ● SRC shows trend

     ● (per billion cycles)


                                                       D-Cache SRC
    Exec Time              L2 MRC




                                          (millions)
         L1 data cache miss SRC is adequate
     ●




                                                         Software Partitioning 17
L2 MRC Failure in Art
                 Single-programmed art
             ●



                     L3 and main memory hit rates explain trend
                 ●


                     (per billion cycles)
                 ●




                                            L3 Hit Rate   Main Memory Hit Rate
                 D-Cache SRC
(millions)




                                                                  Software Partitioning 18
Conclusions
    Software partitioning is effective
●


         Up to 17% improvement
     ●


         Deployable today
     ●


    Stall metric is better than miss metric
●


         More accurate for performance analysis
     ●


         Measurable on real system
     ●




                 Work In Progress
    More workloads
●


    Dynamically allocate more/less partitions
●


    Dynamically determine optimal # of partitions
●

     ● Using hardware performance counters


                                                    Software Partitioning 19

More Related Content

What's hot

Membase Meetup Chicago - january 2011
Membase Meetup Chicago - january 2011Membase Meetup Chicago - january 2011
Membase Meetup Chicago - january 2011Membase
 
Scaling With Sun Systems For MySQL Jan09
Scaling With Sun Systems For MySQL Jan09Scaling With Sun Systems For MySQL Jan09
Scaling With Sun Systems For MySQL Jan09Steve Staso
 
Gluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGlusterFS
 
Scalability
ScalabilityScalability
Scalabilityfelho
 
Hadoop World 2011: HDFS Federation - Suresh Srinivas, Hortonworks
Hadoop World 2011: HDFS Federation - Suresh Srinivas, HortonworksHadoop World 2011: HDFS Federation - Suresh Srinivas, Hortonworks
Hadoop World 2011: HDFS Federation - Suresh Srinivas, HortonworksCloudera, Inc.
 
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS StorageWebinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS StorageGlusterFS
 
Using Novell Sentinel Log Manager to Monitor Novell Applications
Using Novell Sentinel Log Manager to Monitor Novell ApplicationsUsing Novell Sentinel Log Manager to Monitor Novell Applications
Using Novell Sentinel Log Manager to Monitor Novell ApplicationsNovell
 
Apache Hadoop on Virtual Machines
Apache Hadoop on Virtual MachinesApache Hadoop on Virtual Machines
Apache Hadoop on Virtual MachinesDataWorks Summit
 
Sql Server 2005 Memory Internals
Sql Server 2005 Memory InternalsSql Server 2005 Memory Internals
Sql Server 2005 Memory InternalsDmitry Geyzersky
 
Embedded Database Technology | Interbase From Embarcadero Technologies
Embedded Database Technology | Interbase From Embarcadero TechnologiesEmbedded Database Technology | Interbase From Embarcadero Technologies
Embedded Database Technology | Interbase From Embarcadero TechnologiesMichael Findling
 
Embracing Open Source: Practice and Experience from Alibaba
Embracing Open Source: Practice and Experience from AlibabaEmbracing Open Source: Practice and Experience from Alibaba
Embracing Open Source: Practice and Experience from AlibabaWensong Zhang
 
SLES 11 SP2 PerformanceEvaluation for Linux on System z
SLES 11 SP2 PerformanceEvaluation for Linux on System zSLES 11 SP2 PerformanceEvaluation for Linux on System z
SLES 11 SP2 PerformanceEvaluation for Linux on System zIBM India Smarter Computing
 
Intro to GlusterFS Webinar - August 2011
Intro to GlusterFS Webinar - August 2011Intro to GlusterFS Webinar - August 2011
Intro to GlusterFS Webinar - August 2011GlusterFS
 
My experience with embedding PostgreSQL
 My experience with embedding PostgreSQL My experience with embedding PostgreSQL
My experience with embedding PostgreSQLJignesh Shah
 
数据中心网络研究:机遇与挑战
数据中心网络研究:机遇与挑战数据中心网络研究:机遇与挑战
数据中心网络研究:机遇与挑战Weiwei Fang
 
Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 HardwareJacob Wu
 
VNSISPL_DBMS_Concepts_ch17
VNSISPL_DBMS_Concepts_ch17VNSISPL_DBMS_Concepts_ch17
VNSISPL_DBMS_Concepts_ch17sriprasoon
 

What's hot (20)

Membase Meetup Chicago - january 2011
Membase Meetup Chicago - january 2011Membase Meetup Chicago - january 2011
Membase Meetup Chicago - january 2011
 
Scaling With Sun Systems For MySQL Jan09
Scaling With Sun Systems For MySQL Jan09Scaling With Sun Systems For MySQL Jan09
Scaling With Sun Systems For MySQL Jan09
 
Gluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFS
 
Cosbench apac
Cosbench apacCosbench apac
Cosbench apac
 
cosbench-openstack.pdf
cosbench-openstack.pdfcosbench-openstack.pdf
cosbench-openstack.pdf
 
Scalability
ScalabilityScalability
Scalability
 
Hadoop World 2011: HDFS Federation - Suresh Srinivas, Hortonworks
Hadoop World 2011: HDFS Federation - Suresh Srinivas, HortonworksHadoop World 2011: HDFS Federation - Suresh Srinivas, Hortonworks
Hadoop World 2011: HDFS Federation - Suresh Srinivas, Hortonworks
 
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS StorageWebinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
 
Using Novell Sentinel Log Manager to Monitor Novell Applications
Using Novell Sentinel Log Manager to Monitor Novell ApplicationsUsing Novell Sentinel Log Manager to Monitor Novell Applications
Using Novell Sentinel Log Manager to Monitor Novell Applications
 
Apache Hadoop on Virtual Machines
Apache Hadoop on Virtual MachinesApache Hadoop on Virtual Machines
Apache Hadoop on Virtual Machines
 
Sql Server 2005 Memory Internals
Sql Server 2005 Memory InternalsSql Server 2005 Memory Internals
Sql Server 2005 Memory Internals
 
Embedded Database Technology | Interbase From Embarcadero Technologies
Embedded Database Technology | Interbase From Embarcadero TechnologiesEmbedded Database Technology | Interbase From Embarcadero Technologies
Embedded Database Technology | Interbase From Embarcadero Technologies
 
Embracing Open Source: Practice and Experience from Alibaba
Embracing Open Source: Practice and Experience from AlibabaEmbracing Open Source: Practice and Experience from Alibaba
Embracing Open Source: Practice and Experience from Alibaba
 
SLES 11 SP2 PerformanceEvaluation for Linux on System z
SLES 11 SP2 PerformanceEvaluation for Linux on System zSLES 11 SP2 PerformanceEvaluation for Linux on System z
SLES 11 SP2 PerformanceEvaluation for Linux on System z
 
Intro to GlusterFS Webinar - August 2011
Intro to GlusterFS Webinar - August 2011Intro to GlusterFS Webinar - August 2011
Intro to GlusterFS Webinar - August 2011
 
My experience with embedding PostgreSQL
 My experience with embedding PostgreSQL My experience with embedding PostgreSQL
My experience with embedding PostgreSQL
 
数据中心网络研究:机遇与挑战
数据中心网络研究:机遇与挑战数据中心网络研究:机遇与挑战
数据中心网络研究:机遇与挑战
 
Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 Hardware
 
VNSISPL_DBMS_Concepts_ch17
VNSISPL_DBMS_Concepts_ch17VNSISPL_DBMS_Concepts_ch17
VNSISPL_DBMS_Concepts_ch17
 
1
11
1
 

Viewers also liked

SecureCore RTAS2013
SecureCore RTAS2013SecureCore RTAS2013
SecureCore RTAS2013mkyoon83
 
Parallelism-Aware Memory Interference Delay Analysis for COTS Multicore Systems
Parallelism-Aware Memory Interference Delay Analysis for COTS Multicore SystemsParallelism-Aware Memory Interference Delay Analysis for COTS Multicore Systems
Parallelism-Aware Memory Interference Delay Analysis for COTS Multicore SystemsHeechul Yun
 
MemGuard: Memory Bandwidth Reservation System for Efficient Performance Isola...
MemGuard: Memory Bandwidth Reservation System for Efficient Performance Isola...MemGuard: Memory Bandwidth Reservation System for Efficient Performance Isola...
MemGuard: Memory Bandwidth Reservation System for Efficient Performance Isola...Heechul Yun
 
Multi-IMA Partition Scheduling for Global I/O Synchronization
Multi-IMA Partition Scheduling for Global I/O SynchronizationMulti-IMA Partition Scheduling for Global I/O Synchronization
Multi-IMA Partition Scheduling for Global I/O Synchronizationrtsljekim
 
Cache, Set Associative, Write-Through, Write-Back
Cache, Set Associative, Write-Through, Write-BackCache, Set Associative, Write-Through, Write-Back
Cache, Set Associative, Write-Through, Write-Back동호 이
 
Practical real-time operating system security for the masses
Practical real-time operating system security for the massesPractical real-time operating system security for the masses
Practical real-time operating system security for the massesMilosch Meriac
 
Resilient IoT Security: The end of flat security models
Resilient IoT Security: The end of flat security modelsResilient IoT Security: The end of flat security models
Resilient IoT Security: The end of flat security modelsMilosch Meriac
 

Viewers also liked (8)

SecureCore RTAS2013
SecureCore RTAS2013SecureCore RTAS2013
SecureCore RTAS2013
 
Parallelism-Aware Memory Interference Delay Analysis for COTS Multicore Systems
Parallelism-Aware Memory Interference Delay Analysis for COTS Multicore SystemsParallelism-Aware Memory Interference Delay Analysis for COTS Multicore Systems
Parallelism-Aware Memory Interference Delay Analysis for COTS Multicore Systems
 
MemGuard: Memory Bandwidth Reservation System for Efficient Performance Isola...
MemGuard: Memory Bandwidth Reservation System for Efficient Performance Isola...MemGuard: Memory Bandwidth Reservation System for Efficient Performance Isola...
MemGuard: Memory Bandwidth Reservation System for Efficient Performance Isola...
 
Multi-IMA Partition Scheduling for Global I/O Synchronization
Multi-IMA Partition Scheduling for Global I/O SynchronizationMulti-IMA Partition Scheduling for Global I/O Synchronization
Multi-IMA Partition Scheduling for Global I/O Synchronization
 
Cache, Set Associative, Write-Through, Write-Back
Cache, Set Associative, Write-Through, Write-BackCache, Set Associative, Write-Through, Write-Back
Cache, Set Associative, Write-Through, Write-Back
 
RapidMRC
RapidMRCRapidMRC
RapidMRC
 
Practical real-time operating system security for the masses
Practical real-time operating system security for the massesPractical real-time operating system security for the masses
Practical real-time operating system security for the masses
 
Resilient IoT Security: The end of flat security models
Resilient IoT Security: The end of flat security modelsResilient IoT Security: The end of flat security models
Resilient IoT Security: The end of flat security models
 

Similar to Cache-partitioning

Profiling Multicore Systems to Maximize Core Utilization
Profiling Multicore Systems to Maximize Core Utilization Profiling Multicore Systems to Maximize Core Utilization
Profiling Multicore Systems to Maximize Core Utilization mentoresd
 
Os Madsen Block
Os Madsen BlockOs Madsen Block
Os Madsen Blockoscon2007
 
Practice and challenges from building IaaS
Practice and challenges from building IaaSPractice and challenges from building IaaS
Practice and challenges from building IaaSShawn Zhu
 
SlapOS Presentation at VW2011 Seoul
SlapOS Presentation at VW2011 SeoulSlapOS Presentation at VW2011 Seoul
SlapOS Presentation at VW2011 SeoulJean-Paul Smets
 
Asset site 2010-dec-03_13-57
Asset site 2010-dec-03_13-57Asset site 2010-dec-03_13-57
Asset site 2010-dec-03_13-57yonathan01
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...Chester Chen
 
V Labs Product Presentation
V Labs  Product PresentationV Labs  Product Presentation
V Labs Product PresentationWil Huijben
 
Cloudcon East Presentation
Cloudcon East PresentationCloudcon East Presentation
Cloudcon East Presentationbr7tt
 
Cloudcon East Presentation
Cloudcon East PresentationCloudcon East Presentation
Cloudcon East Presentationbr7tt
 
Shak larry-jeder-perf-and-tuning-summit14-part1-final
Shak larry-jeder-perf-and-tuning-summit14-part1-finalShak larry-jeder-perf-and-tuning-summit14-part1-final
Shak larry-jeder-perf-and-tuning-summit14-part1-finalTommy Lee
 
N2os overview
N2os overviewN2os overview
N2os overviewhwjeon1
 
XPDDS17: Intel New QoS (RDT) Features Introduction - Yi Sun, Intel
XPDDS17: Intel New QoS (RDT) Features Introduction - Yi Sun, IntelXPDDS17: Intel New QoS (RDT) Features Introduction - Yi Sun, Intel
XPDDS17: Intel New QoS (RDT) Features Introduction - Yi Sun, IntelThe Linux Foundation
 
Continuum PCAP
Continuum PCAP Continuum PCAP
Continuum PCAP rwachsman
 
Linux on System z Update: Current & Future Linux on System z Technology
Linux on System z Update: Current & Future Linux on System z TechnologyLinux on System z Update: Current & Future Linux on System z Technology
Linux on System z Update: Current & Future Linux on System z TechnologyIBM India Smarter Computing
 
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...Jeff Larkin
 
Vdi demo
Vdi demoVdi demo
Vdi demotomcku
 
Open Storage Sun Intel European Business Technology Tour
Open Storage Sun Intel European Business Technology TourOpen Storage Sun Intel European Business Technology Tour
Open Storage Sun Intel European Business Technology TourWalter Moriconi
 
Yahoo Communities Architecture Unlikely Bedfellows
Yahoo Communities Architecture Unlikely BedfellowsYahoo Communities Architecture Unlikely Bedfellows
Yahoo Communities Architecture Unlikely BedfellowsConSanFrancisco123
 

Similar to Cache-partitioning (20)

Profiling Multicore Systems to Maximize Core Utilization
Profiling Multicore Systems to Maximize Core Utilization Profiling Multicore Systems to Maximize Core Utilization
Profiling Multicore Systems to Maximize Core Utilization
 
Os Madsen Block
Os Madsen BlockOs Madsen Block
Os Madsen Block
 
Practice and challenges from building IaaS
Practice and challenges from building IaaSPractice and challenges from building IaaS
Practice and challenges from building IaaS
 
SlapOS Presentation at VW2011 Seoul
SlapOS Presentation at VW2011 SeoulSlapOS Presentation at VW2011 Seoul
SlapOS Presentation at VW2011 Seoul
 
Asset site 2010-dec-03_13-57
Asset site 2010-dec-03_13-57Asset site 2010-dec-03_13-57
Asset site 2010-dec-03_13-57
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
 
V Labs Product Presentation
V Labs  Product PresentationV Labs  Product Presentation
V Labs Product Presentation
 
Cloudcon East Presentation
Cloudcon East PresentationCloudcon East Presentation
Cloudcon East Presentation
 
Cloudcon East Presentation
Cloudcon East PresentationCloudcon East Presentation
Cloudcon East Presentation
 
Shak larry-jeder-perf-and-tuning-summit14-part1-final
Shak larry-jeder-perf-and-tuning-summit14-part1-finalShak larry-jeder-perf-and-tuning-summit14-part1-final
Shak larry-jeder-perf-and-tuning-summit14-part1-final
 
N2os overview
N2os overviewN2os overview
N2os overview
 
XPDDS17: Intel New QoS (RDT) Features Introduction - Yi Sun, Intel
XPDDS17: Intel New QoS (RDT) Features Introduction - Yi Sun, IntelXPDDS17: Intel New QoS (RDT) Features Introduction - Yi Sun, Intel
XPDDS17: Intel New QoS (RDT) Features Introduction - Yi Sun, Intel
 
Continuum PCAP
Continuum PCAP Continuum PCAP
Continuum PCAP
 
What's New in RHEL 6 for Linux on System z?
What's New in RHEL 6 for Linux on System z?What's New in RHEL 6 for Linux on System z?
What's New in RHEL 6 for Linux on System z?
 
Linux on System z Update: Current & Future Linux on System z Technology
Linux on System z Update: Current & Future Linux on System z TechnologyLinux on System z Update: Current & Future Linux on System z Technology
Linux on System z Update: Current & Future Linux on System z Technology
 
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
 
Sivp presentation
Sivp presentationSivp presentation
Sivp presentation
 
Vdi demo
Vdi demoVdi demo
Vdi demo
 
Open Storage Sun Intel European Business Technology Tour
Open Storage Sun Intel European Business Technology TourOpen Storage Sun Intel European Business Technology Tour
Open Storage Sun Intel European Business Technology Tour
 
Yahoo Communities Architecture Unlikely Bedfellows
Yahoo Communities Architecture Unlikely BedfellowsYahoo Communities Architecture Unlikely Bedfellows
Yahoo Communities Architecture Unlikely Bedfellows
 

Recently uploaded

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 

Cache-partitioning

  • 1. Managing Shared L2 Caches on Multicore Systems in Software David Tam, Reza Azimi, Livio Soares, Michael Stumm University of Toronto {tamda, azimi, livio, stumm}@eecg.toronto.edu WIOSCA 2007 Software Partitioning 1
  • 2. Multicores Today Core Hardware Thread Shared L2 Cache Shared L2 Issues: Uncontrolled sharing threads compete for space ● LRU eviction policy interference Software Partitioning 2
  • 3. Uncontrolled Sharing is Bad Dual Core Power5 100 100 100 89 87 90 80 72 70 57 60 IPC (%) mcf 50 art 40 30 20 10 Our System, 0 In Software Single-Prg Multi-Prg, Multi-Prg, Uncontrolled Controlled Hardware partitioning mechanisms: ● [Qureshi06],[Rafique06],[Chandra05],[Iyer04],[Suh04] ● Not available today ● Software Partitioning 3
  • 4. Software Partitioning Implemented on real system ● Apply classic page-coloring technique ● [Cho06],[Bershad94],[Lynch92],[Kessler92] ● Guided phys pages allocation ● controlled L2 cache line usage Virtual Pages Process A Physical Pages Color A L2 Cache } Color A Color B } Color B Virtual Pages Physically Process B Color A Indexed Color B { Fixed Mapping { (Hardware) OS Managed Software Partitioning 4
  • 5. Software-Based Properties Deployable right now ● OS, VMM ● Flexible ● Can allocate for specific: users, applications, processes, ● threads Unaffected by: # threads, location ● MySQL M A A MM MA MM AAAA A MMM Apache A 25% 75% Apache MySql (4 colors) (12 colors) MMA A MMMA MMMM A A MM Software Partitioning 5
  • 6. Linux Implementation Color-aware physical page allocation ● Minimal changes to Linux ● List of local free physical pages: ● Single list Old: (per logical CPU) To find target color quickly: ● Multiple lists 0 1 New: (per logical CPU) 15 Looks like single list to rest of Linux ● Software Partitioning 6
  • 7. Experimental Setup Power5 Victim 1.9MB L2 14 cycles 36MB 90 cycles Local Remote 4 GB 4 GB 280 cycles 310 cycles 16 partitions (colors): ● 120 KB each (L2), 2.25 MB each (L3) ● Linux 2.6.15 ● SPECcpu2000, SPECjbb2000 ● Software Partitioning 7
  • 8. Overview of Results Show ability to control cache usage ● Single-programmed ● Show benefits of partitioning ● Multi-programmed ● Metric selection for performance analysis ● Software Partitioning 8
  • 9. Overview of Results Show ability to control cache usage ● Single-programmed ● Show benefits of partitioning ● Multi-programmed ● Metric selection for performance analysis ● Software Partitioning 9
  • 10. Controlling Cache Usage Mechanism can control cache usage ● Single-programmed ● Software Partitioning 10
  • 11. Overview of Results Show ability to control cache usage ● Single-programmed ● Show benefits of partitioning ● Multi-programmed ● Metric selection for performance analysis ● Software Partitioning 11
  • 12. Benefits of Partitioning Multiprogrammed (1 application per core) ● ● Base: multi-prg, no partitioning Software Partitioning 12
  • 13. Benefits of Partitioning Software Partitioning 13
  • 14. Mechanism Issues Inherent: L3 victim cache partitioned correspondingly ● Super-pages reduce # of partitions ● Time granularity ● Allocating more/less partitions ● Current Implementation: Static partitioning ● ● Future work: dynamic partitioning Software Partitioning 14
  • 15. Overview of Results Show ability to control cache usage ● Single-programmed ● Show benefits of partitioning ● Multi-programmed ● Metric selection for performance analysis ● Software Partitioning 15
  • 16. Experiences with Metrics Explaining & predicting performance ● Obvious Metric: L2 Miss Rate Curve (MRC) Rate = per billion cycles ● Miss rate ignores cost variations [Moreto07],[Qureshi06] ● ● Miss latencies ● Memory level parallelism [Qureshi06] Alternative Metric: Stall Rate Curve (SRC) Stall rate more inclusive [McGregor05] ● ● Instruction retirement stall ● Caused by memory latencies ● Directly measures pipeline degradation Measurable by hardware performance counters ● Software Partitioning 16
  • 17. L2 MRC Failure in Art Single-programmed art ● ● L2 MRC does not show trend ● SRC shows trend ● (per billion cycles) D-Cache SRC Exec Time L2 MRC (millions) L1 data cache miss SRC is adequate ● Software Partitioning 17
  • 18. L2 MRC Failure in Art Single-programmed art ● L3 and main memory hit rates explain trend ● (per billion cycles) ● L3 Hit Rate Main Memory Hit Rate D-Cache SRC (millions) Software Partitioning 18
  • 19. Conclusions Software partitioning is effective ● Up to 17% improvement ● Deployable today ● Stall metric is better than miss metric ● More accurate for performance analysis ● Measurable on real system ● Work In Progress More workloads ● Dynamically allocate more/less partitions ● Dynamically determine optimal # of partitions ● ● Using hardware performance counters Software Partitioning 19