SlideShare a Scribd company logo
1 of 48
Download to read offline
PmcTools: Whole-system, low-overhead
             performance measurement in FreeBSD

                                   Joseph Koshy

                                   FreeBSD Developer


                                   18 April 2009




Joseph Koshy (FreeBSD Developer)     FreeBSD/PmcTools   18 April 2009   1 / 48
Outline

1   Introduction
       Introducing PmcTools
2   PmcTools
      Architectural Overview
      API
      Design Issues
      Profiling
      Implementation
3   Status & Future Work
      Status
      Future Projects
      Research Topics
4   Conclusion

Joseph Koshy (FreeBSD Developer)   FreeBSD/PmcTools   18 April 2009   2 / 48
Introduction


Goals Of This Talk




      Introduce FreeBSD/PmcTools.
      Introduce BSD culture.




Joseph Koshy (FreeBSD Developer)       FreeBSD/PmcTools   18 April 2009   3 / 48
Introduction


About FreeBSD




      http://www.freebsd.org/
      Popular among appliance makers, ISPs, web hosting providers:
              Fast, stable, high-quality code, liberal license.
      FreeBSD culture in one sentence: “Shut up and code”.


Joseph Koshy (FreeBSD Developer)       FreeBSD/PmcTools           18 April 2009   4 / 48
Introduction


About jkoshy@FreeBSD.org




      FreeBSD developer since 1998.
      Technical interests:
              Performance analysis; the design of high performance software.
              Low power computing.
              Higher order, typed languages.
              Writing clean, well-designed code.




Joseph Koshy (FreeBSD Developer)       FreeBSD/PmcTools           18 April 2009   5 / 48
Introduction


The Three Big Questions in Performance Analysis




  1   What is the system doing?
  2   Where in the code does the behaviour arise?
  3   What is to be done about it?




Joseph Koshy (FreeBSD Developer)       FreeBSD/PmcTools   18 April 2009   6 / 48
Introduction


Question 1: What is the system doing?


      System behaviour:
              Traditional UNIX tools: vmstat, iostat, top, systat, ktrace,
              truss.
              Counters under the sysctl hierarchy.
              Compile time options such as LOCK PROFILING.
              New tools like dtrace.
      Machine behaviour:
              Modern CPUs have in-CPU hardware counters measuring
              hardware behaviour: bus utilization, cache operations, instructions
              decoded and executed, branch behaviour, floating point and vector
              operations, speculative execution, . . .
              Near zero overheads, good precision.




Joseph Koshy (FreeBSD Developer)       FreeBSD/PmcTools             18 April 2009   7 / 48
Introduction


Question 2: Which portion of the system is
responsible?


      Which subsystems are involved?
      Where specifically in the code is the problem arising?
      System performance is a “global” property.
              “Local” inspection of code not always sufficient.
      As a community we are still exploring the domain of performance
      analysis tools:
              Which data to collect.
              Collecting it with low-overheads.
              Making sense of the information collected.




Joseph Koshy (FreeBSD Developer)       FreeBSD/PmcTools          18 April 2009   8 / 48
Introduction   Introducing PmcTools




1   Introduction
       Introducing PmcTools

2   PmcTools
      Architectural Overview
      API
      Design Issues
      Profiling
      Implementation

3   Status & Future Work
      Status
      Future Projects
      Research Topics

4   Conclusion


Joseph Koshy (FreeBSD Developer)       FreeBSD/PmcTools                  18 April 2009   9 / 48
Introduction   Introducing PmcTools


PmcTools Project Goals



                                                Low−overheads
            Whole−system
            Measurements
                                                                              FreeBSD Tier−1
                                                                              Architectures
                                                PmcTools
       Use in production



                          Platform for                               Have Fun!
                          Interesting tools




Joseph Koshy (FreeBSD Developer)              FreeBSD/PmcTools                      18 April 2009   10 / 48
Introduction   Introducing PmcTools


Performance Analysis: Conventional vs. PmcTools


  Description                                Conventional                PmcTools
  Need special binaries                      Yes                         No
  Dynamically loaded objects                 No                          Yes
  Profiling scope                             Executable                  Process & System
  Need restart                               Yes                         No
  Measurement overheads                      High                        Low
  Profiling tick                              Time                        Many options
  Profile inside critical sections            No                          Yes (x86)
  Cross-architecture analysis                No                          Yes
  Distributed profiling                       No                          Yes
  Production use                             No                          Yes



Joseph Koshy (FreeBSD Developer)       FreeBSD/PmcTools                        18 April 2009   11 / 48
Introduction   Introducing PmcTools


Related open-source projects



         Linux Many projects related to PMCs:
                  Oprofile: http://oprofile.sourceforge.net/
                  Perfmon: http://perfmon2.sourceforge.net/
                  Perfctr: http://sourceforge.net/projects/perfctr/
                  Rabbit: http://www.scl.ameslab.gov/Projects/Rabbit/
       Solaris CPC(3) library.
     NetBSD A pmc(3) API.




Joseph Koshy (FreeBSD Developer)       FreeBSD/PmcTools                  18 April 2009   12 / 48
PmcTools   Architectural Overview




1   Introduction
       Introducing PmcTools

2   PmcTools
      Architectural Overview
      API
      Design Issues
      Profiling
      Implementation

3   Status & Future Work
      Status
      Future Projects
      Research Topics

4   Conclusion


Joseph Koshy (FreeBSD Developer)    FreeBSD/PmcTools                   18 April 2009   13 / 48
PmcTools                   Architectural Overview


Overview: Architecture




                                                                         process
                                   process



                                                   process
                                                                                              log
                                                                                     libpmc




                                              kernel                               hwpmc
                                   CPU

                                             CPU

                                                             CPU

                                                                   CPU

                                                                           CPU

                                                                                      CPU

      A platform to build tools that use PMC data.



Joseph Koshy (FreeBSD Developer)                   FreeBSD/PmcTools                                 18 April 2009   14 / 48
PmcTools   Architectural Overview


Components



      hwpmc kernel bits
kernel changes (see later)
       libpmc application API
pmccontrol management tool
     pmcstat proof-of-concept application
pmcannotate contributed tool
          etc... others in the future




Joseph Koshy (FreeBSD Developer)    FreeBSD/PmcTools                   18 April 2009   15 / 48
PmcTools    Architectural Overview


PMC Scopes


                                                                          #0
                                                                          #1
                                                                          #2
                                                                          #3
                                                                          #4

                          CPU 0    CPU 1        CPU 2           CPU 3


                                           Process−scope PMC

                                           System−scope PMC



1 process-scope PMC & 2 system-scope PMCs simultaneously active.



Joseph Koshy (FreeBSD Developer)       FreeBSD/PmcTools                        18 April 2009   16 / 48
PmcTools   Architectural Overview


Counting vs. Sampling




                       Process−scope,                    System−scope,
                       Counting                          Counting




                       Process−scope,                    System−scope,
                       Sampling                          Sampling




Joseph Koshy (FreeBSD Developer)         FreeBSD/PmcTools                   18 April 2009   17 / 48
PmcTools      Architectural Overview


Using system-scope PMCs




                                                    process


                                                              process
                                        kernel




      Three system scope PMCs, on three CPUs.
      Measure behaviour of the system as a whole.

Joseph Koshy (FreeBSD Developer)    FreeBSD/PmcTools                      18 April 2009   18 / 48
PmcTools       Architectural Overview


Using process-scope PMCs




                                                  owner




                                                                             target
                                                            2: attach




                                   1: allocate
                                                              kernel




      A process-scope PMC is allocated & attached to a target process.
      Entire “row” of PMCs reserved across CPUs.

Joseph Koshy (FreeBSD Developer)                   FreeBSD/PmcTools                      18 April 2009   19 / 48
PmcTools   API




1   Introduction
       Introducing PmcTools

2   PmcTools
      Architectural Overview
      API
      Design Issues
      Profiling
      Implementation

3   Status & Future Work
      Status
      Future Projects
      Research Topics

4   Conclusion


Joseph Koshy (FreeBSD Developer)    FreeBSD/PmcTools   18 April 2009   20 / 48
PmcTools   API


API Overview


Categories:
      Administration (2).
      Convenience Functions (8).
      Initialization (1).
      Log file handling (3).
      PMC Management (10).
      Queries (7).
      Arch-specific functions (1).
32 functions documented in 15 manual pages. 10 manual pages for
supported hardware events.



Joseph Koshy (FreeBSD Developer)    FreeBSD/PmcTools   18 April 2009   21 / 48
PmcTools     API


Example: API Usage

                                                     attach
                              allocate

                                                              start


                               release
                                                    stop              read

            pmc     allocate()           Allocate a PMC; returns a handle.
            pmc     attach()             Attach a PMC to a target process.
            pmc     start()              Start a PMC.
            pmc     read()               Read PMC values.
            pmc     stop()               Stop a PMC.
            pmc     release()            Release resources.

Joseph Koshy (FreeBSD Developer)          FreeBSD/PmcTools                   18 April 2009   22 / 48
PmcTools   Design Issues




1   Introduction
       Introducing PmcTools

2   PmcTools
      Architectural Overview
      API
      Design Issues
      Profiling
      Implementation

3   Status & Future Work
      Status
      Future Projects
      Research Topics

4   Conclusion


Joseph Koshy (FreeBSD Developer)    FreeBSD/PmcTools          18 April 2009   23 / 48
PmcTools   Design Issues


PMCs Vary A Lot

    AMD Athlon XP                  4 PMCs, 48 bits wide.
    AMD Athlon64                   4 PMCs, Different set of hardware events.
    Intel Pentium MMX              2 PMCs. 40 bits wide. Counting only.
    Intel Pentium Pro              2 PMCs, 40 bits for reads, 32 bits for writes.
    Intel Pentium IV               18 PMCs shared across logical CPUs. En-
                                   tirely different programming model.
    Intel Core                     Number of PMCs and widths vary. Has
                                   programmable & fixed-function PMCs.
    Intel Core/i7                  As above, but also has per-package PMCs.

      PMCs are closely tied to CPU micro-architecture.
      PMC capabilities, supported events, access methods,
      programming constraints can vary across CPU generations.


Joseph Koshy (FreeBSD Developer)         FreeBSD/PmcTools             18 April 2009   24 / 48
PmcTools   Design Issues


API Design Issues

Issues:
      Designing an extensible programming interface for application use.
      Allowing knowledgeable applications to make full use of hardware.
PmcTools philosophy:
      Make simple things easy to do.
      Make complex things possible.
Current “UI” uses name=value pairs:

% pmcstat -p k8-bu-fill-request-l2-miss,
             mask=dc-fill+ic-fill,usr



Joseph Koshy (FreeBSD Developer)    FreeBSD/PmcTools          18 April 2009   25 / 48
PmcTools   Profiling




1   Introduction
       Introducing PmcTools

2   PmcTools
      Architectural Overview
      API
      Design Issues
      Profiling
      Implementation

3   Status & Future Work
      Status
      Future Projects
      Research Topics

4   Conclusion


Joseph Koshy (FreeBSD Developer)    FreeBSD/PmcTools     18 April 2009   26 / 48
PmcTools   Profiling


Conventional Statistical Profiling




      Needs specially compiled binaries (cc -pg).
      Sampling runs off the clock tick.
              Cannot profile inside kernel critical sections.
      “In-place” record keeping.
      Call graph is approximated.




Joseph Koshy (FreeBSD Developer)     FreeBSD/PmcTools          18 April 2009   27 / 48
PmcTools   Profiling


PmcTools’ Statistical Profiling




      Sets up PMCs to interrupt the CPU on overflow.
      Uses an NMI to drive sampling (on x86):
              Can profile inside kernel critical sections.
              Needs lock-free implementation techniques.
      Separates record keeping from data collection.
      Captures the exact callchain at the point of the sample.




Joseph Koshy (FreeBSD Developer)    FreeBSD/PmcTools        18 April 2009   28 / 48
PmcTools         Profiling


Profiling with NMIs

                                 HWPMC(4)
                                                                               hwpmc_hook()

                     machine dependent trap handler
                                                                               trap()
                            low level trap handler
                                                                                              log file
           NMI
                                                              #0
                                                              #1
                                                              #2
                                                              #3



                                                                    hardclock()                   hwpmc helper
                                                                                                  thread

 record sample
                                                                   lock−less ring       per−owner
                    CPU 0      CPU 1      CPU 2       CPU 3
                                                                   buffers              log buffer


Joseph Koshy (FreeBSD Developer)                   FreeBSD/PmcTools                            18 April 2009   29 / 48
PmcTools       Profiling


Profiling Workflow


             System/application
             under investigation               hwpmc(4) logfile


                                   pmcstat
                                                  pmcstat
                                                                          gprof(1) data


                                                                       gprof

                                                                                          gprof(1) output




      Uses gprof(1) to do user reports (currently):
              Needs to be redone: gprof(1) limitations.
      Call chains are captured and used to generate call graphs.


Joseph Koshy (FreeBSD Developer)              FreeBSD/PmcTools                                       18 April 2009   30 / 48
PmcTools        Profiling


Profiling of Shared Objects

                 0                                                                             0xFFFFFFFF

                     text        data   bss       heap        mmap            stack          Kernel



                      unmapped                 rtld                                          command line
                                                                                             arguments




                                               data




                                                                       data




                                                                                      data
                                        text




                                                                text
      Each executable object in the system gets its own gmon.out file:
              kernel
              kernel modules
              executables
              shared libraries
              run time loader

Joseph Koshy (FreeBSD Developer)                      FreeBSD/PmcTools                                      18 April 2009   31 / 48
PmcTools      Profiling


Remote Profiling




                                          log data




                        system under                            analysis
                        measurement                             console



      pmc configure log() takes a file descriptor.
      Can log to a disk file, a pipe, or to a network socket.
      Events in log file carry timestamps for disambiguation.



Joseph Koshy (FreeBSD Developer)        FreeBSD/PmcTools                   18 April 2009   32 / 48
PmcTools   Implementation




1   Introduction
       Introducing PmcTools

2   PmcTools
      Architectural Overview
      API
      Design Issues
      Profiling
      Implementation

3   Status & Future Work
      Status
      Future Projects
      Research Topics

4   Conclusion


Joseph Koshy (FreeBSD Developer)    FreeBSD/PmcTools           18 April 2009   33 / 48
PmcTools   Implementation


Implementation Information


      Module                                     Comments
      sys/dev/hwpmc,                             31K LoC, i386&amd64
      sys/sys/pmc*.h
      lib/libpmc                                 3.3K LoC
      usr.sbin/*                                 5.4K LoC
      documentation                              29 manual pages, 11K LoD

      All public APIs have manual pages.
      All hardware events, and their modifiers are documented.
      The internal API between libpmc and hwpmc is also documented.
See also: “Beautiful Code Exists, If You Know Where To Look”, CACM,
July 2008.

Joseph Koshy (FreeBSD Developer)    FreeBSD/PmcTools             18 April 2009   34 / 48
PmcTools   Implementation


Impact on Base Kernel

Space Requirements 2 bits (P HWPMC, TDP CALLCHAIN). Uses free
           bits in existing flags words.
Kernel Changes Clock handling, kernel linker, MD code, process
           handling, scheduler, VM (options HWPMC HOOKS).
Kernel Callbacks
                                          scheduler                       low−level (assembler)

                           kernel linker                                      trap()

                        VM (mmap)                        hwpmc


                                   exec

                                     clock handling


Joseph Koshy (FreeBSD Developer)                FreeBSD/PmcTools                      18 April 2009   35 / 48
PmcTools   Implementation


Portability




Application Portability High.
Portability of libpmc Moderate. Requires a POSIX-like system.
Adding support for new PMC hardware Moderate.
 Kernel bits Low.




Joseph Koshy (FreeBSD Developer)    FreeBSD/PmcTools           18 April 2009   36 / 48
Status & Future Work   Status




1   Introduction
       Introducing PmcTools

2   PmcTools
      Architectural Overview
      API
      Design Issues
      Profiling
      Implementation

3   Status & Future Work
      Status
      Future Projects
      Research Topics

4   Conclusion


Joseph Koshy (FreeBSD Developer)                FreeBSD/PmcTools   18 April 2009   37 / 48
Status & Future Work   Status


Current State



      Proof-of-concept application pmcstat is the current “user
      interface”.
              Crufty.
      Low overheads (design goal: 5%) and tunable.
      In production use. Being shipped on customer boxes by appliance
      vendors.
      Support load on the rise (esp. requests for support of new
      hardware).




Joseph Koshy (FreeBSD Developer)                FreeBSD/PmcTools   18 April 2009   38 / 48
Status & Future Work   Status


Support load

      Volunteer project. Initial hardware bought from pocket, or on loan.
      Current hardware support:
              12 combinations of {PMC hardware × 32/64 bit OS variants} × 9
              OS versions [FreeBSD 6.0 · · · 6.4, 7.0 · · · 7.2, 8.0] = 108
              combinations!
              Need a hardware lab to manage testing and bug reports.
      Need an automated test suite that is run continuously.
              Also useful for detecting OS & application performance regressions
              early.
      Email support load is on the rise:
              Rise in FreeBSD adoption.
              FreeBSD users and developers worldwide now chipping in with
              features, bug fixes, offering tutorials and spreading the word.


Joseph Koshy (FreeBSD Developer)                FreeBSD/PmcTools   18 April 2009   39 / 48
Status & Future Work   Future Projects




1   Introduction
       Introducing PmcTools

2   PmcTools
      Architectural Overview
      API
      Design Issues
      Profiling
      Implementation

3   Status & Future Work
      Status
      Future Projects
      Research Topics

4   Conclusion


Joseph Koshy (FreeBSD Developer)                FreeBSD/PmcTools            18 April 2009   40 / 48
Status & Future Work   Future Projects


Profiling the Cloud



                                                                              traffic




                      Control/Data




                      Analysis Console




Joseph Koshy (FreeBSD Developer)                  FreeBSD/PmcTools                      18 April 2009   41 / 48
Status & Future Work   Future Projects


Other project ideas



      A graphical visualizer “console”.
      Enhance gprof, or write a report generator afresh.
      Link up with existing profile based optimization frameworks.
      Allow performance analysis of non-native architectures.
      Support non-x86 PMCs.
      Integrate PmcTools and DTrace.
      Port to other BSDs and/or OpenSolaris.




Joseph Koshy (FreeBSD Developer)                FreeBSD/PmcTools            18 April 2009   42 / 48
Status & Future Work   Research Topics




1   Introduction
       Introducing PmcTools

2   PmcTools
      Architectural Overview
      API
      Design Issues
      Profiling
      Implementation

3   Status & Future Work
      Status
      Future Projects
      Research Topics

4   Conclusion


Joseph Koshy (FreeBSD Developer)                FreeBSD/PmcTools            18 April 2009   43 / 48
Status & Future Work    Research Topics


Profile guided system layout
                                       applications




                             kernel




                                      L1/L2/L3
                                      Cache




      Lay out the whole system to help “hot” portions remain in cache.
      Would require an augmented toolchain
      (http://elftoolchain.sourceforge.net/) & enhancements to
      hwpmc(4).
      Useful for low end devices using direct-mapped caches.
Joseph Koshy (FreeBSD Developer)                      FreeBSD/PmcTools           18 April 2009   44 / 48
Status & Future Work   Research Topics


Detection of SMP data structure layout bugs


struct shared
  ...
  char sh foo;
  int sh bar;
  char sh buzz;
  ...
;


      Would use a combination of static analysis & hwpmc(4) data.
      Detection of the poor cache line layout behaviour.
              Cache line ping-ponging between CPUs.



Joseph Koshy (FreeBSD Developer)                FreeBSD/PmcTools            18 April 2009   45 / 48
Status & Future Work   Research Topics


Profiling for power use




      What part of the system consumes power?
      Where in the code is power being spent?




Joseph Koshy (FreeBSD Developer)                FreeBSD/PmcTools            18 April 2009   46 / 48
Conclusion




1   Introduction
       Introducing PmcTools

2   PmcTools
      Architectural Overview
      API
      Design Issues
      Profiling
      Implementation

3   Status & Future Work
      Status
      Future Projects
      Research Topics

4   Conclusion


Joseph Koshy (FreeBSD Developer)      FreeBSD/PmcTools   18 April 2009   47 / 48
Conclusion


Talk Summary




      FreeBSD/PmcTools was introduced.
      The design & implementation of PmcTools was looked at.
      Possible future development and research directions for the
      project were touched upon.




Joseph Koshy (FreeBSD Developer)      FreeBSD/PmcTools    18 April 2009   48 / 48

More Related Content

Viewers also liked

FreeBSD Unified Configuration
FreeBSD Unified ConfigurationFreeBSD Unified Configuration
FreeBSD Unified ConfigurationAndrew Pantyukhin
 
ZFS and FreeBSD Jails
ZFS and FreeBSD JailsZFS and FreeBSD Jails
ZFS and FreeBSD Jailsapeiron
 
Python on FreeBSD
Python on FreeBSDPython on FreeBSD
Python on FreeBSDpycontw
 
Chef - Evolving with Infrastructure Automation
Chef - Evolving with Infrastructure AutomationChef - Evolving with Infrastructure Automation
Chef - Evolving with Infrastructure AutomationNathaniel Brown
 
Baking Docker Using Chef
Baking Docker Using ChefBaking Docker Using Chef
Baking Docker Using ChefMukta Aphale
 
FreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame GraphsFreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame GraphsBrendan Gregg
 
SysAdm: Simplifying FreeBSD Administration
SysAdm: Simplifying FreeBSD AdministrationSysAdm: Simplifying FreeBSD Administration
SysAdm: Simplifying FreeBSD AdministrationKen Moore
 
FreeBSD: The Next 10 Years (MeetBSD 2014)
FreeBSD: The Next 10 Years (MeetBSD 2014)FreeBSD: The Next 10 Years (MeetBSD 2014)
FreeBSD: The Next 10 Years (MeetBSD 2014)iXsystems
 
FreeNAS 10: Challenges of Building a Modern Storage Appliance based on FreeBS...
FreeNAS 10: Challenges of Building a Modern Storage Appliance based on FreeBS...FreeNAS 10: Challenges of Building a Modern Storage Appliance based on FreeBS...
FreeNAS 10: Challenges of Building a Modern Storage Appliance based on FreeBS...iXsystems
 
600M+ Unsuspecting FreeBSD Users (MeetBSD California 2014)
600M+ Unsuspecting FreeBSD Users (MeetBSD California 2014)600M+ Unsuspecting FreeBSD Users (MeetBSD California 2014)
600M+ Unsuspecting FreeBSD Users (MeetBSD California 2014)iXsystems
 

Viewers also liked (11)

FreeBSD Unified Configuration
FreeBSD Unified ConfigurationFreeBSD Unified Configuration
FreeBSD Unified Configuration
 
ZFS and FreeBSD Jails
ZFS and FreeBSD JailsZFS and FreeBSD Jails
ZFS and FreeBSD Jails
 
Python on FreeBSD
Python on FreeBSDPython on FreeBSD
Python on FreeBSD
 
Chef - Evolving with Infrastructure Automation
Chef - Evolving with Infrastructure AutomationChef - Evolving with Infrastructure Automation
Chef - Evolving with Infrastructure Automation
 
Baking Docker Using Chef
Baking Docker Using ChefBaking Docker Using Chef
Baking Docker Using Chef
 
FreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame GraphsFreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame Graphs
 
SysAdm: Simplifying FreeBSD Administration
SysAdm: Simplifying FreeBSD AdministrationSysAdm: Simplifying FreeBSD Administration
SysAdm: Simplifying FreeBSD Administration
 
FreeBSD: Dev to Prod
FreeBSD: Dev to ProdFreeBSD: Dev to Prod
FreeBSD: Dev to Prod
 
FreeBSD: The Next 10 Years (MeetBSD 2014)
FreeBSD: The Next 10 Years (MeetBSD 2014)FreeBSD: The Next 10 Years (MeetBSD 2014)
FreeBSD: The Next 10 Years (MeetBSD 2014)
 
FreeNAS 10: Challenges of Building a Modern Storage Appliance based on FreeBS...
FreeNAS 10: Challenges of Building a Modern Storage Appliance based on FreeBS...FreeNAS 10: Challenges of Building a Modern Storage Appliance based on FreeBS...
FreeNAS 10: Challenges of Building a Modern Storage Appliance based on FreeBS...
 
600M+ Unsuspecting FreeBSD Users (MeetBSD California 2014)
600M+ Unsuspecting FreeBSD Users (MeetBSD California 2014)600M+ Unsuspecting FreeBSD Users (MeetBSD California 2014)
600M+ Unsuspecting FreeBSD Users (MeetBSD California 2014)
 

Similar to PmcTools: Whole-System, Low-Overhead Performance Measurement in FreeBSD

Overview of FreeBSD PMC Tools
Overview of FreeBSD PMC ToolsOverview of FreeBSD PMC Tools
Overview of FreeBSD PMC ToolsACMBangalore
 
Creating an Embedded System Lab
Creating an Embedded System LabCreating an Embedded System Lab
Creating an Embedded System LabNonamepro
 
Persistent Memory Development Kit (PMDK): State of the Project
Persistent Memory Development Kit (PMDK): State of the ProjectPersistent Memory Development Kit (PMDK): State of the Project
Persistent Memory Development Kit (PMDK): State of the ProjectIntel® Software
 
Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Jace Liang
 
The Lives of Others: Open-Source Development Practices Elsewhere
The Lives of Others: Open-Source Development Practices ElsewhereThe Lives of Others: Open-Source Development Practices Elsewhere
The Lives of Others: Open-Source Development Practices ElsewherePeter Eisentraut
 
ERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projectsERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projectsChristian Charreyre
 
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)Hajime Tazaki
 
Instalando Cacti no CentOS 5
Instalando Cacti no CentOS 5Instalando Cacti no CentOS 5
Instalando Cacti no CentOS 5Carlos Eduardo
 
pythonOCC PDE2009 presentation
pythonOCC PDE2009 presentationpythonOCC PDE2009 presentation
pythonOCC PDE2009 presentationThomas Paviot
 
Using open source software to build an industrial grade embedded linux platfo...
Using open source software to build an industrial grade embedded linux platfo...Using open source software to build an industrial grade embedded linux platfo...
Using open source software to build an industrial grade embedded linux platfo...SZ Lin
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPFRogerColl2
 
PyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsPyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsHenry Schreiner
 
Final presentasi gnome asia
Final presentasi gnome asiaFinal presentasi gnome asia
Final presentasi gnome asiaAnton Siswo
 
The Yocto Project
The Yocto ProjectThe Yocto Project
The Yocto Projectrossburton
 

Similar to PmcTools: Whole-System, Low-Overhead Performance Measurement in FreeBSD (20)

Overview of FreeBSD PMC Tools
Overview of FreeBSD PMC ToolsOverview of FreeBSD PMC Tools
Overview of FreeBSD PMC Tools
 
Creating an Embedded System Lab
Creating an Embedded System LabCreating an Embedded System Lab
Creating an Embedded System Lab
 
MULTICS
MULTICSMULTICS
MULTICS
 
Advancement on embedded linux-v2
Advancement on embedded linux-v2Advancement on embedded linux-v2
Advancement on embedded linux-v2
 
Persistent Memory Development Kit (PMDK): State of the Project
Persistent Memory Development Kit (PMDK): State of the ProjectPersistent Memory Development Kit (PMDK): State of the Project
Persistent Memory Development Kit (PMDK): State of the Project
 
Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology
 
Rhce ppt
Rhce pptRhce ppt
Rhce ppt
 
The Lives of Others: Open-Source Development Practices Elsewhere
The Lives of Others: Open-Source Development Practices ElsewhereThe Lives of Others: Open-Source Development Practices Elsewhere
The Lives of Others: Open-Source Development Practices Elsewhere
 
ERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projectsERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projects
 
Japan's post K Computer
Japan's post K ComputerJapan's post K Computer
Japan's post K Computer
 
HPC Workbench Presentation
HPC Workbench PresentationHPC Workbench Presentation
HPC Workbench Presentation
 
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
 
Instalando Cacti no CentOS 5
Instalando Cacti no CentOS 5Instalando Cacti no CentOS 5
Instalando Cacti no CentOS 5
 
pythonOCC PDE2009 presentation
pythonOCC PDE2009 presentationpythonOCC PDE2009 presentation
pythonOCC PDE2009 presentation
 
Using open source software to build an industrial grade embedded linux platfo...
Using open source software to build an industrial grade embedded linux platfo...Using open source software to build an industrial grade embedded linux platfo...
Using open source software to build an industrial grade embedded linux platfo...
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPF
 
PyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsPyCon2022 - Building Python Extensions
PyCon2022 - Building Python Extensions
 
Final presentasi gnome asia
Final presentasi gnome asiaFinal presentasi gnome asia
Final presentasi gnome asia
 
Multicore
MulticoreMulticore
Multicore
 
The Yocto Project
The Yocto ProjectThe Yocto Project
The Yocto Project
 

Recently uploaded

The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

PmcTools: Whole-System, Low-Overhead Performance Measurement in FreeBSD

  • 1. PmcTools: Whole-system, low-overhead performance measurement in FreeBSD Joseph Koshy FreeBSD Developer 18 April 2009 Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 1 / 48
  • 2. Outline 1 Introduction Introducing PmcTools 2 PmcTools Architectural Overview API Design Issues Profiling Implementation 3 Status & Future Work Status Future Projects Research Topics 4 Conclusion Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 2 / 48
  • 3. Introduction Goals Of This Talk Introduce FreeBSD/PmcTools. Introduce BSD culture. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 3 / 48
  • 4. Introduction About FreeBSD http://www.freebsd.org/ Popular among appliance makers, ISPs, web hosting providers: Fast, stable, high-quality code, liberal license. FreeBSD culture in one sentence: “Shut up and code”. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 4 / 48
  • 5. Introduction About jkoshy@FreeBSD.org FreeBSD developer since 1998. Technical interests: Performance analysis; the design of high performance software. Low power computing. Higher order, typed languages. Writing clean, well-designed code. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 5 / 48
  • 6. Introduction The Three Big Questions in Performance Analysis 1 What is the system doing? 2 Where in the code does the behaviour arise? 3 What is to be done about it? Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 6 / 48
  • 7. Introduction Question 1: What is the system doing? System behaviour: Traditional UNIX tools: vmstat, iostat, top, systat, ktrace, truss. Counters under the sysctl hierarchy. Compile time options such as LOCK PROFILING. New tools like dtrace. Machine behaviour: Modern CPUs have in-CPU hardware counters measuring hardware behaviour: bus utilization, cache operations, instructions decoded and executed, branch behaviour, floating point and vector operations, speculative execution, . . . Near zero overheads, good precision. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 7 / 48
  • 8. Introduction Question 2: Which portion of the system is responsible? Which subsystems are involved? Where specifically in the code is the problem arising? System performance is a “global” property. “Local” inspection of code not always sufficient. As a community we are still exploring the domain of performance analysis tools: Which data to collect. Collecting it with low-overheads. Making sense of the information collected. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 8 / 48
  • 9. Introduction Introducing PmcTools 1 Introduction Introducing PmcTools 2 PmcTools Architectural Overview API Design Issues Profiling Implementation 3 Status & Future Work Status Future Projects Research Topics 4 Conclusion Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 9 / 48
  • 10. Introduction Introducing PmcTools PmcTools Project Goals Low−overheads Whole−system Measurements FreeBSD Tier−1 Architectures PmcTools Use in production Platform for Have Fun! Interesting tools Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 10 / 48
  • 11. Introduction Introducing PmcTools Performance Analysis: Conventional vs. PmcTools Description Conventional PmcTools Need special binaries Yes No Dynamically loaded objects No Yes Profiling scope Executable Process & System Need restart Yes No Measurement overheads High Low Profiling tick Time Many options Profile inside critical sections No Yes (x86) Cross-architecture analysis No Yes Distributed profiling No Yes Production use No Yes Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 11 / 48
  • 12. Introduction Introducing PmcTools Related open-source projects Linux Many projects related to PMCs: Oprofile: http://oprofile.sourceforge.net/ Perfmon: http://perfmon2.sourceforge.net/ Perfctr: http://sourceforge.net/projects/perfctr/ Rabbit: http://www.scl.ameslab.gov/Projects/Rabbit/ Solaris CPC(3) library. NetBSD A pmc(3) API. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 12 / 48
  • 13. PmcTools Architectural Overview 1 Introduction Introducing PmcTools 2 PmcTools Architectural Overview API Design Issues Profiling Implementation 3 Status & Future Work Status Future Projects Research Topics 4 Conclusion Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 13 / 48
  • 14. PmcTools Architectural Overview Overview: Architecture process process process log libpmc kernel hwpmc CPU CPU CPU CPU CPU CPU A platform to build tools that use PMC data. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 14 / 48
  • 15. PmcTools Architectural Overview Components hwpmc kernel bits kernel changes (see later) libpmc application API pmccontrol management tool pmcstat proof-of-concept application pmcannotate contributed tool etc... others in the future Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 15 / 48
  • 16. PmcTools Architectural Overview PMC Scopes #0 #1 #2 #3 #4 CPU 0 CPU 1 CPU 2 CPU 3 Process−scope PMC System−scope PMC 1 process-scope PMC & 2 system-scope PMCs simultaneously active. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 16 / 48
  • 17. PmcTools Architectural Overview Counting vs. Sampling Process−scope, System−scope, Counting Counting Process−scope, System−scope, Sampling Sampling Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 17 / 48
  • 18. PmcTools Architectural Overview Using system-scope PMCs process process kernel Three system scope PMCs, on three CPUs. Measure behaviour of the system as a whole. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 18 / 48
  • 19. PmcTools Architectural Overview Using process-scope PMCs owner target 2: attach 1: allocate kernel A process-scope PMC is allocated & attached to a target process. Entire “row” of PMCs reserved across CPUs. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 19 / 48
  • 20. PmcTools API 1 Introduction Introducing PmcTools 2 PmcTools Architectural Overview API Design Issues Profiling Implementation 3 Status & Future Work Status Future Projects Research Topics 4 Conclusion Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 20 / 48
  • 21. PmcTools API API Overview Categories: Administration (2). Convenience Functions (8). Initialization (1). Log file handling (3). PMC Management (10). Queries (7). Arch-specific functions (1). 32 functions documented in 15 manual pages. 10 manual pages for supported hardware events. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 21 / 48
  • 22. PmcTools API Example: API Usage attach allocate start release stop read pmc allocate() Allocate a PMC; returns a handle. pmc attach() Attach a PMC to a target process. pmc start() Start a PMC. pmc read() Read PMC values. pmc stop() Stop a PMC. pmc release() Release resources. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 22 / 48
  • 23. PmcTools Design Issues 1 Introduction Introducing PmcTools 2 PmcTools Architectural Overview API Design Issues Profiling Implementation 3 Status & Future Work Status Future Projects Research Topics 4 Conclusion Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 23 / 48
  • 24. PmcTools Design Issues PMCs Vary A Lot AMD Athlon XP 4 PMCs, 48 bits wide. AMD Athlon64 4 PMCs, Different set of hardware events. Intel Pentium MMX 2 PMCs. 40 bits wide. Counting only. Intel Pentium Pro 2 PMCs, 40 bits for reads, 32 bits for writes. Intel Pentium IV 18 PMCs shared across logical CPUs. En- tirely different programming model. Intel Core Number of PMCs and widths vary. Has programmable & fixed-function PMCs. Intel Core/i7 As above, but also has per-package PMCs. PMCs are closely tied to CPU micro-architecture. PMC capabilities, supported events, access methods, programming constraints can vary across CPU generations. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 24 / 48
  • 25. PmcTools Design Issues API Design Issues Issues: Designing an extensible programming interface for application use. Allowing knowledgeable applications to make full use of hardware. PmcTools philosophy: Make simple things easy to do. Make complex things possible. Current “UI” uses name=value pairs: % pmcstat -p k8-bu-fill-request-l2-miss, mask=dc-fill+ic-fill,usr Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 25 / 48
  • 26. PmcTools Profiling 1 Introduction Introducing PmcTools 2 PmcTools Architectural Overview API Design Issues Profiling Implementation 3 Status & Future Work Status Future Projects Research Topics 4 Conclusion Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 26 / 48
  • 27. PmcTools Profiling Conventional Statistical Profiling Needs specially compiled binaries (cc -pg). Sampling runs off the clock tick. Cannot profile inside kernel critical sections. “In-place” record keeping. Call graph is approximated. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 27 / 48
  • 28. PmcTools Profiling PmcTools’ Statistical Profiling Sets up PMCs to interrupt the CPU on overflow. Uses an NMI to drive sampling (on x86): Can profile inside kernel critical sections. Needs lock-free implementation techniques. Separates record keeping from data collection. Captures the exact callchain at the point of the sample. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 28 / 48
  • 29. PmcTools Profiling Profiling with NMIs HWPMC(4) hwpmc_hook() machine dependent trap handler trap() low level trap handler log file NMI #0 #1 #2 #3 hardclock() hwpmc helper thread record sample lock−less ring per−owner CPU 0 CPU 1 CPU 2 CPU 3 buffers log buffer Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 29 / 48
  • 30. PmcTools Profiling Profiling Workflow System/application under investigation hwpmc(4) logfile pmcstat pmcstat gprof(1) data gprof gprof(1) output Uses gprof(1) to do user reports (currently): Needs to be redone: gprof(1) limitations. Call chains are captured and used to generate call graphs. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 30 / 48
  • 31. PmcTools Profiling Profiling of Shared Objects 0 0xFFFFFFFF text data bss heap mmap stack Kernel unmapped rtld command line arguments data data data text text Each executable object in the system gets its own gmon.out file: kernel kernel modules executables shared libraries run time loader Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 31 / 48
  • 32. PmcTools Profiling Remote Profiling log data system under analysis measurement console pmc configure log() takes a file descriptor. Can log to a disk file, a pipe, or to a network socket. Events in log file carry timestamps for disambiguation. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 32 / 48
  • 33. PmcTools Implementation 1 Introduction Introducing PmcTools 2 PmcTools Architectural Overview API Design Issues Profiling Implementation 3 Status & Future Work Status Future Projects Research Topics 4 Conclusion Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 33 / 48
  • 34. PmcTools Implementation Implementation Information Module Comments sys/dev/hwpmc, 31K LoC, i386&amd64 sys/sys/pmc*.h lib/libpmc 3.3K LoC usr.sbin/* 5.4K LoC documentation 29 manual pages, 11K LoD All public APIs have manual pages. All hardware events, and their modifiers are documented. The internal API between libpmc and hwpmc is also documented. See also: “Beautiful Code Exists, If You Know Where To Look”, CACM, July 2008. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 34 / 48
  • 35. PmcTools Implementation Impact on Base Kernel Space Requirements 2 bits (P HWPMC, TDP CALLCHAIN). Uses free bits in existing flags words. Kernel Changes Clock handling, kernel linker, MD code, process handling, scheduler, VM (options HWPMC HOOKS). Kernel Callbacks scheduler low−level (assembler) kernel linker trap() VM (mmap) hwpmc exec clock handling Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 35 / 48
  • 36. PmcTools Implementation Portability Application Portability High. Portability of libpmc Moderate. Requires a POSIX-like system. Adding support for new PMC hardware Moderate. Kernel bits Low. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 36 / 48
  • 37. Status & Future Work Status 1 Introduction Introducing PmcTools 2 PmcTools Architectural Overview API Design Issues Profiling Implementation 3 Status & Future Work Status Future Projects Research Topics 4 Conclusion Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 37 / 48
  • 38. Status & Future Work Status Current State Proof-of-concept application pmcstat is the current “user interface”. Crufty. Low overheads (design goal: 5%) and tunable. In production use. Being shipped on customer boxes by appliance vendors. Support load on the rise (esp. requests for support of new hardware). Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 38 / 48
  • 39. Status & Future Work Status Support load Volunteer project. Initial hardware bought from pocket, or on loan. Current hardware support: 12 combinations of {PMC hardware × 32/64 bit OS variants} × 9 OS versions [FreeBSD 6.0 · · · 6.4, 7.0 · · · 7.2, 8.0] = 108 combinations! Need a hardware lab to manage testing and bug reports. Need an automated test suite that is run continuously. Also useful for detecting OS & application performance regressions early. Email support load is on the rise: Rise in FreeBSD adoption. FreeBSD users and developers worldwide now chipping in with features, bug fixes, offering tutorials and spreading the word. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 39 / 48
  • 40. Status & Future Work Future Projects 1 Introduction Introducing PmcTools 2 PmcTools Architectural Overview API Design Issues Profiling Implementation 3 Status & Future Work Status Future Projects Research Topics 4 Conclusion Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 40 / 48
  • 41. Status & Future Work Future Projects Profiling the Cloud traffic Control/Data Analysis Console Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 41 / 48
  • 42. Status & Future Work Future Projects Other project ideas A graphical visualizer “console”. Enhance gprof, or write a report generator afresh. Link up with existing profile based optimization frameworks. Allow performance analysis of non-native architectures. Support non-x86 PMCs. Integrate PmcTools and DTrace. Port to other BSDs and/or OpenSolaris. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 42 / 48
  • 43. Status & Future Work Research Topics 1 Introduction Introducing PmcTools 2 PmcTools Architectural Overview API Design Issues Profiling Implementation 3 Status & Future Work Status Future Projects Research Topics 4 Conclusion Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 43 / 48
  • 44. Status & Future Work Research Topics Profile guided system layout applications kernel L1/L2/L3 Cache Lay out the whole system to help “hot” portions remain in cache. Would require an augmented toolchain (http://elftoolchain.sourceforge.net/) & enhancements to hwpmc(4). Useful for low end devices using direct-mapped caches. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 44 / 48
  • 45. Status & Future Work Research Topics Detection of SMP data structure layout bugs struct shared ... char sh foo; int sh bar; char sh buzz; ... ; Would use a combination of static analysis & hwpmc(4) data. Detection of the poor cache line layout behaviour. Cache line ping-ponging between CPUs. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 45 / 48
  • 46. Status & Future Work Research Topics Profiling for power use What part of the system consumes power? Where in the code is power being spent? Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 46 / 48
  • 47. Conclusion 1 Introduction Introducing PmcTools 2 PmcTools Architectural Overview API Design Issues Profiling Implementation 3 Status & Future Work Status Future Projects Research Topics 4 Conclusion Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 47 / 48
  • 48. Conclusion Talk Summary FreeBSD/PmcTools was introduced. The design & implementation of PmcTools was looked at. Possible future development and research directions for the project were touched upon. Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 48 / 48