SlideShare ist ein Scribd-Unternehmen logo
1 von 68
Downloaden Sie, um offline zu lesen
Parsing and Type checking
                  all 210000 configurations
                of the Linux              kernel

  10,000 features, 6 million lines of C code




                                    Christian Kästner
Feature-Oriented
Product Lines
Database
 Engine
Printer
Firmware
Linux
Kernel
7
Software Product Lines
Boeing
Bosch Group                    in Industry
Cummins, Inc.
Ericsson
General Dynamics
General Motors
Hewlett Packard
Lockheed Martin
Lucent
NASA
Nokia
Philips
Siemens
…
Variability ~ Complexity
33 features
       optional, independent




a unique configuration for every
person on this planet
320 features
     optional, independent




                      more configurations than estimated
                   atoms in the universe
Correctness?
Product-Line Implementation
static int _rep_queue_filedone(        Excerpt from
   DB_ENV *dbenv,                      Oracle’s Berkeley DB
   REP *rep,
   __rep_fileinfo_args *rfp) {
#ifdef NO_QUEUE
   COMPQUIET(rep, NULL);
   COMPQUIET(rfp, NULL);
   return (__db_no_queue_am(dbenv));
#else
   db_pgno_t first, last;
   u_int32_t flags;
   int empty, ret, t_ret;
#ifdef DIAGNOSTIC
   DB_MSGBUF mb;
#endif
   // over 100 lines of add. code
}
#endif


           Conditional Compilation
static int _rep_queue_filedone(        Excerpt from
   DB_ENV *dbenv,                      Oracle’s Berkeley DB
   REP *rep,
   __rep_fileinfo_args *rfp) {
#ifdef NO_QUEUE
   COMPQUIET(rep, NULL);
   COMPQUIET(rfp, NULL);
   return (__db_no_queue_am(dbenv));
#else
   db_pgno_t first, last;
   u_int32_t flags;
   int empty, ret, t_ret;
                #ifdef X
#ifdef DIAGNOSTIC
   DB_MSGBUF mb;void foo();
#endif          #endif
   // over 100 lines of add. code
}               void bar() {
#endif            foo();
                }

           Conditional Compilation
Objections / Criticism

   Designed in the 70th and hardly evolved since

     “#ifdef considered harmful”
                                                                  “#ifdef hell”
  “maintenance becomes a ‘hit or miss’ process”

                                   “is difficult to determine if the code being
                                    viewed is actually compiled into the system”

                “incomprehensible source texts”
    “programming errors are easy                  “CPP makes maintenance difficult”
     to make and difficult to detect”
                                                      “source code rapidly
“preprocessor diagnostics are poor”                    becomes a maze”
[ICSE’08,ASE’08,Tools‘09,GPCE’09,EASE‘11,..]
                Views
Visual representation
 Disciplined mapping
Consistency checking
          Refactorings
                    …



                                         open source, fosd.net/cide




              Virtual Separation of Concerns
10,000 features, 6 million lines of C code
[ICSE’10, AOSD‘11]


40 Open-Source C Projects
                                                 apache, berkely db,
                                              cherokee, clamav, dia,
                                                emacs, freebsd, gcc,
Number of features




                                            ghostscript, gimp, glibc,
                                           gnumeric, gnuplot, irssi,
                                        libxml, lighttpd, linux, lynx,
                                           minix, mplayer, mpsolve,
                                             openldap, opensolaris,
                                     openvpn, parrot, php, pidgin,
                                       postgresql, privoxy, python,
                                      sendmail, sqlite, subversion,
                                             sylpheed, tcl, vim, xfig,
                                        xine-lib, xorg-server, xterm
                     Lines of code
Correctness?
Printer
Firmware
Checking Products
                 2000 Features
                 100 Printers
                 30 New Printers per Year

Printer
Firmware
Checking Products
               10000 Features
               210000 Configurations


Linux
Kernel
Checking Product Line
Implementation with 10000 Features
                    + Generator

Linux
Kernel
Variability-Aware Analysis
                 Parser
                 Type System
                 Static Analysis
                 Bug Finding
                 Testing
                 Model Checking
                 Theorem Proving
                 …
Product Generation




Variability-Aware           Conventional
Analysis                        Analysis
We aim for a sound and complete approach
Variability-Aware Analysis
                 Parser
                 Type System
                 Static Analysis
                 Bug Finding
                 Testing
                 Model Checking
                 Theorem Proving
                 …
References



             Conflicts
Presence Conditions
       WORLD


               BYE
true




                     true
Reachability: pc(caller) -> pc(target)
 Conflicts: ¬(pc(def1) ˄ pc(def2))




                                          ¬ (WORLD ˄ BYE)



                  true -> (WORLD v BYE)

true -> true
Reachability: pc(caller) -> pc(target)
 Conflicts: ¬(pc(def1) ˄ pc(def2))




                                          ¬ (WORLD ˄ BYE)

   Found 2 type errors:
    - [WORLD & BYE] file hello.c:8:8v BYE)
                     true -> (WORLD
         redefinition of msg
true [!WORLD & !BYE] file hello.c:11:8
    - -> true
         msg undeclared
P

              Variability Model:
                                          WORLD       BYE




                                    VM ->¬ (WORLD ˄ BYE)



                  VM -> (true -> (WORLD v BYE))

VM -> (true -> true)
AST with Variability Information
                                              true -> true
                          greet.c

         …
printf        VWORLD             VBYE          main

          msg         ε       msg         ε    printf
¬ (WORLD ˄ BYE)
                  true -> (WORLD v BYE)
                                                msg




                                      WORLD
                                      BYE            35




   Extended Lookup Mechanism
[ASE’08, TOSEM‘11]

Formalization: CFJ




Theorem (Product Generation Preserves Typing): All
products that are generated for valid feature selections
from a well-typed product line are well typed.
Product Generation




Variability-Aware           Conventional
Analysis                        Analysis
Surface Complexity

                          Inherent
                          Complexity




                SAT
                Problem
[TOSEM‘11]



Product Line     LOC Features   Products Time per     Time f.
                                          Product      entire
                                             (sec)   Product
                                                   Line (sec)
MobileMedia     5700      14       2784        0.3          2


Mobile RSS     20 000     14       2048         1           8
Reader
Lampiro        45 000     11       2048         2          19


Berkeley DB    70 000     42 3.6 billion        3          21
[ICSE’10, AOSD‘11]


40 Open-Source C Projects
                                                              apache, berkely db,
                                                           cherokee, clamav, dia,
variable code in C files (in %)




                                                             emacs, freebsd, gcc,
                                                         ghostscript, gimp, glibc,
                                                        gnumeric, gnuplot, irssi,
                                                     libxml, lighttpd, linux, lynx,
                                                        minix, mplayer, mpsolve,
                                                          openldap, opensolaris,
                                                  openvpn, parrot, php, pidgin,
                                                    postgresql, privoxy, python,
                                                   sendmail, sqlite, subversion,
                                                          sylpheed, tcl, vim, xfig,
                                                     xine-lib, xorg-server, xterm
                                  Lines of code
Parse                                                         Type Check
                                       greet.
                                         c

                       …     VWORL
              printf                            VBYE       main


.c
                               D



                       msg         ε     msg           ε   printf




                                                                                 Linker checks
                                                            msg




                                       greet.
                                         c

                       …     VWORL
              printf                            VBYE       main


.c
                               D



                       msg         ε     msg           ε   printf


                                                           msg




                                       greet.
                                         c

                       …     VWORL
              printf                            VBYE       main
                               D




 .c                    msg         ε     msg           ε   printf


                                                            msg
Challenges
Real-world C code
C preprocessor
Huge size
Module system / linker checks
Variability-Aware Analysis
                 Parser
                 Type System
                 Static Analysis
                 Bug Finding
                 Testing
                 Model Checking
                 Theorem Proving
                 …
[OOPSLA‘11]




                      greet.c

         …
printf       VWORLD        VBYE         main

         msg     ε      msg       ε     printf

                                         msg

 AST with Variability Information
Parsing C without Preprocessing
Macro expansion                 Undisciplined
needed for parsing               annotations                                       Alternative macros




                                                     ?
                     ?                                                         ?
                                                     greet.c



                         + printf           VWORLD         VBYE       + main


                                    + msg       ε      + msg      ε   printf


                                                                       msg
Previous Solutions

Disciplined Subset
   Requires Code Preparation

Heuristics and Partial Analysis
  Inaccurate, False Positives

Brute Force
  Infeasible Effort
TypeChef


                                    (                                                           +
                                        2
                Variability-Aware           *                   Variability-Aware
                                                3                                       *               VA
                        Lexer                       )
                                                        +
                                                                       Parser
                                                         4A
                                                          5¬A                       2       3       4        5


                                                                                    Variability-Aware
                                                                                            Analysis

https://github.com/ckaestne/TypeChef
[OOPSLA‘11]



                     4A   (¬A 4¬A˄B +¬A˄B 6¬A     )¬A        true
(       3        +                                       )
                                4¬A˄B +¬A˄B 6¬A
                     4A   (¬A                     )¬A
    +                           4¬A˄B +¬A˄B 6¬A

3       VA
    4       VB
                                                Library of
        +        6                      Variability-Aware
                                      Parser Combinators
    4        6
                                                   in Scala
7665 C files (x86)
   0
 353 included header files per C file
   0
8590 distinct macros per C file
   0
  72 % conditional
   0
  30 seconds per file (median)
   0
                                 2.6.33.3
                                 X86
      0 syntax errors
Type Checking
  20 seconds per file


                        2.6.33.3
                        X86
511 files
260.000 lines of C code
    811 features
     51 minutes parsing
      6 minutes type checking
      4 seconds linker checking

Type Checking BusyBox
//… skipped 260 lines
struct globals {
     double cur_time;
     //… skipped 11 lines
#if ENABLE_FEATURE_NTPD_SERVER
     int   listen_fd;
#endif
     unsigned verbose;
     //… skipped 73 lines
};

//… skipped 1761 lines

int ntpd_main(int argc UNUSED_PARAM, char **argv)
{
#undef G
      struct globals G;
      //… skipped 81 lines
      if (i > (ENABLE_FEATURE_NTPD_SERVER && G.listen_fd != -1)) {
         …
    ntpd.c: 2128
      }
      …
} [CONFIG_NTPD && !CONFIG_FEATURE_NTPD_SERVER]
         field listen_fd unknown in struct globals
FUTURE DIRECTIONS
Correctness?
Product Generation




Variability-Aware           Conventional
Analysis                        Analysis
Variability-Aware Analysis
                 Parser
                 Type System
                 Static Analysis
                 Bug Finding
                 Testing
                 Model Checking
                 Theorem Proving
                 …
Product-Line
                                                 Evolution




615 trillion config.   553 quintillion config.
[GPCE’09; Grant Prop.]


Reengineering Variability

#ifdef                                           plug-ins
parameters                               feature modules
branches in VCS                                   aspects
clones                           disciplined annotations
domain knowledge                      runtime variability



                                      Disciplined
Legacy System      refactoring         Variability
                                 Implementation
Compositional Approaches                   [ICSE’09, J.ASE’10, SCP’10, TSE’12]

Base / Platform
class Stack {
  void push(Object o) {
    elementData[size++] = o;
  }
  ...
}


                                                class Stack {
                                                  void push(Object o) {
                                                    Lock l = lock(o);
Feature: Queue                                      elementData[size++] = o;
refines class Stack {          Composition        }
                                                    l.unlock();
                                                  ...
  void push(Object o) {
    Lock l = lock(o);                           }
    Super.push(o);
    l.unlock();
  }
  ...
}

                                   Module
                                   Components
Feature: Diagnostic                Frameworks, Plug-ins
aspect Diagnostics {
  ...                              Feature-Modules / Mixin Layers / …
}
                                   Aspects / Subjects, Hyper/J, Deltas
Predicting         [SPLC’11, SQJ’11, ICSE’12]


Nonfunctional Properties
[AOSD’11 ESEM’11, EASE’11]


Empirical methods
& human factors
Domain-
   Specific
Languages:
    SugarJ
  [GPCE’11, OOPSLA’11]




  Runtime
   Updates
   for Java
  [APSEC’08, J. SP&E’11]
Parsing and Type Checking
        all 210000 Configurations
              of the Linux Kernel

                                                                                  greet.c

             4   (   4¬A˄B +¬A˄B   6¬A   )¬A
                                                   true
                                                                   …
 (   3   +                                     )          printf         VWORLD        VBYE       main
                     4¬A˄B +¬A˄B   6¬A
             4   (                       )¬A
                     4¬A˄B +¬A˄B   6¬A
                                                                   msg       ε      msg       ε   printf

                                                                                                  msg




https://github.com/ckaestne/TypeChef
TypeChef
                                                             (Analyzing Real-World C Code)
                               Virtual Separation of Concerns
                             (Tool Support for Annotation-Based
                                 Variability Implementation)
        Aspect-Oriented
        Decomposition
        of Berkeley DB
  AOP
  Compiler
  Extensions
               M.Sc.
                                       Ph.D. Thesis                         Post Doc
               Thesis
1982…   2006               2007        2008           2009        2010            2011      2012




           Austin, Texas               Magdeburg, Germany                Marburg, Germany

Weitere ähnliche Inhalte

Was ist angesagt? (7)

ADB(Android Debug Bridge): How it works?
ADB(Android Debug Bridge): How it works?ADB(Android Debug Bridge): How it works?
ADB(Android Debug Bridge): How it works?
 
Checkpoint/Restore: are we there yet?
Checkpoint/Restore: are we there yet?Checkpoint/Restore: are we there yet?
Checkpoint/Restore: are we there yet?
 
Supercharging Cassandra - GOTO Amsterdam
Supercharging Cassandra - GOTO AmsterdamSupercharging Cassandra - GOTO Amsterdam
Supercharging Cassandra - GOTO Amsterdam
 
Java se7 features
Java se7 featuresJava se7 features
Java se7 features
 
Me3D: A Model-driven Methodology Expediting Embedded Device Driver Development
Me3D: A Model-driven Methodology  Expediting Embedded Device  Driver DevelopmentMe3D: A Model-driven Methodology  Expediting Embedded Device  Driver Development
Me3D: A Model-driven Methodology Expediting Embedded Device Driver Development
 
Mv unmasked.w.code.march.2013
Mv unmasked.w.code.march.2013Mv unmasked.w.code.march.2013
Mv unmasked.w.code.march.2013
 
What is new and cool j2se & java
What is new and cool j2se & javaWhat is new and cool j2se & java
What is new and cool j2se & java
 

Andere mochten auch

Andere mochten auch (7)

Variability-Aware Parsing -- OOPSLA Talk
Variability-Aware Parsing -- OOPSLA TalkVariability-Aware Parsing -- OOPSLA Talk
Variability-Aware Parsing -- OOPSLA Talk
 
Virtual Separation of Concerns (2011 Update)
Virtual Separation of Concerns (2011 Update)Virtual Separation of Concerns (2011 Update)
Virtual Separation of Concerns (2011 Update)
 
Semseo socialamedier-hr
Semseo socialamedier-hrSemseo socialamedier-hr
Semseo socialamedier-hr
 
Virtual Separation of Concerns
Virtual Separation of ConcernsVirtual Separation of Concerns
Virtual Separation of Concerns
 
Seeing Software
Seeing SoftwareSeeing Software
Seeing Software
 
Variability-Aware Analysis (FOSD Dagstuhl 2011)
Variability-Aware Analysis (FOSD Dagstuhl 2011)Variability-Aware Analysis (FOSD Dagstuhl 2011)
Variability-Aware Analysis (FOSD Dagstuhl 2011)
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
 

Ähnlich wie Parsing and Type checking all 2^10000 configurations of the Linux kernel

Direct Code Execution - LinuxCon Japan 2014
Direct Code Execution - LinuxCon Japan 2014Direct Code Execution - LinuxCon Japan 2014
Direct Code Execution - LinuxCon Japan 2014
Hajime Tazaki
 
Using Smalltalk for controlling robotics systems
Using Smalltalk for controlling robotics systemsUsing Smalltalk for controlling robotics systems
Using Smalltalk for controlling robotics systems
Serge Stinckwich
 
SSTC-2012 BenKBovée 2933a_Backup Slides 26-Apr 1130-1300 Track1
SSTC-2012 BenKBovée 2933a_Backup Slides 26-Apr 1130-1300 Track1SSTC-2012 BenKBovée 2933a_Backup Slides 26-Apr 1130-1300 Track1
SSTC-2012 BenKBovée 2933a_Backup Slides 26-Apr 1130-1300 Track1
Benton "Ben" Bovée
 

Ähnlich wie Parsing and Type checking all 2^10000 configurations of the Linux kernel (20)

Genode Compositions
Genode CompositionsGenode Compositions
Genode Compositions
 
Direct Code Execution - LinuxCon Japan 2014
Direct Code Execution - LinuxCon Japan 2014Direct Code Execution - LinuxCon Japan 2014
Direct Code Execution - LinuxCon Japan 2014
 
PVS-Studio, a solution for resource intensive applications development
PVS-Studio, a solution for resource intensive applications developmentPVS-Studio, a solution for resource intensive applications development
PVS-Studio, a solution for resource intensive applications development
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVM
 
Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...
 
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...
 
How to Test Enterprise Java Applications
How to Test Enterprise Java ApplicationsHow to Test Enterprise Java Applications
How to Test Enterprise Java Applications
 
Challenges in Debugging Bootstraps of Reflective Kernels
Challenges in Debugging Bootstraps of Reflective KernelsChallenges in Debugging Bootstraps of Reflective Kernels
Challenges in Debugging Bootstraps of Reflective Kernels
 
Squeak DBX
Squeak DBXSqueak DBX
Squeak DBX
 
Using Smalltalk for controlling robotics systems
Using Smalltalk for controlling robotics systemsUsing Smalltalk for controlling robotics systems
Using Smalltalk for controlling robotics systems
 
Introduction to .NET
Introduction to .NETIntroduction to .NET
Introduction to .NET
 
Build Programming Language Runtime with LLVM
Build Programming Language Runtime with LLVMBuild Programming Language Runtime with LLVM
Build Programming Language Runtime with LLVM
 
淺談探索 Linux 系統設計之道
淺談探索 Linux 系統設計之道 淺談探索 Linux 系統設計之道
淺談探索 Linux 系統設計之道
 
PIL - A Platform Independent Language
PIL - A Platform Independent LanguagePIL - A Platform Independent Language
PIL - A Platform Independent Language
 
Linux kernel bug hunting
Linux kernel bug huntingLinux kernel bug hunting
Linux kernel bug hunting
 
App container rkt
App container rktApp container rkt
App container rkt
 
Azure Day Rome Reloaded 2019 - Deconstructing Kubernetes using AKS
Azure Day Rome Reloaded 2019 - Deconstructing Kubernetes using AKSAzure Day Rome Reloaded 2019 - Deconstructing Kubernetes using AKS
Azure Day Rome Reloaded 2019 - Deconstructing Kubernetes using AKS
 
olibc: Another C Library optimized for Embedded Linux
olibc: Another C Library optimized for Embedded Linuxolibc: Another C Library optimized for Embedded Linux
olibc: Another C Library optimized for Embedded Linux
 
SSTC-2012 BenKBovée 2933a_Backup Slides 26-Apr 1130-1300 Track1
SSTC-2012 BenKBovée 2933a_Backup Slides 26-Apr 1130-1300 Track1SSTC-2012 BenKBovée 2933a_Backup Slides 26-Apr 1130-1300 Track1
SSTC-2012 BenKBovée 2933a_Backup Slides 26-Apr 1130-1300 Track1
 
Harmonia open iris_basic_v0.1
Harmonia open iris_basic_v0.1Harmonia open iris_basic_v0.1
Harmonia open iris_basic_v0.1
 

Kürzlich hochgeladen

Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 

Kürzlich hochgeladen (20)

State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
 
Your enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jYour enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4j
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 

Parsing and Type checking all 2^10000 configurations of the Linux kernel

  • 1. Parsing and Type checking all 210000 configurations of the Linux kernel 10,000 features, 6 million lines of C code Christian Kästner
  • 2.
  • 7. 7
  • 8. Software Product Lines Boeing Bosch Group in Industry Cummins, Inc. Ericsson General Dynamics General Motors Hewlett Packard Lockheed Martin Lucent NASA Nokia Philips Siemens …
  • 10. 33 features optional, independent a unique configuration for every person on this planet
  • 11. 320 features optional, independent more configurations than estimated atoms in the universe
  • 14. static int _rep_queue_filedone( Excerpt from DB_ENV *dbenv, Oracle’s Berkeley DB REP *rep, __rep_fileinfo_args *rfp) { #ifdef NO_QUEUE COMPQUIET(rep, NULL); COMPQUIET(rfp, NULL); return (__db_no_queue_am(dbenv)); #else db_pgno_t first, last; u_int32_t flags; int empty, ret, t_ret; #ifdef DIAGNOSTIC DB_MSGBUF mb; #endif // over 100 lines of add. code } #endif Conditional Compilation
  • 15. static int _rep_queue_filedone( Excerpt from DB_ENV *dbenv, Oracle’s Berkeley DB REP *rep, __rep_fileinfo_args *rfp) { #ifdef NO_QUEUE COMPQUIET(rep, NULL); COMPQUIET(rfp, NULL); return (__db_no_queue_am(dbenv)); #else db_pgno_t first, last; u_int32_t flags; int empty, ret, t_ret; #ifdef X #ifdef DIAGNOSTIC DB_MSGBUF mb;void foo(); #endif #endif // over 100 lines of add. code } void bar() { #endif foo(); } Conditional Compilation
  • 16. Objections / Criticism Designed in the 70th and hardly evolved since “#ifdef considered harmful” “#ifdef hell” “maintenance becomes a ‘hit or miss’ process” “is difficult to determine if the code being viewed is actually compiled into the system” “incomprehensible source texts” “programming errors are easy “CPP makes maintenance difficult” to make and difficult to detect” “source code rapidly “preprocessor diagnostics are poor” becomes a maze”
  • 17. [ICSE’08,ASE’08,Tools‘09,GPCE’09,EASE‘11,..] Views Visual representation Disciplined mapping Consistency checking Refactorings … open source, fosd.net/cide Virtual Separation of Concerns
  • 18. 10,000 features, 6 million lines of C code
  • 19. [ICSE’10, AOSD‘11] 40 Open-Source C Projects apache, berkely db, cherokee, clamav, dia, emacs, freebsd, gcc, Number of features ghostscript, gimp, glibc, gnumeric, gnuplot, irssi, libxml, lighttpd, linux, lynx, minix, mplayer, mpsolve, openldap, opensolaris, openvpn, parrot, php, pidgin, postgresql, privoxy, python, sendmail, sqlite, subversion, sylpheed, tcl, vim, xfig, xine-lib, xorg-server, xterm Lines of code
  • 22. Checking Products 2000 Features 100 Printers 30 New Printers per Year Printer Firmware
  • 23. Checking Products 10000 Features 210000 Configurations Linux Kernel
  • 24. Checking Product Line Implementation with 10000 Features + Generator Linux Kernel
  • 25. Variability-Aware Analysis Parser Type System Static Analysis Bug Finding Testing Model Checking Theorem Proving …
  • 26. Product Generation Variability-Aware Conventional Analysis Analysis
  • 27. We aim for a sound and complete approach
  • 28. Variability-Aware Analysis Parser Type System Static Analysis Bug Finding Testing Model Checking Theorem Proving …
  • 29.
  • 30. References Conflicts
  • 31. Presence Conditions WORLD BYE true true
  • 32. Reachability: pc(caller) -> pc(target) Conflicts: ¬(pc(def1) ˄ pc(def2)) ¬ (WORLD ˄ BYE) true -> (WORLD v BYE) true -> true
  • 33. Reachability: pc(caller) -> pc(target) Conflicts: ¬(pc(def1) ˄ pc(def2)) ¬ (WORLD ˄ BYE) Found 2 type errors: - [WORLD & BYE] file hello.c:8:8v BYE) true -> (WORLD redefinition of msg true [!WORLD & !BYE] file hello.c:11:8 - -> true msg undeclared
  • 34. P Variability Model: WORLD BYE VM ->¬ (WORLD ˄ BYE) VM -> (true -> (WORLD v BYE)) VM -> (true -> true)
  • 35. AST with Variability Information true -> true greet.c … printf VWORLD VBYE main msg ε msg ε printf ¬ (WORLD ˄ BYE) true -> (WORLD v BYE) msg WORLD BYE 35 Extended Lookup Mechanism
  • 36. [ASE’08, TOSEM‘11] Formalization: CFJ Theorem (Product Generation Preserves Typing): All products that are generated for valid feature selections from a well-typed product line are well typed.
  • 37. Product Generation Variability-Aware Conventional Analysis Analysis
  • 38. Surface Complexity Inherent Complexity SAT Problem
  • 39. [TOSEM‘11] Product Line LOC Features Products Time per Time f. Product entire (sec) Product Line (sec) MobileMedia 5700 14 2784 0.3 2 Mobile RSS 20 000 14 2048 1 8 Reader Lampiro 45 000 11 2048 2 19 Berkeley DB 70 000 42 3.6 billion 3 21
  • 40. [ICSE’10, AOSD‘11] 40 Open-Source C Projects apache, berkely db, cherokee, clamav, dia, variable code in C files (in %) emacs, freebsd, gcc, ghostscript, gimp, glibc, gnumeric, gnuplot, irssi, libxml, lighttpd, linux, lynx, minix, mplayer, mpsolve, openldap, opensolaris, openvpn, parrot, php, pidgin, postgresql, privoxy, python, sendmail, sqlite, subversion, sylpheed, tcl, vim, xfig, xine-lib, xorg-server, xterm Lines of code
  • 41.
  • 42. Parse Type Check greet. c … VWORL printf VBYE main .c D msg ε msg ε printf Linker checks msg greet. c … VWORL printf VBYE main .c D msg ε msg ε printf msg greet. c … VWORL printf VBYE main D .c msg ε msg ε printf msg
  • 43. Challenges Real-world C code C preprocessor Huge size Module system / linker checks
  • 44. Variability-Aware Analysis Parser Type System Static Analysis Bug Finding Testing Model Checking Theorem Proving …
  • 45. [OOPSLA‘11] greet.c … printf VWORLD VBYE main msg ε msg ε printf msg AST with Variability Information
  • 46. Parsing C without Preprocessing
  • 47. Macro expansion Undisciplined needed for parsing annotations Alternative macros ? ? ? greet.c + printf VWORLD VBYE + main + msg ε + msg ε printf msg
  • 48.
  • 49.
  • 50. Previous Solutions Disciplined Subset Requires Code Preparation Heuristics and Partial Analysis Inaccurate, False Positives Brute Force Infeasible Effort
  • 51. TypeChef ( + 2 Variability-Aware * Variability-Aware 3 * VA Lexer ) + Parser 4A 5¬A 2 3 4 5 Variability-Aware Analysis https://github.com/ckaestne/TypeChef
  • 52. [OOPSLA‘11] 4A (¬A 4¬A˄B +¬A˄B 6¬A )¬A true ( 3 + ) 4¬A˄B +¬A˄B 6¬A 4A (¬A )¬A + 4¬A˄B +¬A˄B 6¬A 3 VA 4 VB Library of + 6 Variability-Aware Parser Combinators 4 6 in Scala
  • 53. 7665 C files (x86) 0 353 included header files per C file 0 8590 distinct macros per C file 0 72 % conditional 0 30 seconds per file (median) 0 2.6.33.3 X86 0 syntax errors
  • 54. Type Checking 20 seconds per file 2.6.33.3 X86
  • 55. 511 files 260.000 lines of C code 811 features 51 minutes parsing 6 minutes type checking 4 seconds linker checking Type Checking BusyBox
  • 56. //… skipped 260 lines struct globals { double cur_time; //… skipped 11 lines #if ENABLE_FEATURE_NTPD_SERVER int listen_fd; #endif unsigned verbose; //… skipped 73 lines }; //… skipped 1761 lines int ntpd_main(int argc UNUSED_PARAM, char **argv) { #undef G struct globals G; //… skipped 81 lines if (i > (ENABLE_FEATURE_NTPD_SERVER && G.listen_fd != -1)) { … ntpd.c: 2128 } … } [CONFIG_NTPD && !CONFIG_FEATURE_NTPD_SERVER] field listen_fd unknown in struct globals
  • 59. Product Generation Variability-Aware Conventional Analysis Analysis
  • 60. Variability-Aware Analysis Parser Type System Static Analysis Bug Finding Testing Model Checking Theorem Proving …
  • 61. Product-Line Evolution 615 trillion config. 553 quintillion config.
  • 62. [GPCE’09; Grant Prop.] Reengineering Variability #ifdef plug-ins parameters feature modules branches in VCS aspects clones disciplined annotations domain knowledge runtime variability Disciplined Legacy System refactoring Variability Implementation
  • 63. Compositional Approaches [ICSE’09, J.ASE’10, SCP’10, TSE’12] Base / Platform class Stack { void push(Object o) { elementData[size++] = o; } ... } class Stack { void push(Object o) { Lock l = lock(o); Feature: Queue elementData[size++] = o; refines class Stack { Composition } l.unlock(); ... void push(Object o) { Lock l = lock(o); } Super.push(o); l.unlock(); } ... } Module Components Feature: Diagnostic Frameworks, Plug-ins aspect Diagnostics { ... Feature-Modules / Mixin Layers / … } Aspects / Subjects, Hyper/J, Deltas
  • 64. Predicting [SPLC’11, SQJ’11, ICSE’12] Nonfunctional Properties
  • 66. Domain- Specific Languages: SugarJ [GPCE’11, OOPSLA’11] Runtime Updates for Java [APSEC’08, J. SP&E’11]
  • 67. Parsing and Type Checking all 210000 Configurations of the Linux Kernel greet.c 4 ( 4¬A˄B +¬A˄B 6¬A )¬A true … ( 3 + ) printf VWORLD VBYE main 4¬A˄B +¬A˄B 6¬A 4 ( )¬A 4¬A˄B +¬A˄B 6¬A msg ε msg ε printf msg https://github.com/ckaestne/TypeChef
  • 68. TypeChef (Analyzing Real-World C Code) Virtual Separation of Concerns (Tool Support for Annotation-Based Variability Implementation) Aspect-Oriented Decomposition of Berkeley DB AOP Compiler Extensions M.Sc. Ph.D. Thesis Post Doc Thesis 1982… 2006 2007 2008 2009 2010 2011 2012 Austin, Texas Magdeburg, Germany Marburg, Germany