SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Downloaden Sie, um offline zu lesen
Open Source Verification
Under a Cloud
Peter Breuer Simon Pickin
Dept. Comp. Sci. Dpto. Ing. Tel.
Univ. Birmingham U. Carlos III de Madrid
UK Spain
What we’ve done
Started with monolithic analysis software
Restructured it into
database plus verification problem server
ad hoc network of remote verification solver clients
Thrown it at verification problems tackled previously
Checking Linux kernel code for SMP locking errors
just over 1,000,000 lines of C code/assembler
look for possible take of spinlock in dangerous context
possible double-take of spinlock before release, etc.
took about 9,000,000s of system time
Outcome is intrinsically 50× as slow as original
takes 100 clients to go 2× as fast
but it scales well!
Why we’ve done it
For a future vision
formal methods specialists contribute standard analyses
developers upload code for analysis to cloud
unskilled open source supporters help favourite project
run client solvers at home on their spare CPU cycles
specialists report regressions and new errors
requires humans to eliminate false positives
Hopefully we get a positive reinforcement loop
more developers develop formal methods skills
more formal methods people understand coding problems
more unskilled contributers develop skills
Perhaps to run faster!
500 clients analyse 1,000,000 LOC in 3h
What our Ad Hoc Volunteer Cloud of Solvers Looked Like
Internet
3x<1+y?
p/q=>q−>p?
query
store
...
0
0
1
1
Firewall
High
Disk
Speed
DB server
How the Calculation is Organised
P Parse produces syntax tree T
1,000,000 LOC produces 10,000,000 syntax tree nodes
A Decorate T with post-conditions post to get T post
E.g. ‘x ≤ 1’ where x counts locks taken
H Add evaluation [post ∧ d] for defect d to get T post
eval
E.g. x ≤ 1 ∧
x ≥ 1 p = lock()
false otherwise
L Where there’s a nonzero evaluation, flag the point p
X Certify intermediates T post
eval and defect list L
customized approximating logic
Symbolic Approximation
Rearranging the Calculation for a Cloud
Parsing P is currently done monolithically first of all
Logical analysis is done piecewise at each syntax node
A = ◦
p
Ap
Ditto checking H = ◦
p
Hp
Organised as work-units of analysis and checking
A ◦ H = ◦
p
(Ap ◦ Hp)
Subject to dependencies induced by syntax p = P
i
(pi )
Certification
Store the evidence (intermediate and final results) T post
eval
Store parse, logic configuration, defect definitions
Provide list of defects L and certificate X
Certificate contains signatures of all items above
?
Idea is that a doubter can ask to see the data
Any part of the computation can then be repeated . . .
. . . to confirm or refute what the certificate claims
!
Experimental Data
746,844 function definitions in >1,000,000 LOC
many turned out to be ‘static inline’ duplications
Reduced to 78,619 syntactically different definitions
Clients each initially given 10m to analyse 1 function
Abort after 10m and try another
only 373 tasks needed longer than 10m
Time-limit on task raised to 15m, etc.
129 functions needed longer than 1h
Were split up syntax-wise and checkpointed every 5m
24 functions remained unanalyzed at experiment end
complexity explosion in logic accounts for most
Duplicated Function Definitions are a Problem!
Top-level definitions with multiple instances ( xy = 746844)
1
10
100
1000
10000
100000
1 10 100 1000
#uniquedefs
#instances
Time taken per analysis task
1
10
100
1 10 100 1000 10000 100000
#tasks
time in seconds
Time taken per analysis task (cumulative count)
0
20
40
60
80
100
1 10 100 1000 10000 100000
%tasks
time in seconds
Percentage of total time taken per analysis task
(cumulative)
0
20
40
60
80
100
1 10 100 1000 10000 100000
%time
time taken per task in seconds
Practical Difficulties
Populating DB remotely was going to take a month!
parsing/loading locally took about 24h
Remote DB transactions can take seconds each
beaten by cacheing (95% hits) of DB queries on clients
solver attempts 150-500 queries/s (90% reads)
5-10 queries/s escape caches to the net
Solver CPU loading is only a few %, peaking occasionally
DB server limits transaction rate
limit at 100 queries/s for 3GB RAM 1.8GHz 64bit Athlon
needed to up RAM to DB size to avoid disk I/O limits
CPU loaded to about 15% by each active thread
Client/net breakdowns mean only 25-50% operating level
Summary
Prototype verification cloud software
Based on symbolic approximation
customized approximating logic
Computation handled incrementally and piecewise
Intermediate results retained for accountability
any part repeated/duplicated if challenged
Produces certificate and a list of defects
Experiment on 1,000,000 LOCC (Linux kernel)
9,000,000s of (normalized 1GHz CPU) system time
< 24h for 100 clients
< 6h for 500 clients
provided can find DB servers to handle 2500 queries/s!
(fortunately queries are strongly localized)

Weitere ähnliche Inhalte

Was ist angesagt?

Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...
Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...
Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...
ISSEL
 
Proving Properties of Security Protocols by Induction
Proving Properties of Security Protocols by InductionProving Properties of Security Protocols by Induction
Proving Properties of Security Protocols by Induction
Lawrence Paulson
 
Automatic Sound Signals Quality Estimation Integration
Automatic Sound Signals Quality Estimation IntegrationAutomatic Sound Signals Quality Estimation Integration
Automatic Sound Signals Quality Estimation Integration
willemvandrunen
 

Was ist angesagt? (18)

Metrics ekon 14_2_kleiner
Metrics ekon 14_2_kleinerMetrics ekon 14_2_kleiner
Metrics ekon 14_2_kleiner
 
Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...
Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...
Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...
 
Symbexecsearch
SymbexecsearchSymbexecsearch
Symbexecsearch
 
LSRepair: Live Search of Fix Ingredients for Automated Program Repair
LSRepair: Live Search of Fix Ingredients for Automated Program RepairLSRepair: Live Search of Fix Ingredients for Automated Program Repair
LSRepair: Live Search of Fix Ingredients for Automated Program Repair
 
D. Fast, Simple User-Space Network Functions with Snabb (RIPE 77)
D. Fast, Simple User-Space Network Functions with Snabb (RIPE 77)D. Fast, Simple User-Space Network Functions with Snabb (RIPE 77)
D. Fast, Simple User-Space Network Functions with Snabb (RIPE 77)
 
Use of an Oscilloscope - maXbox Starter33
Use of an Oscilloscope - maXbox Starter33Use of an Oscilloscope - maXbox Starter33
Use of an Oscilloscope - maXbox Starter33
 
Bigdata Presentation
Bigdata PresentationBigdata Presentation
Bigdata Presentation
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
 
maXbox Starter 39 GEO Maps Tutorial
maXbox Starter 39 GEO Maps TutorialmaXbox Starter 39 GEO Maps Tutorial
maXbox Starter 39 GEO Maps Tutorial
 
Serious Games + Computer Science = Serious CS
Serious Games + Computer Science = Serious CSSerious Games + Computer Science = Serious CS
Serious Games + Computer Science = Serious CS
 
Automatic Sound Signals Quality Estimation Integration
Automatic Sound Signals Quality Estimation IntegrationAutomatic Sound Signals Quality Estimation Integration
Automatic Sound Signals Quality Estimation Integration
 
Proving Properties of Security Protocols by Induction
Proving Properties of Security Protocols by InductionProving Properties of Security Protocols by Induction
Proving Properties of Security Protocols by Induction
 
DieHard: Probabilistic Memory Safety for Unsafe Languages
DieHard: Probabilistic Memory Safety for Unsafe LanguagesDieHard: Probabilistic Memory Safety for Unsafe Languages
DieHard: Probabilistic Memory Safety for Unsafe Languages
 
Automatic Sound Signals Quality Estimation Integration
Automatic Sound Signals Quality Estimation IntegrationAutomatic Sound Signals Quality Estimation Integration
Automatic Sound Signals Quality Estimation Integration
 
Address/Thread/Memory Sanitizer
Address/Thread/Memory SanitizerAddress/Thread/Memory Sanitizer
Address/Thread/Memory Sanitizer
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
 
Mobilesoft 2017 Keynote
Mobilesoft 2017 KeynoteMobilesoft 2017 Keynote
Mobilesoft 2017 Keynote
 
The Cryptol Epilogue: Swift and Bulletproof VHDL
The Cryptol Epilogue: Swift and Bulletproof VHDLThe Cryptol Epilogue: Swift and Bulletproof VHDL
The Cryptol Epilogue: Swift and Bulletproof VHDL
 

Andere mochten auch (11)

Mise-en-Scene - CLAMPS and Camera
Mise-en-Scene - CLAMPS and CameraMise-en-Scene - CLAMPS and Camera
Mise-en-Scene - CLAMPS and Camera
 
Expat Services WorldWide, Monique van Bergen: Working in the Netherlands
Expat Services WorldWide, Monique van Bergen: Working in the NetherlandsExpat Services WorldWide, Monique van Bergen: Working in the Netherlands
Expat Services WorldWide, Monique van Bergen: Working in the Netherlands
 
Oral presentation
Oral presentationOral presentation
Oral presentation
 
Agile Software Development - Session 1
Agile Software Development - Session 1Agile Software Development - Session 1
Agile Software Development - Session 1
 
Why, How, and Who is Mike Wagner?
Why, How, and Who is  Mike Wagner?Why, How, and Who is  Mike Wagner?
Why, How, and Who is Mike Wagner?
 
Los Tres Chiflados
Los Tres ChifladosLos Tres Chiflados
Los Tres Chiflados
 
Looking Good Inspection
Looking Good InspectionLooking Good Inspection
Looking Good Inspection
 
Marcelle Poirier is a Reputed Avocat Francophone in Miami
Marcelle Poirier is a Reputed Avocat Francophone in MiamiMarcelle Poirier is a Reputed Avocat Francophone in Miami
Marcelle Poirier is a Reputed Avocat Francophone in Miami
 
1
11
1
 
Mis máquinas
Mis máquinasMis máquinas
Mis máquinas
 
09
0909
09
 

Ähnlich wie Open Source Verification under a Cloud (OpenCert 2010)

Swift profiling middleware and tools
Swift profiling middleware and toolsSwift profiling middleware and tools
Swift profiling middleware and tools
zhang hua
 
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Flink Forward
 
Oracle Wait Events That Everyone Should Know.ppt
Oracle Wait Events That Everyone Should Know.pptOracle Wait Events That Everyone Should Know.ppt
Oracle Wait Events That Everyone Should Know.ppt
TricantinoLopezPerez
 
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Rafael Ferreira da Silva
 

Ähnlich wie Open Source Verification under a Cloud (OpenCert 2010) (20)

Swift profiling middleware and tools
Swift profiling middleware and toolsSwift profiling middleware and tools
Swift profiling middleware and tools
 
Solve the colocation conundrum: Performance and density at scale with Kubernetes
Solve the colocation conundrum: Performance and density at scale with KubernetesSolve the colocation conundrum: Performance and density at scale with Kubernetes
Solve the colocation conundrum: Performance and density at scale with Kubernetes
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
The trials and tribulations of providing engineering infrastructure
 The trials and tribulations of providing engineering infrastructure  The trials and tribulations of providing engineering infrastructure
The trials and tribulations of providing engineering infrastructure
 
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
 
Advertising Fraud Detection at Scale at T-Mobile
Advertising Fraud Detection at Scale at T-MobileAdvertising Fraud Detection at Scale at T-Mobile
Advertising Fraud Detection at Scale at T-Mobile
 
Puppet Camp DC 2015: Distributed OpenSCAP Compliance Validation with MCollective
Puppet Camp DC 2015: Distributed OpenSCAP Compliance Validation with MCollectivePuppet Camp DC 2015: Distributed OpenSCAP Compliance Validation with MCollective
Puppet Camp DC 2015: Distributed OpenSCAP Compliance Validation with MCollective
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
 
Become a Performance Diagnostics Hero
Become a Performance Diagnostics HeroBecome a Performance Diagnostics Hero
Become a Performance Diagnostics Hero
 
Python高级编程(二)
Python高级编程(二)Python高级编程(二)
Python高级编程(二)
 
Intelligent Monitoring
Intelligent MonitoringIntelligent Monitoring
Intelligent Monitoring
 
Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28
 
Oracle Wait Events That Everyone Should Know.ppt
Oracle Wait Events That Everyone Should Know.pptOracle Wait Events That Everyone Should Know.ppt
Oracle Wait Events That Everyone Should Know.ppt
 
The Need for Async @ ScalaWorld
The Need for Async @ ScalaWorldThe Need for Async @ ScalaWorld
The Need for Async @ ScalaWorld
 
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...
 
Nyt Prof 200910
Nyt Prof 200910Nyt Prof 200910
Nyt Prof 200910
 
WRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation WorkbenchWRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation Workbench
 
Real-World WebAppSec Flaws - Examples and Countermeasues
Real-World WebAppSec Flaws - Examples and CountermeasuesReal-World WebAppSec Flaws - Examples and Countermeasues
Real-World WebAppSec Flaws - Examples and Countermeasues
 
Performance and predictability (1)
Performance and predictability (1)Performance and predictability (1)
Performance and predictability (1)
 
Performance and Predictability - Richard Warburton
Performance and Predictability - Richard WarburtonPerformance and Predictability - Richard Warburton
Performance and Predictability - Richard Warburton
 

Mehr von Peter Breuer

Mehr von Peter Breuer (10)

Avoiding Hardware Aliasing
Avoiding Hardware AliasingAvoiding Hardware Aliasing
Avoiding Hardware Aliasing
 
Empirical Patterns in Google Scholar Citation Counts (CyberPatterns 2014)
Empirical Patterns in Google Scholar Citation Counts (CyberPatterns 2014)Empirical Patterns in Google Scholar Citation Counts (CyberPatterns 2014)
Empirical Patterns in Google Scholar Citation Counts (CyberPatterns 2014)
 
Certifying (RISC) Machine Code Safe from Aliasing (OpenCert 2013)
Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)
Certifying (RISC) Machine Code Safe from Aliasing (OpenCert 2013)
 
Tutorial: Formal Methods for Hardware Verification - Overview and Application...
Tutorial: Formal Methods for Hardware Verification - Overview and Application...Tutorial: Formal Methods for Hardware Verification - Overview and Application...
Tutorial: Formal Methods for Hardware Verification - Overview and Application...
 
A Semantic Model for VHDL-AMS (CHARME '97)
A Semantic Model for VHDL-AMS (CHARME '97)A Semantic Model for VHDL-AMS (CHARME '97)
A Semantic Model for VHDL-AMS (CHARME '97)
 
The mixed-signal modelling language VHDL-AMS and its semantics (ICNACSA 1999)
The mixed-signal modelling language VHDL-AMS and its semantics (ICNACSA 1999)The mixed-signal modelling language VHDL-AMS and its semantics (ICNACSA 1999)
The mixed-signal modelling language VHDL-AMS and its semantics (ICNACSA 1999)
 
Higher Order Applicative XML (Monterey 2002)
Higher Order Applicative XML (Monterey 2002)Higher Order Applicative XML (Monterey 2002)
Higher Order Applicative XML (Monterey 2002)
 
Raiding the Noosphere
Raiding the NoosphereRaiding the Noosphere
Raiding the Noosphere
 
Abstract Interpretation meets model checking near the 1000000 LOC mark: Findi...
Abstract Interpretation meets model checking near the 1000000 LOC mark: Findi...Abstract Interpretation meets model checking near the 1000000 LOC mark: Findi...
Abstract Interpretation meets model checking near the 1000000 LOC mark: Findi...
 
Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...
Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...
Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 

Open Source Verification under a Cloud (OpenCert 2010)

  • 1. Open Source Verification Under a Cloud Peter Breuer Simon Pickin Dept. Comp. Sci. Dpto. Ing. Tel. Univ. Birmingham U. Carlos III de Madrid UK Spain
  • 2. What we’ve done Started with monolithic analysis software Restructured it into database plus verification problem server ad hoc network of remote verification solver clients Thrown it at verification problems tackled previously Checking Linux kernel code for SMP locking errors just over 1,000,000 lines of C code/assembler look for possible take of spinlock in dangerous context possible double-take of spinlock before release, etc. took about 9,000,000s of system time Outcome is intrinsically 50× as slow as original takes 100 clients to go 2× as fast but it scales well!
  • 3. Why we’ve done it For a future vision formal methods specialists contribute standard analyses developers upload code for analysis to cloud unskilled open source supporters help favourite project run client solvers at home on their spare CPU cycles specialists report regressions and new errors requires humans to eliminate false positives Hopefully we get a positive reinforcement loop more developers develop formal methods skills more formal methods people understand coding problems more unskilled contributers develop skills Perhaps to run faster! 500 clients analyse 1,000,000 LOC in 3h
  • 4. What our Ad Hoc Volunteer Cloud of Solvers Looked Like Internet 3x<1+y? p/q=>q−>p? query store ... 0 0 1 1 Firewall High Disk Speed DB server
  • 5. How the Calculation is Organised P Parse produces syntax tree T 1,000,000 LOC produces 10,000,000 syntax tree nodes A Decorate T with post-conditions post to get T post E.g. ‘x ≤ 1’ where x counts locks taken H Add evaluation [post ∧ d] for defect d to get T post eval E.g. x ≤ 1 ∧ x ≥ 1 p = lock() false otherwise L Where there’s a nonzero evaluation, flag the point p X Certify intermediates T post eval and defect list L customized approximating logic Symbolic Approximation
  • 6. Rearranging the Calculation for a Cloud Parsing P is currently done monolithically first of all Logical analysis is done piecewise at each syntax node A = ◦ p Ap Ditto checking H = ◦ p Hp Organised as work-units of analysis and checking A ◦ H = ◦ p (Ap ◦ Hp) Subject to dependencies induced by syntax p = P i (pi )
  • 7. Certification Store the evidence (intermediate and final results) T post eval Store parse, logic configuration, defect definitions Provide list of defects L and certificate X Certificate contains signatures of all items above ? Idea is that a doubter can ask to see the data Any part of the computation can then be repeated . . . . . . to confirm or refute what the certificate claims !
  • 8. Experimental Data 746,844 function definitions in >1,000,000 LOC many turned out to be ‘static inline’ duplications Reduced to 78,619 syntactically different definitions Clients each initially given 10m to analyse 1 function Abort after 10m and try another only 373 tasks needed longer than 10m Time-limit on task raised to 15m, etc. 129 functions needed longer than 1h Were split up syntax-wise and checkpointed every 5m 24 functions remained unanalyzed at experiment end complexity explosion in logic accounts for most
  • 9. Duplicated Function Definitions are a Problem! Top-level definitions with multiple instances ( xy = 746844) 1 10 100 1000 10000 100000 1 10 100 1000 #uniquedefs #instances
  • 10. Time taken per analysis task 1 10 100 1 10 100 1000 10000 100000 #tasks time in seconds
  • 11. Time taken per analysis task (cumulative count) 0 20 40 60 80 100 1 10 100 1000 10000 100000 %tasks time in seconds
  • 12. Percentage of total time taken per analysis task (cumulative) 0 20 40 60 80 100 1 10 100 1000 10000 100000 %time time taken per task in seconds
  • 13. Practical Difficulties Populating DB remotely was going to take a month! parsing/loading locally took about 24h Remote DB transactions can take seconds each beaten by cacheing (95% hits) of DB queries on clients solver attempts 150-500 queries/s (90% reads) 5-10 queries/s escape caches to the net Solver CPU loading is only a few %, peaking occasionally DB server limits transaction rate limit at 100 queries/s for 3GB RAM 1.8GHz 64bit Athlon needed to up RAM to DB size to avoid disk I/O limits CPU loaded to about 15% by each active thread Client/net breakdowns mean only 25-50% operating level
  • 14. Summary Prototype verification cloud software Based on symbolic approximation customized approximating logic Computation handled incrementally and piecewise Intermediate results retained for accountability any part repeated/duplicated if challenged Produces certificate and a list of defects Experiment on 1,000,000 LOCC (Linux kernel) 9,000,000s of (normalized 1GHz CPU) system time < 24h for 100 clients < 6h for 500 clients provided can find DB servers to handle 2500 queries/s! (fortunately queries are strongly localized)