Formal Verification of Programming Languages

Motivation 1960s 1970s 1980s 1990s 2000s Conclusions

Formal Veriﬁcation of Programming Language
Implementations
Ph.D. Literature Seminar

Jason S. Reich
<jason@cs.york.ac.uk>

University of York

December 8, 2009


Compiling an arithmetic language

Compile from a simple arithmetic language to machine code for a
simple register machine.

Example taken from [McCart67]




Source language

Numeric constants
Variables
Addition
e.g. (x + 3) + (x + (y + 2))





Target language
Source language
Load Immediate into ac
Numeric constants LOAD into ac from
Variables address/register
Addition STOre ac value to
address/register
e.g. (x + 3) + (x + (y + 2))
ADD register value to ac



Arithmetic expression compiler in Haskell

compile : : Source −> I n t −> Target
compile ( Const v ) t = [ L i v ]
compile ( Var x ) t = [ Load x ]
compile (Sum e1 e2 ) t =
c o m p i l e e1 t
++ [ Sto ( "t + " ++ show t ) ]
++ c o m p i l e e2 ( t + 1 )
++ [ Add ( "t + " ++ show t ) ]



When compiled and executed, is the value in the accumulator the
result of the source arithmetic expression?
(x + 3) + (x + (y + 2)) compiled to machine code?

1 LOAD x 8 LOAD y
2 STO t 9 STO t + 2
3 LI 3 10 LI 2
4 ADD t 11 ADD t + 2
5 STO t 12 ADD t + 1
6 LOAD x 13 ADD t
7 STO t + 1
n.b. Where x and y are known memory locations and t + k are registers.


Why use high-level languages?

Rapid development
Easier to understand, maintain and modify
Less likely to make mistakes
Easier to reason about and infer properties
Architecture portability
But...


Can you trust your compiler?

Use a compiler to translate from a high-level language to a
low-level
Compilers are programs (generally) written by people
People make mistakes
Can silently turn “a correct program into an incorrect
executable” [Leroy09]
GHC 6.10.x is ≈ 800, 000 lines of code and has had 737 bugs
reported in the bug tracker as of 04/12/2009 [GHC]
Can we formally verify a compiler?


McCarthy and Painter, 1967

“Correctness of a compiler for arithmetic expressions”
[McCart67]
Describe, in ﬁrst-order predicate logic;
Source language semantics
Target language semantics
A compilation process
Reason that the compiler maintains semantic equivalence


McCarthy and Painter, 1967

Semantic equivalence in [McCart67]
∀e ∈ Expressions, ∀µ : Variable Mappings •
interpret(e, µ) ≡ acValue(emulate(compile(e), mkState(µ)))

Very limited, small toy source and target language
Proof performed by hand
Logical framework and proof presented in under ten pages
Shows that proving a compiler correct is possible


Milner and Weyhrauch, 1972

“Proving compiler correctness in a mechanised logic”
[Milner72]
Provide an LCF machine-checked proof of the
McCarthy-Painter example
Proceed towards mechanically proving a compiler for a more
complex language to a stack machine
Claim to have “no signiﬁcant doubt that the remainder of the
proof can be done on machine” [Milner72]


Morris, 1973

“Advice on structuring compilers and proving them correct”
[Morris73]
Proves by hand the correctness of a compiler for a source
language that contains assignment, conditionals, loops,
arithmetic, booleans operations and local deﬁnitions

“Essence” of the advice presented in [Morris73]

compile
Source language −−→
−− Target language
 
 Target semantics
Source semantics

Source meanings ←−−
−− Target meanings
decode


Thatcher, Wagner and Wright, 1980
Advice presented in [Thatch80]

compile
Source language −−→
−− Target language
 
 Target semantics
Source semantics

Source meanings −−→
−− Target meanings
encode

“More on advice on structuring compilers and proving them
correct” [Thatch80]
Provides a correct compiler for a more advanced target
language than [Morris73]
Claim that mechanised theorem proving tools required further
development


The “structuring compilers” series

Discuss constructing algebras to describe languages
How to move from one algebra to another
Encode abstract state to concrete or decode to abstract?
“there is not enough information in the [abstract] state to
recover the [concrete] state completely” [Moore89]
Further paper “Even more on advice on structuring compilers
and proving them correct: changing an arrow” [Orejas81]
[Moore89] discusses this issue from a practical perspective


Meijer, 1994

“More advice on proving a compiler correct: Improve a correct
compiler” [Meijer94]
Given a interpreter for a source language, can we transform it
into a compiler to and residual interpreter for the target
language?
A functional decomposition problem (i.e.
interpreter = emulator ◦ compiler )
Demonstrate this technique for a first-order imperative
language compiling to a three-address code machine
While quite feasible for first-order languages, becomes far
more difficult for higher-order languages


Berghofer and Stecker, 2003

“Extracting a formally veriﬁed, fully executable compiler from
a proof assistant” [Bergho03]
Proves a compiler for a subset of the Java source language to
Java bytecode
Includes typechecking, abstract syntax tree annotation and
bytecode translation
Isabelle/HOL used to prove properties about an abstract
compiler
Isabelle code extraction to produce an executable compiler


Dave, 2003

Papers listed against decade published
Maulik A. Dave’s
bibliography for “Compiler
Veriﬁcation” [Dave03]
Ninety-nine papers listed
Ninety-one of those listed
were published after 1990
Interestingly neither the
Milner and Weyhrauch paper
nor the Meijer are included


Recent work

Leroy’s “A formally veriﬁed compiler back-end” [Leroy09]
Proves a compiler for Cminor to PowerPC assembler
Chlipala’s “A veriﬁed compiler for an impure functional
language” [Chlipa10]
For a toy (but still quite feature rich) functional source
language to instructions register-based machine
Both use the Coq proof assistant and code extraction
Both decompose the problem into compilation to several
intermediate languages
Both express worries that the proof assistant itself contain
bugs that would invalidate correctness


Conclusions

Compilers have been proved correct for progressively larger
source languages
Rapidly became apparent that some kind of proof assistant is
required
Decomposition of large compilers is a key factor for success
Programs are only veriﬁed when all surrounding elements are
veriﬁed


Open questions

What about compilers for larger target languages and more
advanced compilation facilities?
Are our mechanised assistants producing valid proofs?
Are there other ways to decompose the problem?
Are particular language paradigms more amenable to compiler
veriﬁcation?
Why haven’t the concepts of [Meijer94] been more widely
used?
What other ways are there of decomposing the compiler
veriﬁcation problem?


More information

Slides and bibliography will be made available at;
http://www-users.cs.york.ac.uk/~jason/

Jason S. Reich
<jason@cs.york.ac.uk>

Formal Verification of Programming Languages

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (17)

Ähnlich wie Formal Verification of Programming Languages

Ähnlich wie Formal Verification of Programming Languages (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Formal Verification of Programming Languages