SlideShare ist ein Scribd-Unternehmen logo
1 von 49
Downloaden Sie, um offline zu lesen
Extensible domain-specific programming
for the sciences
Eric Van Wyk
University of Minnesota

VBI, December 5, 2013

slides available at http:www.cs.umn.edu/~evw

1 / 45
Current trends / topics in PL
Formal verification
CompCert - http://compcert.inria.fr/
Astr´e - http://www.astree.ens.fr/
e
Hoare logic (1960’s)
{P} code {Q}
Proof assistants: Coq, Abella, Isabelle, ...
use required in some PL publishing venues

2 / 45
3 / 45
4 / 45
Current trends / topics in PL
Parallel programming - multiple cores, everywhere.

“no more free lunch”
need
new abstractions: e.g. Cilk, MapReduce, FP
new semantics: e.g. deterministic parallel Java
5 / 45
Current trends / topics in PL
Expressive and safe static typing
extending richer static types, e.g.
append ::

( [a], [a] ) -> [a]

to dependent types
append ::

( [a|n], [a|m] ) -> [a|n+m]

turns array out-of-bounds and null-pointer bugs into
static type errors

6 / 45
Extensible languages
Allow programmers select the features to be used in their
programming languages.
new syntax / notations
new semantic analyses / error-checking
Why would anyone want to do that?

7 / 45
Programming language features
General purpose features
assignment statements, loops, if-then-else statements
functions (perhaps higher-order) and procedures
I/O facilities
modules
data: integer, strings, arrays, records
Domain-specific features
matrix operations (MATLAB)
regular expression matching (Perl, Python)
statistics functions (R)
computational geometry operations (LN)
parallel computing (SISAL, X10, NESL, etc.)
Many similarities, needless differences.
Working with multiple (domain-specific) languages is a
headache.
8 / 45
Extensible languages
Allow programmers select the features to be used in their
programming languages.
new syntax / notations
new semantic analyses / error-checking
Pick a general purpose host language (e.g. ANSI C),
extend with domain-specific features.
myProgram.xc =⇒ myProgram.c

9 / 45
Regular expressions
# include " stdio . h "
# include " regex . h "
int main ( int argc , char * argv []) {
char * text = readFileContents ( " X . data " ) ;
// eukaryotic messenger RNA sequences
regex foo = /^ ATG [ ATGC ]{3 ,10} A {5 ,10} $ / ;
if ( text =~ foo )
printf ( " Matches ... n " ) ;
else
printf ( " Doesn ’t match ... n " ) ;
}
10 / 45
Mining Climate Data - Ocean Eddies

Spinning pools of water
Transport heat, salt, and
nutrients
Learning about their
behavior is difficult

11 / 45
A time slice for a point in the ocean

12 / 45
main ( int argc , char ** argv ) {
Matrix float <3 > data
= readMatrix ( " ssh . data " ) ;
Matrix float <3 > scores
= matrixMap ( scoreTS , data , [2]) ;
writeMatrix ( " temporalScores . data " ,
scores ) ;
}

13 / 45
Matrix float <1 > scoreTS ( Matrix float <1 > ts )
{
int i = 0 , beginning , n = dimSize ( ts , 0) ;
Matrix float <1 > scores
= init ( Matrix float <1 > , dimSize ( ts , 0) ) ;
while ( ts [ i ] < ts [ i +1]) { i = i +1 ; }
Matrix float [0] trough ;
while ( i < n -1) {
( trough , beginning , i )
= getTrough ( ts , i ) ;
scores [ beginning :: i ]
= computeArea ( trough ) ;
}
return scores ;
}

14 / 45
Matrix float <1 > computeArea
( Matrix float <1 > areaOfInterest )
{
float y1 = areaOfInterest [0];
float y2 = areaOfInterest [ end ];
int x1 = 0;
int x2 = dimSize ( areaOfInterest ,0) -1;
float m = ( y1 - y2 ) / (( float ) ( x1 - x2 ) ) ;
float b = y1 - m * x1 ;
Matrix float <1 > Line = ( x1 :: x2 ) * m + b ;
float area
= with ( x1 <= i < x2 )
fold (+ , 0.0 , line - areaOfInterest ) ;
return
with ( 0 <= i < dimSize ( Line ,0) )
genarray ([ dimSize ( Line , 0) ] , area ) ;
}
15 / 45
( Matrix float <1 > , int , int ) getTrough
( Matrix float <1 > ts , int i )
{
int beginning = i ;
int n = dimSize ( ts , 0) ;
while ( i +1 < n && ts [ i ] >= ts [ i +1])
i = i +1;
while ( i +1 < n && ts [ i ] < ts [ i +1])
i = i +1;
return ( ts [ beginning :: i ] , beginning , i ) ;
}

16 / 45
Matrix extensions
several features from MATLAB
with, fold, and genarray from Single Assignment C
all translated down to expected C code
straightforward parallel implementations of matrixMap,
with, fold, and genarray.

17 / 45
Dimension analysis

pound-seconds = newton-seconds
18 / 45
# include " stdio . h "
int main ( int
int meter x
int meter y
int meter ^2

argc , char * argv []) {
= 3.4 ;
= 5.6 ;
area = x * y ;

printf ( " % d  n " , x + y ) ;
printf ( " % d  n " , x + z ) ;

// OK
// Error

}

19 / 45
# include " stdio . h "
int main ( int
int meter x
int meter y
int meter ^2

argc , char * argv []) {
= 3.4 ;
= 5.6 ;
area = x * y ;

printf ( " % d  n " , x + y ) ; // OK
// printf ("% d  n " , x + z ) ; // Error
}

20 / 45
# include " stdio . h "
int main ( int
int
x
int
y
int

argc , char * argv []) {
= 3.4 ;
= 5.6 ;
area = x * y ;

printf ( " % d  n " , x + y ) ;

// OK

}

Extensions of this form find errors, but otherwise are “erased”
during translation.

21 / 45
Extension composition
Programmers can select the extensions that they want.
May want to use multiple extensions in the same program.
Distinguish between
1. extension user
has no knowledge of language design or implementations

2. extension developer
must know about language design and implementation

Tools build a custom .xc =⇒ .c translator for them
How can that be done?

22 / 45
Building translators from composable extensible
languages
Two primary challenges:
1. composable syntax — enables building a scanner, parser
context-aware scanning [GPCE’07]
modular determinism analysis [PLDI’09]
Copper

2. composable semantics — analysis and translations
attribute grammars with forwarding, collections and
higher-order attributes
set union of specification components
sets of productions, non-terminals, attributes
sets of attribute defining equations, on a production
sets of equations contributing values to a single attribute

modular well-definedness analysis [SLE’12a]
modular termination analysis [SLE’12b, Krishnan-PhD]
Silver
23 / 45
Generating parsers and scanners from grammars
and regular expressions
nonterminals: Stmt, Expr
terminals: Id
/[a-zA-Z][a-zA-Z0-9]*/
Num /[0-9]+/
Eq
’=’
Semi ’;’
Plus ’+’
Mult ’*’
Stmt ::= Stmt Semi Stmt
Stmt ::= Id Eq Expr
Expr ::= Expr Plus Expr
Expr ::= Expr Mult Expr
Expr ::= Id
24 / 45
Stmt

Stmt
Id(x)

Eq

Semi

Stmt
Id(a)

Expr

Eq

Expr
Id(b)

Expr

Plus

Expr

Id(y)
Expr

Mult

Num(3)

Expr
Id(z)

Id(x), Eq, Id(y), Plus, Num(3), Mult, Id(z), Semi, Id(a), Eq, Id(b)
“x

=

y

+

3

*

z

;

a

=

b”
25 / 45
Attribute Grammars
add semantics — meaning — to context free grammars
nodes (non-terminals) have attributes
that is, semantic values

Expr may be attributed with
type - the type of the expression
errors - list of error messages
env - mapping variable names to their types

Stmt may be attributed with errors and env

26 / 45
...

errors=[ERROR];

Stmt env = [x→int, y→int, z→string]

Stmt errors = [ ]

Semi

env = [x→int, y→int, z→string]

Id(x)

Eq

Expr type = int; errors = [ ] Id(x)

Stmt errors=[ERRO

env = [x→in

Eq

Expr t=string

env = [x→int, y→int, z→string]

env = [
Id(z)

Expr

type = int; errors = [ ]

Plus

Expr env = [x→int, y→int, z→string]

Id(y)
Expr
Num(3)

Mult

Expr type = int; errors = [ ]

env = [x→int, y→int, z→st

Id(y)
27 / 45
Attribute grammar specifications
Equations associated with productions define attribute values.
abstract production addition
e : : Expr : : = l : : Expr ’+ ’ r : : Expr
{
e . e r r o r s := l . e r r o r ++ r . e r r o r s ++
. . . c h e c k t h a t l and r a r e i n t e g e r s

...

e . type = i n t ;
l . env = e . env ;
r . env = e . env ;
}

28 / 45
Modern attribute grammars
higher-order attributes
reference attributes
collection attributes
forwarding
module systems
separate compilation
etc.

29 / 45
for-loop as an extension
abstract production for
s : : Stmt : : = i : : Name l o w e r : : Expr u p p e r : : Expr
body : : Stmt
{
s . e r r o r s := l o w e r . e r r o r ++ u p p e r . e r r o r s ++
body . e r r o r s ++
. . . c h e c k t h a t i i s an i n t e g e r . . .
forwards to
// i=l o w e r ; w h i l e ( i <= u p p e r ) { body ; i=i +1;}
seq ( assignment ( varRef ( i ) , lower ) ,
while (
l t e ( varRef ( i ) , upper ) ,
b l o c k ( s e q ( body ,
a s s i g n m e n t ( v a r R e f ( i ) , add ( v a r R e f ( i ) ,
i n t L i t ( ”1” ) ) ) ) ) ) ) ;
}
30 / 45
Building an attribute grammar evaluator from composed
specifications.

... AG H ∪∗ {AG E1 , ..., AG En }
∀i ∈ [1, n].modComplete(AG H , AG Ei )
E
E
⇒ ⇒ complete(AG H ∪ {AG1 , ..., AGn })
Monolithic analysis - not too hard, but not too useful.
Modular analysis - harder, but required [SLE’12a].

31 / 45
Challenges in scanning

Keywords in embedded languages may be identifiers in host
language:
int SELECT ;
...
rs = using c query { SELECT last name
FROM person WHERE ...

32 / 45
Challenges in scanning

Different extensions use same keyword
connection c "jdbc:derby:./derby/db/testdb"
with table person [ person id INTEGER,
first name VARCHAR ];
...
b = table ( c1 : T F ,
c2 : F * ) ;

33 / 45
Challenges in scanning

Operators with different precedence specifications:
x = 3 + y * z ;
...
str = /[a-z][a-z0-9]*.java/

34 / 45
Challenges in scanning

Terminals that are prefixes of others
List<List<Integer>> dlist ;
...
x = y >> 4 ;

35 / 45
Need for context

Traditionally, parser and scanner are disjoint.
Scanner → Parser → Semantic Analysis
In context aware scanning, they communicate
Scanner

Parser → Semantic Analysis

36 / 45
Context aware scanning
Scanner recognizes only tokens valid for current “context”
keeps embedded sub-languages, in a sense, separate
Consider:
chan in, out;
for i in a { a[i] = i*i ; }

Two terminal symbols that match “in”.
terminal IN ’in’ ;
terminal ID /[a-zA-Z ][a-zA-Z 0-9]*/
submits to {keyword };
terminal FOR ’for’ lexer class {keyword };

example is part of AbleP [SPIN’11]

37 / 45
Parsing C as an extension to Promela
c_decl {
typedef struct Coord {
int x, y; } Coord;
c_state "Coord pt" "Global"
int z = 3;

}
/* goes in state vector */
/* standard global decl */

active proctype example()
{ c_code { now.pt.x = now.pt.y = 0; };
do :: c_expr { now.pt.x == now.pt.y }
-> c_code { now.pt.y++; }
:: else -> break
od;
c_code { printf("values %d: %d, %d,%dn",
Pexample->_pid, now.z, now.pt.x, now.pt.y);
38 / 45
Context aware scanning
This scanning algorithm subordinates the
disambiguation principle of maximal munch
to the principle of
disambiguation by context.
It will return a shorter valid match before a longer invalid
match.
In List<List<Integer>> before “>”,
“>” in valid lookahead but “>>” is not.
A context aware scanner is essentially an implicitly-moded
scanner.
There is no explicit specification of valid look ahead.
It is generated from standard grammars and terminal
regexs.
39 / 45
With a smarter scanner, LALR(1) is not so brittle.
We can build syntactically composable language
extensions.
Context aware scanning makes composable syntax “more
likely”
But it does not give a guarantee of composability.

40 / 45
Building a parser from composed specifications.

... CFG H ∪∗ {CFG E1 , ..., CFG En }
∀i ∈ [1, n].isComposable(CFG H , CFG Ei )∧
conflictFree(CFG H ∪ CFG Ei )
⇒ ⇒ conflictFree(CFG H ∪ {CFG E1 , ..., CFG En })
Monolithic analysis - not too hard, but not too useful.
Modular analysis - harder, but required [PLDI’09].
Non-commutative composition of restricted LALR(1)
grammars.
41 / 45
42 / 45
Expressiveness versus safe composition

Compare to
other parser generators
libraries
The modular compositionality analysis does not require
context aware scanning.
But, context aware scanning makes it practical.

43 / 45
Future Work
ableC - extensible C11 specification
builds on lessons learned from extensible specifications of
Java [ECOOP’07], Lustre [FASE’07], Modelica,
Promela [SPIN’11].
incorporate existing language extensions

composition of language extensions are compile-time
language specific analysis
new applications of AGs

44 / 45
Thanks for your attention.

Questions?
http://melt.cs.umn.edu
evw@cs.umn.edu

45 / 45
Eric Van Wyk and August Schwerdfeger.
Context-aware scanning for parsing extensible languages.
In Intl. Conf. on Generative Programming and Component
Engineering, (GPCE), pages 63–72. ACM, 2007.
Eric Van Wyk, Derek Bodin, Jimin Gao, and Lijesh
Krishnan.
Silver: an extensible attribute grammar system.
Science of Computer Programming, 75(1–2):39–54,
January 2010.
August Schwerdfeger and Eric Van Wyk.
Verifiable composition of deterministic grammars.
In Proc. of Conf. on Programming Language Design and
Implementation (PLDI), pages 199–210. ACM, June 2009.

45 / 45
Ted Kaminski and Eric Van Wyk.
Modular well-definedness analysis for attribute grammars.
In Proc. of Intl. Conf. on Software Language Engineering
(SLE), volume 7745 of LNCS, pages 352–371.
Springer-Verlag, September 2012.
Lijesh Krishnan and Eric Van Wyk.
Termination analysis for higher-order attribute grammars.
In Proceedings of the 5th International Conference on
Software Language Engineering (SLE 2012), volume 7745
of LNCS, pages 44–63. Springer-Verlag, September 2012.
Lijesh Krishnan.
Composable Semantics Using Higher-Order Attribute
Grammars.
PhD thesis, University of Minnesota, Department of
Computer Science and Engineering, 2012.
http://purl.umn.edu/144010
45 / 45
Yogesh Mali and Eric Van Wyk.
Building extensible specifications and implementations of
Promela with AbleP.
In Proc. of Intl. SPIN Workshop on Model Checking of
Software, volume 6823 of LNCS, pages 108–125.
Springer-Verlag, July 2011.
Eric Van Wyk, Lijesh Krishnan, August Schwerdfeger, and
Derek Bodin.
Attribute grammar-based language extensions for Java.
In Proc. of European Conf. on Object Oriented Prog.
(ECOOP), volume 4609 of LNCS, pages 575–599.
Springer-Verlag, 2007.

45 / 45
Jimin Gao, Mats Heimdahl, and Eric Van Wyk.
Flexible and extensible notations for modeling languages.
In Fundamental Approaches to Software Engineering,
FASE 2007, volume 4422 of LNCS, pages 102–116.
Springer-Verlag, March 2007.

45 / 45

Weitere ähnliche Inhalte

Was ist angesagt?

Monads - Dublin Scala meetup
Monads - Dublin Scala meetupMonads - Dublin Scala meetup
Monads - Dublin Scala meetupMikhail Girkin
 
C++ 11 Features
C++ 11 FeaturesC++ 11 Features
C++ 11 FeaturesJan Rüegg
 
Handling of character strings C programming
Handling of character strings C programmingHandling of character strings C programming
Handling of character strings C programmingAppili Vamsi Krishna
 
Strings in c
Strings in cStrings in c
Strings in cvampugani
 
CS4200 2019 | Lecture 3 | Parsing
CS4200 2019 | Lecture 3 | ParsingCS4200 2019 | Lecture 3 | Parsing
CS4200 2019 | Lecture 3 | ParsingEelco Visser
 
TMPA-2017: Generating Cost Aware Covering Arrays For Free
TMPA-2017: Generating Cost Aware Covering Arrays For Free TMPA-2017: Generating Cost Aware Covering Arrays For Free
TMPA-2017: Generating Cost Aware Covering Arrays For Free Iosif Itkin
 
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...Francesco Casalegno
 
Dr archana dhawan bajaj - csharp fundamentals slides
Dr archana dhawan bajaj - csharp fundamentals slidesDr archana dhawan bajaj - csharp fundamentals slides
Dr archana dhawan bajaj - csharp fundamentals slidesDr-archana-dhawan-bajaj
 
TMPA-2015: Implementing the MetaVCG Approach in the C-light System
TMPA-2015: Implementing the MetaVCG Approach in the C-light SystemTMPA-2015: Implementing the MetaVCG Approach in the C-light System
TMPA-2015: Implementing the MetaVCG Approach in the C-light SystemIosif Itkin
 
Basic c++ programs
Basic c++ programsBasic c++ programs
Basic c++ programsharman kaur
 
Lab manual data structure (cs305 rgpv) (usefulsearch.org) (useful search)
Lab manual data structure (cs305 rgpv) (usefulsearch.org)  (useful search)Lab manual data structure (cs305 rgpv) (usefulsearch.org)  (useful search)
Lab manual data structure (cs305 rgpv) (usefulsearch.org) (useful search)Make Mannan
 
Pydiomatic
PydiomaticPydiomatic
Pydiomaticrik0
 
First-Class Patterns
First-Class PatternsFirst-Class Patterns
First-Class PatternsJohn De Goes
 

Was ist angesagt? (20)

Monads - Dublin Scala meetup
Monads - Dublin Scala meetupMonads - Dublin Scala meetup
Monads - Dublin Scala meetup
 
14 strings
14 strings14 strings
14 strings
 
C++ 11 Features
C++ 11 FeaturesC++ 11 Features
C++ 11 Features
 
Computer Programming- Lecture 5
Computer Programming- Lecture 5 Computer Programming- Lecture 5
Computer Programming- Lecture 5
 
Handling of character strings C programming
Handling of character strings C programmingHandling of character strings C programming
Handling of character strings C programming
 
Strings in c
Strings in cStrings in c
Strings in c
 
What's New in C++ 11?
What's New in C++ 11?What's New in C++ 11?
What's New in C++ 11?
 
CS4200 2019 | Lecture 3 | Parsing
CS4200 2019 | Lecture 3 | ParsingCS4200 2019 | Lecture 3 | Parsing
CS4200 2019 | Lecture 3 | Parsing
 
TMPA-2017: Generating Cost Aware Covering Arrays For Free
TMPA-2017: Generating Cost Aware Covering Arrays For Free TMPA-2017: Generating Cost Aware Covering Arrays For Free
TMPA-2017: Generating Cost Aware Covering Arrays For Free
 
String in c
String in cString in c
String in c
 
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
 
Dr archana dhawan bajaj - csharp fundamentals slides
Dr archana dhawan bajaj - csharp fundamentals slidesDr archana dhawan bajaj - csharp fundamentals slides
Dr archana dhawan bajaj - csharp fundamentals slides
 
TMPA-2015: Implementing the MetaVCG Approach in the C-light System
TMPA-2015: Implementing the MetaVCG Approach in the C-light SystemTMPA-2015: Implementing the MetaVCG Approach in the C-light System
TMPA-2015: Implementing the MetaVCG Approach in the C-light System
 
Computer graphics
Computer graphicsComputer graphics
Computer graphics
 
Swift School #1
Swift School #1Swift School #1
Swift School #1
 
Basic c++ programs
Basic c++ programsBasic c++ programs
Basic c++ programs
 
Struct examples
Struct examplesStruct examples
Struct examples
 
Lab manual data structure (cs305 rgpv) (usefulsearch.org) (useful search)
Lab manual data structure (cs305 rgpv) (usefulsearch.org)  (useful search)Lab manual data structure (cs305 rgpv) (usefulsearch.org)  (useful search)
Lab manual data structure (cs305 rgpv) (usefulsearch.org) (useful search)
 
Pydiomatic
PydiomaticPydiomatic
Pydiomatic
 
First-Class Patterns
First-Class PatternsFirst-Class Patterns
First-Class Patterns
 

Andere mochten auch

RODOLFO PALACIOS PARADERO resume
RODOLFO PALACIOS PARADERO resumeRODOLFO PALACIOS PARADERO resume
RODOLFO PALACIOS PARADERO resumeRodolfo Paradero
 
Informe Energético a Marzo de 2015 por provincias
Informe Energético a Marzo de 2015 por provinciasInforme Energético a Marzo de 2015 por provincias
Informe Energético a Marzo de 2015 por provinciasEduardo Nelson German
 
Propuesta de macrodiseño
Propuesta de macrodiseñoPropuesta de macrodiseño
Propuesta de macrodiseñoBarbara brice?
 
simple Thermalhydraulics code for LWR
simple Thermalhydraulics code for LWRsimple Thermalhydraulics code for LWR
simple Thermalhydraulics code for LWRAdhi Prihastomo
 
митинг
митингмитинг
митингOUMC
 
Продающие лендинги. Доклад Николая Мащенко с 22 встречи Клуба интернет-маркет...
Продающие лендинги. Доклад Николая Мащенко с 22 встречи Клуба интернет-маркет...Продающие лендинги. Доклад Николая Мащенко с 22 встречи Клуба интернет-маркет...
Продающие лендинги. Доклад Николая Мащенко с 22 встречи Клуба интернет-маркет...Клуб Интернет-Маркетологов
 

Andere mochten auch (11)

RODOLFO PALACIOS PARADERO resume
RODOLFO PALACIOS PARADERO resumeRODOLFO PALACIOS PARADERO resume
RODOLFO PALACIOS PARADERO resume
 
Informe Energético a Marzo de 2015 por provincias
Informe Energético a Marzo de 2015 por provinciasInforme Energético a Marzo de 2015 por provincias
Informe Energético a Marzo de 2015 por provincias
 
Propuesta de macrodiseño
Propuesta de macrodiseñoPropuesta de macrodiseño
Propuesta de macrodiseño
 
simple Thermalhydraulics code for LWR
simple Thermalhydraulics code for LWRsimple Thermalhydraulics code for LWR
simple Thermalhydraulics code for LWR
 
Seo tips
Seo tipsSeo tips
Seo tips
 
митинг
митингмитинг
митинг
 
final
finalfinal
final
 
Indices 05 dec2013065415
Indices 05 dec2013065415Indices 05 dec2013065415
Indices 05 dec2013065415
 
Продающие лендинги. Доклад Николая Мащенко с 22 встречи Клуба интернет-маркет...
Продающие лендинги. Доклад Николая Мащенко с 22 встречи Клуба интернет-маркет...Продающие лендинги. Доклад Николая Мащенко с 22 встречи Клуба интернет-маркет...
Продающие лендинги. Доклад Николая Мащенко с 22 встречи Клуба интернет-маркет...
 
Río Sil
Río SilRío Sil
Río Sil
 
Company Profile
Company ProfileCompany Profile
Company Profile
 

Ähnlich wie talk at Virginia Bioinformatics Institute, December 5, 2013

Scala as a Declarative Language
Scala as a Declarative LanguageScala as a Declarative Language
Scala as a Declarative Languagevsssuresh
 
So I am writing a CS code for a project and I keep getting cannot .pdf
So I am writing a CS code for a project and I keep getting cannot .pdfSo I am writing a CS code for a project and I keep getting cannot .pdf
So I am writing a CS code for a project and I keep getting cannot .pdfezonesolutions
 
Swift - Krzysztof Skarupa
Swift -  Krzysztof SkarupaSwift -  Krzysztof Skarupa
Swift - Krzysztof SkarupaSunscrapers
 
Functions And Header Files In C++ | Bjarne stroustrup
Functions And Header Files In C++ | Bjarne stroustrupFunctions And Header Files In C++ | Bjarne stroustrup
Functions And Header Files In C++ | Bjarne stroustrupSyedHaroonShah4
 
Find an LCS of X = and Y = Show the c and b table. Attach File .docx
  Find an LCS of X =  and Y =   Show the c and b table.  Attach File  .docx  Find an LCS of X =  and Y =   Show the c and b table.  Attach File  .docx
Find an LCS of X = and Y = Show the c and b table. Attach File .docxAbdulrahman890100
 
Time Series Analysis and Mining with R
Time Series Analysis and Mining with RTime Series Analysis and Mining with R
Time Series Analysis and Mining with RYanchang Zhao
 
Fp in scala part 2
Fp in scala part 2Fp in scala part 2
Fp in scala part 2Hang Zhao
 
All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2goMoriyoshi Koizumi
 
Best C++ Programming Homework Help
Best C++ Programming Homework HelpBest C++ Programming Homework Help
Best C++ Programming Homework HelpC++ Homework Help
 
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021Peng Cheng
 
VIT351 Software Development VI Unit2
VIT351 Software Development VI Unit2VIT351 Software Development VI Unit2
VIT351 Software Development VI Unit2YOGESH SINGH
 
Computer graphics lab manual
Computer graphics lab manualComputer graphics lab manual
Computer graphics lab manualUma mohan
 

Ähnlich wie talk at Virginia Bioinformatics Institute, December 5, 2013 (20)

Scala as a Declarative Language
Scala as a Declarative LanguageScala as a Declarative Language
Scala as a Declarative Language
 
C++11 - STL Additions
C++11 - STL AdditionsC++11 - STL Additions
C++11 - STL Additions
 
So I am writing a CS code for a project and I keep getting cannot .pdf
So I am writing a CS code for a project and I keep getting cannot .pdfSo I am writing a CS code for a project and I keep getting cannot .pdf
So I am writing a CS code for a project and I keep getting cannot .pdf
 
Array
ArrayArray
Array
 
Swift - Krzysztof Skarupa
Swift -  Krzysztof SkarupaSwift -  Krzysztof Skarupa
Swift - Krzysztof Skarupa
 
Scala - brief intro
Scala - brief introScala - brief intro
Scala - brief intro
 
Ch8a
Ch8aCh8a
Ch8a
 
Functions And Header Files In C++ | Bjarne stroustrup
Functions And Header Files In C++ | Bjarne stroustrupFunctions And Header Files In C++ | Bjarne stroustrup
Functions And Header Files In C++ | Bjarne stroustrup
 
Scala Paradigms
Scala ParadigmsScala Paradigms
Scala Paradigms
 
Find an LCS of X = and Y = Show the c and b table. Attach File .docx
  Find an LCS of X =  and Y =   Show the c and b table.  Attach File  .docx  Find an LCS of X =  and Y =   Show the c and b table.  Attach File  .docx
Find an LCS of X = and Y = Show the c and b table. Attach File .docx
 
Time Series Analysis and Mining with R
Time Series Analysis and Mining with RTime Series Analysis and Mining with R
Time Series Analysis and Mining with R
 
Fp in scala part 2
Fp in scala part 2Fp in scala part 2
Fp in scala part 2
 
All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2go
 
String Manipulation Function and Header File Functions
String Manipulation Function and Header File FunctionsString Manipulation Function and Header File Functions
String Manipulation Function and Header File Functions
 
Best C++ Programming Homework Help
Best C++ Programming Homework HelpBest C++ Programming Homework Help
Best C++ Programming Homework Help
 
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
 
VIT351 Software Development VI Unit2
VIT351 Software Development VI Unit2VIT351 Software Development VI Unit2
VIT351 Software Development VI Unit2
 
5_IntermediateCodeGeneration.ppt
5_IntermediateCodeGeneration.ppt5_IntermediateCodeGeneration.ppt
5_IntermediateCodeGeneration.ppt
 
Computer graphics lab manual
Computer graphics lab manualComputer graphics lab manual
Computer graphics lab manual
 
Lk module3
Lk module3Lk module3
Lk module3
 

Kürzlich hochgeladen

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Kürzlich hochgeladen (20)

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

talk at Virginia Bioinformatics Institute, December 5, 2013

  • 1. Extensible domain-specific programming for the sciences Eric Van Wyk University of Minnesota VBI, December 5, 2013 slides available at http:www.cs.umn.edu/~evw 1 / 45
  • 2. Current trends / topics in PL Formal verification CompCert - http://compcert.inria.fr/ Astr´e - http://www.astree.ens.fr/ e Hoare logic (1960’s) {P} code {Q} Proof assistants: Coq, Abella, Isabelle, ... use required in some PL publishing venues 2 / 45
  • 5. Current trends / topics in PL Parallel programming - multiple cores, everywhere. “no more free lunch” need new abstractions: e.g. Cilk, MapReduce, FP new semantics: e.g. deterministic parallel Java 5 / 45
  • 6. Current trends / topics in PL Expressive and safe static typing extending richer static types, e.g. append :: ( [a], [a] ) -> [a] to dependent types append :: ( [a|n], [a|m] ) -> [a|n+m] turns array out-of-bounds and null-pointer bugs into static type errors 6 / 45
  • 7. Extensible languages Allow programmers select the features to be used in their programming languages. new syntax / notations new semantic analyses / error-checking Why would anyone want to do that? 7 / 45
  • 8. Programming language features General purpose features assignment statements, loops, if-then-else statements functions (perhaps higher-order) and procedures I/O facilities modules data: integer, strings, arrays, records Domain-specific features matrix operations (MATLAB) regular expression matching (Perl, Python) statistics functions (R) computational geometry operations (LN) parallel computing (SISAL, X10, NESL, etc.) Many similarities, needless differences. Working with multiple (domain-specific) languages is a headache. 8 / 45
  • 9. Extensible languages Allow programmers select the features to be used in their programming languages. new syntax / notations new semantic analyses / error-checking Pick a general purpose host language (e.g. ANSI C), extend with domain-specific features. myProgram.xc =⇒ myProgram.c 9 / 45
  • 10. Regular expressions # include " stdio . h " # include " regex . h " int main ( int argc , char * argv []) { char * text = readFileContents ( " X . data " ) ; // eukaryotic messenger RNA sequences regex foo = /^ ATG [ ATGC ]{3 ,10} A {5 ,10} $ / ; if ( text =~ foo ) printf ( " Matches ... n " ) ; else printf ( " Doesn ’t match ... n " ) ; } 10 / 45
  • 11. Mining Climate Data - Ocean Eddies Spinning pools of water Transport heat, salt, and nutrients Learning about their behavior is difficult 11 / 45
  • 12. A time slice for a point in the ocean 12 / 45
  • 13. main ( int argc , char ** argv ) { Matrix float <3 > data = readMatrix ( " ssh . data " ) ; Matrix float <3 > scores = matrixMap ( scoreTS , data , [2]) ; writeMatrix ( " temporalScores . data " , scores ) ; } 13 / 45
  • 14. Matrix float <1 > scoreTS ( Matrix float <1 > ts ) { int i = 0 , beginning , n = dimSize ( ts , 0) ; Matrix float <1 > scores = init ( Matrix float <1 > , dimSize ( ts , 0) ) ; while ( ts [ i ] < ts [ i +1]) { i = i +1 ; } Matrix float [0] trough ; while ( i < n -1) { ( trough , beginning , i ) = getTrough ( ts , i ) ; scores [ beginning :: i ] = computeArea ( trough ) ; } return scores ; } 14 / 45
  • 15. Matrix float <1 > computeArea ( Matrix float <1 > areaOfInterest ) { float y1 = areaOfInterest [0]; float y2 = areaOfInterest [ end ]; int x1 = 0; int x2 = dimSize ( areaOfInterest ,0) -1; float m = ( y1 - y2 ) / (( float ) ( x1 - x2 ) ) ; float b = y1 - m * x1 ; Matrix float <1 > Line = ( x1 :: x2 ) * m + b ; float area = with ( x1 <= i < x2 ) fold (+ , 0.0 , line - areaOfInterest ) ; return with ( 0 <= i < dimSize ( Line ,0) ) genarray ([ dimSize ( Line , 0) ] , area ) ; } 15 / 45
  • 16. ( Matrix float <1 > , int , int ) getTrough ( Matrix float <1 > ts , int i ) { int beginning = i ; int n = dimSize ( ts , 0) ; while ( i +1 < n && ts [ i ] >= ts [ i +1]) i = i +1; while ( i +1 < n && ts [ i ] < ts [ i +1]) i = i +1; return ( ts [ beginning :: i ] , beginning , i ) ; } 16 / 45
  • 17. Matrix extensions several features from MATLAB with, fold, and genarray from Single Assignment C all translated down to expected C code straightforward parallel implementations of matrixMap, with, fold, and genarray. 17 / 45
  • 18. Dimension analysis pound-seconds = newton-seconds 18 / 45
  • 19. # include " stdio . h " int main ( int int meter x int meter y int meter ^2 argc , char * argv []) { = 3.4 ; = 5.6 ; area = x * y ; printf ( " % d n " , x + y ) ; printf ( " % d n " , x + z ) ; // OK // Error } 19 / 45
  • 20. # include " stdio . h " int main ( int int meter x int meter y int meter ^2 argc , char * argv []) { = 3.4 ; = 5.6 ; area = x * y ; printf ( " % d n " , x + y ) ; // OK // printf ("% d n " , x + z ) ; // Error } 20 / 45
  • 21. # include " stdio . h " int main ( int int x int y int argc , char * argv []) { = 3.4 ; = 5.6 ; area = x * y ; printf ( " % d n " , x + y ) ; // OK } Extensions of this form find errors, but otherwise are “erased” during translation. 21 / 45
  • 22. Extension composition Programmers can select the extensions that they want. May want to use multiple extensions in the same program. Distinguish between 1. extension user has no knowledge of language design or implementations 2. extension developer must know about language design and implementation Tools build a custom .xc =⇒ .c translator for them How can that be done? 22 / 45
  • 23. Building translators from composable extensible languages Two primary challenges: 1. composable syntax — enables building a scanner, parser context-aware scanning [GPCE’07] modular determinism analysis [PLDI’09] Copper 2. composable semantics — analysis and translations attribute grammars with forwarding, collections and higher-order attributes set union of specification components sets of productions, non-terminals, attributes sets of attribute defining equations, on a production sets of equations contributing values to a single attribute modular well-definedness analysis [SLE’12a] modular termination analysis [SLE’12b, Krishnan-PhD] Silver 23 / 45
  • 24. Generating parsers and scanners from grammars and regular expressions nonterminals: Stmt, Expr terminals: Id /[a-zA-Z][a-zA-Z0-9]*/ Num /[0-9]+/ Eq ’=’ Semi ’;’ Plus ’+’ Mult ’*’ Stmt ::= Stmt Semi Stmt Stmt ::= Id Eq Expr Expr ::= Expr Plus Expr Expr ::= Expr Mult Expr Expr ::= Id 24 / 45
  • 25. Stmt Stmt Id(x) Eq Semi Stmt Id(a) Expr Eq Expr Id(b) Expr Plus Expr Id(y) Expr Mult Num(3) Expr Id(z) Id(x), Eq, Id(y), Plus, Num(3), Mult, Id(z), Semi, Id(a), Eq, Id(b) “x = y + 3 * z ; a = b” 25 / 45
  • 26. Attribute Grammars add semantics — meaning — to context free grammars nodes (non-terminals) have attributes that is, semantic values Expr may be attributed with type - the type of the expression errors - list of error messages env - mapping variable names to their types Stmt may be attributed with errors and env 26 / 45
  • 27. ... errors=[ERROR]; Stmt env = [x→int, y→int, z→string] Stmt errors = [ ] Semi env = [x→int, y→int, z→string] Id(x) Eq Expr type = int; errors = [ ] Id(x) Stmt errors=[ERRO env = [x→in Eq Expr t=string env = [x→int, y→int, z→string] env = [ Id(z) Expr type = int; errors = [ ] Plus Expr env = [x→int, y→int, z→string] Id(y) Expr Num(3) Mult Expr type = int; errors = [ ] env = [x→int, y→int, z→st Id(y) 27 / 45
  • 28. Attribute grammar specifications Equations associated with productions define attribute values. abstract production addition e : : Expr : : = l : : Expr ’+ ’ r : : Expr { e . e r r o r s := l . e r r o r ++ r . e r r o r s ++ . . . c h e c k t h a t l and r a r e i n t e g e r s ... e . type = i n t ; l . env = e . env ; r . env = e . env ; } 28 / 45
  • 29. Modern attribute grammars higher-order attributes reference attributes collection attributes forwarding module systems separate compilation etc. 29 / 45
  • 30. for-loop as an extension abstract production for s : : Stmt : : = i : : Name l o w e r : : Expr u p p e r : : Expr body : : Stmt { s . e r r o r s := l o w e r . e r r o r ++ u p p e r . e r r o r s ++ body . e r r o r s ++ . . . c h e c k t h a t i i s an i n t e g e r . . . forwards to // i=l o w e r ; w h i l e ( i <= u p p e r ) { body ; i=i +1;} seq ( assignment ( varRef ( i ) , lower ) , while ( l t e ( varRef ( i ) , upper ) , b l o c k ( s e q ( body , a s s i g n m e n t ( v a r R e f ( i ) , add ( v a r R e f ( i ) , i n t L i t ( ”1” ) ) ) ) ) ) ) ; } 30 / 45
  • 31. Building an attribute grammar evaluator from composed specifications. ... AG H ∪∗ {AG E1 , ..., AG En } ∀i ∈ [1, n].modComplete(AG H , AG Ei ) E E ⇒ ⇒ complete(AG H ∪ {AG1 , ..., AGn }) Monolithic analysis - not too hard, but not too useful. Modular analysis - harder, but required [SLE’12a]. 31 / 45
  • 32. Challenges in scanning Keywords in embedded languages may be identifiers in host language: int SELECT ; ... rs = using c query { SELECT last name FROM person WHERE ... 32 / 45
  • 33. Challenges in scanning Different extensions use same keyword connection c "jdbc:derby:./derby/db/testdb" with table person [ person id INTEGER, first name VARCHAR ]; ... b = table ( c1 : T F , c2 : F * ) ; 33 / 45
  • 34. Challenges in scanning Operators with different precedence specifications: x = 3 + y * z ; ... str = /[a-z][a-z0-9]*.java/ 34 / 45
  • 35. Challenges in scanning Terminals that are prefixes of others List<List<Integer>> dlist ; ... x = y >> 4 ; 35 / 45
  • 36. Need for context Traditionally, parser and scanner are disjoint. Scanner → Parser → Semantic Analysis In context aware scanning, they communicate Scanner Parser → Semantic Analysis 36 / 45
  • 37. Context aware scanning Scanner recognizes only tokens valid for current “context” keeps embedded sub-languages, in a sense, separate Consider: chan in, out; for i in a { a[i] = i*i ; } Two terminal symbols that match “in”. terminal IN ’in’ ; terminal ID /[a-zA-Z ][a-zA-Z 0-9]*/ submits to {keyword }; terminal FOR ’for’ lexer class {keyword }; example is part of AbleP [SPIN’11] 37 / 45
  • 38. Parsing C as an extension to Promela c_decl { typedef struct Coord { int x, y; } Coord; c_state "Coord pt" "Global" int z = 3; } /* goes in state vector */ /* standard global decl */ active proctype example() { c_code { now.pt.x = now.pt.y = 0; }; do :: c_expr { now.pt.x == now.pt.y } -> c_code { now.pt.y++; } :: else -> break od; c_code { printf("values %d: %d, %d,%dn", Pexample->_pid, now.z, now.pt.x, now.pt.y); 38 / 45
  • 39. Context aware scanning This scanning algorithm subordinates the disambiguation principle of maximal munch to the principle of disambiguation by context. It will return a shorter valid match before a longer invalid match. In List<List<Integer>> before “>”, “>” in valid lookahead but “>>” is not. A context aware scanner is essentially an implicitly-moded scanner. There is no explicit specification of valid look ahead. It is generated from standard grammars and terminal regexs. 39 / 45
  • 40. With a smarter scanner, LALR(1) is not so brittle. We can build syntactically composable language extensions. Context aware scanning makes composable syntax “more likely” But it does not give a guarantee of composability. 40 / 45
  • 41. Building a parser from composed specifications. ... CFG H ∪∗ {CFG E1 , ..., CFG En } ∀i ∈ [1, n].isComposable(CFG H , CFG Ei )∧ conflictFree(CFG H ∪ CFG Ei ) ⇒ ⇒ conflictFree(CFG H ∪ {CFG E1 , ..., CFG En }) Monolithic analysis - not too hard, but not too useful. Modular analysis - harder, but required [PLDI’09]. Non-commutative composition of restricted LALR(1) grammars. 41 / 45
  • 43. Expressiveness versus safe composition Compare to other parser generators libraries The modular compositionality analysis does not require context aware scanning. But, context aware scanning makes it practical. 43 / 45
  • 44. Future Work ableC - extensible C11 specification builds on lessons learned from extensible specifications of Java [ECOOP’07], Lustre [FASE’07], Modelica, Promela [SPIN’11]. incorporate existing language extensions composition of language extensions are compile-time language specific analysis new applications of AGs 44 / 45
  • 45. Thanks for your attention. Questions? http://melt.cs.umn.edu evw@cs.umn.edu 45 / 45
  • 46. Eric Van Wyk and August Schwerdfeger. Context-aware scanning for parsing extensible languages. In Intl. Conf. on Generative Programming and Component Engineering, (GPCE), pages 63–72. ACM, 2007. Eric Van Wyk, Derek Bodin, Jimin Gao, and Lijesh Krishnan. Silver: an extensible attribute grammar system. Science of Computer Programming, 75(1–2):39–54, January 2010. August Schwerdfeger and Eric Van Wyk. Verifiable composition of deterministic grammars. In Proc. of Conf. on Programming Language Design and Implementation (PLDI), pages 199–210. ACM, June 2009. 45 / 45
  • 47. Ted Kaminski and Eric Van Wyk. Modular well-definedness analysis for attribute grammars. In Proc. of Intl. Conf. on Software Language Engineering (SLE), volume 7745 of LNCS, pages 352–371. Springer-Verlag, September 2012. Lijesh Krishnan and Eric Van Wyk. Termination analysis for higher-order attribute grammars. In Proceedings of the 5th International Conference on Software Language Engineering (SLE 2012), volume 7745 of LNCS, pages 44–63. Springer-Verlag, September 2012. Lijesh Krishnan. Composable Semantics Using Higher-Order Attribute Grammars. PhD thesis, University of Minnesota, Department of Computer Science and Engineering, 2012. http://purl.umn.edu/144010 45 / 45
  • 48. Yogesh Mali and Eric Van Wyk. Building extensible specifications and implementations of Promela with AbleP. In Proc. of Intl. SPIN Workshop on Model Checking of Software, volume 6823 of LNCS, pages 108–125. Springer-Verlag, July 2011. Eric Van Wyk, Lijesh Krishnan, August Schwerdfeger, and Derek Bodin. Attribute grammar-based language extensions for Java. In Proc. of European Conf. on Object Oriented Prog. (ECOOP), volume 4609 of LNCS, pages 575–599. Springer-Verlag, 2007. 45 / 45
  • 49. Jimin Gao, Mats Heimdahl, and Eric Van Wyk. Flexible and extensible notations for modeling languages. In Fundamental Approaches to Software Engineering, FASE 2007, volume 4422 of LNCS, pages 102–116. Springer-Verlag, March 2007. 45 / 45