SlideShare ist ein Scribd-Unternehmen logo
1 von 67
Downloaden Sie, um offline zu lesen
SLATE 2015, Madrid, Spain June 18-19, 2015
1/67
The Application of Grammar
Inference to Software
Language Engineering
Marjan Mernik
University of Maribor, Slovenia
SLATE 2015, Madrid, Spain June 18-19, 2015
2/67
Outline of the Presentation
• What is a Grammar Inference (GI)?
• A short introduction to Software Language
Engineering (SLE)
• Some applications of GI to SLE and SE (CFG
inference, Metamodel inference, Graph
grammar inference, semantic inference)
• Inside GI
• Conclusion
SLATE 2015, Madrid, Spain June 18-19, 2015
3/67
What is a Grammar Inference (GI)?
• Grammatical inference is a process of
learning the grammar from positive (and
negative) language samples (sentences).
• Grammatical inference attracts researchers
from different fields such as pattern
recognition, computational linguistic,
natural language acquisition, software
engineering, ...
SLATE 2015, Madrid, Spain June 18-19, 2015
4/67
What is a Grammar Inference (GI)?
• Given a sentence ps and a grammar G we can
tell whether ps belongs to L(G) (ps ∈ L(G)).
Such sentence is called positive sample.
• A set of positive samples is denoted with S+. In
similar manner we can defined set of negative
samples S-. Those samples do not belong to
L(G) and can no be derived from starting
symbol S.
SLATE 2015, Madrid, Spain June 18-19, 2015
5/67
What is a Grammar Inference (GI)?
• Given a set S+ and S-, which might be also
empty, the task of grammar inference is to
find at least one grammar G such that
S+⊆L(G) and S-⊆L (G).
• A set of positive samples S+ of a L(G) is
structurally complete if each grammar
production is used in the generation of at
least one sentence in S+.
SLATE 2015, Madrid, Spain June 18-19, 2015
6/67
What is a Grammar Inference (GI)?
print 5
print a where a=10
print b+1 where b=1
print a+b+2 where a=1, b=2
What
computer
language
she used?
Try out our newly
developed grammar
inference algorithm!
What is a
grammar of
this
language?
SLATE 2015, Madrid, Spain June 18-19, 2015
7/67
A short introduction to SLE
• Software Language Engineering (SLE) is a
young engineering discipline with the aim of
establishing a systematic and rigorous
approach to the development, use, and
maintenance of computer languages, which
comprises specification, modeling and
programming languages.
• SLE is equally focused on General-Purpose
Langauges (GPLs) and Domain-Specific
Languages (DSLs)
SLATE 2015, Madrid, Spain June 18-19, 2015
8/67
A short introduction to SLE
Technicalliterature
Existing
implementations
Customsurveys
Expertadvice
Currentandfuture
requirements
Terminology
Concepts
Commonalities
Variations
Syntax
Semantics
DSLDSL
Possible
existing
implementations
Requirements
DSL
DSL development phases
SLATE 2015, Madrid, Spain June 18-19, 2015
9/67
Some applications of GI to SLE
• A. Stevenson, J. R. Cordy. A Survey of grammatical inference
in software engineering. Science of Computer Programming
96 (2014) 444-459.
– Inference of GPLs
– Inference of DSLs (grammar and metamodel
based)
– Inference of Visual Languages (based on graph
grammars)
– Inference of execution traces
SLATE 2015, Madrid, Spain June 18-19, 2015
10/67
Some applications of GI to SLE
• Inference of GPLs
– Problem is still too complex to infer a GPL with several
hundreds of production rules.
– Several successful attempts have been developed to infer a
dialect GPL grammar G’ from existing GPL grammar G.
M. Di Penta, P. Lombardi, K. Taneja, L. Troiano. Search-
Based Inference of Dialect Grammars. Soft Computing 12
(2008) 51-66.
A. Dubej, P. Jalote, S. Aggarval. Learning context free
grammar rules from a set of programs. IET Software 2
(2008) 223-240.
SLATE 2015, Madrid, Spain June 18-19, 2015
11/67
Some applications of GI to SLE
• Inference of GPLs
– Restrictive assumption have been declared in these works:
1. G and G’ share the same set of non-terminals
2. T ⊆ T’
3. (T’-T) consists only of keywords and operators
4. Production rules can be only added and not removed
5. Missing rules always begin with a keyword
SLATE 2015, Madrid, Spain June 18-19, 2015
12/67
Some applications of GI to SLE
• HRNČIČ, Dejan, MERNIK, Marjan, BRYANT, Barrett Richard.
Embedding DSLS into GPLS: A Grammatical Inference
Approach. Information Technology and Control , 2011, vol.
40, no. 4, pp. 307-315.
• Our approach can be used also for syntax
extensions and for DSL embedding
– To embed domain-specific language (e.g, SQL)
into another programming language (GPL or DSL)
SLATE 2015, Madrid, Spain June 18-19, 2015
13/67
Some applications of GI to SLE
• Initial grammar (ANSI C):
1. translation unit ::= external decl
2. translation unit ::= translation unit external decl
3. external decl ::= function denition
4. external decl ::= decl
6. function denition ::= declarator decl list compound stat
9. decl ::= decl specs init declarator list ;
10. decl ::= decl specs ;
11. decl list ::= decl
12. decl list ::= decl list decl
15. decl specs ::= type spec decl specs
27. type spec ::= int | long | ...
45. init declarator list ::= init declarator
46. init declarator list ::= init declarator list , init declarator
47. init declarator ::= declarator
64. enumerator ::= id
65. enumerator ::= id = const exp
67. declarator ::= direct declarator
68. direct declarator ::= id
69. direct declarator ::= ( declarator )
70. direct declarator ::= direct declarator [ const exp ]
71. direct declarator ::= direct declarator [ ]
72. direct declarator ::= direct declarator ( param type list )
73. direct declarator ::= direct declarator ( id list )
74. direct declarator ::= direct declarator ( )
88. id list ::= id
89. id list ::= id list , id
90. initializer ::= assignment exp
91. initializer ::= initializer list
93. initializer list ::= initializer
94. initializer list ::= initializer list , initializer
110. stat ::= labeled stat | exp stat | compound stat | selection stat
114. stat ::= iteration stat | jump stat
116. labeled stat ::= id : stat
117. labeled stat ::= case const exp : stat
118. labeled stat ::= default : stat
119. exp stat ::= exp ;
120. exp stat ::= ;
121. compound stat ::= decl list stat list
125. stat list ::= stat
126. stat list ::= stat list stat
127. selection stat ::= if ( exp ) stat
129. selection stat ::= switch ( exp ) stat
130. iteration stat ::= while ( exp ) stat
131. iteration stat ::= do stat while ( exp ) ;
132. iteration stat ::= for ( exp ; exp ; exp ) stat
140. jump stat ::= goto id ; | continue ; | break ; | return exp ;
145. exp ::= assignment exp
146. exp ::= exp , assignment exp
147. assignment exp ::= conditional exp
148. assignment exp ::= conditional exp assignment operator
assignment exp
205. const ::= int const | char const | oat const
SLATE 2015, Madrid, Spain June 18-19, 2015
14/67
Some applications of GI to SLE
• Initial grammar (ANSI C):
int main() {
char str[][];
int i;
printf("Students:");
for(i = 0; i < str.length; i++) {
printf(str[i]);
}
return 0;
}
int main() {
char str[][] = { SELECT Name FROM
Students };
int i;
printf("Students:");
for(i = 0; i < str.length; i++) {
printf(str[i]);
}
return 0;
}
true positive sample false negative samples:
int main() {
char str[][] = { SELECT Name, Surname
FROM Students, Professors };
int i;
printf("Students and Professors:");
for(i = 0; i < str.length; i++) {
printf(str[i]);
}
return 0;
}
SLATE 2015, Madrid, Spain June 18-19, 2015
15/67
Some applications of GI to SLE
• Inferred Grammar:
1. translation unit ::= external decl
2. translation unit ::= translation unit external decl
3. external decl ::= function denition
4. external decl ::= decl
6. function denition ::= declarator decl list compound stat
9. decl ::= decl specs init declarator list ;
10. decl ::= decl specs ;
11. decl list ::= decl
12. decl list ::= decl list decl
15. decl specs ::= type spec decl specs
27. type spec ::= int | long | ...
45. init declarator list ::= init declarator
46. init declarator list ::= init declarator list , init declarator
47. init declarator ::= declarator
64. enumerator ::= id
65. enumerator ::= id = const exp
67. declarator ::= direct declarator NT1
68. direct declarator ::= id
69. direct declarator ::= ( declarator )
70. direct declarator ::= direct declarator [ const exp ]
71. direct declarator ::= direct declarator [ ]
72. direct declarator ::= direct declarator ( param type list )
73. direct declarator ::= direct declarator ( id list )
74. direct declarator ::= direct declarator ( )
88. id list ::= id
89. id list ::= id list , id
90. initializer ::= assignment exp
91. initializer ::= initializer list
93. initializer list ::= initializer
94. initializer list ::= initializer list , initializer
110. stat ::= labeled stat | exp stat | compound stat | selection stat
114. stat ::= iteration stat | jump stat
116. labeled stat ::= id : stat
117. labeled stat ::= case const exp : stat
118. labeled stat ::= default : stat
119. exp stat ::= exp ;
120. exp stat ::= ;
121. compound stat ::= decl list stat list
125. stat list ::= stat
126. stat list ::= stat list stat
127. selection stat ::= if ( exp ) stat
129. selection stat ::= switch ( exp ) stat
130. iteration stat ::= while ( exp ) stat
131. iteration stat ::= do stat while ( exp ) ;
132. iteration stat ::= for ( exp ; exp ; exp ) stat
140. jump stat ::= goto id ; | continue ; | break ; | return exp ;
145. exp ::= assignment exp
146. exp ::= exp , assignment exp
147. assignment exp ::= conditional exp
148. assignment exp ::= conditional exp assignment operator
assignment exp
205. const ::= int const | char const | oat const
208. NT1 ::= = SELECT id NT2 FROM id NT2 | ϵ
210. NT2 ::= , id NT2 | ϵ
SLATE 2015, Madrid, Spain June 18-19, 2015
16/67
Some applications of GI to SLE
• Inference of DSLs
– A DSL grammar is usually small enough that the inference
process is feasible.
– Inference of grammar-based DSLs.
– Inference of metamodel-base DSLs.
HRNČIČ, Dejan, MERNIK, Marjan, BRYANT, Barrett Richard,
JAVED, Faizan. A memetic grammar inference algorithm for
language learning. Applied Soft Computing, 2012, vol. 12, iss. 3,
pp. 1006-1020.
HRNČIČ, Dejan, MERNIK, Marjan, BRYANT, Barrett Richard.
Improving grammar inference by a memetic algorithm. IEEE
Transactions on Systems, Man, and Cybernetics - Part C, 2012,
vol. 42, no. 5, pp. 692-703.
SLATE 2015, Madrid, Spain June 18-19, 2015
17/67
Some applications of GI to SLE
• 12 input samples of DESK language on which the
algorithm was tested:
1. print a
2. print 3
3. print b + 14
4. print a + b + c
5. print a where b = 14
6. print 10 where d = 15
7. print 9 + b where b = 16
8. print 1 + 2 where id = 1
9. print a where b = 5, c = 4
10. print 21 where a = 6, b = 5
11. print 5 + 6 where a = 3, c = 14
12. print a + b + c where a = 4, b = 3, c = 2
SLATE 2015, Madrid, Spain June 18-19, 2015
18/67
Some applications of GI to SLE
Original grammar:
1. DESK ::= print E C
2. E ::= E + F
3. E ::= F
4. F ::= id
5. F ::= num
6. C ::= where Ds
7. C ::= ε
8. Ds ::= D
9. Ds ::= Ds , D
10. D ::= id = num
Inferred grammar:
1: NT1 -> print NT3 NT5
2: NT2 -> + NT3
3: NT2 -> ε
4: NT3 -> num NT2
5: NT3 -> id NT2
6: NT4 -> , id = num NT4
7: NT4 -> ε
8: NT5 -> where id = num NT4
9: NT5 -> ε
SLATE 2015, Madrid, Spain June 18-19, 2015
19/67
Some applications of GI to SLE
DSL for hypertree description
SLATE 2015, Madrid, Spain June 18-19, 2015
20/67
Some applications of GI to SLE
Inferred grammar for hypertree description DSL
SLATE 2015, Madrid, Spain June 18-19, 2015
21/67
Some applications of GI to SLE
• As a model conforms to a metamodel in a
similar manner to how a program conforms to a
grammar, the metamodel inference can be
defined as follows.
• The set of all models that conform to a given
metamodel MM will be called the language of
the metamodel and denoted L(MM). Given a
model instance m and a metamodel MM we can
tell whether m conforms to MM (m ∈ L(MM)).
SLATE 2015, Madrid, Spain June 18-19, 2015
22/67
Some applications of GI to SLE
• A set of positive samples is denoted with S+.
Conversely, a negative sample belongs to
L(MM), which denotes a set of all models that
do not conform to metamodel MM. A set of
negative samples is denoted with S-.
• A set of positive samples S+ of a metamodel
MM is structurally complete if each
metamodel element appears in at least one
model in S+.
SLATE 2015, Madrid, Spain June 18-19, 2015
23/67
Some applications of GI to SLE
• Given a set of positive samples S+ and set of
negative samples S-, which might be also
empty, the task of metamodel inference is to
find at least one metamodel MM such that
S+⊆L(MM) and S-⊆L(MM).
SLATE 2015, Madrid, Spain June 18-19, 2015
24/67
Some applications of GI to SLE
• JAVED, Faizan, MERNIK, Marjan, GRAY, Jeffrey G., BRYANT,
Barrett Richard. MARS: A Metamodel Recovery System Using
Grammar Inference. Information and Software Technology,
2008, vol. 50, iss. 9-10, pp. 948-968.
• Qichao Liu, Jeff Gray, Marjan Mernik, Barrett R. Bryant.
Application of Metamodel Inference with Large-Scale
Metamodels. Int. J. Software and Informatics 6(2): 201-231
(2012)
SLATE 2015, Madrid, Spain June 18-19, 2015
25/67
Some applications of GI to SLE
• ESML (Embedded System Modeling Language)
SLATE 2015, Madrid, Spain June 18-19, 2015
26/67
Some applications of GI to SLE
• Original ESML metamodel - Configuration viewpoint
SLATE 2015, Madrid, Spain June 18-19, 2015
27/67
Some applications of GI to SLE
• Inferred ESML metamodel - Configuration viewpoint
SLATE 2015, Madrid, Spain June 18-19, 2015
28/67
Some applications of GI to SLE
• Inference of Visual Languages (based on graph
grammars)
FÜRST, Luka, MERNIK, Marjan, MAHNIČ, Viljan. Graph grammar
induction as a parser-controlled heuristic search process.
AGTIVE’12, pp. 121-136.
FÜRST, Luka, MERNIK, Marjan, MAHNIČ, Viljan. Converting
metamodels to graph grammars: doing without advanced graph
grammar features. Software and System Modeling (SoSym),
2013, Article in Press.
SLATE 2015, Madrid, Spain June 18-19, 2015
29/67
Graph grammar inference
Positive and negative samples for hydrocarbons
with single and double bonds
SLATE 2015, Madrid, Spain June 18-19, 2015
30/67
Graph grammar inference
Inferred graph grammar
SLATE 2015, Madrid, Spain June 18-19, 2015
31/67
Graph grammar inference
Positive samples
for flowcharts
SLATE 2015, Madrid, Spain June 18-19, 2015
32/67
Graph grammar inference
Inferred graph grammar
SLATE 2015, Madrid, Spain June 18-19, 2015
33/67
Some applications of GI to SLE
• Inference of execution traces
– GI is well suited to analyses involving program
execution because the events in execution traces
are temporally ordered in the same way that
tokens are spatially ordered.
– GI can be used in the dynamic analysis of
software.
N. Walkinshaw, K. Bogdanov, M. Holcombe, S. Salahuddin.
Improving dynamic software analysis by applying grammar
inference principles. J. Softw. Maint. Evol. 20 (2008) 269-
290.
SLATE 2015, Madrid, Spain June 18-19, 2015
34/67
Some applications of GI to SLE
• Javier Canovas, Jordi Cabot, Jesus Lopez-Fernandez, Jesus
Sanchez, Cuadrado, Esther Guerra, Juan De Lara. Engaging
End-Users in the Collaborative Development of Domain-
Specific Modelling Languages, CDVE 2013: 101-110.
SLATE 2015, Madrid, Spain June 18-19, 2015
35/67
Some applications of GI to SLE
• Javier Canovas, Jordi Cabot, Jesus Lopez-Fernandez, Jesus
Sanchez, Cuadrado, Esther Guerra, Juan De Lara. Engaging
End-Users in the Collaborative Development of Domain-
Specific Modelling Languages. CDVE 2013: 101-110.
SLATE 2015, Madrid, Spain June 18-19, 2015
36/67
Some applications of GI to SLE
• Model evolution using metamodel inference
SLATE 2015, Madrid, Spain June 18-19, 2015
37/67
Some applications of GI to SE
• Applications to Software Engineering (SE)
• Grammar-based systems
• Marjan Mernik, Matej Crepinsek, Tomaz Kosar,
Damijan Rebernak, Viljem Zumer. Grammar-Based
Systems: Definition and Examples. Informatica
28(3): 245-255 (2004)
SLATE 2015, Madrid, Spain June 18-19, 2015
38/67
Some applications of GI to SE
• A grammar-based system is any system that uses a
grammar and/or sentences produced by this
grammar to solve various problems outside the
domain of programming language definition and its
implementation. The vital component of such a
system is well structured and expressed with a
grammar or with sentences produced by this
grammar in an explicit or implicit manners.
SLATE 2015, Madrid, Spain June 18-19, 2015
39/67
Some applications of GI to SE
• Applications to Software Engineering (SE)
• Wil van der Aalst. Process Mining: Making Sense of Processes
Hidden in Big Event Data. Invited talk at FedCSIS 2013
• Sentence = Trace in event log
• Grammar = Process model
SLATE 2015, Madrid, Spain June 18-19, 2015
40/67
Inside GI
• Several theoretical learning models.
– Identification in the limit (allows the inference
algorithm to converge on the target grammar
given a sufficiently large quantity of sentences).
– Teacher and queries (introduces a teacher who
knows the target language and can answer
particular types of queries from the learner)
– Probably Approximately Correct (PAC) learning
(doesn’t guarantee exact identification).
SLATE 2015, Madrid, Spain June 18-19, 2015
41/67
Inside GI
• Gold Theorem (1967) - it is impossible to
identify any of the four classes of languages
in the Chomsky hierarchy in the limit using
only positive samples.
• Using both negative and positive samples,
the Chomsky hierarchy languages can be
identified in the limit.
SLATE 2015, Madrid, Spain June 18-19, 2015
42/67
Inside GI
• Intuitively, Gold's theorem can be explained
by recognizing the fact that the final
generalization of positive samples would be
an automation that accept all strings.
• Singular use of positive samples results in an
uncertainty as to when the generalization
steps should be stopped. This implies the
need for some restrictions or background
knowledge on the generalization process.
SLATE 2015, Madrid, Spain June 18-19, 2015
43/67
Inside GI
• A lot of research has been done on
extraction of context-free grammars, but
the problem is still not solved sufficiently
mainly due to immense search space.
SLATE 2015, Madrid, Spain June 18-19, 2015
44/67
Inside GI
SLATE 2015, Madrid, Spain June 18-19, 2015
45/67
Inside GI
SLATE 2015, Madrid, Spain June 18-19, 2015
46/67
Inside GI
SLATE 2015, Madrid, Spain June 18-19, 2015
47/67
Inside GI
SLATE 2015, Madrid, Spain June 18-19, 2015
48/67
Inside GI
SLATE 2015, Madrid, Spain June 18-19, 2015
49/67
Inside GI
• Memetic algorithms are evolutionary
algorithms with local search operator
– use of evolutionary concepts (population,
evolutionary operators)
– improves the search for solutions with local
search.
SLATE 2015, Madrid, Spain June 18-19, 2015
50/67
Inside GI
example n
example 1
regular
definitions
...
initiali-
zation
local
search
mutation
generali-
zation
selection
evolutionary
cycle
found
grammars
parse positive
examples
MAGIc
evaluate
(LISA
parser)
- simple
- Sequitur
diff
• Memetic Algorithm for Grammatical Inference
SLATE 2015, Madrid, Spain June 18-19, 2015
51/67
Inside GI
• Sequitur: http://sequitur.info/
• abcabdabcabd
0 → 1 1
1 → 2 c 2 d
2 → a b
• p i w i=n, i=n // print id where id=n, id=n
0 → p 1 w 2, 2
1 → i
2 → 1 = n
SLATE 2015, Madrid, Spain June 18-19, 2015
52/67
Inside GI
print a where c=2
print 5+b where b = 10print id where id=num
print num+id where id=num
SLATE 2015, Madrid, Spain June 18-19, 2015
53/67
Inside GI
Apply diff command!
1a2,3
> num
> +
print id where id=num
print num+id where id=num
What is the difference
among two samples?
print id where id=num
print num+id where id=num
But where to change the
grammar?
SLATE 2015, Madrid, Spain June 18-19, 2015
54/67
Inside GI
Start with the grammar
that parses first sample:
print a where c=2
N1 ::= print N2 where id = num
N2 ::= id
Use information from
LR(1) parsing on 2nd
sample.
Configurations returned from the
LR(1) parser:
Nx → α1 • α2
Ny → β •
Nz → • γ
SLATE 2015, Madrid, Spain June 18-19, 2015
55/67
Inside GI
• Input samples:
s1,s2,...,sn (true positive)
s1,s2,...,sk,a1,...,am,sk+1,...sn (false negative)
– difference: a1,...,am
SLATE 2015, Madrid, Spain June 18-19, 2015
56/67
Inside GI
• Nx → α1 • α2
– if
Nx ::= α1 N1 α2
N1 ::= ai+1 ... am
N1 ::= ε
– if
Nx ::= α1 N1
N1 ::= α2
N1 ::= ai+1 ... am
– if
change in this configuration can’t be made
)FIRST(αs 21k ∈+
FOLLOW(Nx)s)FIRST(αs 1k21k ∈∧∉ ++
FOLLOW(Nx)s)FIRST(αs 1k21k ∉∧∉ ++
SLATE 2015, Madrid, Spain June 18-19, 2015
57/67
Inside GI
print a where c=2
print 5+b where b = 10
N1 → print • N2 where id = num
N1 ::= print N2 where id = num
N2 ::= id
N1 ::= print N3 N2 where id = num
N2 ::= id
N3 ::= num +
N3 ::= ε
SLATE 2015, Madrid, Spain June 18-19, 2015
58/67
Inside GI
Production: Nx ::= α1 Ny α2
Option
Nx ::= α1 Nz α2
Nz ::= Ny
Nz ::= ε
But, how mutation is
done?
SLATE 2015, Madrid, Spain June 18-19, 2015
59/67
Inside GI
Nx ::= α Ny Nx ::= Ny Ny
Ny ::= α Ny ::= α
Ny ::= β Ny ::= βWhat about
generalization step?
Nx ::= Ny Ny Nx ::= Ny
Ny ::= α Ny ::= α Ny
Ny ::= β Ny ::= β Ny
Ny ::= ε
SLATE 2015, Madrid, Spain June 18-19, 2015
60/67
Semantic inference
a) Grammar G:
E ::= T EE
EE ::= + T EE
EE ::= ε
T ::= int
b) Attributes and their types:
int synthesized vals
int inherited inVal
c) Set of terminals T = {‘nonterminal.attribute’,
‘lexValue’}
d) Set of functions F = {sum}
Set of positive programs with
associated meanings:
(5, 5)
(2+5, 7)
(10+5+8, 23)
SLATE 2015, Madrid, Spain June 18-19, 2015
61/67
Semantic inference
// infered attribute grammars
E ::= T EE
{ E[0].vals = EE[0].vals;
EE[0].inVal = T[0].vals;}
EE ::= + T EE
{ EE[0].vals = EE[1].vals;
EE[1].inVal = sum(T[0].vals, EE[0].inVal);}
EE ::= ε
{EE[0].vals = EE[0].inVal;}
T ::= int
{T[0].vals = Integer.parseInt(#int.value());}
SLATE 2015, Madrid, Spain June 18-19, 2015
62/67
Semantic inference
SLATE 2015, Madrid, Spain June 18-19, 2015
63/67
Semantic inference
Actually, the prototypical implementation found two additional
attribute grammars, which differ in the semantics for the 2nd
production EE ::= + T EE, but both are valid. The following two
semantic rules for the 2nd production were found:
a) EE ::= + T EE
{ EE[0].vals = sum(EE[1].vals, T[0].vals);
EE[1].inVal = EE[0].inVal;}
b) EE ::= + T EE
{ EE[0].vals = sum(EE[1].vals, EE[0].inVal);
EE[1].inVal = T[0].vals;}
SLATE 2015, Madrid, Spain June 18-19, 2015
64/67
Semantic inference
SLATE 2015, Madrid, Spain June 18-19, 2015
65/67
Semantic inference
SLATE 2015, Madrid, Spain June 18-19, 2015
66/67
Conclusion
Hope that I convinced you
that grammatical inference is
interesting and useful.
Yes, I will used in my
current project on
business process
mining.
SLATE 2015, Madrid, Spain June 18-19, 2015
67/67
Conclusion
Collaborators:
M. Črepinšek, D. Hrnčič1,
B. Bryant2, A. Sprague2, J. Gray2, J. Faizan2, Q. Liu2
L. Fürst3, V. Mahnič3
1University of Maribor, Slovenia
2The University of Alabama at Birmingham, USA
3University of Ljubljana, Slovenia
Sent comments/questions to:
marjan.mernik@um.si

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (15)

TMPA-2017: Static Checking of Array Objects in JavaScript
TMPA-2017: Static Checking of Array Objects in JavaScriptTMPA-2017: Static Checking of Array Objects in JavaScript
TMPA-2017: Static Checking of Array Objects in JavaScript
 
Functional programming-advantages
Functional programming-advantagesFunctional programming-advantages
Functional programming-advantages
 
Isorc18 keynote
Isorc18 keynoteIsorc18 keynote
Isorc18 keynote
 
Ch6
Ch6Ch6
Ch6
 
Deep C
Deep CDeep C
Deep C
 
C++ Tokens
C++ TokensC++ Tokens
C++ Tokens
 
Complete C++ programming Language Course
Complete C++ programming Language CourseComplete C++ programming Language Course
Complete C++ programming Language Course
 
C language
C languageC language
C language
 
C C++ tutorial for beginners- tibacademy.in
C C++ tutorial for beginners- tibacademy.inC C++ tutorial for beginners- tibacademy.in
C C++ tutorial for beginners- tibacademy.in
 
C Programming
C ProgrammingC Programming
C Programming
 
Ada lab manual
Ada lab manualAda lab manual
Ada lab manual
 
Learning c - An extensive guide to learn the C Language
Learning c - An extensive guide to learn the C LanguageLearning c - An extensive guide to learn the C Language
Learning c - An extensive guide to learn the C Language
 
C++ unit-1-part-7
C++ unit-1-part-7C++ unit-1-part-7
C++ unit-1-part-7
 
C++ for beginners
C++ for beginnersC++ for beginners
C++ for beginners
 
Glimpses of C++0x
Glimpses of C++0xGlimpses of C++0x
Glimpses of C++0x
 

Ähnlich wie The Application of Grammar Inference to Software Language Engineering

Generic Programming seminar
Generic Programming seminarGeneric Programming seminar
Generic Programming seminar
Gautam Roy
 

Ähnlich wie The Application of Grammar Inference to Software Language Engineering (20)

C and Data structure lab manual ECE (2).pdf
C and Data structure lab manual ECE (2).pdfC and Data structure lab manual ECE (2).pdf
C and Data structure lab manual ECE (2).pdf
 
Decaf language specification
Decaf language specificationDecaf language specification
Decaf language specification
 
Fundamentals of computer programming by Dr. A. Charan Kumari
Fundamentals of computer programming by Dr. A. Charan KumariFundamentals of computer programming by Dr. A. Charan Kumari
Fundamentals of computer programming by Dr. A. Charan Kumari
 
C programing Tutorial
C programing TutorialC programing Tutorial
C programing Tutorial
 
2 EPT 162 Lecture 2
2 EPT 162 Lecture 22 EPT 162 Lecture 2
2 EPT 162 Lecture 2
 
Scala as a Declarative Language
Scala as a Declarative LanguageScala as a Declarative Language
Scala as a Declarative Language
 
dinoC_ppt.pptx
dinoC_ppt.pptxdinoC_ppt.pptx
dinoC_ppt.pptx
 
Tesseract. Recognizing Errors in Recognition Software
Tesseract. Recognizing Errors in Recognition SoftwareTesseract. Recognizing Errors in Recognition Software
Tesseract. Recognizing Errors in Recognition Software
 
18CSS101J PROGRAMMING FOR PROBLEM SOLVING
18CSS101J PROGRAMMING FOR PROBLEM SOLVING18CSS101J PROGRAMMING FOR PROBLEM SOLVING
18CSS101J PROGRAMMING FOR PROBLEM SOLVING
 
Lec-1c.pdf
Lec-1c.pdfLec-1c.pdf
Lec-1c.pdf
 
Applying Compiler Techniques to Iterate At Blazing Speed
Applying Compiler Techniques to Iterate At Blazing SpeedApplying Compiler Techniques to Iterate At Blazing Speed
Applying Compiler Techniques to Iterate At Blazing Speed
 
Perl
PerlPerl
Perl
 
OpenGurukul : Language : C Programming
OpenGurukul : Language : C ProgrammingOpenGurukul : Language : C Programming
OpenGurukul : Language : C Programming
 
Prog1-L2.pptx
Prog1-L2.pptxProg1-L2.pptx
Prog1-L2.pptx
 
Learn C
Learn CLearn C
Learn C
 
201801 CSE240 Lecture 07
201801 CSE240 Lecture 07201801 CSE240 Lecture 07
201801 CSE240 Lecture 07
 
Esoft Metro Campus - Programming with C++
Esoft Metro Campus - Programming with C++Esoft Metro Campus - Programming with C++
Esoft Metro Campus - Programming with C++
 
Survey of Program Transformation Technologies
Survey of Program Transformation TechnologiesSurvey of Program Transformation Technologies
Survey of Program Transformation Technologies
 
C++ lecture 01
C++   lecture 01C++   lecture 01
C++ lecture 01
 
Generic Programming seminar
Generic Programming seminarGeneric Programming seminar
Generic Programming seminar
 

Mehr von Facultad de Informática UCM

Mehr von Facultad de Informática UCM (20)

¿Por qué debemos seguir trabajando en álgebra lineal?
¿Por qué debemos seguir trabajando en álgebra lineal?¿Por qué debemos seguir trabajando en álgebra lineal?
¿Por qué debemos seguir trabajando en álgebra lineal?
 
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
 
DRAC: Designing RISC-V-based Accelerators for next generation Computers
DRAC: Designing RISC-V-based Accelerators for next generation ComputersDRAC: Designing RISC-V-based Accelerators for next generation Computers
DRAC: Designing RISC-V-based Accelerators for next generation Computers
 
uElectronics ongoing activities at ESA
uElectronics ongoing activities at ESAuElectronics ongoing activities at ESA
uElectronics ongoing activities at ESA
 
Tendencias en el diseño de procesadores con arquitectura Arm
Tendencias en el diseño de procesadores con arquitectura ArmTendencias en el diseño de procesadores con arquitectura Arm
Tendencias en el diseño de procesadores con arquitectura Arm
 
Formalizing Mathematics in Lean
Formalizing Mathematics in LeanFormalizing Mathematics in Lean
Formalizing Mathematics in Lean
 
Introduction to Quantum Computing and Quantum Service Oriented Computing
Introduction to Quantum Computing and Quantum Service Oriented ComputingIntroduction to Quantum Computing and Quantum Service Oriented Computing
Introduction to Quantum Computing and Quantum Service Oriented Computing
 
Computer Design Concepts for Machine Learning
Computer Design Concepts for Machine LearningComputer Design Concepts for Machine Learning
Computer Design Concepts for Machine Learning
 
Inteligencia Artificial en la atención sanitaria del futuro
Inteligencia Artificial en la atención sanitaria del futuroInteligencia Artificial en la atención sanitaria del futuro
Inteligencia Artificial en la atención sanitaria del futuro
 
Design Automation Approaches for Real-Time Edge Computing for Science Applic...
 Design Automation Approaches for Real-Time Edge Computing for Science Applic... Design Automation Approaches for Real-Time Edge Computing for Science Applic...
Design Automation Approaches for Real-Time Edge Computing for Science Applic...
 
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
 
Fault-tolerance Quantum computation and Quantum Error Correction
Fault-tolerance Quantum computation and Quantum Error CorrectionFault-tolerance Quantum computation and Quantum Error Correction
Fault-tolerance Quantum computation and Quantum Error Correction
 
Cómo construir un chatbot inteligente sin morir en el intento
Cómo construir un chatbot inteligente sin morir en el intentoCómo construir un chatbot inteligente sin morir en el intento
Cómo construir un chatbot inteligente sin morir en el intento
 
Automatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPCAutomatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPC
 
Type and proof structures for concurrency
Type and proof structures for concurrencyType and proof structures for concurrency
Type and proof structures for concurrency
 
Hardware/software security contracts: Principled foundations for building sec...
Hardware/software security contracts: Principled foundations for building sec...Hardware/software security contracts: Principled foundations for building sec...
Hardware/software security contracts: Principled foundations for building sec...
 
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
 
Do you trust your artificial intelligence system?
Do you trust your artificial intelligence system?Do you trust your artificial intelligence system?
Do you trust your artificial intelligence system?
 
Redes neuronales y reinforcement learning. Aplicación en energía eólica.
Redes neuronales y reinforcement learning. Aplicación en energía eólica.Redes neuronales y reinforcement learning. Aplicación en energía eólica.
Redes neuronales y reinforcement learning. Aplicación en energía eólica.
 
Challenges and Opportunities for AI and Data analytics in Offshore wind
Challenges and Opportunities for AI and Data analytics in Offshore windChallenges and Opportunities for AI and Data analytics in Offshore wind
Challenges and Opportunities for AI and Data analytics in Offshore wind
 

Kürzlich hochgeladen

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Kürzlich hochgeladen (20)

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 

The Application of Grammar Inference to Software Language Engineering

  • 1. SLATE 2015, Madrid, Spain June 18-19, 2015 1/67 The Application of Grammar Inference to Software Language Engineering Marjan Mernik University of Maribor, Slovenia
  • 2. SLATE 2015, Madrid, Spain June 18-19, 2015 2/67 Outline of the Presentation • What is a Grammar Inference (GI)? • A short introduction to Software Language Engineering (SLE) • Some applications of GI to SLE and SE (CFG inference, Metamodel inference, Graph grammar inference, semantic inference) • Inside GI • Conclusion
  • 3. SLATE 2015, Madrid, Spain June 18-19, 2015 3/67 What is a Grammar Inference (GI)? • Grammatical inference is a process of learning the grammar from positive (and negative) language samples (sentences). • Grammatical inference attracts researchers from different fields such as pattern recognition, computational linguistic, natural language acquisition, software engineering, ...
  • 4. SLATE 2015, Madrid, Spain June 18-19, 2015 4/67 What is a Grammar Inference (GI)? • Given a sentence ps and a grammar G we can tell whether ps belongs to L(G) (ps ∈ L(G)). Such sentence is called positive sample. • A set of positive samples is denoted with S+. In similar manner we can defined set of negative samples S-. Those samples do not belong to L(G) and can no be derived from starting symbol S.
  • 5. SLATE 2015, Madrid, Spain June 18-19, 2015 5/67 What is a Grammar Inference (GI)? • Given a set S+ and S-, which might be also empty, the task of grammar inference is to find at least one grammar G such that S+⊆L(G) and S-⊆L (G). • A set of positive samples S+ of a L(G) is structurally complete if each grammar production is used in the generation of at least one sentence in S+.
  • 6. SLATE 2015, Madrid, Spain June 18-19, 2015 6/67 What is a Grammar Inference (GI)? print 5 print a where a=10 print b+1 where b=1 print a+b+2 where a=1, b=2 What computer language she used? Try out our newly developed grammar inference algorithm! What is a grammar of this language?
  • 7. SLATE 2015, Madrid, Spain June 18-19, 2015 7/67 A short introduction to SLE • Software Language Engineering (SLE) is a young engineering discipline with the aim of establishing a systematic and rigorous approach to the development, use, and maintenance of computer languages, which comprises specification, modeling and programming languages. • SLE is equally focused on General-Purpose Langauges (GPLs) and Domain-Specific Languages (DSLs)
  • 8. SLATE 2015, Madrid, Spain June 18-19, 2015 8/67 A short introduction to SLE Technicalliterature Existing implementations Customsurveys Expertadvice Currentandfuture requirements Terminology Concepts Commonalities Variations Syntax Semantics DSLDSL Possible existing implementations Requirements DSL DSL development phases
  • 9. SLATE 2015, Madrid, Spain June 18-19, 2015 9/67 Some applications of GI to SLE • A. Stevenson, J. R. Cordy. A Survey of grammatical inference in software engineering. Science of Computer Programming 96 (2014) 444-459. – Inference of GPLs – Inference of DSLs (grammar and metamodel based) – Inference of Visual Languages (based on graph grammars) – Inference of execution traces
  • 10. SLATE 2015, Madrid, Spain June 18-19, 2015 10/67 Some applications of GI to SLE • Inference of GPLs – Problem is still too complex to infer a GPL with several hundreds of production rules. – Several successful attempts have been developed to infer a dialect GPL grammar G’ from existing GPL grammar G. M. Di Penta, P. Lombardi, K. Taneja, L. Troiano. Search- Based Inference of Dialect Grammars. Soft Computing 12 (2008) 51-66. A. Dubej, P. Jalote, S. Aggarval. Learning context free grammar rules from a set of programs. IET Software 2 (2008) 223-240.
  • 11. SLATE 2015, Madrid, Spain June 18-19, 2015 11/67 Some applications of GI to SLE • Inference of GPLs – Restrictive assumption have been declared in these works: 1. G and G’ share the same set of non-terminals 2. T ⊆ T’ 3. (T’-T) consists only of keywords and operators 4. Production rules can be only added and not removed 5. Missing rules always begin with a keyword
  • 12. SLATE 2015, Madrid, Spain June 18-19, 2015 12/67 Some applications of GI to SLE • HRNČIČ, Dejan, MERNIK, Marjan, BRYANT, Barrett Richard. Embedding DSLS into GPLS: A Grammatical Inference Approach. Information Technology and Control , 2011, vol. 40, no. 4, pp. 307-315. • Our approach can be used also for syntax extensions and for DSL embedding – To embed domain-specific language (e.g, SQL) into another programming language (GPL or DSL)
  • 13. SLATE 2015, Madrid, Spain June 18-19, 2015 13/67 Some applications of GI to SLE • Initial grammar (ANSI C): 1. translation unit ::= external decl 2. translation unit ::= translation unit external decl 3. external decl ::= function denition 4. external decl ::= decl 6. function denition ::= declarator decl list compound stat 9. decl ::= decl specs init declarator list ; 10. decl ::= decl specs ; 11. decl list ::= decl 12. decl list ::= decl list decl 15. decl specs ::= type spec decl specs 27. type spec ::= int | long | ... 45. init declarator list ::= init declarator 46. init declarator list ::= init declarator list , init declarator 47. init declarator ::= declarator 64. enumerator ::= id 65. enumerator ::= id = const exp 67. declarator ::= direct declarator 68. direct declarator ::= id 69. direct declarator ::= ( declarator ) 70. direct declarator ::= direct declarator [ const exp ] 71. direct declarator ::= direct declarator [ ] 72. direct declarator ::= direct declarator ( param type list ) 73. direct declarator ::= direct declarator ( id list ) 74. direct declarator ::= direct declarator ( ) 88. id list ::= id 89. id list ::= id list , id 90. initializer ::= assignment exp 91. initializer ::= initializer list 93. initializer list ::= initializer 94. initializer list ::= initializer list , initializer 110. stat ::= labeled stat | exp stat | compound stat | selection stat 114. stat ::= iteration stat | jump stat 116. labeled stat ::= id : stat 117. labeled stat ::= case const exp : stat 118. labeled stat ::= default : stat 119. exp stat ::= exp ; 120. exp stat ::= ; 121. compound stat ::= decl list stat list 125. stat list ::= stat 126. stat list ::= stat list stat 127. selection stat ::= if ( exp ) stat 129. selection stat ::= switch ( exp ) stat 130. iteration stat ::= while ( exp ) stat 131. iteration stat ::= do stat while ( exp ) ; 132. iteration stat ::= for ( exp ; exp ; exp ) stat 140. jump stat ::= goto id ; | continue ; | break ; | return exp ; 145. exp ::= assignment exp 146. exp ::= exp , assignment exp 147. assignment exp ::= conditional exp 148. assignment exp ::= conditional exp assignment operator assignment exp 205. const ::= int const | char const | oat const
  • 14. SLATE 2015, Madrid, Spain June 18-19, 2015 14/67 Some applications of GI to SLE • Initial grammar (ANSI C): int main() { char str[][]; int i; printf("Students:"); for(i = 0; i < str.length; i++) { printf(str[i]); } return 0; } int main() { char str[][] = { SELECT Name FROM Students }; int i; printf("Students:"); for(i = 0; i < str.length; i++) { printf(str[i]); } return 0; } true positive sample false negative samples: int main() { char str[][] = { SELECT Name, Surname FROM Students, Professors }; int i; printf("Students and Professors:"); for(i = 0; i < str.length; i++) { printf(str[i]); } return 0; }
  • 15. SLATE 2015, Madrid, Spain June 18-19, 2015 15/67 Some applications of GI to SLE • Inferred Grammar: 1. translation unit ::= external decl 2. translation unit ::= translation unit external decl 3. external decl ::= function denition 4. external decl ::= decl 6. function denition ::= declarator decl list compound stat 9. decl ::= decl specs init declarator list ; 10. decl ::= decl specs ; 11. decl list ::= decl 12. decl list ::= decl list decl 15. decl specs ::= type spec decl specs 27. type spec ::= int | long | ... 45. init declarator list ::= init declarator 46. init declarator list ::= init declarator list , init declarator 47. init declarator ::= declarator 64. enumerator ::= id 65. enumerator ::= id = const exp 67. declarator ::= direct declarator NT1 68. direct declarator ::= id 69. direct declarator ::= ( declarator ) 70. direct declarator ::= direct declarator [ const exp ] 71. direct declarator ::= direct declarator [ ] 72. direct declarator ::= direct declarator ( param type list ) 73. direct declarator ::= direct declarator ( id list ) 74. direct declarator ::= direct declarator ( ) 88. id list ::= id 89. id list ::= id list , id 90. initializer ::= assignment exp 91. initializer ::= initializer list 93. initializer list ::= initializer 94. initializer list ::= initializer list , initializer 110. stat ::= labeled stat | exp stat | compound stat | selection stat 114. stat ::= iteration stat | jump stat 116. labeled stat ::= id : stat 117. labeled stat ::= case const exp : stat 118. labeled stat ::= default : stat 119. exp stat ::= exp ; 120. exp stat ::= ; 121. compound stat ::= decl list stat list 125. stat list ::= stat 126. stat list ::= stat list stat 127. selection stat ::= if ( exp ) stat 129. selection stat ::= switch ( exp ) stat 130. iteration stat ::= while ( exp ) stat 131. iteration stat ::= do stat while ( exp ) ; 132. iteration stat ::= for ( exp ; exp ; exp ) stat 140. jump stat ::= goto id ; | continue ; | break ; | return exp ; 145. exp ::= assignment exp 146. exp ::= exp , assignment exp 147. assignment exp ::= conditional exp 148. assignment exp ::= conditional exp assignment operator assignment exp 205. const ::= int const | char const | oat const 208. NT1 ::= = SELECT id NT2 FROM id NT2 | ϵ 210. NT2 ::= , id NT2 | ϵ
  • 16. SLATE 2015, Madrid, Spain June 18-19, 2015 16/67 Some applications of GI to SLE • Inference of DSLs – A DSL grammar is usually small enough that the inference process is feasible. – Inference of grammar-based DSLs. – Inference of metamodel-base DSLs. HRNČIČ, Dejan, MERNIK, Marjan, BRYANT, Barrett Richard, JAVED, Faizan. A memetic grammar inference algorithm for language learning. Applied Soft Computing, 2012, vol. 12, iss. 3, pp. 1006-1020. HRNČIČ, Dejan, MERNIK, Marjan, BRYANT, Barrett Richard. Improving grammar inference by a memetic algorithm. IEEE Transactions on Systems, Man, and Cybernetics - Part C, 2012, vol. 42, no. 5, pp. 692-703.
  • 17. SLATE 2015, Madrid, Spain June 18-19, 2015 17/67 Some applications of GI to SLE • 12 input samples of DESK language on which the algorithm was tested: 1. print a 2. print 3 3. print b + 14 4. print a + b + c 5. print a where b = 14 6. print 10 where d = 15 7. print 9 + b where b = 16 8. print 1 + 2 where id = 1 9. print a where b = 5, c = 4 10. print 21 where a = 6, b = 5 11. print 5 + 6 where a = 3, c = 14 12. print a + b + c where a = 4, b = 3, c = 2
  • 18. SLATE 2015, Madrid, Spain June 18-19, 2015 18/67 Some applications of GI to SLE Original grammar: 1. DESK ::= print E C 2. E ::= E + F 3. E ::= F 4. F ::= id 5. F ::= num 6. C ::= where Ds 7. C ::= ε 8. Ds ::= D 9. Ds ::= Ds , D 10. D ::= id = num Inferred grammar: 1: NT1 -> print NT3 NT5 2: NT2 -> + NT3 3: NT2 -> ε 4: NT3 -> num NT2 5: NT3 -> id NT2 6: NT4 -> , id = num NT4 7: NT4 -> ε 8: NT5 -> where id = num NT4 9: NT5 -> ε
  • 19. SLATE 2015, Madrid, Spain June 18-19, 2015 19/67 Some applications of GI to SLE DSL for hypertree description
  • 20. SLATE 2015, Madrid, Spain June 18-19, 2015 20/67 Some applications of GI to SLE Inferred grammar for hypertree description DSL
  • 21. SLATE 2015, Madrid, Spain June 18-19, 2015 21/67 Some applications of GI to SLE • As a model conforms to a metamodel in a similar manner to how a program conforms to a grammar, the metamodel inference can be defined as follows. • The set of all models that conform to a given metamodel MM will be called the language of the metamodel and denoted L(MM). Given a model instance m and a metamodel MM we can tell whether m conforms to MM (m ∈ L(MM)).
  • 22. SLATE 2015, Madrid, Spain June 18-19, 2015 22/67 Some applications of GI to SLE • A set of positive samples is denoted with S+. Conversely, a negative sample belongs to L(MM), which denotes a set of all models that do not conform to metamodel MM. A set of negative samples is denoted with S-. • A set of positive samples S+ of a metamodel MM is structurally complete if each metamodel element appears in at least one model in S+.
  • 23. SLATE 2015, Madrid, Spain June 18-19, 2015 23/67 Some applications of GI to SLE • Given a set of positive samples S+ and set of negative samples S-, which might be also empty, the task of metamodel inference is to find at least one metamodel MM such that S+⊆L(MM) and S-⊆L(MM).
  • 24. SLATE 2015, Madrid, Spain June 18-19, 2015 24/67 Some applications of GI to SLE • JAVED, Faizan, MERNIK, Marjan, GRAY, Jeffrey G., BRYANT, Barrett Richard. MARS: A Metamodel Recovery System Using Grammar Inference. Information and Software Technology, 2008, vol. 50, iss. 9-10, pp. 948-968. • Qichao Liu, Jeff Gray, Marjan Mernik, Barrett R. Bryant. Application of Metamodel Inference with Large-Scale Metamodels. Int. J. Software and Informatics 6(2): 201-231 (2012)
  • 25. SLATE 2015, Madrid, Spain June 18-19, 2015 25/67 Some applications of GI to SLE • ESML (Embedded System Modeling Language)
  • 26. SLATE 2015, Madrid, Spain June 18-19, 2015 26/67 Some applications of GI to SLE • Original ESML metamodel - Configuration viewpoint
  • 27. SLATE 2015, Madrid, Spain June 18-19, 2015 27/67 Some applications of GI to SLE • Inferred ESML metamodel - Configuration viewpoint
  • 28. SLATE 2015, Madrid, Spain June 18-19, 2015 28/67 Some applications of GI to SLE • Inference of Visual Languages (based on graph grammars) FÜRST, Luka, MERNIK, Marjan, MAHNIČ, Viljan. Graph grammar induction as a parser-controlled heuristic search process. AGTIVE’12, pp. 121-136. FÜRST, Luka, MERNIK, Marjan, MAHNIČ, Viljan. Converting metamodels to graph grammars: doing without advanced graph grammar features. Software and System Modeling (SoSym), 2013, Article in Press.
  • 29. SLATE 2015, Madrid, Spain June 18-19, 2015 29/67 Graph grammar inference Positive and negative samples for hydrocarbons with single and double bonds
  • 30. SLATE 2015, Madrid, Spain June 18-19, 2015 30/67 Graph grammar inference Inferred graph grammar
  • 31. SLATE 2015, Madrid, Spain June 18-19, 2015 31/67 Graph grammar inference Positive samples for flowcharts
  • 32. SLATE 2015, Madrid, Spain June 18-19, 2015 32/67 Graph grammar inference Inferred graph grammar
  • 33. SLATE 2015, Madrid, Spain June 18-19, 2015 33/67 Some applications of GI to SLE • Inference of execution traces – GI is well suited to analyses involving program execution because the events in execution traces are temporally ordered in the same way that tokens are spatially ordered. – GI can be used in the dynamic analysis of software. N. Walkinshaw, K. Bogdanov, M. Holcombe, S. Salahuddin. Improving dynamic software analysis by applying grammar inference principles. J. Softw. Maint. Evol. 20 (2008) 269- 290.
  • 34. SLATE 2015, Madrid, Spain June 18-19, 2015 34/67 Some applications of GI to SLE • Javier Canovas, Jordi Cabot, Jesus Lopez-Fernandez, Jesus Sanchez, Cuadrado, Esther Guerra, Juan De Lara. Engaging End-Users in the Collaborative Development of Domain- Specific Modelling Languages, CDVE 2013: 101-110.
  • 35. SLATE 2015, Madrid, Spain June 18-19, 2015 35/67 Some applications of GI to SLE • Javier Canovas, Jordi Cabot, Jesus Lopez-Fernandez, Jesus Sanchez, Cuadrado, Esther Guerra, Juan De Lara. Engaging End-Users in the Collaborative Development of Domain- Specific Modelling Languages. CDVE 2013: 101-110.
  • 36. SLATE 2015, Madrid, Spain June 18-19, 2015 36/67 Some applications of GI to SLE • Model evolution using metamodel inference
  • 37. SLATE 2015, Madrid, Spain June 18-19, 2015 37/67 Some applications of GI to SE • Applications to Software Engineering (SE) • Grammar-based systems • Marjan Mernik, Matej Crepinsek, Tomaz Kosar, Damijan Rebernak, Viljem Zumer. Grammar-Based Systems: Definition and Examples. Informatica 28(3): 245-255 (2004)
  • 38. SLATE 2015, Madrid, Spain June 18-19, 2015 38/67 Some applications of GI to SE • A grammar-based system is any system that uses a grammar and/or sentences produced by this grammar to solve various problems outside the domain of programming language definition and its implementation. The vital component of such a system is well structured and expressed with a grammar or with sentences produced by this grammar in an explicit or implicit manners.
  • 39. SLATE 2015, Madrid, Spain June 18-19, 2015 39/67 Some applications of GI to SE • Applications to Software Engineering (SE) • Wil van der Aalst. Process Mining: Making Sense of Processes Hidden in Big Event Data. Invited talk at FedCSIS 2013 • Sentence = Trace in event log • Grammar = Process model
  • 40. SLATE 2015, Madrid, Spain June 18-19, 2015 40/67 Inside GI • Several theoretical learning models. – Identification in the limit (allows the inference algorithm to converge on the target grammar given a sufficiently large quantity of sentences). – Teacher and queries (introduces a teacher who knows the target language and can answer particular types of queries from the learner) – Probably Approximately Correct (PAC) learning (doesn’t guarantee exact identification).
  • 41. SLATE 2015, Madrid, Spain June 18-19, 2015 41/67 Inside GI • Gold Theorem (1967) - it is impossible to identify any of the four classes of languages in the Chomsky hierarchy in the limit using only positive samples. • Using both negative and positive samples, the Chomsky hierarchy languages can be identified in the limit.
  • 42. SLATE 2015, Madrid, Spain June 18-19, 2015 42/67 Inside GI • Intuitively, Gold's theorem can be explained by recognizing the fact that the final generalization of positive samples would be an automation that accept all strings. • Singular use of positive samples results in an uncertainty as to when the generalization steps should be stopped. This implies the need for some restrictions or background knowledge on the generalization process.
  • 43. SLATE 2015, Madrid, Spain June 18-19, 2015 43/67 Inside GI • A lot of research has been done on extraction of context-free grammars, but the problem is still not solved sufficiently mainly due to immense search space.
  • 44. SLATE 2015, Madrid, Spain June 18-19, 2015 44/67 Inside GI
  • 45. SLATE 2015, Madrid, Spain June 18-19, 2015 45/67 Inside GI
  • 46. SLATE 2015, Madrid, Spain June 18-19, 2015 46/67 Inside GI
  • 47. SLATE 2015, Madrid, Spain June 18-19, 2015 47/67 Inside GI
  • 48. SLATE 2015, Madrid, Spain June 18-19, 2015 48/67 Inside GI
  • 49. SLATE 2015, Madrid, Spain June 18-19, 2015 49/67 Inside GI • Memetic algorithms are evolutionary algorithms with local search operator – use of evolutionary concepts (population, evolutionary operators) – improves the search for solutions with local search.
  • 50. SLATE 2015, Madrid, Spain June 18-19, 2015 50/67 Inside GI example n example 1 regular definitions ... initiali- zation local search mutation generali- zation selection evolutionary cycle found grammars parse positive examples MAGIc evaluate (LISA parser) - simple - Sequitur diff • Memetic Algorithm for Grammatical Inference
  • 51. SLATE 2015, Madrid, Spain June 18-19, 2015 51/67 Inside GI • Sequitur: http://sequitur.info/ • abcabdabcabd 0 → 1 1 1 → 2 c 2 d 2 → a b • p i w i=n, i=n // print id where id=n, id=n 0 → p 1 w 2, 2 1 → i 2 → 1 = n
  • 52. SLATE 2015, Madrid, Spain June 18-19, 2015 52/67 Inside GI print a where c=2 print 5+b where b = 10print id where id=num print num+id where id=num
  • 53. SLATE 2015, Madrid, Spain June 18-19, 2015 53/67 Inside GI Apply diff command! 1a2,3 > num > + print id where id=num print num+id where id=num What is the difference among two samples? print id where id=num print num+id where id=num But where to change the grammar?
  • 54. SLATE 2015, Madrid, Spain June 18-19, 2015 54/67 Inside GI Start with the grammar that parses first sample: print a where c=2 N1 ::= print N2 where id = num N2 ::= id Use information from LR(1) parsing on 2nd sample. Configurations returned from the LR(1) parser: Nx → α1 • α2 Ny → β • Nz → • γ
  • 55. SLATE 2015, Madrid, Spain June 18-19, 2015 55/67 Inside GI • Input samples: s1,s2,...,sn (true positive) s1,s2,...,sk,a1,...,am,sk+1,...sn (false negative) – difference: a1,...,am
  • 56. SLATE 2015, Madrid, Spain June 18-19, 2015 56/67 Inside GI • Nx → α1 • α2 – if Nx ::= α1 N1 α2 N1 ::= ai+1 ... am N1 ::= ε – if Nx ::= α1 N1 N1 ::= α2 N1 ::= ai+1 ... am – if change in this configuration can’t be made )FIRST(αs 21k ∈+ FOLLOW(Nx)s)FIRST(αs 1k21k ∈∧∉ ++ FOLLOW(Nx)s)FIRST(αs 1k21k ∉∧∉ ++
  • 57. SLATE 2015, Madrid, Spain June 18-19, 2015 57/67 Inside GI print a where c=2 print 5+b where b = 10 N1 → print • N2 where id = num N1 ::= print N2 where id = num N2 ::= id N1 ::= print N3 N2 where id = num N2 ::= id N3 ::= num + N3 ::= ε
  • 58. SLATE 2015, Madrid, Spain June 18-19, 2015 58/67 Inside GI Production: Nx ::= α1 Ny α2 Option Nx ::= α1 Nz α2 Nz ::= Ny Nz ::= ε But, how mutation is done?
  • 59. SLATE 2015, Madrid, Spain June 18-19, 2015 59/67 Inside GI Nx ::= α Ny Nx ::= Ny Ny Ny ::= α Ny ::= α Ny ::= β Ny ::= βWhat about generalization step? Nx ::= Ny Ny Nx ::= Ny Ny ::= α Ny ::= α Ny Ny ::= β Ny ::= β Ny Ny ::= ε
  • 60. SLATE 2015, Madrid, Spain June 18-19, 2015 60/67 Semantic inference a) Grammar G: E ::= T EE EE ::= + T EE EE ::= ε T ::= int b) Attributes and their types: int synthesized vals int inherited inVal c) Set of terminals T = {‘nonterminal.attribute’, ‘lexValue’} d) Set of functions F = {sum} Set of positive programs with associated meanings: (5, 5) (2+5, 7) (10+5+8, 23)
  • 61. SLATE 2015, Madrid, Spain June 18-19, 2015 61/67 Semantic inference // infered attribute grammars E ::= T EE { E[0].vals = EE[0].vals; EE[0].inVal = T[0].vals;} EE ::= + T EE { EE[0].vals = EE[1].vals; EE[1].inVal = sum(T[0].vals, EE[0].inVal);} EE ::= ε {EE[0].vals = EE[0].inVal;} T ::= int {T[0].vals = Integer.parseInt(#int.value());}
  • 62. SLATE 2015, Madrid, Spain June 18-19, 2015 62/67 Semantic inference
  • 63. SLATE 2015, Madrid, Spain June 18-19, 2015 63/67 Semantic inference Actually, the prototypical implementation found two additional attribute grammars, which differ in the semantics for the 2nd production EE ::= + T EE, but both are valid. The following two semantic rules for the 2nd production were found: a) EE ::= + T EE { EE[0].vals = sum(EE[1].vals, T[0].vals); EE[1].inVal = EE[0].inVal;} b) EE ::= + T EE { EE[0].vals = sum(EE[1].vals, EE[0].inVal); EE[1].inVal = T[0].vals;}
  • 64. SLATE 2015, Madrid, Spain June 18-19, 2015 64/67 Semantic inference
  • 65. SLATE 2015, Madrid, Spain June 18-19, 2015 65/67 Semantic inference
  • 66. SLATE 2015, Madrid, Spain June 18-19, 2015 66/67 Conclusion Hope that I convinced you that grammatical inference is interesting and useful. Yes, I will used in my current project on business process mining.
  • 67. SLATE 2015, Madrid, Spain June 18-19, 2015 67/67 Conclusion Collaborators: M. Črepinšek, D. Hrnčič1, B. Bryant2, A. Sprague2, J. Gray2, J. Faizan2, Q. Liu2 L. Fürst3, V. Mahnič3 1University of Maribor, Slovenia 2The University of Alabama at Birmingham, USA 3University of Ljubljana, Slovenia Sent comments/questions to: marjan.mernik@um.si