Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 23 Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Anzeige

Ähnlich wie Austen x talk (20)

Austen x talk

  1. 1. AustenX AustenX: a parser generator with some novel features Presented by Matthew Goode scratchy.org.nz
  2. 2. Overview ● AustenX is a parser generator built using Java ● But target languages will extend beyond Java ● It is based on Parsing Expression Grammars (PEGs), and uses Packrat memorisation ● Provides extensions to PEGs, and also handles left-recursion well (not so easy with PEG parsers) – with an interesting solution to a an interesting theorectical problem that has practical uses. ● AustenX is built using a code-generator code generator tool, called SkeletonX, which is interesting in its own right.
  3. 3. Parser Generators ● Given a grammar, generate code to facilitate reading text files ● Parsing Expression Grammar a form of grammar, that unlike context-free, is unambigiuous (not based on generation). ● Eg: ● 'Hello' 'A'+ ● 'A'* 'B'? 'A' / 'B' ● E = { E '+' E } / NUMBER
  4. 4. Examples ● Hello = 'Hello' ● ManyAs = 'A'+ ● AsAndMaybeB = 'A'* 'B'? ● AorB = 'A' / 'B' BOrBA = 'B' / 'BA' ● E = { NUMBER { '+' NUMBER } * } / NUMBER ● AFollowedByB = 'A' & 'B' ● AnotFollowedByB = 'A' ! 'B'
  5. 5. Recursive Descent ● PEGs are easy to translate to code ● Eg FunctionCall = ID '(' Arguments ')' 'A'* function readFunctionCall() { if(!readID()) reset() return fail if(!consume('(') reset() return fail if(!readArguments() ) reset() return fail if(!consume(')') reset() return fail while(consume('A')) {} }
  6. 6. Redundant Calls ● Like doing a Fibonacci calculation ● Dynamic programming solution ● Create a table of Rules x Positions ● Starting at end, calculate Rule at each position ● Previous rules only require later rules ● Packrat parsing ● Start at beginning, only do resolution when requested, store result
  7. 7. Example 4 + 3 * 2 A A A Add Mult As Exp Hello Stuff
  8. 8. Left Recursion ● Recursive Descent and Packrat parsing has problems with left recursion ● EG ● E = {E '+' E } / Number ● Creates infinite loops function readE() { readE() ... }
  9. 9. Solution ● 'Bubble up' resolutions ● Eg 1 + 2 ● First pass, no resolution, so E '+' E fails, but Num consumes the 1 ● Current best => E = Num ( '1' ) ● Retry until no more gains ● E = ('1') + ('2')
  10. 10. Problem ● Always right associative ● Eg E = { E '+' E } / { E '*' E } / Num ● Always resolves 1 * 2 + 3 to ( (1) * ( (2) + (3) ) ● Eg E = { E '*' E } / { E '+' E } / Num ● Also resolves 1 * 2 + 3 to ( (1) * ( (2) + (3) )! ● Problem only occurs when there is a right recursion of rule with left recursion
  11. 11. Solution ● A process of reinterpreting/rewriting ● Recall 1 * 2 + 3, with E = { E '+' E } / { E '*' E } / Num ● Resolution at '2' is ((2) + (3)) ● If resolved '(1) ', note that '+' is higher priority than '*', so search recursion to find lower – eg (2) ● Now resolved to ((1) * (2)) ● Bubble up to get (((1) * (2)) + 3)
  12. 12. Back to AustenX ● Currently uses seperate tokenisation ● A DFA (Lex-like) tokeniser included ● Allows the left-recursion discussed, with indirect and direct recursion ● Allows selective memorisation ● Provides statistics on use ● Turns out memorisation (at least with tokenisation) is mostly not needed ● Has extensions to PEG
  13. 13. Example grammar pattern Example2 { Add ( Example2:left PLUS Example2:right ) Number ( NUMBER:value ) ID ( ID:value ) } pattern Example1( ID:name STRING Example2:first ID Example2 )
  14. 14. Extended PEGs ● Easy split ● (AustenX uses '|' for 'or') ● Arguments = Arugment / Comma ● Variables and queries ● Pattern Fib [$x=0 $y=0 $next=0 $t = 0 $l = 2]( $x=INT $y=INT { $t = $x +$y $next = INT ($t==$next) $x = $y $y = $next $l++ } * $l:length )
  15. 15. Built in GUI
  16. 16. Future directions ● Better error handling ● Improved tokenisation ● With scanner-free, and binary modes ● New language targets ● Support for indentation-sensitive languages ● Formal (or at least informal) write-up of ordered left recursion
  17. 17. SkeletonX ● AustenX is a code generator ● SkeletonX is a tool for making code generators. It is a code generator code generator. ● Not just a template engine! ● It understands (a subset of) Java ● Currently going through many iterations to get a version that uses itself (not quite there yet) ● Many headaches over how scope should work
  18. 18. SkeletonX example 1 ● Define a design (an heirarchical data structure) ● define Example { A (String name) { B (int value, C cRef) } C (String cName) }
  19. 19. SkeletonX example 2 public class Main { @link A { public AClass doStuff[doStuff,$name]() { return new Aclass(@link B $value); } } } @link A { public class AClass[$name, Class] { @constructor ( @link B int value) { } } }
  20. 20. SkeletonX example 3 ● In user code: DesignRoot r; CBlock c = r.addCBlock(''Simple''); r.addABlock(''First'').addBBlock(42, c); ABlock a2 = r.addABlock(''Second''); a2.addBBlock(64, c); a2.addBBlock(35, c);
  21. 21. SkeletonX example 4 public class Main { public FirstClass doStuffFirst() { return new FirstClass(42); } public SecondClass doStuffSecond() { return new SecondClass(64,35); } } public class FirstClass { public FirstClass(int value) { } } public class SecondClass { Public SecondClass(int value, int value2) { } }
  22. 22. Other projects ● Munky ● A unified language for making web apps that compiles to HTML/Javascript/PHP/SQL ● Also, a related java based framework ● Very cool. ● At some point, a game-making tool designed for young children in a class room setting (eg, centralised storage/sharing)
  23. 23. HELP! (Conclusion) ● Lots of things to do ● Lots of projects ● Write paper on AustenX ● Not much money ● Looking for part time work, or funding ● Also, love to have access to journals again... scratchy.org.nz

×