9. WHAT'S LOLCODE?
1 Oh hai. In teh beginnin
1 In the beginning, some Guy
Ceiling Cat maded teh skiez
made an Image of a cat. And
An da Urfs, but he did not
the picture was without
eated dem.
caption, and void.
2 Da Urfs no had shapez An
2 And said Guy did make a
haded dark face, An Ceiling
caption unto his Image. And
Cat rode invisible bike over
it was funny, and the Guy did
teh waterz.
call it good.
3 At start, no has lyte. An
3 And this guy's Image did
Ceiling Cat sayz, i can haz
become an internet meme, all
lite? An lite wuz.
Web 2.0-ish and stuff.
4 An Ceiling Cat sawed teh
4 There was much rejoicing.
lite, to seez stuffs...
10. WHAT'S LOLCODE
A lolcat is an image combining a photograph, most
frequently a cat, with a humorous and idiosyncratic caption
in (often) broken English—a dialect which is known as
quot;lolspeakquot;, or quot;kittehquot;. The name quot;lolcatquot; is a compound word
of quot;LOLquot; and quot;catquot;.[1] Another name is cat macro, being a
type of image macro.[2] Lolcats are created for photo
sharing imageboards and other internet forums. Lolcats are
similar to
other anthropomorphic animal-based
image macros such as the O RLY?
owl.[3]
16. SO, LOLCODE?
Adam Lindsay had a revelation. From Ceiling Cat.
quot;This is a love letter to very clever people who
are slightly bored. I had no idea there were so
many of us out there.quot;
- FAQ, lolcode.com
23. I CAN SPEEK LOLCODE
Data types
NUMBR: Integer values
NUMBAR: Floating point values
YARN: String values
NOOB: Null values
TROOF: Boolean values (WIN / FAIL)
24. I CAN SPEEK LOLCODE
Operators
Arithmetic
SUM OF x AN y, DIFF OF x AN y
PRODUKT OF x AN y, QUOSHUNT OF x AN y
MOD OF x AN y
BIGGR OF x AN y, SMALLR OF x AN y
Boolean
BOTH OF x AN y
EITHER OF x AN y
WON OF x AN y
ALL OF x AN y [AN ...] MKAY
ANY OF x AN y [AN ...] MKAY
NOT x
25. I CAN SPEEK LOLCODE
Operators (cont.)
Comparison
BOTH SAEM x AN y
WIN iff x == y
DIFFRINT x AN y
FAIL iff x == y
Concatenation
SMOOSH x y z p d q ... MKAY
Concatenates infinitely many YARNs
Casting
MAEK x A <type>
x IS NOW A <type>
26. I CAN SPEEK LOLCODE
Comparison
BOTH SAEM ANIMAL AN quot;CATquot;, O RLY?
YA RLY, VISIBLE quot;J00 HAV A CATquot;
MEBBE BOTH SAEM ANIMAL AN quot;MAUSquot;
VISIBLE quot;JOO HAV A MAUS? WTF?quot;
NO WAI, VISIBLE quot;J00 SUXquot;
OIC
27. I CAN SPEEK LOLCODE
Case
COLOR, WTF?
OMG quot;Rquot;
VISIBLE quot;RED FISHquot;
GTFO
OMG quot;Yquot;
VISIBLE quot;YELLOW FISHquot;
OMG quot;Gquot;
OMG quot;Bquot;
VISIBLE quot;FISH HAS A FLAVORquot;
GTFO
OMGWTF
VISIBLE quot;FISH IS TRANSPARENTquot;
OIC
28. I CAN SPEEK LOLCODE
Loops
IM IN YR <label> <operation> YR <variable>
[TIL|WILE <expression>]
<code block>
IM OUTTA YR <label>
IT variable
The default when nothing else is specified
Similar to Perl's $_
29. I CAN SPEEK PL/LOLCODE
PL/LOLCODE-specific stuff
VISIBLE == PL/pgSQL RAISE
Alt. VISIBLE <level> quot;Messagequot;
FOUND YR <val> == PL/pgSQL RETURN
GIMMEH <var> OUTTA DATABUKKIT quot;queryquot;
Database interface (SPI)
Function arguments are named LOL1, LOL2, etc.
automatically
PL/LOLCODE isn't smart enough yet to let you
give them arbitrary names
30. I CAN SPEEK PL/LOLCODE
PL/LOLCODE-specific stuff
VISIBLE == PL/pgSQL RAISE
Alt. VISIBLE <level> quot;Messagequot;
FOUND YR <val> == PL/pgSQL RETURN
GIMMEH <var> OUTTA DATABUKKIT quot;queryquot;
Database interface (SPI)
LIAR LIAR PANTZ
Function arguments are named LOL1, LOL2, etc.
automatically
ON FIRE!!!1
PL/LOLCODE isn't smart enough yet to let you
give them arbitrary names
31. I CAN SPEEK PL/LOLCODE
PL/LOLCODE-specific stuff
VISIBLE == PL/pgSQL RAISE
Alt. VISIBLE <level> quot;Messagequot;
FOUND YR <val> == PL/pgSQL RETURN
GIMMEH <var> OUTTA DATABUKKIT quot;queryquot;
Database interface (SPI)
Function arguments are named LOL1, LOL2, etc.
automatically
PL/LOLCODE isn't smart enough yet to let you
give them arbitrary names
As of a few nights ago, this is no longer true.
W00T!!1
32. I CAN SPEEK PL/LOLCODE
HAI
I HAS A X ITZ SMOOSH quot;MONORAILquot; quot;KITTYquot; MKAY
I HAS A Y ITZ 3
I HAS A Z ITZ BIGGR OF Y AN 15
BOTH SAEM Z AN Y, O RLY?
YA RLY, VISIBLE quot;SUMTHINZ RONG!!quot;
NO WAI, VISIBLE quot;THX. CAN I HAZ CHEEZBURGR?quot;
OIC
IM IN YR MYLOOP
Y R DIFF OF Y AN 1
DIFFRINT Y AN BIGGR OF Y AN 5, O RLY?
YA RLY, GTFO
OIC
IM OUTTA YR MYLOOP
KTHXBYE
34. ANATUMMY OF A PL
Function call handler
plpgsql_call_handler, plperl_call_handler, etc.
This function is responsible for executing each
function call for a particular language
CREATE LANGUAGE
Must be written in a compiled language
Must take language_handler as a parameter
Function validator
plpgsql_validator, plperl_validator
Optional
Validates functions when they're created
Takes an OID argument
TRUSTED vs. UNTRUSTED
35. ANATUMMY OF A PL
Where's the interpreter?
Some PLs come with their own interpreter
PL/pgSQL, SQL/PSM, PL/LOLCODE, PL/Proxy
Others use an interpreter from a shared library
PL/Perl, PL/Python, PL/R, most everything else
Some do other strange things
PL/J
Talks to a Java virtual machine via RMI calls
PLs with built-in interpreters can use PostgreSQL's
internal data types to avoid conversion overhead
36. ANATUMMY OF A PL
Where's the interpreter?
Some PLs come with their own interpreter
PL/pgSQL, SQL/PSM, PL/LOLCODE, PL/Proxy
Others use an interpreter from a shared library
PL/Perl, PL/Python, PL/R, most everything else
Some do other strange things
PL/J
Talks to a Java virtual machine via RMI calls
PLs with built-in interpreters can use PostgreSQL's
internal data types to avoid conversion overhead
PL/LOLCODE isn't that smart yet
37. ANATUMMY OF A PL
Compilation
Commonly a function will be quot;compiledquot; the first time
it is run in a session
The function call handler looks up argument and
return types only once
Sometimes keeps the parse tree in memory
Avoids having to re-parse the function next time
it's called
38. ANATUMMY OF A PL
Compilation
Commonly a function will be quot;compiledquot; the first time
it is run in a session
The function call handler looks up argument and
return types only once
Sometimes keeps the parse tree in memory
Avoids having to re-parse the function next time
it's called
PL/LOLCODE isn't that smart yet
39. CHEEZBURGER
SPEEK LOLCODE
FUNKSHUN KALL HANDLRZ
ANATUMMY OF A PL
40. FUNKSHUN KALL HANDLRZ
Job of the function handler
Figure out if it's a TRIGGER function
Determine argument values
Determine return type
Find the function source
...alternatively, find a pre-stored parse tree
Execute the function
Return the proper values
41. FUNKSHUN KALL HANDLRZ
PG_MODULE_MAGIC;
Datum pl_lolcode_call_handler
(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(pl_lolcode_call_handler);
Datum
pl_lolcode_call_handler(PG_FUNCTION_ARGS)
{
/* ... */
}
42. FUNKSHUN KALL HANDLRZ
Get function information
Form_pg_proc from server/catalog/pg_proc.h
Form_pg_proc procStruct; HeapTuple procTup;
procTup = SearchSysCache(PROCOID, ObjectIdGetDatum(fcinfo-
>flinfo->fn_oid), 0, 0, 0);
if (!HeapTupleIsValid(procTup)) elog(ERROR, quot;Cache lookup
failed for procedure %uquot;, fcinfo->flinfo->fn_oid);
procStruct = (Form_pg_proc) GETSTRUCT(procTup);
ReleaseSysCache(procTup);
43. FUNKSHUN KALL HANDLRZ
Get function source
procsrcdatum = SysCacheGetAttr(PROCOID,
procTup, Anum_pg_proc_prosrc, &isnull);
if (isnull) elog(ERROR, quot;Function source is
nullquot;);
proc_source = DatumGetCString
(DirectFunctionCall1(textout, procsrcdatum));
44. FUNKSHUN KALL HANDLRZ
Get return type
typeTup = SearchSysCache(TYPEOID, ObjectIdGetDatum
(procStruct->prorettype), 0, 0, 0);
if (!HeapTupleIsValid(typeTup)) elog(ERROR, quot;Cache lookup
failed for type %uquot;, procStruct->prorettype);
typeStruct = (Form_pg_type) GETSTRUCT(typeTup);
resultTypeIOParam = getTypeIOParam(typeTup);
fmgr_info_cxt(typeStruct->typinput, &flinfo,
TopMemoryContext);
45. FUNKSHUN KALL HANDLRZ
Get arguments and their types
for (i = 0; i < procStruct->pronargs; i++)
{
snprintf(arg_name, 255, quot;LOL%dquot;, i+1);
lolDeclareVar(arg_name);
LOLifyDatum(fcinfo->arg[i], fcinfo->argnull[i],
procStruct->proargtypes.values[i], arg_name);
}
LOLifyDatum() gets the typeStruct for the argument
type, gets the argument value as a C string, and sticks it
in PL/LOLCODE's variable table
46. FUNKSHUN KALL HANDLRZ
Pass the function source to the interpreter
pllolcode_yy_scan_string(proc_source);
pllolcode_yyparse(NULL);
pllolcode_yylex_destroy();
More on this later
48. FUNKSHUN KALL HANDLRZ
Note the function manager interface (fmgr.[h|c])
InputFunctionCall
Calls a previously determined datatype input
function
OutputFunctionCall
Opposite of InputFunctionCall
DirectFunctionCall[1..9]
Call a specific, named function with various
numbers of arguments
Require FmgrInfo structures to describe the
functions to be called
Read through fmgr.h and fmgr.c for more details
49. CHEEZBURGER
SPEEK LOLCODE
FUNKSHUN KALL HANDLRZ
MAEK UN INTURPRETR
ANATUMMY OF A PL
50. MAEK UN INTURPRETR
Don't write the parser on your own.
It's painful
It's a lot of work
It's easy to do wrong
Someone else already has
Writing the interpreter on your own is ok, but use a tool
I wrote an LOLCODE interpreter to ...
Learn how
Be sure it was licensed the way I wanted it
licensed (BSD)
51. MAEK UN INTURPRETR
Tools for writing parsers and interpreters
There are a fair number of them
ANTLR
Lex + Yacc
Flex + Bison
Parsers, language definitions, etc. are complex
business
Top-down vs. Bottom-up, Associativity rules...
I chose Flex + Bison, because that's what
PostgreSQL uses
I've no particular expertise to choose based on
merits
I forgot most of my formal languages class :)
52. MAEK UN INTURPRETR
Parser generation tools require a language definition
Generally in a variant of BNF (Backus-Naur Form)
<symbol1> ::= <expression>
<postal-address> ::= <name-part> <street-address> <zip-part>
<name-part> ::= <personal-part> <last-name> <opt-jr-part> <EOL>
| <personal-part> <name-part>
<personal-part> ::= <first-name> | <initial> quot;.quot;
<street-address> ::= <opt-apt-num> <house-num> <street-name> <EOL>
<zip-part> ::= <town-name> quot;,quot; <state-code> <ZIP-code> <EOL>
<opt-jr-part> ::= quot;Sr.quot; | quot;Jr.quot; | <roman-numeral> | quot;quot;Thanks, Wikipedia
53. MAEK UN INTURPRETR
Flex + Bison == two parts
Flex = lexer
Divides input into a set of labeled tokens
Bison = parser
Given a stream of tokens, interpret them and act
accordingly
55. MAEK UN INTURPRETR
State machine allows for processing blocks
OBTW { BEGIN LOL_BLOCKCMT; }
<LOL_BLOCKCMT>TLDR { BEGIN 0; lol_lex_debug(quot;yylex:
block commentquot;); }
<LOL_BLOCKCMT>n { lineNumber++; }
<LOL_BLOCKCMT>. {}
56. MAEK UN INTURPRETR
Some code to make memory allocation work
#define YYSTACK_FREE pfree
#define YYSTACK_MALLOC palloc
#define YYFREE pfree
#define YYMALLOC palloc
...
%option noyyalloc noyyrealloc noyyfree
PostgreSQL uses memory contexts, so the parser has to
use them too.
57. MAEK UN INTURPRETR
Flex makes a big stream of tokens
Bison turns them into something useful
Define the tokens:
%token HAI KTHXBYE EXPREND
%token YARN IDENTIFIER NUMBR NUMBAR
%token TROOFWIN TROOFFAIL TROOF
Define grammar and associated actions
%%
lol_program:
HAI EXPREND lol_cmdblock KTHXBYE EXPREND
{ executeProgram($3); }
58. MAEK UN INTURPRETR
Bison's job (in this case) is to build a parse tree
We don't actually execute anything until the tree is
built
In some future day, we'll save the parse tree after
the first execution, and avoid re-parsing each time a
function is called
Once the tree is built, we'll pass it to a function that will
execute it
Specifically, the data structure is a linked list where
each node can point to the next node in the list as well as
to a new list
This is similar to the structure PL/pgSQL uses
59. MAEK UN INTURPRETR
typedef struct lolNodeList {
int length;
lolNode *head;
lolNode *tail;
} lolNodeList;
struct lolNode {
NodeType node_type;
struct lolNode *next;
/* list for sub-nodes */
lolNodeList *list;
NodeVal nodeVal;
};
60. MAEK UN INTURPRETR
Each lolNode is a particular statement
VISIBLE
ORLY
IHASA
...
Also, constants and variables are nodes
YARN
NUMBAR
...
How each node uses the members of the lolNode
struct depends on the node
61. MAEK UN INTURPRETR
TROOF node
Really simple -- handles a TROOF (Boolean) constant
NodeType node_type = TROOF
NodeVal nodeVal = WIN or FAIL
When executed, this node simply sets IT to the WIN
or FAIL value stored in nodeVal
62. MAEK UN INTURPRETR
VISIBLE nodes
Remember, VISIBLE rasies an exception
Structure members:
NodeType node_type = VISIBLE
lolNodeList *list = pointer to a list of nodes
containing:
A constant string (a 1-node list)
An expression evaluating to a string (possibly
many nodes)
NodeVal nodeVal = a number representing the
message level (NOTICE, WARNING, etc.)
When executed, this node runs the list, which puts a
value in IT, and raises IT as an exception
63. MAEK UN INTURPRETR
SUMOF node
SUM OF x AN y = x + y
Structure members:
lolNodeList *list == a two-element list, where each
node represents an operand
The result is returned in IT
64. MAEK UN INTURPRETR
ORLY node
O RLY? == if, YA RLY == then, MEBBE == else if, NO
WAI == else
lolNodeList *list points to a set of nodes, one per YA
RLY, MEBBE, and NO WAI block.
Each YA RLY, MEBBE, and NO WAI block has a
sub-list of its own, containing the instructions in
the block
MEBBE blocks also contain a list of nodes
containing the MEBBE expression
66. MAEK UN INTURPRETR
Token definitions
These describe the data Flex generates
%token FOUNDYR IHASA ITZ R
%token SUMOF DIFFOF PRODUKTOF QUOSHUNTOF MODOF BIGGROF
SMALLROF AN
%token BOTHOF EITHEROF WONOF NOT ALLOF ANYOF MKAY
%token BOTHSAEM DIFFRINT
These will turn into #define statements (e.g. #define
DIFFRINT 123)
68. MAEK UN INTURPRETR
A definition of the data type used in Bison's stack
%union {
int numbrVal;
double numbarVal;
char *yarnVal;
struct lolNodeList *nodeList;
struct lolNode *node;
}
This is pretty much always a union
69. MAEK UN INTURPRETR
Type declarations to map various declared expressions
to union members
%type <nodeList> lol_program lol_cmdblock lol_orly_block
%type <node> lol_const lol_var lol_cmd lol_xaryoprgroup
%type <yarnVal> YARN
%type <yarnVal> IDENTIFIER
70. MAEK UN INTURPRETR
As the parser sees token streams it can interpret as a
particular object in the grammar, it runs the code
associated with that object, using $1, $2, etc. to
identify objects on the stack
lol_expression:
...
| IHASA lol_var ITZ lol_expression
{
$$ = lolMakeNode
(DECL, ($2)->nodeVal, lolMakeList($4));
}
71. MISELANEEUS
DEVELOPMENT
ENVIRONMENT
INTRO TO SPI
72. SPI
GIMMEH <var> OUTTA DATABUKKIT quot;<query>quot;
Note that this only returns one value
PL/LOLCODE uses a special parse tree node for SPI
The expression that generates the query is stored in
the node's sub-list
Executing that list puts the text of the query in IT
SPI sends that query to the server for processing
The first column of the first row is stuck in the
variable in the GIMMEH statement
SPIval = SPI_getvalue(SPI_tuptable->vals[0], SPI_tuptable->tupdesc,
1);
LOLifyString(SPIval, SPI_gettypeid(SPI_tuptable->tupdesc, 1), node-
>nodeVal.yarnVal);
73. Datums
Everything is a Datum
What a Datum actually is is architecture-dependant
sizeof(Datum) == sizeof(long) >= sizeof(void *) >= 4
Different data types are stored in Datums in
different ways
You need to know what data type the datum is supposed
to store
SPI_tuptable is a global pointer to a structure that
describes the last tuple returned by SPI
Based on a Datum and knowledge of the type it
represents, you can get useful values from it
There are lots of functions, macros, etc. for processing
Datums
74. Development Environment
Configure with --enable-debug, --enable-cassert
--enable-debug
Include debugging symbols in final library
This allows things like gdb to know where in the
program you are, and tell you about it
Disables optimizations
Makes executables bigger
--enable-cassert
Adds (in)sanity checks
Makes things slower
More likely to tell you if you do something really
stupid
cf. memory context problems