SlideShare ist ein Scribd-Unternehmen logo
1 von 14
What you forgot from your
Computer Science
Degree
Stephen Darlington
Wandle Software Limited
Okay, not all of it.
Requirements
•

Parse a string. Convert all occurrences of HTML
escape characters into their Unicode equivalent

•

"If you see '&lt;' convert it to '<'"
How Google Did It
static HTMLEscapeMap gAsciiHTMLEscapeMap[] = {
// A.2.2. Special characters
{ @"&quot;", 34 },
{ @"&amp;", 38 },
{ @"&apos;", 39 },
{ @"&lt;", 60 },
...
{ @"&hearts;", 9829 },
{ @"&diams;", 9830 }
};
https://code.google.com/p/google-toolbox-for-mac/source/browse/trunk/Foundation/GTMNSString
%2BHTML.m
How Google Did It
for (unsigned i = 0; i < sizeof(gAsciiHTMLEscapeMap) /
sizeof(HTMLEscapeMap); ++i) {
if ([escapeString
isEqualToString:gAsciiHTMLEscapeMap[i].escapeSequence]) {
[finalString replaceCharactersInRange:escapeRange withString:
[NSString stringWithCharacters:&gAsciiHTMLEscapeMap[i].uchar length:1]];
break;
}
}
Yuck
“flex is a tool for generating scanners. A
scanner is a program which recognizes
lexical patterns in text. The flex program
[looks for a] description of a scanner to
generate. The description is in the form of
pairs of regular expressions and C code,
called rules. flex generates as output a C
source file”
Lexical Analysis With Flex Introduction
http://flex.sourceforge.net/manual/Introduction.html#Introduction
Lexer Description
&amp; { return WSL_ENTITY_amp; }
&gt; { return WSL_ENTITY_gt; }
&lt; { return WSL_ENTITY_lt; }
&quot; { return WSL_ENTITY_quot; }
&apos; { return WSL_ENTITY_apos; }
&AElig; { return WSL_ENTITY_AElig; }
...
&#[0-9]+; { return WSL_ENTITY_NUMBER; }
[^&]+ { return WSL_ENTITY_NOMATCH; }
. { return WSL_ENTITIY_NOMATCH; }
Constants
#define WSL_ENTITY_NOMATCH -1
#define WSL_ENTITY_NUMBER -2
#define WSL_ENTITY_amp 38 // # ampersand
#define WSL_ENTITY_gt 62 // # greater than
#define WSL_ENTITY_lt 60 // # less than
#define WSL_ENTITY_quot 34 // # double quote
...
Main loop
while ((expression = WSLlex(scanner))) {
switch (expression) {
case WSL_ENTITY_NOMATCH:
[output appendFormat:@"%@", [NSString stringWithCString:WSLget_text(scanner)
encoding:NSISOLatin1StringEncoding]];
break;
case WSL_ENTITY_NUMBER:
expression = atoi(&WSLget_text(scanner)[2]);
// fall through so expression is added to string
default:
[output appendFormat:@"%C", (unsigned short) expression];
break;
}
}
Ziggity-ZaggityZooooom!
Benefits
•

Right tool for the right job

•

Consistent performance

•

Xcode knows about Flex
(with some caveats) so
simple to integrate

•

Flex has various flags to
optimise performance, for
example -Cf is much faster
but uses lots more memory
Further information
•

WSLHTMLEntities is on GitHub
(https://github.com/sdarlington/WSLHTMLEntities
)

•

Flex documentation
(http://flex.sourceforge.net/manual/)

•

"Introduction to Compiling Techniques," J P
Bennett
Stephen Darlington
Wandle Software Limited
@sdarlington
@wandlesoftware
http://www.zx81.org.uk/
http://www.wandlesoftware.com/
Apps:

Yummy / www.cut / Rootn Tootn / CameraGPS

Weitere ähnliche Inhalte

Andere mochten auch

Informed_Consent_Overview
Informed_Consent_OverviewInformed_Consent_Overview
Informed_Consent_Overview
Susan Stoltzfus
 
презентация маркетинг бюро
презентация маркетинг бюропрезентация маркетинг бюро
презентация маркетинг бюро
guseva_mb_buro
 
Community Winter 2006 Web
Community Winter 2006 WebCommunity Winter 2006 Web
Community Winter 2006 Web
Susan Stoltzfus
 
Презентация "Маркетинг-Бюро"
Презентация "Маркетинг-Бюро"Презентация "Маркетинг-Бюро"
Презентация "Маркетинг-Бюро"
guseva_mb_buro
 

Andere mochten auch (12)

Informed_Consent_Overview
Informed_Consent_OverviewInformed_Consent_Overview
Informed_Consent_Overview
 
i-10 Hospitality Company Presentation
i-10 Hospitality Company Presentationi-10 Hospitality Company Presentation
i-10 Hospitality Company Presentation
 
Updated Presentation1 brooks
Updated Presentation1 brooksUpdated Presentation1 brooks
Updated Presentation1 brooks
 
Presentation 1 Brooks
Presentation 1 BrooksPresentation 1 Brooks
Presentation 1 Brooks
 
презентация маркетинг бюро
презентация маркетинг бюропрезентация маркетинг бюро
презентация маркетинг бюро
 
Community Winter 2006 Web
Community Winter 2006 WebCommunity Winter 2006 Web
Community Winter 2006 Web
 
Benefit innovators actuarial services and healthcare reform
Benefit innovators actuarial services and healthcare reformBenefit innovators actuarial services and healthcare reform
Benefit innovators actuarial services and healthcare reform
 
As tic na escola 2013-14
As tic na escola   2013-14As tic na escola   2013-14
As tic na escola 2013-14
 
Mala Jewish Synagogue, Cochin
Mala Jewish Synagogue, CochinMala Jewish Synagogue, Cochin
Mala Jewish Synagogue, Cochin
 
Презентация "Маркетинг-Бюро"
Презентация "Маркетинг-Бюро"Презентация "Маркетинг-Бюро"
Презентация "Маркетинг-Бюро"
 
How to Make an Effective PowerPoint Presentation
How to Make an Effective PowerPoint PresentationHow to Make an Effective PowerPoint Presentation
How to Make an Effective PowerPoint Presentation
 
Familiarity Breeds Contempt (Or why all APIs suck, even yours.)
Familiarity Breeds Contempt (Or why all APIs suck, even yours.)Familiarity Breeds Contempt (Or why all APIs suck, even yours.)
Familiarity Breeds Contempt (Or why all APIs suck, even yours.)
 

Ähnlich wie What you forgot from your Computer Science Degree

JavaScript and jQuery Fundamentals
JavaScript and jQuery FundamentalsJavaScript and jQuery Fundamentals
JavaScript and jQuery Fundamentals
BG Java EE Course
 
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (3/3)
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (3/3)[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (3/3)
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (3/3)
Carles Farré
 
Kickstarting SItes With a Custom Package
Kickstarting SItes With a Custom PackageKickstarting SItes With a Custom Package
Kickstarting SItes With a Custom Package
Jeff Segars
 
49368010 projectreportontraininganddevelopment(1)
49368010 projectreportontraininganddevelopment(1)49368010 projectreportontraininganddevelopment(1)
49368010 projectreportontraininganddevelopment(1)
Kritika910
 
Embedded Typesafe Domain Specific Languages for Java
Embedded Typesafe Domain Specific Languages for JavaEmbedded Typesafe Domain Specific Languages for Java
Embedded Typesafe Domain Specific Languages for Java
Jevgeni Kabanov
 

Ähnlich wie What you forgot from your Computer Science Degree (20)

Scala 3camp 2011
Scala   3camp 2011Scala   3camp 2011
Scala 3camp 2011
 
JavaScript
JavaScriptJavaScript
JavaScript
 
Jquery 1
Jquery 1Jquery 1
Jquery 1
 
AngularJS Basics with Example
AngularJS Basics with ExampleAngularJS Basics with Example
AngularJS Basics with Example
 
Csphtp1 18
Csphtp1 18Csphtp1 18
Csphtp1 18
 
JavaScript and jQuery Fundamentals
JavaScript and jQuery FundamentalsJavaScript and jQuery Fundamentals
JavaScript and jQuery Fundamentals
 
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (3/3)
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (3/3)[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (3/3)
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (3/3)
 
Interoperable Web Services with JAX-WS
Interoperable Web Services with JAX-WSInteroperable Web Services with JAX-WS
Interoperable Web Services with JAX-WS
 
Aug Xml Net Forum Dynamics Integration
Aug Xml Net Forum Dynamics IntegrationAug Xml Net Forum Dynamics Integration
Aug Xml Net Forum Dynamics Integration
 
Effecient javascript
Effecient javascriptEffecient javascript
Effecient javascript
 
Struts2
Struts2Struts2
Struts2
 
My First Rails Plugin - Usertext
My First Rails Plugin - UsertextMy First Rails Plugin - Usertext
My First Rails Plugin - Usertext
 
Kickstarting SItes With a Custom Package
Kickstarting SItes With a Custom PackageKickstarting SItes With a Custom Package
Kickstarting SItes With a Custom Package
 
49368010 projectreportontraininganddevelopment(1)
49368010 projectreportontraininganddevelopment(1)49368010 projectreportontraininganddevelopment(1)
49368010 projectreportontraininganddevelopment(1)
 
Web Security Mistakes: Trusting The Client
Web Security Mistakes: Trusting The ClientWeb Security Mistakes: Trusting The Client
Web Security Mistakes: Trusting The Client
 
What's new in Rails 2?
What's new in Rails 2?What's new in Rails 2?
What's new in Rails 2?
 
Embedded Typesafe Domain Specific Languages for Java
Embedded Typesafe Domain Specific Languages for JavaEmbedded Typesafe Domain Specific Languages for Java
Embedded Typesafe Domain Specific Languages for Java
 
SQL -PHP Tutorial
SQL -PHP TutorialSQL -PHP Tutorial
SQL -PHP Tutorial
 
Ajax ons2
Ajax ons2Ajax ons2
Ajax ons2
 
All things that are not code
All things that are not codeAll things that are not code
All things that are not code
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Kürzlich hochgeladen (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 

What you forgot from your Computer Science Degree

  • 1. What you forgot from your Computer Science Degree Stephen Darlington Wandle Software Limited
  • 2. Okay, not all of it.
  • 3. Requirements • Parse a string. Convert all occurrences of HTML escape characters into their Unicode equivalent • "If you see '&lt;' convert it to '<'"
  • 4. How Google Did It static HTMLEscapeMap gAsciiHTMLEscapeMap[] = { // A.2.2. Special characters { @"&quot;", 34 }, { @"&amp;", 38 }, { @"&apos;", 39 }, { @"&lt;", 60 }, ... { @"&hearts;", 9829 }, { @"&diams;", 9830 } }; https://code.google.com/p/google-toolbox-for-mac/source/browse/trunk/Foundation/GTMNSString %2BHTML.m
  • 5. How Google Did It for (unsigned i = 0; i < sizeof(gAsciiHTMLEscapeMap) / sizeof(HTMLEscapeMap); ++i) { if ([escapeString isEqualToString:gAsciiHTMLEscapeMap[i].escapeSequence]) { [finalString replaceCharactersInRange:escapeRange withString: [NSString stringWithCharacters:&gAsciiHTMLEscapeMap[i].uchar length:1]]; break; } }
  • 7. “flex is a tool for generating scanners. A scanner is a program which recognizes lexical patterns in text. The flex program [looks for a] description of a scanner to generate. The description is in the form of pairs of regular expressions and C code, called rules. flex generates as output a C source file” Lexical Analysis With Flex Introduction http://flex.sourceforge.net/manual/Introduction.html#Introduction
  • 8. Lexer Description &amp; { return WSL_ENTITY_amp; } &gt; { return WSL_ENTITY_gt; } &lt; { return WSL_ENTITY_lt; } &quot; { return WSL_ENTITY_quot; } &apos; { return WSL_ENTITY_apos; } &AElig; { return WSL_ENTITY_AElig; } ... &#[0-9]+; { return WSL_ENTITY_NUMBER; } [^&]+ { return WSL_ENTITY_NOMATCH; } . { return WSL_ENTITIY_NOMATCH; }
  • 9. Constants #define WSL_ENTITY_NOMATCH -1 #define WSL_ENTITY_NUMBER -2 #define WSL_ENTITY_amp 38 // # ampersand #define WSL_ENTITY_gt 62 // # greater than #define WSL_ENTITY_lt 60 // # less than #define WSL_ENTITY_quot 34 // # double quote ...
  • 10. Main loop while ((expression = WSLlex(scanner))) { switch (expression) { case WSL_ENTITY_NOMATCH: [output appendFormat:@"%@", [NSString stringWithCString:WSLget_text(scanner) encoding:NSISOLatin1StringEncoding]]; break; case WSL_ENTITY_NUMBER: expression = atoi(&WSLget_text(scanner)[2]); // fall through so expression is added to string default: [output appendFormat:@"%C", (unsigned short) expression]; break; } }
  • 12. Benefits • Right tool for the right job • Consistent performance • Xcode knows about Flex (with some caveats) so simple to integrate • Flex has various flags to optimise performance, for example -Cf is much faster but uses lots more memory
  • 13. Further information • WSLHTMLEntities is on GitHub (https://github.com/sdarlington/WSLHTMLEntities ) • Flex documentation (http://flex.sourceforge.net/manual/) • "Introduction to Compiling Techniques," J P Bennett
  • 14. Stephen Darlington Wandle Software Limited @sdarlington @wandlesoftware http://www.zx81.org.uk/ http://www.wandlesoftware.com/ Apps: Yummy / www.cut / Rootn Tootn / CameraGPS