SlideShare ist ein Scribd-Unternehmen logo
1 von 33
1
DH 2013, Nebraska
Practical Interoperability:
The Map of Early Modern London
and Internet Shakespeare Editions
Janelle Jenstad and Martin Holmes
University of Victoria
mapoflondon.uvic.ca
2
Loose couplings
3
MoEML and ISE
● On UVic servers
● Overlapping teams
● Mutual need
4
The Map of Early Modern London
● Maps streets, sites and
boundaries of London 1560-1640
● Interface based on Agas Map
● Includes (1) gazetteer, (2)
encyclopedia of London people
and places, (3) library of primary
source texts, (4) edition of A
Survey of London
● Pure TEI XML throughout
5
Mapping toponyms
6
Library
7
Dramatic extracts
8
Internet Shakespeare Editions
● Open-source digital anthology
● Also hosts and incubates Queen's
Men's Editions and Digital
Renaissance Editions
● Goal: all plays of Shakespeare and
contemporaries, 1500-1640
● SGML and non-standard XML :-(
9
Frequency of toponyms
The London locations in Richard III on the Agas Map, sized
according to the number of references to them.
10
Research questions
● How typical is Shakespeare's invocation of
London?
● How do his characters move through the urban
environment?
● What is the relationship between London and
the court?
● How does this vision compare to other
playwrights and to historians?
11
Integration? Interoperability?
Interchange?
Is it reasonable to ask editors to revisit their
"finished" work?
● Can we overcome the significant programming
differences?
12
Initial assumptions
● First rule of collaboration: You're on your own.
● The ISE agenda is not MoEML's agenda.
● MoEML can't ask the ISE editors to tag their
texts.
● MoEML can't depend on the ISE programmers to
implement things for us.
● We must beware of making features on our site
dependent on functionality on theirs.
13
Loose coupling
● We take the ISE texts and tag them.
● We generate sets of links based on through
line numbers (TLNs).
● We store the links in our database.
● We only depend on the fact that links to TLNs
on their website work.
14
The (rather unrealistic) plan
15
Manual tagging and NER
16
Manual tagging and NER
17
Manual tagging and NER
18
Manual tagging and NER
19
Manual tagging and NER
20
Manual tagging and NER
21
Manual tagging and NER
22
Manual tagging and NER
23
Manual tagging and NER
24
Results
● 4 plays:
– Richard II and Richard III (modern spelling)
– Henry VIII and Henry VI Part 2 (old spelling)
● 495 placenames marked up
● 95 linked to Map of Early Modern London
placeography
25
Performance of NER engine
26
Difficulties for NER tagger
● <stage>Enter Yorke, Salisbury, and
Warwick.</stage>
● "Was not your husband / In Margaret's battle at
St Albans slain?"
● Spelling variation ("Tower" versus "Towre")
● Capitalization is unhelpful in old-spelling texts.
● Short utterances confuse it:
– Queen Margaret: Richard.
– Richard: <LOC>Ha</LOC>?
27
The showstopper problem
● Henry VI pt 2:
– 210 placenames in the text
– tagger tagged 109 places, of which 106 were correct
– 29 of these were "England" and 38 "France"
– Among placenames missed:
● 48 were in Britain
● 20 of these were key London locations (Bedlam,
Southwark, London Bridge & Smithfield)
28
Is it worth using NER for
this?
● No.
● Possibly.
– It can function as a check on manual tagging.
● Yes.
– 75 "city plays" are eventually coming...
29
Happy endings
● Second rule of collaboration: nobody wants to
be left out.
● Now the ISE editors have seen how we're
linking to their plays, they want to tag
placenames for themselves.
● We'll just be able to harvest their tags for
MoEML.
30
Functional interoperability
ISE guidelines:
Internal Links to MoEML's London Locations
We are moving towards interoperability with The Map of Early
Modern London (MoEML). If your play includes references to
London locations, you will identify each London location using
the ilink element and the unique MoEML identifier for the
location....
31
ISE guidelines, cont.
The purpose of this tagging is two-fold: (1) it
will allow us to visualize the London locations in
a play using a MoEML map in the
ISE/DRE/QME environment, and (2) it will
allow MoEML to import London references in
ISE/DRE/QME plays into its database of
literary references (with a link back to the ISE).
32
ISE/DRE/QME tagging
<ilink component="geo"
href="mol:CHEA2">Cheapside</link>
ISE will have various instructions in its "geo"
component (England, France, Europe, London, stage
geometry)
All we need is the mol:XXXX# and the TLN
33
Should we continue to use NER?
ISE wants to use tags only in modern critical
editions.
ISE editions of 1 Henry IV, 2 Henry IV, and
Henry V are “done.”
500+ plays in DRE

Weitere ähnliche Inhalte

Andere mochten auch

Final power-pineda-sebastian
Final power-pineda-sebastianFinal power-pineda-sebastian
Final power-pineda-sebastianSeba Pineda
 
Encoding Historical Dates Correctly: Is it Practical, and is it Worth it?
Encoding Historical Dates Correctly:  Is it Practical, and is it Worth it?Encoding Historical Dates Correctly:  Is it Practical, and is it Worth it?
Encoding Historical Dates Correctly: Is it Practical, and is it Worth it?Janelle Jenstad
 
Introduction to perl_operators
Introduction to perl_operatorsIntroduction to perl_operators
Introduction to perl_operatorsVamshi Santhapuri
 
SIMULATION OF REDUCED SWITCH INVERTER BASED UPQC WITH FUZZY LOGIC AND ANN CON...
SIMULATION OF REDUCED SWITCH INVERTER BASED UPQC WITH FUZZY LOGIC AND ANN CON...SIMULATION OF REDUCED SWITCH INVERTER BASED UPQC WITH FUZZY LOGIC AND ANN CON...
SIMULATION OF REDUCED SWITCH INVERTER BASED UPQC WITH FUZZY LOGIC AND ANN CON...MABUSUBANI SHAIK
 
Projeto didático interdisciplinar
Projeto didático interdisciplinarProjeto didático interdisciplinar
Projeto didático interdisciplinarUniasselvi soares
 
Annotated Bibliography Assignment
Annotated Bibliography AssignmentAnnotated Bibliography Assignment
Annotated Bibliography AssignmentDavid Kellogg
 
Mitigation of Power Quality Issues by Nine Switches UPQC Using PI & ANN with ...
Mitigation of Power Quality Issues by Nine Switches UPQC Using PI & ANN with ...Mitigation of Power Quality Issues by Nine Switches UPQC Using PI & ANN with ...
Mitigation of Power Quality Issues by Nine Switches UPQC Using PI & ANN with ...MABUSUBANI SHAIK
 
20130727 cv machine_learning@tokyo webmining
20130727 cv machine_learning@tokyo webmining20130727 cv machine_learning@tokyo webmining
20130727 cv machine_learning@tokyo webminingMasahiro Imai
 

Andere mochten auch (9)

Ipi32246
Ipi32246Ipi32246
Ipi32246
 
Final power-pineda-sebastian
Final power-pineda-sebastianFinal power-pineda-sebastian
Final power-pineda-sebastian
 
Encoding Historical Dates Correctly: Is it Practical, and is it Worth it?
Encoding Historical Dates Correctly:  Is it Practical, and is it Worth it?Encoding Historical Dates Correctly:  Is it Practical, and is it Worth it?
Encoding Historical Dates Correctly: Is it Practical, and is it Worth it?
 
Introduction to perl_operators
Introduction to perl_operatorsIntroduction to perl_operators
Introduction to perl_operators
 
SIMULATION OF REDUCED SWITCH INVERTER BASED UPQC WITH FUZZY LOGIC AND ANN CON...
SIMULATION OF REDUCED SWITCH INVERTER BASED UPQC WITH FUZZY LOGIC AND ANN CON...SIMULATION OF REDUCED SWITCH INVERTER BASED UPQC WITH FUZZY LOGIC AND ANN CON...
SIMULATION OF REDUCED SWITCH INVERTER BASED UPQC WITH FUZZY LOGIC AND ANN CON...
 
Projeto didático interdisciplinar
Projeto didático interdisciplinarProjeto didático interdisciplinar
Projeto didático interdisciplinar
 
Annotated Bibliography Assignment
Annotated Bibliography AssignmentAnnotated Bibliography Assignment
Annotated Bibliography Assignment
 
Mitigation of Power Quality Issues by Nine Switches UPQC Using PI & ANN with ...
Mitigation of Power Quality Issues by Nine Switches UPQC Using PI & ANN with ...Mitigation of Power Quality Issues by Nine Switches UPQC Using PI & ANN with ...
Mitigation of Power Quality Issues by Nine Switches UPQC Using PI & ANN with ...
 
20130727 cv machine_learning@tokyo webmining
20130727 cv machine_learning@tokyo webmining20130727 cv machine_learning@tokyo webmining
20130727 cv machine_learning@tokyo webmining
 

Kürzlich hochgeladen

BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 

Kürzlich hochgeladen (20)

Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 

Final dh2013 interoperability

  • 1. 1 DH 2013, Nebraska Practical Interoperability: The Map of Early Modern London and Internet Shakespeare Editions Janelle Jenstad and Martin Holmes University of Victoria mapoflondon.uvic.ca
  • 3. 3 MoEML and ISE ● On UVic servers ● Overlapping teams ● Mutual need
  • 4. 4 The Map of Early Modern London ● Maps streets, sites and boundaries of London 1560-1640 ● Interface based on Agas Map ● Includes (1) gazetteer, (2) encyclopedia of London people and places, (3) library of primary source texts, (4) edition of A Survey of London ● Pure TEI XML throughout
  • 8. 8 Internet Shakespeare Editions ● Open-source digital anthology ● Also hosts and incubates Queen's Men's Editions and Digital Renaissance Editions ● Goal: all plays of Shakespeare and contemporaries, 1500-1640 ● SGML and non-standard XML :-(
  • 9. 9 Frequency of toponyms The London locations in Richard III on the Agas Map, sized according to the number of references to them.
  • 10. 10 Research questions ● How typical is Shakespeare's invocation of London? ● How do his characters move through the urban environment? ● What is the relationship between London and the court? ● How does this vision compare to other playwrights and to historians?
  • 11. 11 Integration? Interoperability? Interchange? Is it reasonable to ask editors to revisit their "finished" work? ● Can we overcome the significant programming differences?
  • 12. 12 Initial assumptions ● First rule of collaboration: You're on your own. ● The ISE agenda is not MoEML's agenda. ● MoEML can't ask the ISE editors to tag their texts. ● MoEML can't depend on the ISE programmers to implement things for us. ● We must beware of making features on our site dependent on functionality on theirs.
  • 13. 13 Loose coupling ● We take the ISE texts and tag them. ● We generate sets of links based on through line numbers (TLNs). ● We store the links in our database. ● We only depend on the fact that links to TLNs on their website work.
  • 24. 24 Results ● 4 plays: – Richard II and Richard III (modern spelling) – Henry VIII and Henry VI Part 2 (old spelling) ● 495 placenames marked up ● 95 linked to Map of Early Modern London placeography
  • 26. 26 Difficulties for NER tagger ● <stage>Enter Yorke, Salisbury, and Warwick.</stage> ● "Was not your husband / In Margaret's battle at St Albans slain?" ● Spelling variation ("Tower" versus "Towre") ● Capitalization is unhelpful in old-spelling texts. ● Short utterances confuse it: – Queen Margaret: Richard. – Richard: <LOC>Ha</LOC>?
  • 27. 27 The showstopper problem ● Henry VI pt 2: – 210 placenames in the text – tagger tagged 109 places, of which 106 were correct – 29 of these were "England" and 38 "France" – Among placenames missed: ● 48 were in Britain ● 20 of these were key London locations (Bedlam, Southwark, London Bridge & Smithfield)
  • 28. 28 Is it worth using NER for this? ● No. ● Possibly. – It can function as a check on manual tagging. ● Yes. – 75 "city plays" are eventually coming...
  • 29. 29 Happy endings ● Second rule of collaboration: nobody wants to be left out. ● Now the ISE editors have seen how we're linking to their plays, they want to tag placenames for themselves. ● We'll just be able to harvest their tags for MoEML.
  • 30. 30 Functional interoperability ISE guidelines: Internal Links to MoEML's London Locations We are moving towards interoperability with The Map of Early Modern London (MoEML). If your play includes references to London locations, you will identify each London location using the ilink element and the unique MoEML identifier for the location....
  • 31. 31 ISE guidelines, cont. The purpose of this tagging is two-fold: (1) it will allow us to visualize the London locations in a play using a MoEML map in the ISE/DRE/QME environment, and (2) it will allow MoEML to import London references in ISE/DRE/QME plays into its database of literary references (with a link back to the ISE).
  • 32. 32 ISE/DRE/QME tagging <ilink component="geo" href="mol:CHEA2">Cheapside</link> ISE will have various instructions in its "geo" component (England, France, Europe, London, stage geometry) All we need is the mol:XXXX# and the TLN
  • 33. 33 Should we continue to use NER? ISE wants to use tags only in modern critical editions. ISE editions of 1 Henry IV, 2 Henry IV, and Henry V are “done.” 500+ plays in DRE

Hinweis der Redaktion

  1. There is an obvious convergence between the two projects, and we imagine many benefits from interoperability. The gazetteer and mapping features in MoEML would significantly enhance the critical apparatus of the plays, while tying MoEML&apos;s placeography into the works of Shakespeare and his contemporaries would reinforce links between the physical geography and the literature.
  2. We cannot simply ask or expect the editors of ISE plays to tag all the placenames for us. They&apos;re too busy with other stuff, and they can&apos;t see the payoff.
  3. Our original plan involved the creation of a detailed London gazetteer, including all the variant spellings of placenames we know from our own texts, along with a training set of manually tagged plays, to serve as input to the NER process.
  4. Between the first and second plays, I improved the gazetteer substantially by importing a lot of non-London content; and for each play after the first, there&apos;s a larger training set, leading to better results. While precision is remarkably good – over 95% for the last two – recall is very low, and improving only slowly. Note: NER did find several placenames I&apos;d missed in the tagging of the plays.
  5. Places are people throughout the history plays. Syntax is frequently convoluted. Spelling in the old-spelling plays is inconsistent within the play. Nouns are frequently capitalized, so capitalization is not a useful clue for the NER engine as it is with modern texts.
  6. The point here is that the placenames we are most interested in are precisely the ones the tagger is least good at finding. It even missed &quot;London&quot; in one case. Note also, though, that despite finding &quot;England&quot; 29 times, it contrived to miss it 14 times.
  7. No, because it&apos;s hopeless at the very thing we care about most; and we only have 10 plays in our Shakespeare set. Possibly, because it&apos;s slowly getting better, and although we have only 10 Shakespeare plays, we have up to 75 &quot;city plays&quot; coming in the future from Digital Renaissance Editions (out of 500 they intend to tag). Yes, because NER did catch a few instances of placenames I&apos;d missed.