SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Downloaden Sie, um offline zu lesen
Enhancing and Extending the Digital Study of
Intertextuality (pt. 2)
Matteo Romanello (DAI/KCL) @mr56k
“Making Meaning from Data” DCA Panel @ SCS annual
meeting, New Orleans–January 11, 2015
Digital Study of Intertextuality
Enhancing and Extending the Digital Study of Intertextuality:
1. finding new possible parallel passages
Digital Study of Intertextuality
Enhancing and Extending the Digital Study of Intertextuality:
1. finding new possible parallel passages
2. creating a systematic index of already studied parallels
The Classicist’s Toolkit
Tool Accuracy Granularity Coverage
Indexes Locorum + + –
Library Catalogues + – –
Full-text Search – – +
Automatic Citation Indexing +/– + +
Citation Extraction: Step 1, (Named Entity Recognition)
Current accuracy (F1-score): 73,88%
Citation Extraction Step 2: (Relation Detection)
Current accuracy (F1-score): 92,60%
Citation Extraction Step 3: (Disambiguation)
Current accuracy (F1-score): 73,05%
Mining Citations from APh and JSTOR
APh
80 volumes
8 % of vol. 75 (2004)
366 abstracts
26k tokens
380 citations
JSTOR
Classics
1,456 journals
171k articles
327m tokens
Materiali e Discussioni (29 yrs: 1978-2006)
669 articles
5.6m tokens
40k citations
From Index to Network
From Texts to Network
APh: Micro-level
APh: Macro-level
APh: Meso-level
JSTOR: diachronic trends in MD (1978-2006)
Diachronic trends of 5 most cited authors in MD (1978-2006)
Digital Study of Intertextuality: Future Work
Combination of:
discovery of new possible parallel passages
systematic index of already studied parallels
Scenario:
use of the parallel frequency for ranking automatically
identified candidates
Thank you for your attention!
Questions?
matteo.romanello@gmail.com
https://twitter.com/mr56k
Some Links:
http://phd.mr56k.info/data/viz/macro.html
http://phd.mr56k.info/data/viz/meso.html
http://phd.mr56k.info/data/viz/micro.html
https://github.com/mromanello/APh_Corpus
https://github.com/mromanello/CRefEx

Weitere ähnliche Inhalte

Ähnlich wie Enhancing and Extending the Digital Study of Intertextuality (pt. 2): Revealing Patterns of Intertextuality in Corpora of Secondary Literature

Pre-defense_talk
Pre-defense_talkPre-defense_talk
Pre-defense_talk
aphex34
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology:  A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology:  A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
Angelo Salatino
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
Angelo Salatino
 
Lecture-1-Introduction-to-Data-Mining.pdf
Lecture-1-Introduction-to-Data-Mining.pdfLecture-1-Introduction-to-Data-Mining.pdf
Lecture-1-Introduction-to-Data-Mining.pdf
Jojo314349
 
Clustering of Deep WebPages: A Comparative Study
Clustering of Deep WebPages: A Comparative StudyClustering of Deep WebPages: A Comparative Study
Clustering of Deep WebPages: A Comparative Study
ijcsit
 

Ähnlich wie Enhancing and Extending the Digital Study of Intertextuality (pt. 2): Revealing Patterns of Intertextuality in Corpora of Secondary Literature (20)

How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?
 
Pre-defense_talk
Pre-defense_talkPre-defense_talk
Pre-defense_talk
 
IC05 cours 4
IC05 cours 4IC05 cours 4
IC05 cours 4
 
Ontology based clustering algorithms
Ontology based clustering algorithmsOntology based clustering algorithms
Ontology based clustering algorithms
 
KnowledgeCoin : recognizing and rewarding metadata integration and sharing ...
KnowledgeCoin: recognizing and rewarding metadata integration and sharing ...KnowledgeCoin: recognizing and rewarding metadata integration and sharing ...
KnowledgeCoin : recognizing and rewarding metadata integration and sharing ...
 
On the Navigability of Social Tagging Systems
On the Navigability of Social Tagging SystemsOn the Navigability of Social Tagging Systems
On the Navigability of Social Tagging Systems
 
KDD, Data Mining, Data Science_I.pptx
KDD, Data Mining, Data Science_I.pptxKDD, Data Mining, Data Science_I.pptx
KDD, Data Mining, Data Science_I.pptx
 
JOSA TechTalks - Machine Learning in Practice
JOSA TechTalks - Machine Learning in PracticeJOSA TechTalks - Machine Learning in Practice
JOSA TechTalks - Machine Learning in Practice
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology:  A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology:  A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
 
Next Generation Catalogs: Extensible Catalog, David Lindahl
Next Generation Catalogs: Extensible Catalog, David LindahlNext Generation Catalogs: Extensible Catalog, David Lindahl
Next Generation Catalogs: Extensible Catalog, David Lindahl
 
eventdemo2016
eventdemo2016eventdemo2016
eventdemo2016
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
 
Lecture-1-Introduction-to-Data-Mining.pdf
Lecture-1-Introduction-to-Data-Mining.pdfLecture-1-Introduction-to-Data-Mining.pdf
Lecture-1-Introduction-to-Data-Mining.pdf
 
Clustering of Deep WebPages: A Comparative Study
Clustering of Deep WebPages: A Comparative StudyClustering of Deep WebPages: A Comparative Study
Clustering of Deep WebPages: A Comparative Study
 
Evaluating the Use of Clustering for Automatically Organising Digital Library...
Evaluating the Use of Clustering for Automatically Organising Digital Library...Evaluating the Use of Clustering for Automatically Organising Digital Library...
Evaluating the Use of Clustering for Automatically Organising Digital Library...
 
Feature Extraction for Large-Scale Text Collections
Feature Extraction for Large-Scale Text CollectionsFeature Extraction for Large-Scale Text Collections
Feature Extraction for Large-Scale Text Collections
 
Profiling Linked Open Data
Profiling Linked Open DataProfiling Linked Open Data
Profiling Linked Open Data
 
Enhancing Soft Power: using cyberspace to enhance Soft Power
Enhancing Soft Power: using cyberspace to enhance Soft PowerEnhancing Soft Power: using cyberspace to enhance Soft Power
Enhancing Soft Power: using cyberspace to enhance Soft Power
 
The Electronic Notebook Ontology
The Electronic Notebook OntologyThe Electronic Notebook Ontology
The Electronic Notebook Ontology
 

Mehr von Matteo Romanello

Exploring Citation Networks to Study Intertextuality in Classics
Exploring Citation Networks to Study Intertextuality in ClassicsExploring Citation Networks to Study Intertextuality in Classics
Exploring Citation Networks to Study Intertextuality in Classics
Matteo Romanello
 
DARIAH Geo-browser: Exploring Data through Time and Space
DARIAH Geo-browser: Exploring Data through Time and SpaceDARIAH Geo-browser: Exploring Data through Time and Space
DARIAH Geo-browser: Exploring Data through Time and Space
Matteo Romanello
 
Rethinking Critical Editions of Fragments by Ontologies
Rethinking Critical Editions of Fragments by OntologiesRethinking Critical Editions of Fragments by Ontologies
Rethinking Critical Editions of Fragments by Ontologies
Matteo Romanello
 
Presentatio @ ELPUB 2008, Toronto
Presentatio @ ELPUB 2008, TorontoPresentatio @ ELPUB 2008, Toronto
Presentatio @ ELPUB 2008, Toronto
Matteo Romanello
 
Linking Primary and Secondary by Microformats
Linking Primary and Secondary by MicroformatsLinking Primary and Secondary by Microformats
Linking Primary and Secondary by Microformats
Matteo Romanello
 
M.Romanello Ecal Presentation
M.Romanello Ecal PresentationM.Romanello Ecal Presentation
M.Romanello Ecal Presentation
Matteo Romanello
 

Mehr von Matteo Romanello (18)

Towards the Automatic Retrieval of Cited Parallel Passages from Secondary Lit...
Towards the Automatic Retrieval of Cited Parallel Passages from Secondary Lit...Towards the Automatic Retrieval of Cited Parallel Passages from Secondary Lit...
Towards the Automatic Retrieval of Cited Parallel Passages from Secondary Lit...
 
Scaling up the Extraction of Canonical Citations in Classics
Scaling up the Extraction of Canonical Citations in ClassicsScaling up the Extraction of Canonical Citations in Classics
Scaling up the Extraction of Canonical Citations in Classics
 
Transforming Indexes Locorum into Citation Networks
Transforming Indexes Locorum into Citation NetworksTransforming Indexes Locorum into Citation Networks
Transforming Indexes Locorum into Citation Networks
 
Introduction to the Text Reuse panel at DH 2014
Introduction to the Text Reuse panel at DH 2014Introduction to the Text Reuse panel at DH 2014
Introduction to the Text Reuse panel at DH 2014
 
Exploring Citation Networks to Study Intertextuality in Classics
Exploring Citation Networks to Study Intertextuality in ClassicsExploring Citation Networks to Study Intertextuality in Classics
Exploring Citation Networks to Study Intertextuality in Classics
 
DARIAH Geo-browser: Exploring Data through Time and Space
DARIAH Geo-browser: Exploring Data through Time and SpaceDARIAH Geo-browser: Exploring Data through Time and Space
DARIAH Geo-browser: Exploring Data through Time and Space
 
Greedy Enough for the Grid?
Greedy Enough for the Grid?Greedy Enough for the Grid?
Greedy Enough for the Grid?
 
Structured and Unstructured:Extracting Information From Classics Scholarly Texts
Structured and Unstructured:Extracting Information From Classics Scholarly TextsStructured and Unstructured:Extracting Information From Classics Scholarly Texts
Structured and Unstructured:Extracting Information From Classics Scholarly Texts
 
Romanello tokyo
Romanello tokyoRomanello tokyo
Romanello tokyo
 
Stuctured Vs Unstructured: Extracting Information from Classics Scholarly Texts
Stuctured Vs Unstructured: Extracting Information from Classics Scholarly TextsStuctured Vs Unstructured: Extracting Information from Classics Scholarly Texts
Stuctured Vs Unstructured: Extracting Information from Classics Scholarly Texts
 
[poster] Extracting Information From Classics Scholarly Texts
[poster] Extracting Information From Classics Scholarly Texts[poster] Extracting Information From Classics Scholarly Texts
[poster] Extracting Information From Classics Scholarly Texts
 
DIGITAL HUMANITIES E FILOLOGIA Un'introduzione
DIGITAL HUMANITIES   E FILOLOGIA   Un'introduzioneDIGITAL HUMANITIES   E FILOLOGIA   Un'introduzione
DIGITAL HUMANITIES E FILOLOGIA Un'introduzione
 
Ht159 Poster
Ht159 PosterHt159 Poster
Ht159 Poster
 
Rethinking Critical Editions of Fragments by Ontologies
Rethinking Critical Editions of Fragments by OntologiesRethinking Critical Editions of Fragments by Ontologies
Rethinking Critical Editions of Fragments by Ontologies
 
Presentatio @ ELPUB 2008, Toronto
Presentatio @ ELPUB 2008, TorontoPresentatio @ ELPUB 2008, Toronto
Presentatio @ ELPUB 2008, Toronto
 
Linking Primary and Secondary by Microformats
Linking Primary and Secondary by MicroformatsLinking Primary and Secondary by Microformats
Linking Primary and Secondary by Microformats
 
M. Romanello, E-scholia: scenari digitali per la comunicazione scientifica in...
M. Romanello, E-scholia: scenari digitali per la comunicazione scientifica in...M. Romanello, E-scholia: scenari digitali per la comunicazione scientifica in...
M. Romanello, E-scholia: scenari digitali per la comunicazione scientifica in...
 
M.Romanello Ecal Presentation
M.Romanello Ecal PresentationM.Romanello Ecal Presentation
M.Romanello Ecal Presentation
 

Kürzlich hochgeladen

An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
MateoGardella
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 

Kürzlich hochgeladen (20)

Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 

Enhancing and Extending the Digital Study of Intertextuality (pt. 2): Revealing Patterns of Intertextuality in Corpora of Secondary Literature

  • 1. Enhancing and Extending the Digital Study of Intertextuality (pt. 2) Matteo Romanello (DAI/KCL) @mr56k “Making Meaning from Data” DCA Panel @ SCS annual meeting, New Orleans–January 11, 2015
  • 2. Digital Study of Intertextuality Enhancing and Extending the Digital Study of Intertextuality: 1. finding new possible parallel passages
  • 3. Digital Study of Intertextuality Enhancing and Extending the Digital Study of Intertextuality: 1. finding new possible parallel passages 2. creating a systematic index of already studied parallels
  • 4. The Classicist’s Toolkit Tool Accuracy Granularity Coverage Indexes Locorum + + – Library Catalogues + – – Full-text Search – – + Automatic Citation Indexing +/– + +
  • 5. Citation Extraction: Step 1, (Named Entity Recognition) Current accuracy (F1-score): 73,88%
  • 6. Citation Extraction Step 2: (Relation Detection) Current accuracy (F1-score): 92,60%
  • 7. Citation Extraction Step 3: (Disambiguation) Current accuracy (F1-score): 73,05%
  • 8. Mining Citations from APh and JSTOR APh 80 volumes 8 % of vol. 75 (2004) 366 abstracts 26k tokens 380 citations JSTOR Classics 1,456 journals 171k articles 327m tokens Materiali e Discussioni (29 yrs: 1978-2006) 669 articles 5.6m tokens 40k citations
  • 9. From Index to Network
  • 10. From Texts to Network
  • 14. JSTOR: diachronic trends in MD (1978-2006) Diachronic trends of 5 most cited authors in MD (1978-2006)
  • 15. Digital Study of Intertextuality: Future Work Combination of: discovery of new possible parallel passages systematic index of already studied parallels Scenario: use of the parallel frequency for ranking automatically identified candidates
  • 16. Thank you for your attention! Questions? matteo.romanello@gmail.com https://twitter.com/mr56k Some Links: http://phd.mr56k.info/data/viz/macro.html http://phd.mr56k.info/data/viz/meso.html http://phd.mr56k.info/data/viz/micro.html https://github.com/mromanello/APh_Corpus https://github.com/mromanello/CRefEx