SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Downloaden Sie, um offline zu lesen
Text and data mining – the way forward
www.alpsp.org 30 June 2016
Agenda
• What is TDM?
• What are the issues?
• What are the solutions?
What is TDM?
What is TDM?
What is TDM?
• NOT a simple internet search
What is TDM?
• NOT a simple internet search
• Text: human readable to structured data
• Data: software to identify trends, connections
What is TDM?
• NOT a simple internet search
• Text: human readable to structured data
• Data: software to identify trends, connections
• NOT new
– Financial and business industries
– More recently, interest from scholarly research
What do miners need?
• Content
– Access to, or copy of
What do miners need?
• Content
– Access to, or copy of
• Software
– To convert content (where required)
– Tools
What do miners need?
• Content
– Access to, or copy of
• Software
– To convert content (where required)
– Tools
• Skills/knowledge
Issues? (publisher perspective)
• Copyright
– Is this really what’s holding TDM back?
Issues? (publisher perspective)
• Copyright
– Is this really what’s holding TDM back?
• Do researchers really know what the opportunities are,
and how to take advantage of them?
• What are the technical challenges that need to be
overcome?
PRC study
• TDM survey of researchers (520 respondents)
– global
– cross-discipline
– wide age range
– primarily academic institutions
PRC study
• TDM survey of researchers (520 respondents)
– global
– cross-discipline
– wide age range
– primarily academic institutions
• 69% report never using the technique
PRC study
• TDM survey of researchers (520 respondents)
– global
– cross-discipline
– wide age range
– primarily academic institutions
• 69% report never using the technique
• further 7% tried once, but haven’t used again
PRC study
• TDM survey of researchers (520 respondents)
– global
– cross-discipline
– wide age range
– primarily academic institutions
• 69% report never using the technique
• further 7% tried once, but haven’t used again
• 11% use regularly
PRC study
• Two third of respondents interested to learn more
– consider useful for
• literature review
• extracting new facts
• finding hidden links
PRC study
• Existing users
– Advantages
• time-saving
• perform tasks could not otherwise do
– Where
• Equally using offline or online content
– Using
• Open source, off-the-shelf or other services
Issues? (publisher perspective)
• TDM can involve copying lots of content
• Different legal systems
– UK – exception
– EU – likely exception
– Other countries considering
© jenniferhardie.com
TDM and legislation
• TDM is happening now…
…and has been for some years
• Licensed for commercial use (pharma, financial,
business)
TDM and legislation
• TDM is happening now…
…and has been for some years
• Licensed for commercial use (pharma, financial,
business)
• What effect did the exception in the UK have?
• where user has lawful access to content
• for computational analysis
• research for non-commercial purpose
Effect of UK exception (introduced October 2014)
• where user has lawful access to content
• for computational analysis
• research for non-commercial purpose
• Guidance notes make clear that:
– Publishers/content providers are able to provide
reasonable measures to maintain their network
security or stability
Effect of UK exception (introduced October 2014)
Why is there a call to keep this non-commercial?
• Commercial?
– Already licensing and support for commercial mining
in finance, pharma and other businesses
• Protection of content
– Hackers, sharing of usernames and passwords
– Human-readable experience
Effect of UK exception (introduced October 2014)
• Publishers Licensing Society survey
– Few requests being made
– Access being facilitated, unless project outcomes will
directly compete with publisher product
Year % publishers
approached
Total TDM
requests
2013 16% 79
2014 15% 91
2015 14% 84
Copyright and TDM
– In general, publishers working to facilitate TDM
Copyright and TDM
– In general, publishers working to facilitate TDM
– Many now have statements to support mining
• http://olabout.wiley.com/WileyCDA/Section/id-826542.html
• http://www.springer.com/gp/rights-permissions/springer-s-text-and-data-
mining-policy/29056
• https://www.elsevier.com/about/company-information/policies/text-and-data-
mining
• http://www.microbiologyresearch.org/authors/editorial-policies
Copyright and TDM
– In general, publishers working to facilitate TDM
– Many now have statements to support mining
• http://olabout.wiley.com/WileyCDA/Section/id-826542.html
• http://www.springer.com/gp/rights-permissions/springer-s-text-and-data-
mining-policy/29056
• https://www.elsevier.com/about/company-information/policies/text-and-data-
mining
• http://www.microbiologyresearch.org/authors/editorial-policies
– But what happens when there is uncertainty?
PLSClear
– don’t have legal access?
– request permission centrally (not just for TDM)
Technical challenges
• Technical challenges
– facilitating access to bona fide miners
• Prevent theft of content
• Protecting platforms
Technical challenges
• Technical challenges
– facilitating access to bona fide miners
• Prevent theft of content
• Protecting platforms
– Lack of standardization
• Content format
• Interfaces with researchers' preferred mining tools
Technical challenges
• Technical challenges
– facilitating access to bona fide miners
• Prevent theft of content
• Protecting platforms
– Lack of standardization
• Content format
• Interfaces with researchers' preferred mining tools
– Access to both subscribed and non-subscribed
content
• RightFind™ XML for mining
– Produces standardized data
Overcoming technical challenges
• Crossref text and data mining services
– DOIs
– Metadata API
– Delivers full text from publisher sites
Overcoming technical challenges
© coachwithheart.wordpress.com

Weitere ähnliche Inhalte

Was ist angesagt?

UKSG 2018 Breakout - Trouble(shooting) with a capital T: how categorising and...
UKSG 2018 Breakout - Trouble(shooting) with a capital T: how categorising and...UKSG 2018 Breakout - Trouble(shooting) with a capital T: how categorising and...
UKSG 2018 Breakout - Trouble(shooting) with a capital T: how categorising and...UKSG: connecting the knowledge community
 
Improving the Transparency and Credibility of Open Access Publishing by Lars ...
Improving the Transparency and Credibility of Open Access Publishing by Lars ...Improving the Transparency and Credibility of Open Access Publishing by Lars ...
Improving the Transparency and Credibility of Open Access Publishing by Lars ...DOAJ (Directory of Open Access Journals)
 
Research data spring: giving researchers credit for their data
Research data spring: giving researchers credit for their dataResearch data spring: giving researchers credit for their data
Research data spring: giving researchers credit for their dataJisc RDM
 
Research Week 2014: Tri-council Open-Access Policies and Data Management Plan...
Research Week 2014: Tri-council Open-Access Policies and Data Management Plan...Research Week 2014: Tri-council Open-Access Policies and Data Management Plan...
Research Week 2014: Tri-council Open-Access Policies and Data Management Plan...Wilfrid Laurier University
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...Carole Goble
 
The Embedded Data Librarian
The Embedded Data LibrarianThe Embedded Data Librarian
The Embedded Data LibrarianLibrary_Connect
 
January 13, 2016 NISO Webinar: Ensuring the Scholarly Record: Scholarly Retra...
January 13, 2016 NISO Webinar: Ensuring the Scholarly Record: Scholarly Retra...January 13, 2016 NISO Webinar: Ensuring the Scholarly Record: Scholarly Retra...
January 13, 2016 NISO Webinar: Ensuring the Scholarly Record: Scholarly Retra...DeVonne Parks, CEM
 
The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...
The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...
The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...Anna De Liddo
 

Was ist angesagt? (20)

UKSG 2018 Breakout - Trouble(shooting) with a capital T: how categorising and...
UKSG 2018 Breakout - Trouble(shooting) with a capital T: how categorising and...UKSG 2018 Breakout - Trouble(shooting) with a capital T: how categorising and...
UKSG 2018 Breakout - Trouble(shooting) with a capital T: how categorising and...
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
August 12 NISO Webinar: MOOCs and Libraries: A Brewing Collaboration.
August 12 NISO Webinar: MOOCs and Libraries: A Brewing Collaboration.August 12 NISO Webinar: MOOCs and Libraries: A Brewing Collaboration.
August 12 NISO Webinar: MOOCs and Libraries: A Brewing Collaboration.
 
Alison McNab - Document management tools for the next decade: writing, citing...
Alison McNab - Document management tools for the next decade: writing, citing...Alison McNab - Document management tools for the next decade: writing, citing...
Alison McNab - Document management tools for the next decade: writing, citing...
 
Improving the Transparency and Credibility of Open Access Publishing by Lars ...
Improving the Transparency and Credibility of Open Access Publishing by Lars ...Improving the Transparency and Credibility of Open Access Publishing by Lars ...
Improving the Transparency and Credibility of Open Access Publishing by Lars ...
 
Siegman "Creating Accessible Content"
Siegman "Creating Accessible Content"Siegman "Creating Accessible Content"
Siegman "Creating Accessible Content"
 
Research data spring: giving researchers credit for their data
Research data spring: giving researchers credit for their dataResearch data spring: giving researchers credit for their data
Research data spring: giving researchers credit for their data
 
November 18, 2015 NISO Webinar: Text Mining: Digging Deep for Knowledge
November 18, 2015 NISO Webinar: Text Mining: Digging Deep for KnowledgeNovember 18, 2015 NISO Webinar: Text Mining: Digging Deep for Knowledge
November 18, 2015 NISO Webinar: Text Mining: Digging Deep for Knowledge
 
Research Week 2014: Tri-council Open-Access Policies and Data Management Plan...
Research Week 2014: Tri-council Open-Access Policies and Data Management Plan...Research Week 2014: Tri-council Open-Access Policies and Data Management Plan...
Research Week 2014: Tri-council Open-Access Policies and Data Management Plan...
 
Open Discovery Initiative Successes - January 28, 2015
Open Discovery Initiative Successes - January 28, 2015Open Discovery Initiative Successes - January 28, 2015
Open Discovery Initiative Successes - January 28, 2015
 
Jan 14 NISO Webinar Net Neutrality: Will Library Resources be stuck in the Sl...
Jan 14 NISO Webinar Net Neutrality: Will Library Resources be stuck in the Sl...Jan 14 NISO Webinar Net Neutrality: Will Library Resources be stuck in the Sl...
Jan 14 NISO Webinar Net Neutrality: Will Library Resources be stuck in the Sl...
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
Green "Building and Launching The Commons: Because the Scholarly Record has a...
Green "Building and Launching The Commons: Because the Scholarly Record has a...Green "Building and Launching The Commons: Because the Scholarly Record has a...
Green "Building and Launching The Commons: Because the Scholarly Record has a...
 
NISO Two Part Webinar: Is Granularity the Next Discovery Frontier? Part 1: ...
NISO Two Part Webinar:   Is Granularity the Next Discovery Frontier? Part 1: ...NISO Two Part Webinar:   Is Granularity the Next Discovery Frontier? Part 1: ...
NISO Two Part Webinar: Is Granularity the Next Discovery Frontier? Part 1: ...
 
The Embedded Data Librarian
The Embedded Data LibrarianThe Embedded Data Librarian
The Embedded Data Librarian
 
January 13, 2016 NISO Webinar: Ensuring the Scholarly Record: Scholarly Retra...
January 13, 2016 NISO Webinar: Ensuring the Scholarly Record: Scholarly Retra...January 13, 2016 NISO Webinar: Ensuring the Scholarly Record: Scholarly Retra...
January 13, 2016 NISO Webinar: Ensuring the Scholarly Record: Scholarly Retra...
 
Evolution of e-Content Distribution: Ad Hoc to Standardization
Evolution of e-Content Distribution: Ad Hoc to StandardizationEvolution of e-Content Distribution: Ad Hoc to Standardization
Evolution of e-Content Distribution: Ad Hoc to Standardization
 
Broadbent Rozum Creating a Culture of Compliance
Broadbent Rozum Creating a Culture of ComplianceBroadbent Rozum Creating a Culture of Compliance
Broadbent Rozum Creating a Culture of Compliance
 
Rodriguez No Free Lunch Sept 7
Rodriguez No Free Lunch Sept 7Rodriguez No Free Lunch Sept 7
Rodriguez No Free Lunch Sept 7
 
The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...
The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...
The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...
 

Ähnlich wie Text and Data Mining - The Way Forward for Scholarly Research

Aslapr market research for entrepreneurs mg irc presentation 09 22-14
Aslapr market research for entrepreneurs mg irc presentation 09 22-14Aslapr market research for entrepreneurs mg irc presentation 09 22-14
Aslapr market research for entrepreneurs mg irc presentation 09 22-14Mark Goldstein
 
Supporting the uptake of TDM
Supporting the uptake of TDMSupporting the uptake of TDM
Supporting the uptake of TDMopenminted_eu
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation Research Data Alliance
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation Research Data Alliance
 
Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...
Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...
Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...Usama Fayyad
 
Relevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search TechnologiesRelevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search Technologiesenterprisesearchmeetup
 
ASLAPR Market Research for Entrepreneurs Presentation 5/13/14
ASLAPR Market Research for Entrepreneurs Presentation 5/13/14ASLAPR Market Research for Entrepreneurs Presentation 5/13/14
ASLAPR Market Research for Entrepreneurs Presentation 5/13/14Mark Goldstein
 
Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020OpenAIRE
 
General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...Nancy Pontika
 
Self Service Online Research - online communities for research and insights
Self Service Online Research - online communities for research and insightsSelf Service Online Research - online communities for research and insights
Self Service Online Research - online communities for research and insightsStephen Thompson
 
NISO-STM RA21 Project Update
NISO-STM RA21 Project UpdateNISO-STM RA21 Project Update
NISO-STM RA21 Project UpdateTACNISO
 
GLAM Survey presentation Wikimania 2013
GLAM Survey presentation Wikimania 2013GLAM Survey presentation Wikimania 2013
GLAM Survey presentation Wikimania 2013Beat Estermann
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationHistoric Environment Scotland
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationEDINA, University of Edinburgh
 
Text Mining - Techniques & Limitations (A Pharmaceutical Industry Viewpoint)
Text Mining - Techniques & Limitations (A Pharmaceutical Industry Viewpoint)Text Mining - Techniques & Limitations (A Pharmaceutical Industry Viewpoint)
Text Mining - Techniques & Limitations (A Pharmaceutical Industry Viewpoint)Frank Oellien
 
Ahwatukee CoC Market Research for Entrepreneurs Presentation 11_19_14
Ahwatukee CoC Market Research for Entrepreneurs Presentation 11_19_14Ahwatukee CoC Market Research for Entrepreneurs Presentation 11_19_14
Ahwatukee CoC Market Research for Entrepreneurs Presentation 11_19_14Mark Goldstein
 
Introduction Data Science.pptx
Introduction Data Science.pptxIntroduction Data Science.pptx
Introduction Data Science.pptxAkhirulAminulloh2
 
Transparent Personal Data Processing: The Road Ahead
Transparent Personal Data Processing: The Road AheadTransparent Personal Data Processing: The Road Ahead
Transparent Personal Data Processing: The Road AheadSabrina Kirrane
 
Creating a Data Management Plan for your Research
Creating a Data Management Plan for your ResearchCreating a Data Management Plan for your Research
Creating a Data Management Plan for your ResearchRobin Rice
 

Ähnlich wie Text and Data Mining - The Way Forward for Scholarly Research (20)

Aslapr market research for entrepreneurs mg irc presentation 09 22-14
Aslapr market research for entrepreneurs mg irc presentation 09 22-14Aslapr market research for entrepreneurs mg irc presentation 09 22-14
Aslapr market research for entrepreneurs mg irc presentation 09 22-14
 
Supporting the uptake of TDM
Supporting the uptake of TDMSupporting the uptake of TDM
Supporting the uptake of TDM
 
Building blocks for success: criteria for trusted institutional repositories
Building blocks for success: criteria for trusted institutional repositoriesBuilding blocks for success: criteria for trusted institutional repositories
Building blocks for success: criteria for trusted institutional repositories
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...
Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...
Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...
 
Relevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search TechnologiesRelevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search Technologies
 
ASLAPR Market Research for Entrepreneurs Presentation 5/13/14
ASLAPR Market Research for Entrepreneurs Presentation 5/13/14ASLAPR Market Research for Entrepreneurs Presentation 5/13/14
ASLAPR Market Research for Entrepreneurs Presentation 5/13/14
 
Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020
 
General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...
 
Self Service Online Research - online communities for research and insights
Self Service Online Research - online communities for research and insightsSelf Service Online Research - online communities for research and insights
Self Service Online Research - online communities for research and insights
 
NISO-STM RA21 Project Update
NISO-STM RA21 Project UpdateNISO-STM RA21 Project Update
NISO-STM RA21 Project Update
 
GLAM Survey presentation Wikimania 2013
GLAM Survey presentation Wikimania 2013GLAM Survey presentation Wikimania 2013
GLAM Survey presentation Wikimania 2013
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant Application
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant Application
 
Text Mining - Techniques & Limitations (A Pharmaceutical Industry Viewpoint)
Text Mining - Techniques & Limitations (A Pharmaceutical Industry Viewpoint)Text Mining - Techniques & Limitations (A Pharmaceutical Industry Viewpoint)
Text Mining - Techniques & Limitations (A Pharmaceutical Industry Viewpoint)
 
Ahwatukee CoC Market Research for Entrepreneurs Presentation 11_19_14
Ahwatukee CoC Market Research for Entrepreneurs Presentation 11_19_14Ahwatukee CoC Market Research for Entrepreneurs Presentation 11_19_14
Ahwatukee CoC Market Research for Entrepreneurs Presentation 11_19_14
 
Introduction Data Science.pptx
Introduction Data Science.pptxIntroduction Data Science.pptx
Introduction Data Science.pptx
 
Transparent Personal Data Processing: The Road Ahead
Transparent Personal Data Processing: The Road AheadTransparent Personal Data Processing: The Road Ahead
Transparent Personal Data Processing: The Road Ahead
 
Creating a Data Management Plan for your Research
Creating a Data Management Plan for your ResearchCreating a Data Management Plan for your Research
Creating a Data Management Plan for your Research
 

Mehr von National Information Standards Organization (NISO)

Mehr von National Information Standards Organization (NISO) (20)

Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
 
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
 
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
 
Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"
 
Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"
 
Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"
 
Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"
Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"
Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"
 
Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"
Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"
Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"
 
Hahnel “Mapping Progress: Reflections and Charting Future Pathways"
Hahnel “Mapping Progress: Reflections and Charting Future Pathways"Hahnel “Mapping Progress: Reflections and Charting Future Pathways"
Hahnel “Mapping Progress: Reflections and Charting Future Pathways"
 
Stall "Open Science: The Journey of a Scholarly Society"
Stall "Open Science: The Journey of a Scholarly Society"Stall "Open Science: The Journey of a Scholarly Society"
Stall "Open Science: The Journey of a Scholarly Society"
 
Hrynaszkiewicz "A Publisher's Perspective on Open Science"
Hrynaszkiewicz "A Publisher's Perspective on Open Science"Hrynaszkiewicz "A Publisher's Perspective on Open Science"
Hrynaszkiewicz "A Publisher's Perspective on Open Science"
 

Kürzlich hochgeladen

BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
Employablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxEmployablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxryandux83rd
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
Objectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxObjectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxMadhavi Dharankar
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...Nguyen Thanh Tu Collection
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfChristalin Nelson
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17Celine George
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 
Shark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsShark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsArubSultan
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesVijayaLaxmi84
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroomSamsung Business USA
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...Nguyen Thanh Tu Collection
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6Vanessa Camilleri
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 

Kürzlich hochgeladen (20)

BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
Employablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxEmployablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptx
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
Objectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxObjectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdf
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
Shark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsShark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristics
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their uses
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
Introduction to Research ,Need for research, Need for design of Experiments, ...
Introduction to Research ,Need for research, Need for design of Experiments, ...Introduction to Research ,Need for research, Need for design of Experiments, ...
Introduction to Research ,Need for research, Need for design of Experiments, ...
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 

Text and Data Mining - The Way Forward for Scholarly Research

  • 1. Text and data mining – the way forward www.alpsp.org 30 June 2016
  • 2. Agenda • What is TDM? • What are the issues? • What are the solutions?
  • 3.
  • 6. What is TDM? • NOT a simple internet search
  • 7. What is TDM? • NOT a simple internet search • Text: human readable to structured data • Data: software to identify trends, connections
  • 8. What is TDM? • NOT a simple internet search • Text: human readable to structured data • Data: software to identify trends, connections • NOT new – Financial and business industries – More recently, interest from scholarly research
  • 9. What do miners need? • Content – Access to, or copy of
  • 10. What do miners need? • Content – Access to, or copy of • Software – To convert content (where required) – Tools
  • 11. What do miners need? • Content – Access to, or copy of • Software – To convert content (where required) – Tools • Skills/knowledge
  • 12. Issues? (publisher perspective) • Copyright – Is this really what’s holding TDM back?
  • 13. Issues? (publisher perspective) • Copyright – Is this really what’s holding TDM back? • Do researchers really know what the opportunities are, and how to take advantage of them? • What are the technical challenges that need to be overcome?
  • 14. PRC study • TDM survey of researchers (520 respondents) – global – cross-discipline – wide age range – primarily academic institutions
  • 15. PRC study • TDM survey of researchers (520 respondents) – global – cross-discipline – wide age range – primarily academic institutions • 69% report never using the technique
  • 16. PRC study • TDM survey of researchers (520 respondents) – global – cross-discipline – wide age range – primarily academic institutions • 69% report never using the technique • further 7% tried once, but haven’t used again
  • 17. PRC study • TDM survey of researchers (520 respondents) – global – cross-discipline – wide age range – primarily academic institutions • 69% report never using the technique • further 7% tried once, but haven’t used again • 11% use regularly
  • 18. PRC study • Two third of respondents interested to learn more – consider useful for • literature review • extracting new facts • finding hidden links
  • 19. PRC study • Existing users – Advantages • time-saving • perform tasks could not otherwise do – Where • Equally using offline or online content – Using • Open source, off-the-shelf or other services
  • 20. Issues? (publisher perspective) • TDM can involve copying lots of content • Different legal systems – UK – exception – EU – likely exception – Other countries considering © jenniferhardie.com
  • 21. TDM and legislation • TDM is happening now… …and has been for some years • Licensed for commercial use (pharma, financial, business)
  • 22. TDM and legislation • TDM is happening now… …and has been for some years • Licensed for commercial use (pharma, financial, business) • What effect did the exception in the UK have?
  • 23. • where user has lawful access to content • for computational analysis • research for non-commercial purpose Effect of UK exception (introduced October 2014)
  • 24. • where user has lawful access to content • for computational analysis • research for non-commercial purpose • Guidance notes make clear that: – Publishers/content providers are able to provide reasonable measures to maintain their network security or stability Effect of UK exception (introduced October 2014)
  • 25. Why is there a call to keep this non-commercial? • Commercial? – Already licensing and support for commercial mining in finance, pharma and other businesses • Protection of content – Hackers, sharing of usernames and passwords – Human-readable experience
  • 26. Effect of UK exception (introduced October 2014) • Publishers Licensing Society survey – Few requests being made – Access being facilitated, unless project outcomes will directly compete with publisher product Year % publishers approached Total TDM requests 2013 16% 79 2014 15% 91 2015 14% 84
  • 27. Copyright and TDM – In general, publishers working to facilitate TDM
  • 28. Copyright and TDM – In general, publishers working to facilitate TDM – Many now have statements to support mining • http://olabout.wiley.com/WileyCDA/Section/id-826542.html • http://www.springer.com/gp/rights-permissions/springer-s-text-and-data- mining-policy/29056 • https://www.elsevier.com/about/company-information/policies/text-and-data- mining • http://www.microbiologyresearch.org/authors/editorial-policies
  • 29. Copyright and TDM – In general, publishers working to facilitate TDM – Many now have statements to support mining • http://olabout.wiley.com/WileyCDA/Section/id-826542.html • http://www.springer.com/gp/rights-permissions/springer-s-text-and-data- mining-policy/29056 • https://www.elsevier.com/about/company-information/policies/text-and-data- mining • http://www.microbiologyresearch.org/authors/editorial-policies – But what happens when there is uncertainty?
  • 30. PLSClear – don’t have legal access? – request permission centrally (not just for TDM)
  • 31. Technical challenges • Technical challenges – facilitating access to bona fide miners • Prevent theft of content • Protecting platforms
  • 32. Technical challenges • Technical challenges – facilitating access to bona fide miners • Prevent theft of content • Protecting platforms – Lack of standardization • Content format • Interfaces with researchers' preferred mining tools
  • 33. Technical challenges • Technical challenges – facilitating access to bona fide miners • Prevent theft of content • Protecting platforms – Lack of standardization • Content format • Interfaces with researchers' preferred mining tools – Access to both subscribed and non-subscribed content
  • 34. • RightFind™ XML for mining – Produces standardized data Overcoming technical challenges
  • 35. • Crossref text and data mining services – DOIs – Metadata API – Delivers full text from publisher sites Overcoming technical challenges