SlideShare a Scribd company logo
1 of 57
Defragmentation:
Maximising the Use of
Existing Knowledge
Jan Velterop — APE 2015 — Berlin 21 January 2015
Open Access…
…is not the goal
It is a means
to reach the goal
And the goal is…?
Maximal usefulness of existing
scientific research results in order to
achieve:
efficient, fast, and effective new
knowledge creation and discovery
i.e. highest
possible return on
public investment
optimal dissemination…
…of knowledge
The ultimate goal, to which
Open Access is merely a
means, may not be widely
understood – by publishers
The ultimate goal, to which
Open Access is merely a
means, may not be widely
understood
That may be why there are
a lot of different
interpretations of what
Open Access actually is
(in spite of the clear definition given in the
Budapest Open Access Initiative)
The fact that not all published
research is accessible to all
researchers, leads to ‘lamp
post research’
Lamp post research
Looking merely at the literature that
one can access – which is not
necessarily the literature that is
potentially important to one’s research
Lamp post research:
Publicatarrh
&
Datarrhoea
In year Cumulative
Number of abstracts in PubMed
11,135,542
In year Cumulative
Number of abstracts in PubMed
…averaging
more than 2
abstracts
added every
minute
in 2014…
On the impossibility of being expert
341 doi: http://dx.doi.org/10.1136/bmj.c6815 (Published 14 Dece
More scientific and medical papers are being
published now than ever before.
Authors Alan G Fraser and Frank D Dunstan
think that new strategies are needed to deal with
this avalanche of information
new strategies are needed
How does a researcher
decide what’s ‘relevant’
anyway?
How are we filtering or
choosing?
Possible
solutions?
problemEvery has its solution
problemEvery has itssolution
Possible solutions?
Publish fewer articles
Don’t be ridiculous!
Find better ways to decide what’s
truly relevant
Now you’re talking!
First
create an overview…
…only then
start digging
We need the equivalent of aerial surveys
— ‘knowledge drones’? —
Some of my professors were already known as
‘knowledge drones’ :-)
How might we create overviews?
Getting the picture from a large number of data points
‘Whole-o-gram’
Getting a better picture from even more data points
Homing in
on detail
It’s not just about finding
information
It’s also – and possibly more –
about the value & power of
‘recombinant knowledge’
Saving significant time-to-knowledge
After analysis in BRAIN: 4 minutes
Arriving at this conclusion (review in Frontiers Immunology)
after reading 221 papers: weeks
5
“Chronic immune activation is the primary driver in HIV pathogenesis”
What stands in the way?
different…
• publishers
• journals
• platforms
• licences
• formats
• silos
• languages
First of all: fragmentationAnd also, of course: access
(lack of)
Not to the whole article…but to the data
and assertions buried in them
Plenty of initiatives to find stuff:
• PubChase – Open Access Biomedical Journal
Reference Library
• Paperity
• SciLit – Database of Scientific and Scholarly
Literature
• Google Scholar
• Et cetera
Some go further:
• Europe PubMed Central – offering semantic
tools
0
1000000
2000000
3000000
4000000
Title
Full-text in PMC
of which with CC-licence
all full-text articles in PubMedCentral (100%)
all articles with CC-licences (11.9%)
all articles with CC-BY licences (8.7%)
3,087,430
366,973 270,114
Europe-PMC, 19 December 2014“The majority of articles in PMC are subject to traditional copyright restrictions”
Not many ‘true’ open access:
What we need is information
extracted from as many articles as
possible
The more we have, the ‘sharper’
the knowledge picture
Fragmentation and lack of access are
encumbrances to seamless knowledge-
pattern-analyses and themed collection
building (e.g. of graphs)…
…which are fast becoming an absolute
necessity due to the vast amounts of
published material, growing every year,
and, of course, in the aggregate
“As the rate of publishing accelerates,
the need for computational support to
work out which articles to read, and how
to interpret, reproduce and validate the
claims they contain is growing.”
Quote from ‘Lazarus’:
http://www.bbsrc.ac.uk/pa/grants/AwardDetails.aspx?FundingReference=BB/L005298/1
Traditional publications are aimed at
consumption by humans;
“stories that persuade with data”*
Not easily amenable to
machine-processing
* Anita de Waard, Elsevier
In the life-science literature, we typically find:
• drug-like molecules represented as illustrations;
• biochemical properties as tables or graphs;
• protein/DNA sequences buried amongst text;
• references and citations with arcane formats;
• other objects of biological interest being given
ambiguous names.
And, horrors like this (from PLOS, h/t Peter Murray-Rust):
+ (plus underscored) isn’t the same as ± (plus-minus)!
• re-type figures from tables;
• chase citations through digital
libraries;
• redraw molecules by hand;
• et cetera.
tedious, error-prone, wasteful
scientists should be able to use
their precious time better
This creates the need to:
ocuments
Via UD, LAZARUS ‘resurrects’ knowledge from being
buried in articles:
• entities (‘concepts’, incl. synonyms, e.g. proteins)
• phrases, statements, assertions (e.g. triples)
• molecules (incl. Markush structure groups)
• graphs
• tables http://utopiadocs.com
• entities (‘concepts’, incl. synonyms, e.g. proteins)
• phrases, statements, assertions (e.g. triples)
• molecules (incl. Markush structure groups)
• graphs
• tables
These are captured – with their provenance, e.g.
DOI – in a ‘Knowledge Graph’ of their relationships
When assertions are captured, they are compared to
the Knowledge Graph and labelled as ‘new’ (to the
Graph) or ‘already found earlier’
“Lazarus to harness the crowd reading life-
science articles to resurrect the swathes of
legacy data buried in charts, tables, diagrams
and free-text, to liberate processable data into a
shared resource that benefits the community.”
“…activities currently carried out anyway by
individuals for their own purposes (annotating,
cross-referencing articles with databases,
organising collections of articles).”
VHL protein binds to HIF-α which is ubiquitinated and tagged for degradation in the proteasome.
These ‘assertions’ form the ‘knowledge
profile’ of an article, and are added to a
growing ‘knowledge graph’ which can
be analysed for trends, clusters, areas
of intensive activity, et cetera.
Some other initiatives to bring
the open literature together so
that it can be used for large
scale semantic analyses:
libraccess.org
The goal of Libraccess is to
aggregate, de-duplicate, clean and
index scientific resources in open
access repositories, from
all countries, from all disciplines,
and make them available to all,
through a website and with APIs.
Research Pad
Open Access Journal Reference Library
(www.researchpad.co)
Converting all that’s open (CC-BY) into ePub format
for tablets and smartphones.
What I find most interesting, however, is their plan*
to make the whole body of all literature that’s openly
accessible available in XML for semantic analysis†
* being worked on as we speak, they confirmed to me
†
I hope they will add the ‘knowledge profiles’ of paywalled
articles created by Lazarus
Build collection of favouritesRead full textshare with othersInspect metrics
sales@newgen.co technical inquiries: patrick@newgen.co
Thank you
Jan Velterop — APE 2015 — Berlin 21 January 2015
velterop@me.com

More Related Content

What's hot

Open scholarship [a FOSTER open science talk]
Open scholarship [a FOSTER open science talk]Open scholarship [a FOSTER open science talk]
Open scholarship [a FOSTER open science talk]Ross Mounce
 
Sharing re-usable phylogenetic data: we're not there yet
Sharing re-usable phylogenetic data: we're not there yetSharing re-usable phylogenetic data: we're not there yet
Sharing re-usable phylogenetic data: we're not there yetRoss Mounce
 
Dagstuhl "Future" sesssion intro slides
Dagstuhl "Future" sesssion intro slidesDagstuhl "Future" sesssion intro slides
Dagstuhl "Future" sesssion intro slidesTim Clark
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData TheContentMine
 
Liberating facts from the scientific literature - Jisc Digifest 2016
Liberating facts from the scientific literature - Jisc Digifest 2016 Liberating facts from the scientific literature - Jisc Digifest 2016
Liberating facts from the scientific literature - Jisc Digifest 2016 TheContentMine
 
Open software and knowledge for MIOSS
Open software and knowledge for MIOSSOpen software and knowledge for MIOSS
Open software and knowledge for MIOSSpetermurrayrust
 
HNFE 2014 library lecture Spring 2016
HNFE 2014 library lecture Spring 2016HNFE 2014 library lecture Spring 2016
HNFE 2014 library lecture Spring 2016Virginia Pannabecker
 
Automatic Extraction of Knowledge from the Literature
Automatic Extraction of Knowledge from the LiteratureAutomatic Extraction of Knowledge from the Literature
Automatic Extraction of Knowledge from the Literaturepetermurrayrust
 
Challenges in Enabling Mixed Media Scholarly Research with Multi-Media Data i...
Challenges in Enabling Mixed Media Scholarly Research with Multi-Media Data i...Challenges in Enabling Mixed Media Scholarly Research with Multi-Media Data i...
Challenges in Enabling Mixed Media Scholarly Research with Multi-Media Data i...roelandordelman.nl
 
ContentMining in Neuroscience
ContentMining in NeuroscienceContentMining in Neuroscience
ContentMining in Neurosciencepetermurrayrust
 
Effective search of bibliographic databases
Effective search of bibliographic databasesEffective search of bibliographic databases
Effective search of bibliographic databasesTarek Tawfik Amin
 
Text and Data Mining explained at FTDM
Text and Data Mining explained at FTDMText and Data Mining explained at FTDM
Text and Data Mining explained at FTDMpetermurrayrust
 
Content Mining at Wellcome Trust
Content Mining at Wellcome TrustContent Mining at Wellcome Trust
Content Mining at Wellcome Trustpetermurrayrust
 
PLoS - Why It is a Model to be Emulated
PLoS - Why It is a Model to be EmulatedPLoS - Why It is a Model to be Emulated
PLoS - Why It is a Model to be EmulatedPhilip Bourne
 
Content Mining of Science in Europe
Content Mining of Science in EuropeContent Mining of Science in Europe
Content Mining of Science in Europepetermurrayrust
 
Biovision2017 Accessing the scientific literature
Biovision2017 Accessing the scientific literatureBiovision2017 Accessing the scientific literature
Biovision2017 Accessing the scientific literaturepetermurrayrust
 
Citing and reading behaviours in high energy physics.
Citing and reading behaviours in high energy physics.Citing and reading behaviours in high energy physics.
Citing and reading behaviours in high energy physics.Proyecto CeVALE2
 

What's hot (20)

Open scholarship [a FOSTER open science talk]
Open scholarship [a FOSTER open science talk]Open scholarship [a FOSTER open science talk]
Open scholarship [a FOSTER open science talk]
 
Sharing re-usable phylogenetic data: we're not there yet
Sharing re-usable phylogenetic data: we're not there yetSharing re-usable phylogenetic data: we're not there yet
Sharing re-usable phylogenetic data: we're not there yet
 
Dagstuhl "Future" sesssion intro slides
Dagstuhl "Future" sesssion intro slidesDagstuhl "Future" sesssion intro slides
Dagstuhl "Future" sesssion intro slides
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData
 
Ngsp
NgspNgsp
Ngsp
 
Liberating facts from the scientific literature - Jisc Digifest 2016
Liberating facts from the scientific literature - Jisc Digifest 2016 Liberating facts from the scientific literature - Jisc Digifest 2016
Liberating facts from the scientific literature - Jisc Digifest 2016
 
Open software and knowledge for MIOSS
Open software and knowledge for MIOSSOpen software and knowledge for MIOSS
Open software and knowledge for MIOSS
 
HNFE 2014 library lecture Spring 2016
HNFE 2014 library lecture Spring 2016HNFE 2014 library lecture Spring 2016
HNFE 2014 library lecture Spring 2016
 
Automatic Extraction of Knowledge from the Literature
Automatic Extraction of Knowledge from the LiteratureAutomatic Extraction of Knowledge from the Literature
Automatic Extraction of Knowledge from the Literature
 
Challenges in Enabling Mixed Media Scholarly Research with Multi-Media Data i...
Challenges in Enabling Mixed Media Scholarly Research with Multi-Media Data i...Challenges in Enabling Mixed Media Scholarly Research with Multi-Media Data i...
Challenges in Enabling Mixed Media Scholarly Research with Multi-Media Data i...
 
ContentMining in Neuroscience
ContentMining in NeuroscienceContentMining in Neuroscience
ContentMining in Neuroscience
 
Effective search of bibliographic databases
Effective search of bibliographic databasesEffective search of bibliographic databases
Effective search of bibliographic databases
 
Text and Data Mining explained at FTDM
Text and Data Mining explained at FTDMText and Data Mining explained at FTDM
Text and Data Mining explained at FTDM
 
Content Mining at Wellcome Trust
Content Mining at Wellcome TrustContent Mining at Wellcome Trust
Content Mining at Wellcome Trust
 
PLoS - Why It is a Model to be Emulated
PLoS - Why It is a Model to be EmulatedPLoS - Why It is a Model to be Emulated
PLoS - Why It is a Model to be Emulated
 
The new alchemy: Online networking, data sharing and research activity distri...
The new alchemy: Online networking, data sharing and research activity distri...The new alchemy: Online networking, data sharing and research activity distri...
The new alchemy: Online networking, data sharing and research activity distri...
 
Content Mining of Science in Europe
Content Mining of Science in EuropeContent Mining of Science in Europe
Content Mining of Science in Europe
 
Cochrane workshop2016
Cochrane workshop2016Cochrane workshop2016
Cochrane workshop2016
 
Biovision2017 Accessing the scientific literature
Biovision2017 Accessing the scientific literatureBiovision2017 Accessing the scientific literature
Biovision2017 Accessing the scientific literature
 
Citing and reading behaviours in high energy physics.
Citing and reading behaviours in high energy physics.Citing and reading behaviours in high energy physics.
Citing and reading behaviours in high energy physics.
 

Viewers also liked

iExpo Paris 10 juin 2010-Velterop
iExpo Paris 10 juin 2010-VelteropiExpo Paris 10 juin 2010-Velterop
iExpo Paris 10 juin 2010-Velteropvelterop
 
Giessen October 9 09 Nano Publication
Giessen October 9 09 Nano PublicationGiessen October 9 09 Nano Publication
Giessen October 9 09 Nano Publicationvelterop
 
Triples And Access
Triples And AccessTriples And Access
Triples And Accessvelterop
 
Reshaping the research library.LIBER's involvement in The European Library
Reshaping the research library.LIBER's involvement in The European LibraryReshaping the research library.LIBER's involvement in The European Library
Reshaping the research library.LIBER's involvement in The European LibraryLIBER Europe
 
Lund Sep 15 09
Lund Sep 15 09Lund Sep 15 09
Lund Sep 15 09velterop
 
The researcher perspective, Jean-Fred Fontaine, MDC Berlin
The researcher perspective, Jean-Fred Fontaine, MDC BerlinThe researcher perspective, Jean-Fred Fontaine, MDC Berlin
The researcher perspective, Jean-Fred Fontaine, MDC BerlinLIBER Europe
 
Liber Cybsoc
Liber CybsocLiber Cybsoc
Liber Cybsocolegliber
 
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...LIBER Europe
 
Presentation for New LIBER Board Members
Presentation for New LIBER Board MembersPresentation for New LIBER Board Members
Presentation for New LIBER Board MembersLIBER Europe
 
LIBER Webinar: 23 Things About Research Data Management
LIBER Webinar: 23 Things About Research Data ManagementLIBER Webinar: 23 Things About Research Data Management
LIBER Webinar: 23 Things About Research Data ManagementLIBER Europe
 

Viewers also liked (11)

iExpo Paris 10 juin 2010-Velterop
iExpo Paris 10 juin 2010-VelteropiExpo Paris 10 juin 2010-Velterop
iExpo Paris 10 juin 2010-Velterop
 
Giessen October 9 09 Nano Publication
Giessen October 9 09 Nano PublicationGiessen October 9 09 Nano Publication
Giessen October 9 09 Nano Publication
 
Triples And Access
Triples And AccessTriples And Access
Triples And Access
 
Reshaping the research library.LIBER's involvement in The European Library
Reshaping the research library.LIBER's involvement in The European LibraryReshaping the research library.LIBER's involvement in The European Library
Reshaping the research library.LIBER's involvement in The European Library
 
Lund Sep 15 09
Lund Sep 15 09Lund Sep 15 09
Lund Sep 15 09
 
The researcher perspective, Jean-Fred Fontaine, MDC Berlin
The researcher perspective, Jean-Fred Fontaine, MDC BerlinThe researcher perspective, Jean-Fred Fontaine, MDC Berlin
The researcher perspective, Jean-Fred Fontaine, MDC Berlin
 
Liber Cybsoc
Liber CybsocLiber Cybsoc
Liber Cybsoc
 
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...
AnalogIST/ezPAARSE: Analysing Locally Gathered Logfiles to Determine Users’ A...
 
Presentation for New LIBER Board Members
Presentation for New LIBER Board MembersPresentation for New LIBER Board Members
Presentation for New LIBER Board Members
 
Jan Velterop: Science publishing: the different interests of record keeping a...
Jan Velterop: Science publishing: the different interests of record keeping a...Jan Velterop: Science publishing: the different interests of record keeping a...
Jan Velterop: Science publishing: the different interests of record keeping a...
 
LIBER Webinar: 23 Things About Research Data Management
LIBER Webinar: 23 Things About Research Data ManagementLIBER Webinar: 23 Things About Research Data Management
LIBER Webinar: 23 Things About Research Data Management
 

Similar to Optimising the use of existing knowledge

Velterop 2 a ssp arlington may 2015
Velterop 2 a ssp arlington may 2015Velterop 2 a ssp arlington may 2015
Velterop 2 a ssp arlington may 2015velterop
 
Velterop 2 a ssp arlington may 2015
Velterop 2 a ssp arlington may 2015Velterop 2 a ssp arlington may 2015
Velterop 2 a ssp arlington may 2015velterop
 
STRETCHING THE BOUNDARIES OF PUBLISHING: ALTERNATIVES
STRETCHING THE BOUNDARIES OF PUBLISHING: ALTERNATIVESSTRETCHING THE BOUNDARIES OF PUBLISHING: ALTERNATIVES
STRETCHING THE BOUNDARIES OF PUBLISHING: ALTERNATIVESNicolaie Constantinescu
 
New Paradigms In Scholarly Communication (Ibm)
New Paradigms In Scholarly Communication (Ibm)New Paradigms In Scholarly Communication (Ibm)
New Paradigms In Scholarly Communication (Ibm)Cornelius Puschmann
 
Open Access Advantages, Quality and Progress of the Research
Open Access Advantages, Quality and Progress of the ResearchOpen Access Advantages, Quality and Progress of the Research
Open Access Advantages, Quality and Progress of the ResearchIryna Kuchma
 
ScholarlHKES SVP DEGREE COLLEGE, SADASHIVANAGAR, BANGALORE-560080. IQAC ORGA...
ScholarlHKES SVP DEGREE COLLEGE, SADASHIVANAGAR, BANGALORE-560080.  IQAC ORGA...ScholarlHKES SVP DEGREE COLLEGE, SADASHIVANAGAR, BANGALORE-560080.  IQAC ORGA...
ScholarlHKES SVP DEGREE COLLEGE, SADASHIVANAGAR, BANGALORE-560080. IQAC ORGA...Harish Bramhaver
 
Open sciencerefresher2019
Open sciencerefresher2019Open sciencerefresher2019
Open sciencerefresher2019heila1
 
Being an Open Scholar in a Connected World
Being an Open Scholar in a Connected WorldBeing an Open Scholar in a Connected World
Being an Open Scholar in a Connected WorldStian Håklev
 
Open access resources
Open access resourcesOpen access resources
Open access resourcesAkshay Kumar
 
Institutionalisation of an open access – a new possibility for research. A s...
Institutionalisation of an open access – a new possibility for research.  A s...Institutionalisation of an open access – a new possibility for research.  A s...
Institutionalisation of an open access – a new possibility for research. A s...Birute Railiene
 
Mbmh Seminar Leigh Mantle
Mbmh Seminar Leigh MantleMbmh Seminar Leigh Mantle
Mbmh Seminar Leigh Mantlelmantle
 
ScienceOpen article for the Shanghai Publishing Conference August 2015
ScienceOpen article for the Shanghai Publishing Conference August 2015ScienceOpen article for the Shanghai Publishing Conference August 2015
ScienceOpen article for the Shanghai Publishing Conference August 2015ScienceOpen
 
Publish or Perish - Realising Google Scholar's potential to democratise citat...
Publish or Perish - Realising Google Scholar's potential to democratise citat...Publish or Perish - Realising Google Scholar's potential to democratise citat...
Publish or Perish - Realising Google Scholar's potential to democratise citat...Anne-Wil Harzing
 
Review of literature
Review of literature Review of literature
Review of literature HEMANT SHARMA
 
Open Science: Openness in Scientific Research
Open Science: Openness in Scientific ResearchOpen Science: Openness in Scientific Research
Open Science: Openness in Scientific Researchpedjac
 
Open Access: Improving scholarly communication
Open Access: Improving scholarly communicationOpen Access: Improving scholarly communication
Open Access: Improving scholarly communicationIryna Kuchma
 
Libraries, collections, technology: presented at Pennylvania State University...
Libraries, collections, technology: presented at Pennylvania State University...Libraries, collections, technology: presented at Pennylvania State University...
Libraries, collections, technology: presented at Pennylvania State University...lisld
 

Similar to Optimising the use of existing knowledge (20)

Velterop 2 a ssp arlington may 2015
Velterop 2 a ssp arlington may 2015Velterop 2 a ssp arlington may 2015
Velterop 2 a ssp arlington may 2015
 
Velterop 2 a ssp arlington may 2015
Velterop 2 a ssp arlington may 2015Velterop 2 a ssp arlington may 2015
Velterop 2 a ssp arlington may 2015
 
STRETCHING THE BOUNDARIES OF PUBLISHING: ALTERNATIVES
STRETCHING THE BOUNDARIES OF PUBLISHING: ALTERNATIVESSTRETCHING THE BOUNDARIES OF PUBLISHING: ALTERNATIVES
STRETCHING THE BOUNDARIES OF PUBLISHING: ALTERNATIVES
 
New Paradigms In Scholarly Communication (Ibm)
New Paradigms In Scholarly Communication (Ibm)New Paradigms In Scholarly Communication (Ibm)
New Paradigms In Scholarly Communication (Ibm)
 
Open Access Advantages, Quality and Progress of the Research
Open Access Advantages, Quality and Progress of the ResearchOpen Access Advantages, Quality and Progress of the Research
Open Access Advantages, Quality and Progress of the Research
 
ScholarlHKES SVP DEGREE COLLEGE, SADASHIVANAGAR, BANGALORE-560080. IQAC ORGA...
ScholarlHKES SVP DEGREE COLLEGE, SADASHIVANAGAR, BANGALORE-560080.  IQAC ORGA...ScholarlHKES SVP DEGREE COLLEGE, SADASHIVANAGAR, BANGALORE-560080.  IQAC ORGA...
ScholarlHKES SVP DEGREE COLLEGE, SADASHIVANAGAR, BANGALORE-560080. IQAC ORGA...
 
Reading avoidance
Reading avoidanceReading avoidance
Reading avoidance
 
Open sciencerefresher2019
Open sciencerefresher2019Open sciencerefresher2019
Open sciencerefresher2019
 
Being an Open Scholar in a Connected World
Being an Open Scholar in a Connected WorldBeing an Open Scholar in a Connected World
Being an Open Scholar in a Connected World
 
Open access resources
Open access resourcesOpen access resources
Open access resources
 
Institutionalisation of an open access – a new possibility for research. A s...
Institutionalisation of an open access – a new possibility for research.  A s...Institutionalisation of an open access – a new possibility for research.  A s...
Institutionalisation of an open access – a new possibility for research. A s...
 
Scholars Towards Open Access
Scholars Towards Open AccessScholars Towards Open Access
Scholars Towards Open Access
 
Mbmh Seminar Leigh Mantle
Mbmh Seminar Leigh MantleMbmh Seminar Leigh Mantle
Mbmh Seminar Leigh Mantle
 
ScienceOpen article for the Shanghai Publishing Conference August 2015
ScienceOpen article for the Shanghai Publishing Conference August 2015ScienceOpen article for the Shanghai Publishing Conference August 2015
ScienceOpen article for the Shanghai Publishing Conference August 2015
 
Publish or Perish - Realising Google Scholar's potential to democratise citat...
Publish or Perish - Realising Google Scholar's potential to democratise citat...Publish or Perish - Realising Google Scholar's potential to democratise citat...
Publish or Perish - Realising Google Scholar's potential to democratise citat...
 
Review of literature
Review of literature Review of literature
Review of literature
 
Open Access 2014
Open Access 2014Open Access 2014
Open Access 2014
 
Open Science: Openness in Scientific Research
Open Science: Openness in Scientific ResearchOpen Science: Openness in Scientific Research
Open Science: Openness in Scientific Research
 
Open Access: Improving scholarly communication
Open Access: Improving scholarly communicationOpen Access: Improving scholarly communication
Open Access: Improving scholarly communication
 
Libraries, collections, technology: presented at Pennylvania State University...
Libraries, collections, technology: presented at Pennylvania State University...Libraries, collections, technology: presented at Pennylvania State University...
Libraries, collections, technology: presented at Pennylvania State University...
 

Recently uploaded

Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Servosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicServosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicAditi Jain
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXDole Philippines School
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx023NiWayanAnggiSriWa
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxmaryFF1
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalMAESTRELLAMesa2
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomyDrAnita Sharma
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 

Recently uploaded (20)

Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Servosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicServosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by Petrovic
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and Vertical
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomy
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 

Optimising the use of existing knowledge

  • 1. Defragmentation: Maximising the Use of Existing Knowledge Jan Velterop — APE 2015 — Berlin 21 January 2015
  • 4. It is a means to reach the goal
  • 5. And the goal is…?
  • 6. Maximal usefulness of existing scientific research results in order to achieve: efficient, fast, and effective new knowledge creation and discovery i.e. highest possible return on public investment
  • 8. The ultimate goal, to which Open Access is merely a means, may not be widely understood – by publishers The ultimate goal, to which Open Access is merely a means, may not be widely understood
  • 9. That may be why there are a lot of different interpretations of what Open Access actually is (in spite of the clear definition given in the Budapest Open Access Initiative)
  • 10. The fact that not all published research is accessible to all researchers, leads to ‘lamp post research’
  • 12. Looking merely at the literature that one can access – which is not necessarily the literature that is potentially important to one’s research Lamp post research:
  • 14. In year Cumulative Number of abstracts in PubMed 11,135,542
  • 15. In year Cumulative Number of abstracts in PubMed …averaging more than 2 abstracts added every minute in 2014…
  • 16.
  • 17. On the impossibility of being expert 341 doi: http://dx.doi.org/10.1136/bmj.c6815 (Published 14 Dece More scientific and medical papers are being published now than ever before. Authors Alan G Fraser and Frank D Dunstan think that new strategies are needed to deal with this avalanche of information new strategies are needed
  • 18. How does a researcher decide what’s ‘relevant’ anyway?
  • 19. How are we filtering or choosing?
  • 23. Possible solutions? Publish fewer articles Don’t be ridiculous! Find better ways to decide what’s truly relevant Now you’re talking!
  • 25.
  • 27. We need the equivalent of aerial surveys — ‘knowledge drones’? — Some of my professors were already known as ‘knowledge drones’ :-)
  • 28. How might we create overviews?
  • 29. Getting the picture from a large number of data points ‘Whole-o-gram’
  • 30. Getting a better picture from even more data points
  • 32. It’s not just about finding information It’s also – and possibly more – about the value & power of ‘recombinant knowledge’
  • 33. Saving significant time-to-knowledge After analysis in BRAIN: 4 minutes Arriving at this conclusion (review in Frontiers Immunology) after reading 221 papers: weeks 5 “Chronic immune activation is the primary driver in HIV pathogenesis”
  • 34. What stands in the way? different… • publishers • journals • platforms • licences • formats • silos • languages First of all: fragmentationAnd also, of course: access (lack of) Not to the whole article…but to the data and assertions buried in them
  • 35. Plenty of initiatives to find stuff: • PubChase – Open Access Biomedical Journal Reference Library • Paperity • SciLit – Database of Scientific and Scholarly Literature • Google Scholar • Et cetera Some go further: • Europe PubMed Central – offering semantic tools
  • 36. 0 1000000 2000000 3000000 4000000 Title Full-text in PMC of which with CC-licence all full-text articles in PubMedCentral (100%) all articles with CC-licences (11.9%) all articles with CC-BY licences (8.7%) 3,087,430 366,973 270,114 Europe-PMC, 19 December 2014“The majority of articles in PMC are subject to traditional copyright restrictions” Not many ‘true’ open access:
  • 37. What we need is information extracted from as many articles as possible The more we have, the ‘sharper’ the knowledge picture
  • 38. Fragmentation and lack of access are encumbrances to seamless knowledge- pattern-analyses and themed collection building (e.g. of graphs)… …which are fast becoming an absolute necessity due to the vast amounts of published material, growing every year, and, of course, in the aggregate
  • 39. “As the rate of publishing accelerates, the need for computational support to work out which articles to read, and how to interpret, reproduce and validate the claims they contain is growing.” Quote from ‘Lazarus’: http://www.bbsrc.ac.uk/pa/grants/AwardDetails.aspx?FundingReference=BB/L005298/1
  • 40. Traditional publications are aimed at consumption by humans; “stories that persuade with data”* Not easily amenable to machine-processing * Anita de Waard, Elsevier
  • 41. In the life-science literature, we typically find: • drug-like molecules represented as illustrations; • biochemical properties as tables or graphs; • protein/DNA sequences buried amongst text; • references and citations with arcane formats; • other objects of biological interest being given ambiguous names. And, horrors like this (from PLOS, h/t Peter Murray-Rust): + (plus underscored) isn’t the same as ± (plus-minus)!
  • 42. • re-type figures from tables; • chase citations through digital libraries; • redraw molecules by hand; • et cetera. tedious, error-prone, wasteful scientists should be able to use their precious time better This creates the need to:
  • 43. ocuments Via UD, LAZARUS ‘resurrects’ knowledge from being buried in articles: • entities (‘concepts’, incl. synonyms, e.g. proteins) • phrases, statements, assertions (e.g. triples) • molecules (incl. Markush structure groups) • graphs • tables http://utopiadocs.com
  • 44. • entities (‘concepts’, incl. synonyms, e.g. proteins) • phrases, statements, assertions (e.g. triples) • molecules (incl. Markush structure groups) • graphs • tables These are captured – with their provenance, e.g. DOI – in a ‘Knowledge Graph’ of their relationships When assertions are captured, they are compared to the Knowledge Graph and labelled as ‘new’ (to the Graph) or ‘already found earlier’
  • 45. “Lazarus to harness the crowd reading life- science articles to resurrect the swathes of legacy data buried in charts, tables, diagrams and free-text, to liberate processable data into a shared resource that benefits the community.” “…activities currently carried out anyway by individuals for their own purposes (annotating, cross-referencing articles with databases, organising collections of articles).”
  • 46.
  • 47. VHL protein binds to HIF-α which is ubiquitinated and tagged for degradation in the proteasome.
  • 48.
  • 49.
  • 50. These ‘assertions’ form the ‘knowledge profile’ of an article, and are added to a growing ‘knowledge graph’ which can be analysed for trends, clusters, areas of intensive activity, et cetera.
  • 51. Some other initiatives to bring the open literature together so that it can be used for large scale semantic analyses:
  • 52. libraccess.org The goal of Libraccess is to aggregate, de-duplicate, clean and index scientific resources in open access repositories, from all countries, from all disciplines, and make them available to all, through a website and with APIs.
  • 53. Research Pad Open Access Journal Reference Library (www.researchpad.co)
  • 54. Converting all that’s open (CC-BY) into ePub format for tablets and smartphones. What I find most interesting, however, is their plan* to make the whole body of all literature that’s openly accessible available in XML for semantic analysis† * being worked on as we speak, they confirmed to me † I hope they will add the ‘knowledge profiles’ of paywalled articles created by Lazarus
  • 55. Build collection of favouritesRead full textshare with othersInspect metrics
  • 57. Thank you Jan Velterop — APE 2015 — Berlin 21 January 2015 velterop@me.com