1. Big data perspectives on building
open semantic data services for life
sciences
Open PHACTS Foundation
bryn@openphactsfoundation.org
2. Why is it so hard to….
Competitors?
What’s the
structure?
Are they in our
file?
What’s
similar?
What’s the
target?Pharmacology
data?
Known
Pathways?
Working On
Now?
Connections to
disease?
Expressed in right
cell type?
IP?
3. Public Domain Drug Discovery Data
Literature
PubChem
Genbank
Patents
Databases
Downloads
Data Analysis Data Integration Firewalled Databases
4. Public Domain Drug Discovery Data
- The Current Situation
Pfizer
AZ
Roche
n
7. Nanopub
Db
VoID
Data Cache
(Virtuoso Triple Store)
Semantic Workflow Engine
Linked Data API (RDF/XML, TTL, JSON)
Domain
Specific
Services
Identity
Resolution
Service
Chemistry
Registration
Normalisation
& Q/C
Identifier
Management
Service
Indexing
CorePlatform
P12374
EC2.43.4
CS4532
“Adenosine
receptor 2a”
VoID
Db
Nanopub
Db
VoID
Db
VoID
Nanopub
VoID
Public Content Commercial
Public Ontologies
User
Annotations
Apps
8. http://dx.doi.org/10.1016/j.websem.2014.03.003
The Open PHACTS Discovery Platform
• Cloud-Based
“Production” Level
System. Secure & Private
• Guided By Business
Questions
• Uses Semantic Web
Technology But provides
a simple REST-ful API for
everyone else
http://dx.doi.org/10.1016/j.drudis.2013.05.008
11. Why is it so hard to….
Competitors?
What’s the
structure?
Are they in our
file?
What’s
similar?
What’s the
target?Pharmacology
data?
Known
Pathways?
Working On
Now?
Connections to
disease?
Expressed in right
cell type?
IP?
12. Information/Data Tombs...
Internal and external
Built to manage content
Built to meet primary use-case
Tailored indexes
Tailored GUIs
Unique language & metadata
Poor interoperability/integration
Proliferation of PowerPoint, Documents, excel, etc.
Many suppliers of systems and content in a single
workflow
Literature Patents NewsPipeline SAR CSRs SafetyIn vivo Etc
17. The Standards Value Chain is disconnected…
Phase IIIPhase IIPhase ILead OptLead IDHit IDTarget ID
COSTARTUMLS
MedRA
ICD9 to ICD10
Snomed-
CT
Not meant to be exhaustive !!!
18. Bioscience and the 4Vs of big data
Big
Data
Variety
Velocity
Volume
Veracity
19. info@openphactsfoundation.org @Open_PHACTS
Open PHACTS Practical Semantics
Acknowledgements
GlaxoSmithKline – Coordinator
Universität Wien – Managing entity
Technical University of Denmark
University of Hamburg, Center for
Bioinformatics
BioSolveIT GmBH
Consorci Mar Parc de Salut de Barcelona
Leiden University Medical Centre
Royal Society of Chemistry
Vrije Universiteit Amsterdam
Novartis
Merck Serono
H. Lundbeck A/S
Eli Lilly
Netherlands Bioinformatics Centre
Swiss Institute of Bioinformatics
ConnectedDiscovery
EMBL-European Bioinformatics Institute
Janssen Esteve Almirall
OpenLink Scibite
The Open PHACTS Foundation
Spanish National Cancer Research Centre
University of Manchester
Maastricht University
Aqnowledge
University of Santiago de Compostela
Rheinische Friedrich-Wilhelms-Universität
Bonn
AstraZeneca
Pfizer
Hinweis der Redaktion
6
Its about enabling the community too
Why are these there?
What is inside?
Who built them?
How can I see what is inside?
What can I learn?