SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Downloaden Sie, um offline zu lesen
1 / 23

Extraction of structural information from
ChemDraw CDX files: easy, or an

underestimated, difficult challenge?
Josef Eiblmaier, Hans Kraut, Sascha Hausberg, Peter Loew

ICIC 2013 Vienna, October 13 – 16

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
2 / 23

Outline

» ChemDraw files:

Relevance and the Challenge
» Approach
» Projects
» InfoChem ChemProspector
© cora / PIXELIO, www.pixelio.de

» Wiley Smart Article
» Thieme Science of Synthesis Update / Pharmaceutical Substances

» Conclusion / Outlook

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
3 / 23

Patents, Journal Articles and MRW‘s: a Buried Treasure?

Chemical structures
(images)

Chemical
names/fragments (text)

Markush
structures (text,
images, CDX)

Chemical structures
(CDX files)

InfoChem GmbH © 2013

Reactions (CDX files)

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
4 / 23

Manuscript  Article  Database …
Publishing

Manuscript
submission

Manual
Indexing

Database production e.g.
SciFinder, Reaxys, SPRESI
eEROS, ...

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
5 / 23

CDX Scheme vs. Database Record
ChemDraw file

Database

Purpose: presentation / publishing
no search

Purpose: search / retrieval

Unstructured

Structured

Structures: no strict rules

Structures: strict rules

General rules: none

Database rules: strict

Reactant

Product

Reagent

Solvent

Catalyst

LiOH

H2O, THF

Pd(OAc)2

Cl-Co2Et,
Et3N

Acetone,
H2O

SOCl2

Source: Thieme Pharmaceutical Substances, Ticagrelor (in production)

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
6 / 23

CDX Scheme Processing,
what does that mean?

Chemical structures (SD files)

Reactions (RD files)

ICSchemeProcessor

Conditions (RD files)
Reagent

Solvent

Catalyst

LiOH

H2O, THF

Pd(OAc)2

Cl-Co2Et,
Et3N

Acetone,
H2O

SOCl2
Source: Thieme Pharmaceutical Substances, Ticagrelor (in production)

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
7 / 23

But: CDX files, often an optical illusion!
Authors are very inventive for a ‚perfect‘ layout!
Appearences are deceiving!
» Usage of graphical symbols
• Polymer supports
• Heteroatoms

C Grid:

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
8 / 23

Optical illusions 2
» Unresolvable labels
• Labels not defined
• Element symbols used as R-group labels

• Ambiguous fragment labels (e.g. molecular formula)

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
9 / 23

Optical illusions 3
» Variable points of attachment

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
10 / 23

Optical illusions 4
» Reaction arrows / forked arrows / brackets

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
11 / 23

Approach

© Gerd Altmann / PIXELIO, www.pixelio.de

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
12 / 23

Approach
» The algorithmic approach:
• Application of a set of rules in the software (generic, project unspecific). Software

should recognize all cases that might occur!
• project (title-) specific rules (drawing conventions must not change), otherwise

further development necessary
• manual post correction required (cost/time intensive)

• problem is infinite, unprecedented issues can not be handled

» The templating approach:
• software is developed to recognize a defined set of problems (PS)
• all content must be manually pre-templated (cost intensive) according to the

capabilities of the software

» The hybrid approach:
• depending on the source the focus can be laid on either approach
InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
13 / 23

Templating

» Templating: Guidelines for authors and typesetters
• Syntax definitions for tables, R-groups etc.
• Syntax rules for captions
• Reaction arrangement, forked arrows

• Rules for reaction conditions
(reactants, catalysts, solvents, yields, temperature)

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
14 / 23

Examples:
» Algorithmic detection of features
» Resolution of repeating groups
» Enumeration of R-groups
» Resolution of aliases/labels
• source specific alias databases
• continuously extended

» Table Enumeration
• compound enumeration
• reaction factual data:
Caption/Yield

» Variable points of
attachment
» Forked arrows

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
15 / 23

Projects

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
16 / 23

Sucessful Application of CDX Processing:
Chemistry Enrichment Workflow*, (Wiley Smart Article)

*Reinhard Neudert: Enhancing the User Experience for Wiley Chemistry Content, ICIC 2012 14. – 17. October, Berlin
InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
17 / 23

Templating*
Author‘s CDX File

CDX Template

Templating

Enumerated structures

ICSchemeProcessor

CDX-Templating
Guidelines (Structures)

*Reinhard Neudert: Enhancing the User Experience for Wiley Chemistry Content, ICIC 2012 14. – 17. October, Berlin

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
18 / 23

Workflow Science of Synthesis Update
R4

R4

O

R5

OH

HO

H2N
+

NH

HO
HN

 H2O

R5 NH
R4

R4

O

R5

OH

H2N

HO

R4

+

R4

NH

HO
N
HN

R

N
O

HO
R

HO
N

R5

O

N

R5
R4

R4

R5

R4

40

N

 H2O

 H2O

N
 NH3

N H2

4

••

R5 N H2
39

R4

R5

••

R5 N H2
39

N

 H2O

R5 NH

O
4

 NH3

N H2

R5

O

40

N
H

N
R5

ICSchemeProcessor

N
H

O

R5

OH

HO

H2N
+

HO
HN

Scheme
Error Report



NH

 H2O

R5 NH
R4

4

R

R4

N

N
O



Correct /
extend process

R4

R4





CDXTemplating
Guidelines
(Reactions)

R5

••

R5 N H2
39

HO

N H2

N
 NH3

R5

O

40

 H2O

R4

Scheme
correction not
possible

InfoChem GmbH © 2013

N
5

R

N
H

ICIC 2013 Vienna, October 13 – 16

Manual data
entry

Dr. Josef Eiblmaier
19 / 23

Sample Pharmaceutical Substances Update

Source: Thieme Pharmaceutical Substances, Abiraterone
InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
20 / 23

Conclusion

» As much as possible algorithmic processing desirable
• generic: can be applied to other contents as well
• cheaper (humans cost!)

» 100% conversion (without human interaction) never possible
» Solutions are project / source specific
» Relevance of automatic extraction will continuously increase
» Authors / Publishers play an essential role in a successful conversion

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
21 / 23

Acknowledgements
» Wiley
 Michael Forster
 Reinhard Neudert
» Thieme
 Guido Herrmann
 Rolf Hoppe
 Klaus Köberlein
» InfoChem
 Hans Kraut, Sascha Hausberg, Thomas Menke, Manuela Rauh
Fanny Irlinger, Huyen Ngyen, Dagmar Kunzmann

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
22 / 23

© Thomas Link / Flickr

Thank you!
InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier
23 / 23

Questions?

InfoChem GmbH © 2013

ICIC 2013 Vienna, October 13 – 16

Dr. Josef Eiblmaier

Weitere ähnliche Inhalte

Andere mochten auch

ICIC 2013 Conference Proceedings Sebastian Radestock
ICIC 2013 Conference Proceedings Sebastian RadestockICIC 2013 Conference Proceedings Sebastian Radestock
ICIC 2013 Conference Proceedings Sebastian RadestockDr. Haxel Consult
 
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of ChemistryICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of ChemistryDr. Haxel Consult
 
ICIC 2016: Mind the Gap: The novel benefits of human-curated substance locat...
ICIC 2016: Mind the Gap:  The novel benefits of human-curated substance locat...ICIC 2016: Mind the Gap:  The novel benefits of human-curated substance locat...
ICIC 2016: Mind the Gap: The novel benefits of human-curated substance locat...Dr. Haxel Consult
 
New Product Introductions - GenomeQuest Life Sciences
New Product Introductions - GenomeQuest Life SciencesNew Product Introductions - GenomeQuest Life Sciences
New Product Introductions - GenomeQuest Life SciencesDr. Haxel Consult
 
ICIC 2014 Panel: Mobile Apps for Patent Searchers
ICIC 2014 Panel: Mobile Apps for Patent SearchersICIC 2014 Panel: Mobile Apps for Patent Searchers
ICIC 2014 Panel: Mobile Apps for Patent SearchersDr. Haxel Consult
 
ICIC 2013 Conference Proceedings Uwe Rosemann TIB
ICIC 2013 Conference Proceedings Uwe Rosemann TIBICIC 2013 Conference Proceedings Uwe Rosemann TIB
ICIC 2013 Conference Proceedings Uwe Rosemann TIBDr. Haxel Consult
 
ICIC 2014 New Product Introduction Averbis
ICIC 2014 New Product Introduction AverbisICIC 2014 New Product Introduction Averbis
ICIC 2014 New Product Introduction AverbisDr. Haxel Consult
 
ICIC 2014 Tracking of the Mode of Action Landscape in Breast Cancer using Rep...
ICIC 2014 Tracking of the Mode of Action Landscape in Breast Cancer using Rep...ICIC 2014 Tracking of the Mode of Action Landscape in Breast Cancer using Rep...
ICIC 2014 Tracking of the Mode of Action Landscape in Breast Cancer using Rep...Dr. Haxel Consult
 
Knowledge Manager - Fit for the Future
Knowledge Manager - Fit for the Future Knowledge Manager - Fit for the Future
Knowledge Manager - Fit for the Future Dr. Haxel Consult
 
ICIC 2014 Patent Landscape Analysis as a Tool for Public Policies Adjustment:...
ICIC 2014 Patent Landscape Analysis as a Tool for Public Policies Adjustment:...ICIC 2014 Patent Landscape Analysis as a Tool for Public Policies Adjustment:...
ICIC 2014 Patent Landscape Analysis as a Tool for Public Policies Adjustment:...Dr. Haxel Consult
 
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recallICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recallDr. Haxel Consult
 
ICIC 2013 Conference Proceedings Krishna Molecular Connections
ICIC 2013 Conference Proceedings Krishna Molecular ConnectionsICIC 2013 Conference Proceedings Krishna Molecular Connections
ICIC 2013 Conference Proceedings Krishna Molecular ConnectionsDr. Haxel Consult
 
ICIC 2014 New Product Introduction Infotrieve
ICIC 2014 New Product Introduction InfotrieveICIC 2014 New Product Introduction Infotrieve
ICIC 2014 New Product Introduction InfotrieveDr. Haxel Consult
 
ICIC 2014 New Product Introduction Gridlogisc
ICIC 2014 New Product Introduction GridlogiscICIC 2014 New Product Introduction Gridlogisc
ICIC 2014 New Product Introduction GridlogiscDr. Haxel Consult
 
ICIC 2014 New Product Introduction Questel
ICIC 2014 New Product Introduction QuestelICIC 2014 New Product Introduction Questel
ICIC 2014 New Product Introduction QuestelDr. Haxel Consult
 
ICIC 2013 New Product Introductions BizInt
ICIC 2013 New Product Introductions BizIntICIC 2013 New Product Introductions BizInt
ICIC 2013 New Product Introductions BizIntDr. Haxel Consult
 

Andere mochten auch (16)

ICIC 2013 Conference Proceedings Sebastian Radestock
ICIC 2013 Conference Proceedings Sebastian RadestockICIC 2013 Conference Proceedings Sebastian Radestock
ICIC 2013 Conference Proceedings Sebastian Radestock
 
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of ChemistryICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
 
ICIC 2016: Mind the Gap: The novel benefits of human-curated substance locat...
ICIC 2016: Mind the Gap:  The novel benefits of human-curated substance locat...ICIC 2016: Mind the Gap:  The novel benefits of human-curated substance locat...
ICIC 2016: Mind the Gap: The novel benefits of human-curated substance locat...
 
New Product Introductions - GenomeQuest Life Sciences
New Product Introductions - GenomeQuest Life SciencesNew Product Introductions - GenomeQuest Life Sciences
New Product Introductions - GenomeQuest Life Sciences
 
ICIC 2014 Panel: Mobile Apps for Patent Searchers
ICIC 2014 Panel: Mobile Apps for Patent SearchersICIC 2014 Panel: Mobile Apps for Patent Searchers
ICIC 2014 Panel: Mobile Apps for Patent Searchers
 
ICIC 2013 Conference Proceedings Uwe Rosemann TIB
ICIC 2013 Conference Proceedings Uwe Rosemann TIBICIC 2013 Conference Proceedings Uwe Rosemann TIB
ICIC 2013 Conference Proceedings Uwe Rosemann TIB
 
ICIC 2014 New Product Introduction Averbis
ICIC 2014 New Product Introduction AverbisICIC 2014 New Product Introduction Averbis
ICIC 2014 New Product Introduction Averbis
 
ICIC 2014 Tracking of the Mode of Action Landscape in Breast Cancer using Rep...
ICIC 2014 Tracking of the Mode of Action Landscape in Breast Cancer using Rep...ICIC 2014 Tracking of the Mode of Action Landscape in Breast Cancer using Rep...
ICIC 2014 Tracking of the Mode of Action Landscape in Breast Cancer using Rep...
 
Knowledge Manager - Fit for the Future
Knowledge Manager - Fit for the Future Knowledge Manager - Fit for the Future
Knowledge Manager - Fit for the Future
 
ICIC 2014 Patent Landscape Analysis as a Tool for Public Policies Adjustment:...
ICIC 2014 Patent Landscape Analysis as a Tool for Public Policies Adjustment:...ICIC 2014 Patent Landscape Analysis as a Tool for Public Policies Adjustment:...
ICIC 2014 Patent Landscape Analysis as a Tool for Public Policies Adjustment:...
 
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recallICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
 
ICIC 2013 Conference Proceedings Krishna Molecular Connections
ICIC 2013 Conference Proceedings Krishna Molecular ConnectionsICIC 2013 Conference Proceedings Krishna Molecular Connections
ICIC 2013 Conference Proceedings Krishna Molecular Connections
 
ICIC 2014 New Product Introduction Infotrieve
ICIC 2014 New Product Introduction InfotrieveICIC 2014 New Product Introduction Infotrieve
ICIC 2014 New Product Introduction Infotrieve
 
ICIC 2014 New Product Introduction Gridlogisc
ICIC 2014 New Product Introduction GridlogiscICIC 2014 New Product Introduction Gridlogisc
ICIC 2014 New Product Introduction Gridlogisc
 
ICIC 2014 New Product Introduction Questel
ICIC 2014 New Product Introduction QuestelICIC 2014 New Product Introduction Questel
ICIC 2014 New Product Introduction Questel
 
ICIC 2013 New Product Introductions BizInt
ICIC 2013 New Product Introductions BizIntICIC 2013 New Product Introductions BizInt
ICIC 2013 New Product Introductions BizInt
 

Mehr von Dr. Haxel Consult

AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementAI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementDr. Haxel Consult
 
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...Dr. Haxel Consult
 
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...Dr. Haxel Consult
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...Dr. Haxel Consult
 
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...Dr. Haxel Consult
 
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...Dr. Haxel Consult
 
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...Dr. Haxel Consult
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...Dr. Haxel Consult
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...Dr. Haxel Consult
 
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...Dr. Haxel Consult
 
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...Dr. Haxel Consult
 
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...Dr. Haxel Consult
 
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...Dr. Haxel Consult
 
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...Dr. Haxel Consult
 
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...Dr. Haxel Consult
 
AI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterAI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterDr. Haxel Consult
 
AI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCAI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCDr. Haxel Consult
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...Dr. Haxel Consult
 
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...Dr. Haxel Consult
 

Mehr von Dr. Haxel Consult (20)

AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementAI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
 
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
 
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
 
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
 
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
 
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
 
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
 
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
 
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
 
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
 
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
 
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
 
AI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterAI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance Center
 
AI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IPAI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IP
 
AI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCAI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOC
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
 
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
 

Kürzlich hochgeladen

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Kürzlich hochgeladen (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

ICIC 2013 Conference Proceedings Josef Eiblmaier Infochem

  • 1. 1 / 23 Extraction of structural information from ChemDraw CDX files: easy, or an underestimated, difficult challenge? Josef Eiblmaier, Hans Kraut, Sascha Hausberg, Peter Loew ICIC 2013 Vienna, October 13 – 16 InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 2. 2 / 23 Outline » ChemDraw files: Relevance and the Challenge » Approach » Projects » InfoChem ChemProspector © cora / PIXELIO, www.pixelio.de » Wiley Smart Article » Thieme Science of Synthesis Update / Pharmaceutical Substances » Conclusion / Outlook InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 3. 3 / 23 Patents, Journal Articles and MRW‘s: a Buried Treasure? Chemical structures (images) Chemical names/fragments (text) Markush structures (text, images, CDX) Chemical structures (CDX files) InfoChem GmbH © 2013 Reactions (CDX files) ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 4. 4 / 23 Manuscript  Article  Database … Publishing Manuscript submission Manual Indexing Database production e.g. SciFinder, Reaxys, SPRESI eEROS, ... InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 5. 5 / 23 CDX Scheme vs. Database Record ChemDraw file Database Purpose: presentation / publishing no search Purpose: search / retrieval Unstructured Structured Structures: no strict rules Structures: strict rules General rules: none Database rules: strict Reactant Product Reagent Solvent Catalyst LiOH H2O, THF Pd(OAc)2 Cl-Co2Et, Et3N Acetone, H2O SOCl2 Source: Thieme Pharmaceutical Substances, Ticagrelor (in production) InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 6. 6 / 23 CDX Scheme Processing, what does that mean? Chemical structures (SD files) Reactions (RD files) ICSchemeProcessor Conditions (RD files) Reagent Solvent Catalyst LiOH H2O, THF Pd(OAc)2 Cl-Co2Et, Et3N Acetone, H2O SOCl2 Source: Thieme Pharmaceutical Substances, Ticagrelor (in production) InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 7. 7 / 23 But: CDX files, often an optical illusion! Authors are very inventive for a ‚perfect‘ layout! Appearences are deceiving! » Usage of graphical symbols • Polymer supports • Heteroatoms C Grid: InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 8. 8 / 23 Optical illusions 2 » Unresolvable labels • Labels not defined • Element symbols used as R-group labels • Ambiguous fragment labels (e.g. molecular formula) InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 9. 9 / 23 Optical illusions 3 » Variable points of attachment InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 10. 10 / 23 Optical illusions 4 » Reaction arrows / forked arrows / brackets InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 11. 11 / 23 Approach © Gerd Altmann / PIXELIO, www.pixelio.de InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 12. 12 / 23 Approach » The algorithmic approach: • Application of a set of rules in the software (generic, project unspecific). Software should recognize all cases that might occur! • project (title-) specific rules (drawing conventions must not change), otherwise further development necessary • manual post correction required (cost/time intensive) • problem is infinite, unprecedented issues can not be handled » The templating approach: • software is developed to recognize a defined set of problems (PS) • all content must be manually pre-templated (cost intensive) according to the capabilities of the software » The hybrid approach: • depending on the source the focus can be laid on either approach InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 13. 13 / 23 Templating » Templating: Guidelines for authors and typesetters • Syntax definitions for tables, R-groups etc. • Syntax rules for captions • Reaction arrangement, forked arrows • Rules for reaction conditions (reactants, catalysts, solvents, yields, temperature) InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 14. 14 / 23 Examples: » Algorithmic detection of features » Resolution of repeating groups » Enumeration of R-groups » Resolution of aliases/labels • source specific alias databases • continuously extended » Table Enumeration • compound enumeration • reaction factual data: Caption/Yield » Variable points of attachment » Forked arrows InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 15. 15 / 23 Projects InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 16. 16 / 23 Sucessful Application of CDX Processing: Chemistry Enrichment Workflow*, (Wiley Smart Article) *Reinhard Neudert: Enhancing the User Experience for Wiley Chemistry Content, ICIC 2012 14. – 17. October, Berlin InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 17. 17 / 23 Templating* Author‘s CDX File CDX Template Templating Enumerated structures ICSchemeProcessor CDX-Templating Guidelines (Structures) *Reinhard Neudert: Enhancing the User Experience for Wiley Chemistry Content, ICIC 2012 14. – 17. October, Berlin InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 18. 18 / 23 Workflow Science of Synthesis Update R4 R4 O R5 OH HO H2N + NH HO HN  H2O R5 NH R4 R4 O R5 OH H2N HO R4 + R4 NH HO N HN R N O HO R HO N R5 O N R5 R4 R4 R5 R4 40 N  H2O  H2O N  NH3 N H2 4 •• R5 N H2 39 R4 R5 •• R5 N H2 39 N  H2O R5 NH O 4  NH3 N H2 R5 O 40 N H N R5 ICSchemeProcessor N H O R5 OH HO H2N + HO HN Scheme Error Report  NH  H2O R5 NH R4 4 R R4 N N O  Correct / extend process R4 R4   CDXTemplating Guidelines (Reactions) R5 •• R5 N H2 39 HO N H2 N  NH3 R5 O 40  H2O R4 Scheme correction not possible InfoChem GmbH © 2013 N 5 R N H ICIC 2013 Vienna, October 13 – 16 Manual data entry Dr. Josef Eiblmaier
  • 19. 19 / 23 Sample Pharmaceutical Substances Update Source: Thieme Pharmaceutical Substances, Abiraterone InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 20. 20 / 23 Conclusion » As much as possible algorithmic processing desirable • generic: can be applied to other contents as well • cheaper (humans cost!) » 100% conversion (without human interaction) never possible » Solutions are project / source specific » Relevance of automatic extraction will continuously increase » Authors / Publishers play an essential role in a successful conversion InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 21. 21 / 23 Acknowledgements » Wiley  Michael Forster  Reinhard Neudert » Thieme  Guido Herrmann  Rolf Hoppe  Klaus Köberlein » InfoChem  Hans Kraut, Sascha Hausberg, Thomas Menke, Manuela Rauh Fanny Irlinger, Huyen Ngyen, Dagmar Kunzmann InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 22. 22 / 23 © Thomas Link / Flickr Thank you! InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier
  • 23. 23 / 23 Questions? InfoChem GmbH © 2013 ICIC 2013 Vienna, October 13 – 16 Dr. Josef Eiblmaier