SlideShare a Scribd company logo
1 of 29
Download to read offline
Integration Techniques for ELNs


                         Simon Coles
                          Co-founder & CTO
Integration Techniques for ELNs

    •   My background
    •   Why do we need to integrate ELNs?
    •   Why kinds of integration do we need to do?
    •   What prerequisites are there?
    •   Some examples of technologies and techniques
    •   Summary


    • You can download copies of this presentation from
        our web site




    http://www.amphora-research.com/
2
My background

    • MEng in Information Systems Engineering
    • First “ELN” was a consulting project for Kodak
         • Started in 1996
         • Completely electronic, fully integrated
         • Thousands of users, worldwide
    • This grew into Amphora
    • Merged with PatentPad in 2003
         • Paper or electronic records according to legal
             preference
         •   Scientists still get an “Electronic” system
         •   Partner with a wide variety of “ELN” vendors
    • Member of CENSA, working on long term
       records, serving on Steering Team



    http://www.amphora-research.com/
3
Experience

    • Primarily in ELNs for discovery
         • Where patents are a major concern
         • I am sure some of this is relevant to regulated areas,
             but that’s not my focus
    • Work a lot with other “ELN” vendors
         • Seldom do you buy one system
         • Which means we end up seeing a lot of integration!
    • In a variety of industries, all sizes of deployment
         • Pharma
         • Biotech
         • Chemicals
    • Customers around the world, offices in the US &
       the UK



    http://www.amphora-research.com/
4
What’s an ELN?

    • The term “ELN” is now used to described a wide
       variety of systems
         • Science specific
               • Reaction planning tools, Cheminformatics
                    databases, structure drawing tools
               •    Analysis packages, LIMS
               •    Workflow tools
         • General
               • Knowledge/Document Management
               • Scientific data management
         • Laptop/Tablet computers




    http://www.amphora-research.com/
5
Observations

    • The term “ELN”
         • Is so ambiguous it can mean almost anything
             (especially to a marketing person)
         •   Doesn’t help us much from a systems architecture
             perspective
    • A company is unlikely to have just one system that
        could be called an “ELN”
    •   Those ELNs will need to integrate with your
        existing & future systems
    •   Your needs will change with time, so you need to
        be able to protect your investment
         • In data
         • In tools
         • In processes

    http://www.amphora-research.com/
6
Deconstructing “ELN”

    • At first sight an ELN project success can look very
        complex
    •   ELN functionality can be split into two dimensions
         • Some aspects are common to everyone
         • Other requirements are specific to a particular group of
            scientists
    • Splitting out the functionality into these dimensions really
        helps to keep you sane


                                “Broad” aspects
                         Security, Collaboration, Patent Protection
                                            etc.



                          A            B         C           D

    http://www.amphora-research.com/
7
Benefits

    • The corporate functions (Legal, Records, etc.) can
        buy/provide a system that provides a service to
        the niche-specific systems
         • Meet corporate requirements for records etc.
         • Provide a cross-discipline collaboration
    • The individual niches can buy/find systems to
        support their specific needs
         •   Leverage existing investments
         •   Justified according to the benefits they bring
         •   Removes any need to balance competing requirements
         •   Reduce the need
    • Systems can be acquired/purchased in a phased
        approach tailored to the needs & requirements of
        the business
    •   Life is a lot less stressful
    http://www.amphora-research.com/
8
Different levels of abstraction


                                             The “Experiment” is
                                           generally the boundary
                                           between Broad Vs Deep
                                                   systems
    “Broad” aspects                                                   Projects
                                                                    Experiments
                                                                      Reports
                                                                     Raw Data
    A         B        C D




        http://www.amphora-research.com/
9
Types of integration

     Broad/Deep boundary                      “Broad” aspects
      is often exposed as
     network-level services
      which are relatively
          standardized

                                         A       B           C          D


                                   Integrations between different niche systems
                                                is generally custom




      http://www.amphora-research.com/
10
What prerequisites are there?

     • From your ELN product(s)
          • Open Interfaces
          • Open Data
     • Plumbing
          • Various technologies, some simple, some more complex
          • Expertise - often in-house, sometimes consultants

     • Good news - the Open Source movement is really
        helpful
          • Tools & techniques
          • Drive for openness

     • Remember: you need to ask your vendor for all of
        the “Open” stuff before you sign the order

     http://www.amphora-research.com/
11
Open Interfaces

     • What’s an “Interface”?
          •   Where one system “prods” another to do something
          •   Or get some information out
          •   Or put some information in
          •   Generally some data is passed back & forth
     • What’s “open”?
          • Something you can use without undue burden or
              barrier
          •   This covers both commercial and technical aspects
          •   Concerns are very similar to those involved with Open
              Data




     http://www.amphora-research.com/
12
Open Data

     • This is currently a bit of a blind spot for
         purchasers of IT systems
     •   Unfortunately, Open Data is absolutely critical
          • For long term records
          • For your ability to build up an integrated system
          • To protect your IP (partly from a patent perspective, but
              mainly from a re-use aspect)
          •   To maintain a balanced relationship with your vendors
     • This absolutely needs to be part of the ELN
         purchasing process




     http://www.amphora-research.com/
13
“Good” (open) file formats

     • Publicly documented
     • Legally unencumbered
          • No patents, copyright concerns etc.
          • Any patents or copyright must be in the public domain
     • Ideally, self documenting (XML is a good start)
     • Degrade gracefully
          • If you can’t the data, at least you can see a picture
     • Based on more open, primitive formats where
         possible
     •   At least two implementations of readers, one of
         which is Open Source
     •   Widely used (W3C or IETF standards are good
         signs)


     http://www.amphora-research.com/
14
Data formats for the long term

     • Good
          •   For text: Plain ASCII, Unicode, HTML, possibly RTF
          •   For graphics: PNG, SVG
          •   For structured data: XML
          •   To preserve appearance: PDF
     • Worry about
          • Storing files in databases
                • The database file format is probably undocumented
                • Store objects on the file system and use the
                     database to point to them
          • Anything that is proprietary - there’s no excuse for it,
              and it dramatically increases your risk
          •   Binary files generally
          •   Mixing content in files (e.g. embedding XML in PDF)
          •   Proprietary digital signatures

     http://www.amphora-research.com/
15
IP concerns & data formats

     • Companies have always used Proprietary Data
         Formats as a competitive weapon
     •   Companies are waking up to the use of IP tools
         (licenses, patents, copyrights) to reinforce their
         control over data formats
     •   Just because a format is published doesn’t mean it
         is open
          • The Microsoft Office XML formats are a particularly
              bad example
                • Right now it looks positively radioactive
                • They’re being very careful what they say which
                     indicates to me they’re planning something
                •    http://www.groklaw.net/article.php?
                     story=20050330133833843
                •    (see section: 4. Dissecting Microsoft’s “Patent License”)

     http://www.amphora-research.com/
16
Standards

     • There are so many to choose from!
     • Two key ways of generating “Standards”
          • De Facto - dominant supplier/format
          • De Jure - committee based
     • Who gets to “bless” a standard?
     • What makes a “good standard”
          • De Jure process has difficulty keeping up with the real
              world
          •   De Facto process has risk of lock-in
     • Pragmatic approach
          • Expect your suppliers to use open file formats
          • If there is an acceptable standard, use it
          • Make sure you are using the right kind of format for
              each purpose


     http://www.amphora-research.com/
17
Technologies and techniques

     • There are a wide variety of tools you can use to
        integrate IT systems
          •   Tight Vs Loose coupling
          •   Synchronous Vs Asynchronous
          •   Text Vs Binary
          •   Proprietary Vs Open
          •   Simple Vs Complex
     • As a rule
          • Loose is cheaper than Tight coupling
          • Asynchronous is easier to manage than
              Synchronous
          •   Text is easier to work with, and more flexible than
              Binary
          •   Open interfaces are always better than Proprietary
          •   Simple are better Complex approaches

     http://www.amphora-research.com/
18
Considerations when picking tools

     • Use stable interfaces
          • Get a commitment from the vendor about what they’ll
              keep stable across version upgrades
     •   Use public, documented interfaces
     •   Sample code is really really useful
     •   Pick language-neutral interfaces where possible
     •   Platform-neutrality
          • Doesn’t worry (too much) about locking yourself into
              Windows on the client
          •   But if you lock yourself to Windows on the server, it is
              going to hurt




     http://www.amphora-research.com/
19
Glue Languages

     • There are a number of really useful “Glue”
        languages around
          • Python (and Jython, and other relatives)
          • Perl (although I have some concerns about
              maintainability)
          •   Groovy, Beanshell, etc.
     • All of them
          • Play well with XML, http, SOAP etc.
          • Play well OLE
          • Are cross platform
     • My personal preference is Python
          • You can learn it in a matter of hours
          • You can read other people’s code
          • It does everything I need it to do

     http://www.amphora-research.com/
20
Cool stuff

     • SOAP/Web Servers
          • Valuable in many areas
          • But don’t treat it as a religion
          • There are lighter alternatives which bring most of the
              benefits for much less effort
          •   The whole WS-* effort seems to have got out of control
     • REST (XML over http) - a lighter alternative to
         SOAP
     •   File swapping (generally, in XML)
     •   HTTP GET/POST
          • Wonderfully easy to debug!
          • Very flexible



     http://www.amphora-research.com/
21
Nice things to see

     • Integration points exposed as stable URLs
          • For example, our PatentSafe product, we have
              committed to stable URL formats to
                • Submit a record via http (content & metadata)
                • Get a record for display to the user
          • These can be used by other systems
          • And also embedded in Word documents...
     • Lack of wheel re-invention
          • e.g. LDAP is The One True place for user information
          • e.g. RSS/Atom is The One True alerting mechanism
     • Example code
          • In multiple languages



     http://www.amphora-research.com/
22
Here be dragons

     • OLE - some times it is unavoidable (e.g. UI stuff),
        but avoid it when you can
          •   Tight coupling
          •   Buggy
          •   Proprietary
          •   Reduces your platform options
          •   File format issues are awful
          •   Version-to-version compatabilty is “interesting”
     • Direct database access
          • Tight coupling
          • Difficult to guarantee system integrity
          • If you wrote both systems you might want to do this



     http://www.amphora-research.com/
23
Open Source

     • Definitely one to watch
     • Not the “Free” lunch you might think, but a
         pragmatic business too
     •   Examples
          •   Linux
          •   Postgres
          •   JBoss,Tomcat etc.
          •   Ghostscript

     • Open Source is part of everyone’s infrastructure
     • Make sure you can run your systems on a variety of
         platforms



     http://www.amphora-research.com/
24
Why?

     • Good for records
          • Gives you top-to-bottom control
     • Good for TCO
          • We’re finding the Open Source infrastructure easier to
              setup and reliable than proprietary alternatives
     • Enables a better solution
          • Transparent systems mean you can do things the
              original designers didn't think of
          •   This is especially important for ELNs




     http://www.amphora-research.com/
25
Other stuff to watch

     • XML generally (what did we ever do without it)
     • Jabber (as computer messaging and IM framework)
     • Portals & Portlets
          • Especially JSR168,WSRP
          • Remember you may well want to portalize any useful application
     • AJAX
          • Google is my hero
          • You can build usable, functional Web Applications
          • If you haven’t seen GMail I can send you an “invite”
     • VMWare - virtualize your world
          • Wow
          • Great for serve consolidation, great for testing, great for
             development
     • Wikis
        • Beginning to turn into a lightweight application
              environment


     http://www.amphora-research.com/
26
Trends to watch

     • File format nasties
     • Closed/Private interfaces
          • Unlikely to be stable
     • DMCA and other copyright legislation




     http://www.amphora-research.com/
27
Summary

     • You’ll be assembling an “ELN System” from a
         series of components
          • Some you have, some you’ll build, some you’ll buy

     • Get the open stuff before you sign the deal
          • Open, documented, stable interfaces
          • Open file formats

     • Use open, loosely coupled approaches where
         possible
     •   If you can, keep the capability to own the
         integration issues in-house



     http://www.amphora-research.com/
28
Contact information

     •   Web site: http://www.amphora-research.com
     •   EMail: simonc@amphora-research.com
     •   Phone (US): (513) 697 4764
     •   Phone (UK): +44 (0)845 2300160 x2001
     •   AIM: simoncoles@mac.com
     •   Skype: sjcoles




     http://www.amphora-research.com/
29

More Related Content

More from Simon Coles

More from Simon Coles (18)

Electronic Signatures
Electronic SignaturesElectronic Signatures
Electronic Signatures
 
Clean Language in Software
Clean Language in SoftwareClean Language in Software
Clean Language in Software
 
2010 01 27 Surveying the ELN Landscape
2010 01 27 Surveying the ELN Landscape2010 01 27 Surveying the ELN Landscape
2010 01 27 Surveying the ELN Landscape
 
2010 01 27 Chairman Opening Remarks
2010 01 27 Chairman Opening Remarks2010 01 27 Chairman Opening Remarks
2010 01 27 Chairman Opening Remarks
 
2008 09 16 Walking The Tightrope between lawyers and scientists
2008 09 16 Walking The Tightrope between lawyers and scientists2008 09 16 Walking The Tightrope between lawyers and scientists
2008 09 16 Walking The Tightrope between lawyers and scientists
 
2008 06 17 IP Protection
2008 06 17 IP Protection2008 06 17 IP Protection
2008 06 17 IP Protection
 
2008 04 01 ELNs, Chemistry, Biology
2008 04 01 ELNs, Chemistry, Biology2008 04 01 ELNs, Chemistry, Biology
2008 04 01 ELNs, Chemistry, Biology
 
2008 03 11 ELNs in different industries
2008 03 11 ELNs in different industries2008 03 11 ELNs in different industries
2008 03 11 ELNs in different industries
 
2008 02 06 ELN differences
2008 02 06 ELN differences2008 02 06 ELN differences
2008 02 06 ELN differences
 
2008 01 30 ELNs and Compliance
2008 01 30 ELNs and Compliance2008 01 30 ELNs and Compliance
2008 01 30 ELNs and Compliance
 
2007 09 28 ELNs as Patent Evidence Systems
2007 09 28 ELNs as Patent Evidence Systems2007 09 28 ELNs as Patent Evidence Systems
2007 09 28 ELNs as Patent Evidence Systems
 
2007 09 26 ELN Working Routine
2007 09 26 ELN Working Routine2007 09 26 ELN Working Routine
2007 09 26 ELN Working Routine
 
2006 09 26 Beyond Chemistry
2006 09 26 Beyond Chemistry2006 09 26 Beyond Chemistry
2006 09 26 Beyond Chemistry
 
2005 10 20 IVT ELN Cost
2005 10 20 IVT ELN Cost2005 10 20 IVT ELN Cost
2005 10 20 IVT ELN Cost
 
2005 04 05 SRI ELN Architecture
2005 04 05 SRI ELN Architecture2005 04 05 SRI ELN Architecture
2005 04 05 SRI ELN Architecture
 
2004 03 31 ACS ELN Perspectives
2004 03 31 ACS ELN Perspectives2004 03 31 ACS ELN Perspectives
2004 03 31 ACS ELN Perspectives
 
Reconciling Scientists and Lawyers
Reconciling Scientists and LawyersReconciling Scientists and Lawyers
Reconciling Scientists and Lawyers
 
2009 04 21 Lessons Learned Eln Implementations
2009 04 21 Lessons Learned Eln Implementations2009 04 21 Lessons Learned Eln Implementations
2009 04 21 Lessons Learned Eln Implementations
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 

2005 07 19 IVT Integration Techniques

  • 1. Integration Techniques for ELNs Simon Coles Co-founder & CTO
  • 2. Integration Techniques for ELNs • My background • Why do we need to integrate ELNs? • Why kinds of integration do we need to do? • What prerequisites are there? • Some examples of technologies and techniques • Summary • You can download copies of this presentation from our web site http://www.amphora-research.com/ 2
  • 3. My background • MEng in Information Systems Engineering • First “ELN” was a consulting project for Kodak • Started in 1996 • Completely electronic, fully integrated • Thousands of users, worldwide • This grew into Amphora • Merged with PatentPad in 2003 • Paper or electronic records according to legal preference • Scientists still get an “Electronic” system • Partner with a wide variety of “ELN” vendors • Member of CENSA, working on long term records, serving on Steering Team http://www.amphora-research.com/ 3
  • 4. Experience • Primarily in ELNs for discovery • Where patents are a major concern • I am sure some of this is relevant to regulated areas, but that’s not my focus • Work a lot with other “ELN” vendors • Seldom do you buy one system • Which means we end up seeing a lot of integration! • In a variety of industries, all sizes of deployment • Pharma • Biotech • Chemicals • Customers around the world, offices in the US & the UK http://www.amphora-research.com/ 4
  • 5. What’s an ELN? • The term “ELN” is now used to described a wide variety of systems • Science specific • Reaction planning tools, Cheminformatics databases, structure drawing tools • Analysis packages, LIMS • Workflow tools • General • Knowledge/Document Management • Scientific data management • Laptop/Tablet computers http://www.amphora-research.com/ 5
  • 6. Observations • The term “ELN” • Is so ambiguous it can mean almost anything (especially to a marketing person) • Doesn’t help us much from a systems architecture perspective • A company is unlikely to have just one system that could be called an “ELN” • Those ELNs will need to integrate with your existing & future systems • Your needs will change with time, so you need to be able to protect your investment • In data • In tools • In processes http://www.amphora-research.com/ 6
  • 7. Deconstructing “ELN” • At first sight an ELN project success can look very complex • ELN functionality can be split into two dimensions • Some aspects are common to everyone • Other requirements are specific to a particular group of scientists • Splitting out the functionality into these dimensions really helps to keep you sane “Broad” aspects Security, Collaboration, Patent Protection etc. A B C D http://www.amphora-research.com/ 7
  • 8. Benefits • The corporate functions (Legal, Records, etc.) can buy/provide a system that provides a service to the niche-specific systems • Meet corporate requirements for records etc. • Provide a cross-discipline collaboration • The individual niches can buy/find systems to support their specific needs • Leverage existing investments • Justified according to the benefits they bring • Removes any need to balance competing requirements • Reduce the need • Systems can be acquired/purchased in a phased approach tailored to the needs & requirements of the business • Life is a lot less stressful http://www.amphora-research.com/ 8
  • 9. Different levels of abstraction The “Experiment” is generally the boundary between Broad Vs Deep systems “Broad” aspects Projects Experiments Reports Raw Data A B C D http://www.amphora-research.com/ 9
  • 10. Types of integration Broad/Deep boundary “Broad” aspects is often exposed as network-level services which are relatively standardized A B C D Integrations between different niche systems is generally custom http://www.amphora-research.com/ 10
  • 11. What prerequisites are there? • From your ELN product(s) • Open Interfaces • Open Data • Plumbing • Various technologies, some simple, some more complex • Expertise - often in-house, sometimes consultants • Good news - the Open Source movement is really helpful • Tools & techniques • Drive for openness • Remember: you need to ask your vendor for all of the “Open” stuff before you sign the order http://www.amphora-research.com/ 11
  • 12. Open Interfaces • What’s an “Interface”? • Where one system “prods” another to do something • Or get some information out • Or put some information in • Generally some data is passed back & forth • What’s “open”? • Something you can use without undue burden or barrier • This covers both commercial and technical aspects • Concerns are very similar to those involved with Open Data http://www.amphora-research.com/ 12
  • 13. Open Data • This is currently a bit of a blind spot for purchasers of IT systems • Unfortunately, Open Data is absolutely critical • For long term records • For your ability to build up an integrated system • To protect your IP (partly from a patent perspective, but mainly from a re-use aspect) • To maintain a balanced relationship with your vendors • This absolutely needs to be part of the ELN purchasing process http://www.amphora-research.com/ 13
  • 14. “Good” (open) file formats • Publicly documented • Legally unencumbered • No patents, copyright concerns etc. • Any patents or copyright must be in the public domain • Ideally, self documenting (XML is a good start) • Degrade gracefully • If you can’t the data, at least you can see a picture • Based on more open, primitive formats where possible • At least two implementations of readers, one of which is Open Source • Widely used (W3C or IETF standards are good signs) http://www.amphora-research.com/ 14
  • 15. Data formats for the long term • Good • For text: Plain ASCII, Unicode, HTML, possibly RTF • For graphics: PNG, SVG • For structured data: XML • To preserve appearance: PDF • Worry about • Storing files in databases • The database file format is probably undocumented • Store objects on the file system and use the database to point to them • Anything that is proprietary - there’s no excuse for it, and it dramatically increases your risk • Binary files generally • Mixing content in files (e.g. embedding XML in PDF) • Proprietary digital signatures http://www.amphora-research.com/ 15
  • 16. IP concerns & data formats • Companies have always used Proprietary Data Formats as a competitive weapon • Companies are waking up to the use of IP tools (licenses, patents, copyrights) to reinforce their control over data formats • Just because a format is published doesn’t mean it is open • The Microsoft Office XML formats are a particularly bad example • Right now it looks positively radioactive • They’re being very careful what they say which indicates to me they’re planning something • http://www.groklaw.net/article.php? story=20050330133833843 • (see section: 4. Dissecting Microsoft’s “Patent License”) http://www.amphora-research.com/ 16
  • 17. Standards • There are so many to choose from! • Two key ways of generating “Standards” • De Facto - dominant supplier/format • De Jure - committee based • Who gets to “bless” a standard? • What makes a “good standard” • De Jure process has difficulty keeping up with the real world • De Facto process has risk of lock-in • Pragmatic approach • Expect your suppliers to use open file formats • If there is an acceptable standard, use it • Make sure you are using the right kind of format for each purpose http://www.amphora-research.com/ 17
  • 18. Technologies and techniques • There are a wide variety of tools you can use to integrate IT systems • Tight Vs Loose coupling • Synchronous Vs Asynchronous • Text Vs Binary • Proprietary Vs Open • Simple Vs Complex • As a rule • Loose is cheaper than Tight coupling • Asynchronous is easier to manage than Synchronous • Text is easier to work with, and more flexible than Binary • Open interfaces are always better than Proprietary • Simple are better Complex approaches http://www.amphora-research.com/ 18
  • 19. Considerations when picking tools • Use stable interfaces • Get a commitment from the vendor about what they’ll keep stable across version upgrades • Use public, documented interfaces • Sample code is really really useful • Pick language-neutral interfaces where possible • Platform-neutrality • Doesn’t worry (too much) about locking yourself into Windows on the client • But if you lock yourself to Windows on the server, it is going to hurt http://www.amphora-research.com/ 19
  • 20. Glue Languages • There are a number of really useful “Glue” languages around • Python (and Jython, and other relatives) • Perl (although I have some concerns about maintainability) • Groovy, Beanshell, etc. • All of them • Play well with XML, http, SOAP etc. • Play well OLE • Are cross platform • My personal preference is Python • You can learn it in a matter of hours • You can read other people’s code • It does everything I need it to do http://www.amphora-research.com/ 20
  • 21. Cool stuff • SOAP/Web Servers • Valuable in many areas • But don’t treat it as a religion • There are lighter alternatives which bring most of the benefits for much less effort • The whole WS-* effort seems to have got out of control • REST (XML over http) - a lighter alternative to SOAP • File swapping (generally, in XML) • HTTP GET/POST • Wonderfully easy to debug! • Very flexible http://www.amphora-research.com/ 21
  • 22. Nice things to see • Integration points exposed as stable URLs • For example, our PatentSafe product, we have committed to stable URL formats to • Submit a record via http (content & metadata) • Get a record for display to the user • These can be used by other systems • And also embedded in Word documents... • Lack of wheel re-invention • e.g. LDAP is The One True place for user information • e.g. RSS/Atom is The One True alerting mechanism • Example code • In multiple languages http://www.amphora-research.com/ 22
  • 23. Here be dragons • OLE - some times it is unavoidable (e.g. UI stuff), but avoid it when you can • Tight coupling • Buggy • Proprietary • Reduces your platform options • File format issues are awful • Version-to-version compatabilty is “interesting” • Direct database access • Tight coupling • Difficult to guarantee system integrity • If you wrote both systems you might want to do this http://www.amphora-research.com/ 23
  • 24. Open Source • Definitely one to watch • Not the “Free” lunch you might think, but a pragmatic business too • Examples • Linux • Postgres • JBoss,Tomcat etc. • Ghostscript • Open Source is part of everyone’s infrastructure • Make sure you can run your systems on a variety of platforms http://www.amphora-research.com/ 24
  • 25. Why? • Good for records • Gives you top-to-bottom control • Good for TCO • We’re finding the Open Source infrastructure easier to setup and reliable than proprietary alternatives • Enables a better solution • Transparent systems mean you can do things the original designers didn't think of • This is especially important for ELNs http://www.amphora-research.com/ 25
  • 26. Other stuff to watch • XML generally (what did we ever do without it) • Jabber (as computer messaging and IM framework) • Portals & Portlets • Especially JSR168,WSRP • Remember you may well want to portalize any useful application • AJAX • Google is my hero • You can build usable, functional Web Applications • If you haven’t seen GMail I can send you an “invite” • VMWare - virtualize your world • Wow • Great for serve consolidation, great for testing, great for development • Wikis • Beginning to turn into a lightweight application environment http://www.amphora-research.com/ 26
  • 27. Trends to watch • File format nasties • Closed/Private interfaces • Unlikely to be stable • DMCA and other copyright legislation http://www.amphora-research.com/ 27
  • 28. Summary • You’ll be assembling an “ELN System” from a series of components • Some you have, some you’ll build, some you’ll buy • Get the open stuff before you sign the deal • Open, documented, stable interfaces • Open file formats • Use open, loosely coupled approaches where possible • If you can, keep the capability to own the integration issues in-house http://www.amphora-research.com/ 28
  • 29. Contact information • Web site: http://www.amphora-research.com • EMail: simonc@amphora-research.com • Phone (US): (513) 697 4764 • Phone (UK): +44 (0)845 2300160 x2001 • AIM: simoncoles@mac.com • Skype: sjcoles http://www.amphora-research.com/ 29