SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Maori Ito @ NIBIO
Life Science Database
Cross Search and Metadata
Database integrate collaboration
among 4 ministries with NBDC
• Database Catalog
• Life Science Database Cross Search
• Life Science Database Archive
• Database Reconstructive Integration
Why Cross Search?
• Easy to use


• Accustomed to use


• Appropriate for comparing various kinds of
  databases
Sagace
• Search for Biomedical Data & Resources
  in Japan
Bad Skeptical Reputations for
Search Results…
• Useless…
• Slow….
• What is the advantage?
hat is the most Importa
thing in cross search ?
Simple Answers

•Speed and Accuracy
Mechanism of Search Engine
1. Crawling
2. Indexing
3. Query Processing
4. Scoring
Crawling
• Crawl databases and pages by program




                                     Program
Indexing
     • Split data convenient size and store
       own server
External Data




Internal Server
Query Processing and Scoring
In case of Hyper Estraier (Search
    System)
               NIBIO      AgriTogo




                                     Collaborate by
                                       using P2P
NBDC / DBCLS           MEDALS
                                      architecture
                                           Under
                                           Comtemplation
                       JCGGDB


                                                           12
Back to the simple answers to
     improvement
• Speed (Thanks to Johan-san ,Mizuguchi-san and
 many collaborators)
  1. Relax limits on access of DBCLS
  (Use a liggle ingenuity in css and images)
• Accuracy                                        NIBIO




                                               NBDC / DBCLS
How to improve accuracy?
• What is accuracy for life science database
  cross search?
• What is accuracy for life science
  specialist?
• In general, developers emphasize search
  algorithms and scorings.
• However, general results and methods for
  cross search may not suitable for life
  science specialists..?
• Data (Index files) from life science
  databases are sometimes difficult to
  understand immediately.
• It’s hard to make each crawler program for
  each database and maintenance it.
• (We have no extra …. to make proper
  search page like entrez et al….)
To Improve Accuracy
• Manually select Databases
• Assigned weights to crawled databases for
  improving the ranking system
Metadata!
  • One way to solve these problems




  Difficult to
 understand
     data
immediately
If metadata are added data…
                                 Data




Metadata
  Disease:Epithelial adenoma
  Species:Mouse
  Keywords:DNA sequence
  Last Modified:2013-01-19
Easy to understand for users
• It can be a guide to improve user experience.




                                   Image
Easy to understand for crawlers
          Metadata
             Disease:Epithelial adenoma
             Species:Mouse
             Keywords:DNA sequence
             Last Modified:2013-01-19
How to use it?
  • Mark up data by microdata like a tag
Image
                                     Title                    ID


                                                 Last Modified




                       http://www.pdbj.org/emnavi/emnavi_detail.php?id=1556&lang=en
Is it a practical suggestion?
• Google, Yahoo! and Bing decided to use microdata to
  show search results more valuable.
• Some vocabularies have already applied to search
  results.
• E.g.
Schema.org
• Provide a collection of schemas (htm tags)
• Bing, Google, Yahoo! and Yandex rely on
  this markup to improve the display of search
  results, making it easier for people to find
  the right web pages. (quoted by schema.org)
• We proposed “schema.org” extensions for
  “BiologicalDatabaseEntry” and “Biological
  Database”.
• Schema.org proposals :
 http://www.w3.org/wiki/WebSchemas/SchemaDot
 OrgProposals
Properties for
    BiologicalDatabaseEntry

entryID     additionalType        dateCreated
isEntryof   description           dateModified
taxon       image                 keywords
seeAlso     url                   provider
reference   alternativeHeadline   breadcrumb
name        inLanguage
Related Link for our proposal
• WebSchemas proposal ‘Biological
  Databases’ for schema.org
  – http://www.w3.org/wiki/WebSchemas/BioData
    bases
• Discussions at BioHackathon
  – https://github.com/dbcls/bh12/wiki/Schema.org
    -extension
• Discussions at BH12.12 (Japanese only)
  – http://wiki.lifesciencedb.jp/mw/index.php/BH12
    .12/schema.org
How to markup ?
                                                    Declaration
<div itemscope itemtype=“http://schema.org/BiologicalDatabaseEntry”>
ID
 <span itemprop="entryID">1556</span>
Specied
<span itemprop="taxon" itemscope itemtype="http://schema.org/BiologicalDatabaseEntry">
 <span itemprop="name">Bacillus subtilis</span>
</span>
Deposition:
 <span itemprop="dateCreated">2008-09-08</span>
Last update:
 <span itemprop="dateModified">2012-10-24</span>

</div>
                                          Specify Property and
                                         markup with normal tag
And then
• Crawl these microdata              At Present




• Reflect Search Results           Image




                            Within the fiscal year
                           (Preparation to reflect)
Ask for your help
• If this approach have some efforts, there are
  may be chances to reflect major search
  engines.
• Please markup your own site or database
  and give me feedback.
• If you have any suggestions or comments,
  please let me know.
Future Perspective
• Focus on Accuracy continuously
• Microdata
  – Discuss many scientists and finalize the
    proposal of schema.org extension
  – Boost numbers of databases
  – Make support tools to mark up microdata
• Add appropriate data from high-quality
  databases
Thank you for
listening!

Weitere ähnliche Inhalte

Was ist angesagt?

Adding valuethroughdatacuration
Adding valuethroughdatacurationAdding valuethroughdatacuration
Adding valuethroughdatacurationAPLICwebmaster
 
Federated Search: The Good, The Bad And The Ugly
Federated Search: The Good, The Bad And The UglyFederated Search: The Good, The Bad And The Ugly
Federated Search: The Good, The Bad And The Uglydorishelfer
 
Schema.org extension for biological database @ Biohackathon2013
Schema.org extension for biological database @ Biohackathon2013Schema.org extension for biological database @ Biohackathon2013
Schema.org extension for biological database @ Biohackathon2013Maori Ito
 
Deep Dive: Security Trimming in Fusion
Deep Dive: Security Trimming in FusionDeep Dive: Security Trimming in Fusion
Deep Dive: Security Trimming in FusionLucidworks
 
Landing Pages - Joe Hourcle - RDAP12
Landing Pages - Joe Hourcle - RDAP12Landing Pages - Joe Hourcle - RDAP12
Landing Pages - Joe Hourcle - RDAP12ASIS&T
 
Limits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in BioinformaticsLimits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in BioinformaticsDan Sullivan, Ph.D.
 
Federated Search Falls Short
Federated Search Falls ShortFederated Search Falls Short
Federated Search Falls Shortslknight
 
DataONE Education Module 09: Analysis and Workflows
DataONE Education Module 09: Analysis and WorkflowsDataONE Education Module 09: Analysis and Workflows
DataONE Education Module 09: Analysis and WorkflowsDataONE
 
Federated Search in a Disparate Environment
Federated Search in a Disparate EnvironmentFederated Search in a Disparate Environment
Federated Search in a Disparate EnvironmentHelen Mitchell
 
Qiagram
QiagramQiagram
Qiagramjwppz
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLijscai
 
Labmatrix
LabmatrixLabmatrix
Labmatrixjwppz
 
IEDA Data Publication Workshop @AGU
IEDA Data Publication Workshop @AGUIEDA Data Publication Workshop @AGU
IEDA Data Publication Workshop @AGUKerstin Lehnert
 
Preparing your data for sharing and publishing
Preparing your data for sharing and publishingPreparing your data for sharing and publishing
Preparing your data for sharing and publishingVarsha Khodiyar
 
Performance analysis of MongoDB and HBase
Performance analysis of MongoDB and HBasePerformance analysis of MongoDB and HBase
Performance analysis of MongoDB and HBaseSindhujanDhayalan
 

Was ist angesagt? (18)

Presentation federated search
Presentation federated searchPresentation federated search
Presentation federated search
 
Adding valuethroughdatacuration
Adding valuethroughdatacurationAdding valuethroughdatacuration
Adding valuethroughdatacuration
 
Federated Search: The Good, The Bad And The Ugly
Federated Search: The Good, The Bad And The UglyFederated Search: The Good, The Bad And The Ugly
Federated Search: The Good, The Bad And The Ugly
 
Schema.org extension for biological database @ Biohackathon2013
Schema.org extension for biological database @ Biohackathon2013Schema.org extension for biological database @ Biohackathon2013
Schema.org extension for biological database @ Biohackathon2013
 
Deep Dive: Security Trimming in Fusion
Deep Dive: Security Trimming in FusionDeep Dive: Security Trimming in Fusion
Deep Dive: Security Trimming in Fusion
 
Landing Pages - Joe Hourcle - RDAP12
Landing Pages - Joe Hourcle - RDAP12Landing Pages - Joe Hourcle - RDAP12
Landing Pages - Joe Hourcle - RDAP12
 
Limits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in BioinformaticsLimits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in Bioinformatics
 
Federated Search Falls Short
Federated Search Falls ShortFederated Search Falls Short
Federated Search Falls Short
 
DataONE Education Module 09: Analysis and Workflows
DataONE Education Module 09: Analysis and WorkflowsDataONE Education Module 09: Analysis and Workflows
DataONE Education Module 09: Analysis and Workflows
 
BigData Testing by Shreya Pal
BigData Testing by Shreya PalBigData Testing by Shreya Pal
BigData Testing by Shreya Pal
 
Federated Search in a Disparate Environment
Federated Search in a Disparate EnvironmentFederated Search in a Disparate Environment
Federated Search in a Disparate Environment
 
Qiagram
QiagramQiagram
Qiagram
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
 
Labmatrix
LabmatrixLabmatrix
Labmatrix
 
IEDA Data Publication Workshop @AGU
IEDA Data Publication Workshop @AGUIEDA Data Publication Workshop @AGU
IEDA Data Publication Workshop @AGU
 
Preparing your data for sharing and publishing
Preparing your data for sharing and publishingPreparing your data for sharing and publishing
Preparing your data for sharing and publishing
 
Metadata as Standard: improving Interoperability through the Research Data Al...
Metadata as Standard: improving Interoperability through the Research Data Al...Metadata as Standard: improving Interoperability through the Research Data Al...
Metadata as Standard: improving Interoperability through the Research Data Al...
 
Performance analysis of MongoDB and HBase
Performance analysis of MongoDB and HBasePerformance analysis of MongoDB and HBase
Performance analysis of MongoDB and HBase
 

Andere mochten auch

Life Science Accelerator, Tom Palenius
Life Science Accelerator, Tom Palenius Life Science Accelerator, Tom Palenius
Life Science Accelerator, Tom Palenius Business Turku
 
Nishimoto110126 v15-light2
Nishimoto110126 v15-light2Nishimoto110126 v15-light2
Nishimoto110126 v15-light2Takuya Nishimoto
 
The Life Science Product Manager's Toolkit
The Life Science Product Manager's ToolkitThe Life Science Product Manager's Toolkit
The Life Science Product Manager's Toolkitprothenberg
 
JSでファミコンエミュレータを作った時の話
JSでファミコンエミュレータを作った時の話JSでファミコンエミュレータを作った時の話
JSでファミコンエミュレータを作った時の話sairoutine
 
Pharmaceutical Mergers Acquisitions in the U.S
Pharmaceutical Mergers Acquisitions in the U.SPharmaceutical Mergers Acquisitions in the U.S
Pharmaceutical Mergers Acquisitions in the U.SCapgemini
 
Prescription Medicines: Costs in Context
Prescription Medicines: Costs in Context Prescription Medicines: Costs in Context
Prescription Medicines: Costs in Context PhRMA
 
2015 SF Exploratorium Lecture: "Corn: Diversity and Origins"
2015 SF Exploratorium Lecture: "Corn: Diversity and Origins"2015 SF Exploratorium Lecture: "Corn: Diversity and Origins"
2015 SF Exploratorium Lecture: "Corn: Diversity and Origins"jrossibarra
 
The Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureThe Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureArturo Pelayo
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsLinkedIn
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerLuminary Labs
 

Andere mochten auch (10)

Life Science Accelerator, Tom Palenius
Life Science Accelerator, Tom Palenius Life Science Accelerator, Tom Palenius
Life Science Accelerator, Tom Palenius
 
Nishimoto110126 v15-light2
Nishimoto110126 v15-light2Nishimoto110126 v15-light2
Nishimoto110126 v15-light2
 
The Life Science Product Manager's Toolkit
The Life Science Product Manager's ToolkitThe Life Science Product Manager's Toolkit
The Life Science Product Manager's Toolkit
 
JSでファミコンエミュレータを作った時の話
JSでファミコンエミュレータを作った時の話JSでファミコンエミュレータを作った時の話
JSでファミコンエミュレータを作った時の話
 
Pharmaceutical Mergers Acquisitions in the U.S
Pharmaceutical Mergers Acquisitions in the U.SPharmaceutical Mergers Acquisitions in the U.S
Pharmaceutical Mergers Acquisitions in the U.S
 
Prescription Medicines: Costs in Context
Prescription Medicines: Costs in Context Prescription Medicines: Costs in Context
Prescription Medicines: Costs in Context
 
2015 SF Exploratorium Lecture: "Corn: Diversity and Origins"
2015 SF Exploratorium Lecture: "Corn: Diversity and Origins"2015 SF Exploratorium Lecture: "Corn: Diversity and Origins"
2015 SF Exploratorium Lecture: "Corn: Diversity and Origins"
 
The Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureThe Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The Future
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving Cars
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
 

Ähnlich wie Life Science Database Cross Search and Metadata

The Progress on Sagace and Data Integration
The Progress on Sagace and Data IntegrationThe Progress on Sagace and Data Integration
The Progress on Sagace and Data IntegrationMaori Ito
 
Module 1 - Chapter1.pptx
Module 1 - Chapter1.pptxModule 1 - Chapter1.pptx
Module 1 - Chapter1.pptxSoniaDevi15
 
Building genomic data cyberinfrastructure with the online database software T...
Building genomic data cyberinfrastructure with the online database software T...Building genomic data cyberinfrastructure with the online database software T...
Building genomic data cyberinfrastructure with the online database software T...mestato
 
Cedar Overview
Cedar OverviewCedar Overview
Cedar Overviewjbgraybeal
 
CS3270 - DATABASE SYSTEM - Lecture (1)
CS3270 - DATABASE SYSTEM -  Lecture (1)CS3270 - DATABASE SYSTEM -  Lecture (1)
CS3270 - DATABASE SYSTEM - Lecture (1)Dilawar Khan
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsGeorge Stathis
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Spark Summit
 
Government GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 StandardsGovernment GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 StandardsNeo4j
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadatamarkgrover
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 Scott Edmunds
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...SEAD
 
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...Cedar OnDemand: An intelligent browser extension to generate ontology-based m...
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...Syed Ahmad Chan Bukhari, PhD
 
2015 09 emc lsug
2015 09 emc lsug2015 09 emc lsug
2015 09 emc lsugChris Dwan
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...S. Diana Hu
 
Bioschemas Workshop
Bioschemas WorkshopBioschemas Workshop
Bioschemas WorkshopNiall Beard
 
Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Ken Karapetyan
 
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Lucidworks
 

Ähnlich wie Life Science Database Cross Search and Metadata (20)

The Progress on Sagace and Data Integration
The Progress on Sagace and Data IntegrationThe Progress on Sagace and Data Integration
The Progress on Sagace and Data Integration
 
Module 1 - Chapter1.pptx
Module 1 - Chapter1.pptxModule 1 - Chapter1.pptx
Module 1 - Chapter1.pptx
 
Building genomic data cyberinfrastructure with the online database software T...
Building genomic data cyberinfrastructure with the online database software T...Building genomic data cyberinfrastructure with the online database software T...
Building genomic data cyberinfrastructure with the online database software T...
 
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
 
Cedar Overview
Cedar OverviewCedar Overview
Cedar Overview
 
CS3270 - DATABASE SYSTEM - Lecture (1)
CS3270 - DATABASE SYSTEM -  Lecture (1)CS3270 - DATABASE SYSTEM -  Lecture (1)
CS3270 - DATABASE SYSTEM - Lecture (1)
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data Lessons
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
 
Government GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 StandardsGovernment GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 Standards
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadata
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
 
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...Cedar OnDemand: An intelligent browser extension to generate ontology-based m...
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...
 
2015 09 emc lsug
2015 09 emc lsug2015 09 emc lsug
2015 09 emc lsug
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
Bioschemas Workshop
Bioschemas WorkshopBioschemas Workshop
Bioschemas Workshop
 
Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...
 
Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...
 
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
 

Mehr von Maori Ito

42nd MTG in NIBIO
42nd MTG in NIBIO42nd MTG in NIBIO
42nd MTG in NIBIOMaori Ito
 
41st MTG in NIBIO
41st MTG in NIBIO41st MTG in NIBIO
41st MTG in NIBIOMaori Ito
 
40th MTG in NIBIO
40th MTG in NIBIO40th MTG in NIBIO
40th MTG in NIBIOMaori Ito
 
39th MTG in NIBIO
39th MTG in NIBIO39th MTG in NIBIO
39th MTG in NIBIOMaori Ito
 
Test slide for the lab - Target prioritization
Test slide for the lab - Target prioritization Test slide for the lab - Target prioritization
Test slide for the lab - Target prioritization Maori Ito
 
Test for lab_j Psiver j
Test for lab_j Psiver jTest for lab_j Psiver j
Test for lab_j Psiver jMaori Ito
 
38th MTG in NIBIO
38th MTG in NIBIO38th MTG in NIBIO
38th MTG in NIBIOMaori Ito
 
37th mtg in NIBIO
37th mtg in NIBIO37th mtg in NIBIO
37th mtg in NIBIOMaori Ito
 
36th mtg in NIBIO
 36th mtg in NIBIO 36th mtg in NIBIO
36th mtg in NIBIOMaori Ito
 
35th mtg in NIBIO
35th mtg in NIBIO35th mtg in NIBIO
35th mtg in NIBIOMaori Ito
 
34th mtg in NIBIO
34th mtg in NIBIO34th mtg in NIBIO
34th mtg in NIBIOMaori Ito
 
33rd MTG In NIBIO
33rd MTG In NIBIO33rd MTG In NIBIO
33rd MTG In NIBIOMaori Ito
 
32nd MTG in NIBIO
32nd MTG in NIBIO32nd MTG in NIBIO
32nd MTG in NIBIOMaori Ito
 
31st Integrated DB MTG in NIBIO
31st Integrated DB MTG in NIBIO31st Integrated DB MTG in NIBIO
31st Integrated DB MTG in NIBIOMaori Ito
 
30th Integrated DB MTG in NIBIO
30th Integrated DB MTG in NIBIO30th Integrated DB MTG in NIBIO
30th Integrated DB MTG in NIBIOMaori Ito
 
29th Integrated DB MTG in NIBIO
29th Integrated DB MTG in NIBIO29th Integrated DB MTG in NIBIO
29th Integrated DB MTG in NIBIOMaori Ito
 
Bh13.13 sagace 1
Bh13.13 sagace 1Bh13.13 sagace 1
Bh13.13 sagace 1Maori Ito
 

Mehr von Maori Ito (20)

42nd MTG in NIBIO
42nd MTG in NIBIO42nd MTG in NIBIO
42nd MTG in NIBIO
 
41st MTG in NIBIO
41st MTG in NIBIO41st MTG in NIBIO
41st MTG in NIBIO
 
40th MTG in NIBIO
40th MTG in NIBIO40th MTG in NIBIO
40th MTG in NIBIO
 
39th MTG in NIBIO
39th MTG in NIBIO39th MTG in NIBIO
39th MTG in NIBIO
 
Test slide for the lab - Target prioritization
Test slide for the lab - Target prioritization Test slide for the lab - Target prioritization
Test slide for the lab - Target prioritization
 
Test for lab_j Psiver j
Test for lab_j Psiver jTest for lab_j Psiver j
Test for lab_j Psiver j
 
Psiver j
Psiver jPsiver j
Psiver j
 
38th MTG in NIBIO
38th MTG in NIBIO38th MTG in NIBIO
38th MTG in NIBIO
 
37th mtg in NIBIO
37th mtg in NIBIO37th mtg in NIBIO
37th mtg in NIBIO
 
36th mtg in NIBIO
 36th mtg in NIBIO 36th mtg in NIBIO
36th mtg in NIBIO
 
35th mtg in NIBIO
35th mtg in NIBIO35th mtg in NIBIO
35th mtg in NIBIO
 
34th mtg in NIBIO
34th mtg in NIBIO34th mtg in NIBIO
34th mtg in NIBIO
 
33rd MTG In NIBIO
33rd MTG In NIBIO33rd MTG In NIBIO
33rd MTG In NIBIO
 
32nd MTG in NIBIO
32nd MTG in NIBIO32nd MTG in NIBIO
32nd MTG in NIBIO
 
31st Integrated DB MTG in NIBIO
31st Integrated DB MTG in NIBIO31st Integrated DB MTG in NIBIO
31st Integrated DB MTG in NIBIO
 
30th Integrated DB MTG in NIBIO
30th Integrated DB MTG in NIBIO30th Integrated DB MTG in NIBIO
30th Integrated DB MTG in NIBIO
 
29th Integrated DB MTG in NIBIO
29th Integrated DB MTG in NIBIO29th Integrated DB MTG in NIBIO
29th Integrated DB MTG in NIBIO
 
Bh13.13 sagace 1
Bh13.13 sagace 1Bh13.13 sagace 1
Bh13.13 sagace 1
 
28th mtg
28th mtg28th mtg
28th mtg
 
27th mtg 1
27th mtg 127th mtg 1
27th mtg 1
 

Kürzlich hochgeladen

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Life Science Database Cross Search and Metadata

  • 1. Maori Ito @ NIBIO Life Science Database Cross Search and Metadata
  • 2. Database integrate collaboration among 4 ministries with NBDC • Database Catalog • Life Science Database Cross Search • Life Science Database Archive • Database Reconstructive Integration
  • 3. Why Cross Search? • Easy to use • Accustomed to use • Appropriate for comparing various kinds of databases
  • 4. Sagace • Search for Biomedical Data & Resources in Japan
  • 5. Bad Skeptical Reputations for Search Results… • Useless… • Slow…. • What is the advantage?
  • 6. hat is the most Importa thing in cross search ?
  • 8. Mechanism of Search Engine 1. Crawling 2. Indexing 3. Query Processing 4. Scoring
  • 9. Crawling • Crawl databases and pages by program Program
  • 10. Indexing • Split data convenient size and store own server External Data Internal Server
  • 12. In case of Hyper Estraier (Search System) NIBIO AgriTogo Collaborate by using P2P NBDC / DBCLS MEDALS architecture Under Comtemplation JCGGDB 12
  • 13. Back to the simple answers to improvement • Speed (Thanks to Johan-san ,Mizuguchi-san and many collaborators) 1. Relax limits on access of DBCLS (Use a liggle ingenuity in css and images) • Accuracy NIBIO NBDC / DBCLS
  • 14. How to improve accuracy? • What is accuracy for life science database cross search? • What is accuracy for life science specialist?
  • 15. • In general, developers emphasize search algorithms and scorings. • However, general results and methods for cross search may not suitable for life science specialists..? • Data (Index files) from life science databases are sometimes difficult to understand immediately. • It’s hard to make each crawler program for each database and maintenance it. • (We have no extra …. to make proper search page like entrez et al….)
  • 16. To Improve Accuracy • Manually select Databases • Assigned weights to crawled databases for improving the ranking system
  • 17. Metadata! • One way to solve these problems Difficult to understand data immediately
  • 18. If metadata are added data… Data Metadata Disease:Epithelial adenoma Species:Mouse Keywords:DNA sequence Last Modified:2013-01-19
  • 19. Easy to understand for users • It can be a guide to improve user experience. Image
  • 20. Easy to understand for crawlers Metadata Disease:Epithelial adenoma Species:Mouse Keywords:DNA sequence Last Modified:2013-01-19
  • 21. How to use it? • Mark up data by microdata like a tag Image Title ID Last Modified http://www.pdbj.org/emnavi/emnavi_detail.php?id=1556&lang=en
  • 22. Is it a practical suggestion? • Google, Yahoo! and Bing decided to use microdata to show search results more valuable. • Some vocabularies have already applied to search results. • E.g.
  • 23. Schema.org • Provide a collection of schemas (htm tags) • Bing, Google, Yahoo! and Yandex rely on this markup to improve the display of search results, making it easier for people to find the right web pages. (quoted by schema.org) • We proposed “schema.org” extensions for “BiologicalDatabaseEntry” and “Biological Database”. • Schema.org proposals : http://www.w3.org/wiki/WebSchemas/SchemaDot OrgProposals
  • 24. Properties for BiologicalDatabaseEntry entryID additionalType dateCreated isEntryof description dateModified taxon image keywords seeAlso url provider reference alternativeHeadline breadcrumb name inLanguage
  • 25. Related Link for our proposal • WebSchemas proposal ‘Biological Databases’ for schema.org – http://www.w3.org/wiki/WebSchemas/BioData bases • Discussions at BioHackathon – https://github.com/dbcls/bh12/wiki/Schema.org -extension • Discussions at BH12.12 (Japanese only) – http://wiki.lifesciencedb.jp/mw/index.php/BH12 .12/schema.org
  • 26. How to markup ? Declaration <div itemscope itemtype=“http://schema.org/BiologicalDatabaseEntry”> ID <span itemprop="entryID">1556</span> Specied <span itemprop="taxon" itemscope itemtype="http://schema.org/BiologicalDatabaseEntry"> <span itemprop="name">Bacillus subtilis</span> </span> Deposition: <span itemprop="dateCreated">2008-09-08</span> Last update: <span itemprop="dateModified">2012-10-24</span> </div> Specify Property and markup with normal tag
  • 27. And then • Crawl these microdata At Present • Reflect Search Results Image Within the fiscal year (Preparation to reflect)
  • 28. Ask for your help • If this approach have some efforts, there are may be chances to reflect major search engines. • Please markup your own site or database and give me feedback. • If you have any suggestions or comments, please let me know.
  • 29. Future Perspective • Focus on Accuracy continuously • Microdata – Discuss many scientists and finalize the proposal of schema.org extension – Boost numbers of databases – Make support tools to mark up microdata • Add appropriate data from high-quality databases