SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Ontology Modelling of an
 Engineering Document –
Perspectives of Linguistics
        Analysis




          26.08.2012
First Step: Requirements
            Modelling
ROSENERGOATOM project, July 2011
  – Manual processing methodology for Technical
    Requirements document
  – Special software for ISO 15926 data model
    transformation
  – Sample Nuclear Power Plant requirements
    processing:
    • Sample size: 12 paragraphs of text
    • Content identified: 16 requirements, 3 classifiers
    • Resulting model: 96 items, 35 relationships
                                                           2
Technical Document Semantic
            Modelling
TabLan methodology, March 2012
 – Manual processing methodology for technical
   documents (English)
 – Using subset of Gellish http://
   sourceforge.net/apps/trac/gellish/
 – Mapping to the enhanced Initial Template Set
 – .15926 Editor for ISO 15926 data model
   transformation

 – Dowload free from http://
   techinvestlab.ru/files/TabLan/TabLan.rar       3
Document Modelling Lessons
• Technical document modelling promise:
  – Requirements verification
  – Project IT systems customisation (classifiers for
    CAD/CAM/PLM/ERP/etc.)
  – Data integration support (reference data library content
    generation)
  – Tracing design decisions to requirements
  – Design decisions verification
• Formal modelling problems:
  – Labour-intensive process of manual modelling
  – Large volume of «dumb» preparatory work
  – Need for a professional engineering verification in a new
    formalism unknown to engineers
  – Fragmented architecture of project IT environment — an
    obstacle for model reuse
                                                                4
Preconditions for Automation of
   Technical Document Modelling
• Restricted and relatively formal engineering
  subset of natural language
• Contemporary developments in computer based
  natural language processing
• Contemporary developments in ontology
  extraction from natural language texts
• Controlled language for engineering (Gellish)
• Gellish to ISO 15926 mapping development

                                                  5
Experimenting with
       ABBYY Compreno
Technology That Translates from Human
      into Computer Language
http://www.abbyy.ru/science/techno
     logies/business/compreno
ABBYY Compreno
ABBYY Compreno is ABBYY’s innovative technology that performs full semantic and syntactic analysis for
   comprehensive handling of natural language texts.
   ABBYY Compreno is the first ever practical implementation of fundamental linguistic research carried
   out internationally over the past fifty years. A result of seventeen years of intensive R&D, ABBYY
   Compreno offers robust solutions to many long-standing language processing problems of the
   information age, such as:

•       Intelligent search and retrieval
    –     Intelligent semantic search
    –     Multilingual search
    –     Semantic tagging of documents for more powerful searching
•       Comprehensive text analysis
    –     Information monitoring
    –     Controlling access to cofidential information
    –     Summarizing and annotating documents
    –     Sentiment analysis
•       Efficient handling of text documents
    –     Document classification and filtering
    –     Text comparison
•       High quality machine translation
Research Plan

• Starting point – comparison between:
  • syntactic and semantic structure (parsed by ABBYY
    Compreno)
  • formal text model (manually prepared)
• Rule development for mapping between
  linguistic and engineering ontologies (current)
• Customisation with domain thesauri (plans)
• Testing on a corpus of engineering texts (plans)


                                                        8
«The containment system shall include a
 primary containment and a secondary
            containment.»




     ABBYY Compreno parser results: text view
                                                9
ABBYY Compreno parser results: tree view
                                           10
«The containment system shall include a
  primary containment and a secondary
             containment.»
                 Formal model:
Containment system
  A: is a whole for Primary containment
  B: is a whole for Secondary containment
А is classified as a Requirement
B is classified as a Requirement



                                            11
«Inner surfaces should be smooth to prevent
corrosion residue and to simplify decontamination.»




                                       ABBYY Compreno
                                       parser: tree view 12
«Inner surfaces should be smooth to prevent
corrosion residue and to simplify decontamination.»
                                   Formal model:

Inner surfaces
    is a specialization of Surface
    is a specialization of Inner
Inner surfaces
A: is a specialization of Smooth
A
    is classified as a Requirement
    is intended to achieve To prevent corrosion residue and to simplify
        decontamination
To prevent corrosion residue and to simplify decontamination
is a whole for To prevent corrosion residue
        has as subject Corrosion residue
    is a whole for To simplify decontamination
        has as subject Decontamination

                                                                          13
Thank you!
Anatoly Levenchuk
http://ailev.ru (Rus)
http://levenchuk.com (Eng)
ailev@asmp.msk.su

Victor Agroskin
vic5784@gmail.com

.15926 Editor
http://techinvestlab.ru/dot15926Editor
Feedback and comments:
   dot15926@gmail.com
   http://community.livejournal.com/dot15926/

TechInvestLab.ru
+7 (495) 748-5388                               14

Weitere ähnliche Inhalte

Andere mochten auch

No Ki Magic: Managing Complex DITA Hyperdocuments
No Ki Magic: Managing Complex DITA HyperdocumentsNo Ki Magic: Managing Complex DITA Hyperdocuments
No Ki Magic: Managing Complex DITA HyperdocumentsContrext Solutions
 
Алексей Корнилов -- фото к докладу "Робототехника как мультидисциплина"
Алексей Корнилов -- фото к докладу "Робототехника как мультидисциплина"Алексей Корнилов -- фото к докладу "Робототехника как мультидисциплина"
Алексей Корнилов -- фото к докладу "Робототехника как мультидисциплина"Anatoly Levenchuk
 
Introducing Compreno - Natural Language Processing Technology
Introducing Compreno - Natural Language Processing TechnologyIntroducing Compreno - Natural Language Processing Technology
Introducing Compreno - Natural Language Processing TechnologyABBYY
 
Information Flow based Ontology Mapping - 2002
Information Flow based Ontology Mapping - 2002Information Flow based Ontology Mapping - 2002
Information Flow based Ontology Mapping - 2002Yannis Kalfoglou
 
The Return of the Living Datalog
The Return of the Living DatalogThe Return of the Living Datalog
The Return of the Living DatalogMike Fogus
 
AI & Big Data Analytics : Innovation trends and use cases
AI & Big Data Analytics : Innovation trends and use casesAI & Big Data Analytics : Innovation trends and use cases
AI & Big Data Analytics : Innovation trends and use casesSarvesh Kumar
 
from text and ontology : methodologies and tools - Text2Onto
from text and ontology : methodologies and tools - Text2Ontofrom text and ontology : methodologies and tools - Text2Onto
from text and ontology : methodologies and tools - Text2OntoRadhoueneRouached
 
Ontology Engineering for Big Data
Ontology Engineering for Big DataOntology Engineering for Big Data
Ontology Engineering for Big DataKouji Kozaki
 
디지털 플랜트를 위한 정보상호운용성 및 활용성 제고
디지털 플랜트를 위한 정보상호운용성 및 활용성 제고디지털 플랜트를 위한 정보상호운용성 및 활용성 제고
디지털 플랜트를 위한 정보상호운용성 및 활용성 제고Taiheon Choi
 
Big Data & Artificial Intelligence
Big Data & Artificial IntelligenceBig Data & Artificial Intelligence
Big Data & Artificial IntelligenceZavain Dar
 
Predictive Analytics - Big Data & Artificial Intelligence
Predictive Analytics - Big Data & Artificial IntelligencePredictive Analytics - Big Data & Artificial Intelligence
Predictive Analytics - Big Data & Artificial IntelligenceManish Jain
 
일신오토클레이브 회사소개서
일신오토클레이브 회사소개서일신오토클레이브 회사소개서
일신오토클레이브 회사소개서ilshinautoclave
 
Document management system
Document management systemDocument management system
Document management systemRaghu Raja
 
Intelligent Text Analytics with ABBYY Compreno
Intelligent Text Analytics with ABBYY ComprenoIntelligent Text Analytics with ABBYY Compreno
Intelligent Text Analytics with ABBYY ComprenoABBYY
 

Andere mochten auch (17)

No Ki Magic: Managing Complex DITA Hyperdocuments
No Ki Magic: Managing Complex DITA HyperdocumentsNo Ki Magic: Managing Complex DITA Hyperdocuments
No Ki Magic: Managing Complex DITA Hyperdocuments
 
Алексей Корнилов -- фото к докладу "Робототехника как мультидисциплина"
Алексей Корнилов -- фото к докладу "Робототехника как мультидисциплина"Алексей Корнилов -- фото к докладу "Робототехника как мультидисциплина"
Алексей Корнилов -- фото к докладу "Робототехника как мультидисциплина"
 
EED Software Products
EED Software  ProductsEED Software  Products
EED Software Products
 
Introducing Compreno - Natural Language Processing Technology
Introducing Compreno - Natural Language Processing TechnologyIntroducing Compreno - Natural Language Processing Technology
Introducing Compreno - Natural Language Processing Technology
 
Information Flow based Ontology Mapping - 2002
Information Flow based Ontology Mapping - 2002Information Flow based Ontology Mapping - 2002
Information Flow based Ontology Mapping - 2002
 
The Return of the Living Datalog
The Return of the Living DatalogThe Return of the Living Datalog
The Return of the Living Datalog
 
AI & Big Data Analytics : Innovation trends and use cases
AI & Big Data Analytics : Innovation trends and use casesAI & Big Data Analytics : Innovation trends and use cases
AI & Big Data Analytics : Innovation trends and use cases
 
from text and ontology : methodologies and tools - Text2Onto
from text and ontology : methodologies and tools - Text2Ontofrom text and ontology : methodologies and tools - Text2Onto
from text and ontology : methodologies and tools - Text2Onto
 
Ontology Engineering for Big Data
Ontology Engineering for Big DataOntology Engineering for Big Data
Ontology Engineering for Big Data
 
디지털 플랜트를 위한 정보상호운용성 및 활용성 제고
디지털 플랜트를 위한 정보상호운용성 및 활용성 제고디지털 플랜트를 위한 정보상호운용성 및 활용성 제고
디지털 플랜트를 위한 정보상호운용성 및 활용성 제고
 
Web crawler
Web crawlerWeb crawler
Web crawler
 
Big Data & Artificial Intelligence
Big Data & Artificial IntelligenceBig Data & Artificial Intelligence
Big Data & Artificial Intelligence
 
Predictive Analytics - Big Data & Artificial Intelligence
Predictive Analytics - Big Data & Artificial IntelligencePredictive Analytics - Big Data & Artificial Intelligence
Predictive Analytics - Big Data & Artificial Intelligence
 
일신오토클레이브 회사소개서
일신오토클레이브 회사소개서일신오토클레이브 회사소개서
일신오토클레이브 회사소개서
 
RDF and OWL
RDF and OWLRDF and OWL
RDF and OWL
 
Document management system
Document management systemDocument management system
Document management system
 
Intelligent Text Analytics with ABBYY Compreno
Intelligent Text Analytics with ABBYY ComprenoIntelligent Text Analytics with ABBYY Compreno
Intelligent Text Analytics with ABBYY Compreno
 

Mehr von Victor Agroskin

Модульный подход к инвестиционному анализу крипто-протоколов
Модульный подход к инвестиционному анализу крипто-протоколовМодульный подход к инвестиционному анализу крипто-протоколов
Модульный подход к инвестиционному анализу крипто-протоколовVictor Agroskin
 
Личность в цифровом мире
Личность в цифровом миреЛичность в цифровом мире
Личность в цифровом миреVictor Agroskin
 
Реальный мир и хорошие модели данных.
Реальный мир и хорошие модели данных. Реальный мир и хорошие модели данных.
Реальный мир и хорошие модели данных. Victor Agroskin
 
СИСТЕМНЫЙ АНАЛИЗ ВОЗМОЖНОГО РАЗВИТИЯ КОНЦЕПЦИИ ЛИЧНОСТИ
СИСТЕМНЫЙ АНАЛИЗ ВОЗМОЖНОГО РАЗВИТИЯ КОНЦЕПЦИИ ЛИЧНОСТИСИСТЕМНЫЙ АНАЛИЗ ВОЗМОЖНОГО РАЗВИТИЯ КОНЦЕПЦИИ ЛИЧНОСТИ
СИСТЕМНЫЙ АНАЛИЗ ВОЗМОЖНОГО РАЗВИТИЯ КОНЦЕПЦИИ ЛИЧНОСТИVictor Agroskin
 
dot15926 Software Presentation
dot15926 Software Presentationdot15926 Software Presentation
dot15926 Software PresentationVictor Agroskin
 
Regulation System Choice - Risk Management Approach
Regulation System Choice - Risk Management ApproachRegulation System Choice - Risk Management Approach
Regulation System Choice - Risk Management ApproachVictor Agroskin
 

Mehr von Victor Agroskin (6)

Модульный подход к инвестиционному анализу крипто-протоколов
Модульный подход к инвестиционному анализу крипто-протоколовМодульный подход к инвестиционному анализу крипто-протоколов
Модульный подход к инвестиционному анализу крипто-протоколов
 
Личность в цифровом мире
Личность в цифровом миреЛичность в цифровом мире
Личность в цифровом мире
 
Реальный мир и хорошие модели данных.
Реальный мир и хорошие модели данных. Реальный мир и хорошие модели данных.
Реальный мир и хорошие модели данных.
 
СИСТЕМНЫЙ АНАЛИЗ ВОЗМОЖНОГО РАЗВИТИЯ КОНЦЕПЦИИ ЛИЧНОСТИ
СИСТЕМНЫЙ АНАЛИЗ ВОЗМОЖНОГО РАЗВИТИЯ КОНЦЕПЦИИ ЛИЧНОСТИСИСТЕМНЫЙ АНАЛИЗ ВОЗМОЖНОГО РАЗВИТИЯ КОНЦЕПЦИИ ЛИЧНОСТИ
СИСТЕМНЫЙ АНАЛИЗ ВОЗМОЖНОГО РАЗВИТИЯ КОНЦЕПЦИИ ЛИЧНОСТИ
 
dot15926 Software Presentation
dot15926 Software Presentationdot15926 Software Presentation
dot15926 Software Presentation
 
Regulation System Choice - Risk Management Approach
Regulation System Choice - Risk Management ApproachRegulation System Choice - Risk Management Approach
Regulation System Choice - Risk Management Approach
 

Kürzlich hochgeladen

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 

Kürzlich hochgeladen (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 

Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis

  • 1. Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis 26.08.2012
  • 2. First Step: Requirements Modelling ROSENERGOATOM project, July 2011 – Manual processing methodology for Technical Requirements document – Special software for ISO 15926 data model transformation – Sample Nuclear Power Plant requirements processing: • Sample size: 12 paragraphs of text • Content identified: 16 requirements, 3 classifiers • Resulting model: 96 items, 35 relationships 2
  • 3. Technical Document Semantic Modelling TabLan methodology, March 2012 – Manual processing methodology for technical documents (English) – Using subset of Gellish http:// sourceforge.net/apps/trac/gellish/ – Mapping to the enhanced Initial Template Set – .15926 Editor for ISO 15926 data model transformation – Dowload free from http:// techinvestlab.ru/files/TabLan/TabLan.rar 3
  • 4. Document Modelling Lessons • Technical document modelling promise: – Requirements verification – Project IT systems customisation (classifiers for CAD/CAM/PLM/ERP/etc.) – Data integration support (reference data library content generation) – Tracing design decisions to requirements – Design decisions verification • Formal modelling problems: – Labour-intensive process of manual modelling – Large volume of «dumb» preparatory work – Need for a professional engineering verification in a new formalism unknown to engineers – Fragmented architecture of project IT environment — an obstacle for model reuse 4
  • 5. Preconditions for Automation of Technical Document Modelling • Restricted and relatively formal engineering subset of natural language • Contemporary developments in computer based natural language processing • Contemporary developments in ontology extraction from natural language texts • Controlled language for engineering (Gellish) • Gellish to ISO 15926 mapping development 5
  • 6. Experimenting with ABBYY Compreno Technology That Translates from Human into Computer Language http://www.abbyy.ru/science/techno logies/business/compreno
  • 7. ABBYY Compreno ABBYY Compreno is ABBYY’s innovative technology that performs full semantic and syntactic analysis for comprehensive handling of natural language texts. ABBYY Compreno is the first ever practical implementation of fundamental linguistic research carried out internationally over the past fifty years. A result of seventeen years of intensive R&D, ABBYY Compreno offers robust solutions to many long-standing language processing problems of the information age, such as: • Intelligent search and retrieval – Intelligent semantic search – Multilingual search – Semantic tagging of documents for more powerful searching • Comprehensive text analysis – Information monitoring – Controlling access to cofidential information – Summarizing and annotating documents – Sentiment analysis • Efficient handling of text documents – Document classification and filtering – Text comparison • High quality machine translation
  • 8. Research Plan • Starting point – comparison between: • syntactic and semantic structure (parsed by ABBYY Compreno) • formal text model (manually prepared) • Rule development for mapping between linguistic and engineering ontologies (current) • Customisation with domain thesauri (plans) • Testing on a corpus of engineering texts (plans) 8
  • 9. «The containment system shall include a primary containment and a secondary containment.» ABBYY Compreno parser results: text view 9
  • 10. ABBYY Compreno parser results: tree view 10
  • 11. «The containment system shall include a primary containment and a secondary containment.» Formal model: Containment system A: is a whole for Primary containment B: is a whole for Secondary containment А is classified as a Requirement B is classified as a Requirement 11
  • 12. «Inner surfaces should be smooth to prevent corrosion residue and to simplify decontamination.» ABBYY Compreno parser: tree view 12
  • 13. «Inner surfaces should be smooth to prevent corrosion residue and to simplify decontamination.» Formal model: Inner surfaces is a specialization of Surface is a specialization of Inner Inner surfaces A: is a specialization of Smooth A is classified as a Requirement is intended to achieve To prevent corrosion residue and to simplify decontamination To prevent corrosion residue and to simplify decontamination is a whole for To prevent corrosion residue has as subject Corrosion residue is a whole for To simplify decontamination has as subject Decontamination 13
  • 14. Thank you! Anatoly Levenchuk http://ailev.ru (Rus) http://levenchuk.com (Eng) ailev@asmp.msk.su Victor Agroskin vic5784@gmail.com .15926 Editor http://techinvestlab.ru/dot15926Editor Feedback and comments: dot15926@gmail.com http://community.livejournal.com/dot15926/ TechInvestLab.ru +7 (495) 748-5388 14