SlideShare ist ein Scribd-Unternehmen logo
1 von 98
Ontopia Code Camp TMRA 2009-11-11 Lars Marius Garshol & Geir Ove Grønmo
Agenda About you who are you? what do you want from the code camp? About Ontopia The product The future Participating in the project Writing some code!
Some background About Ontopia
Brief history 1999-2000 private hobby project for Geir Ove 2000-2009 commercial software sold by Ontopia AS lots of international customers in diverse fields 2009- open source project
The project Open source hosted at Google Code Contributors Lars Marius Garshol, Bouvet Geir Ove Grønmo, Bouvet Thomas Neidhart, SpaceApps Lars Heuer, Semagia Hannes Niederhausen, TMLab Stig Lau, Bouvet Baard H. Rehn-Johansen, Bouvet Peter-Paul Kruijssen, Morpheus Quintin Siebers, Morpheus
Current activity (toward 5.1) tolog updates added by LMG Various fixes and optimizations by everyone Toma implementation (in sandbox) by Thomas TMQL implementation (in sandbox)? by Sven Krosse
Architecture and modules The product
The big picture Auto-class. A.N.other A.N.other Other CMSs A.N.other A.N.other DB2TM Portlet support OKP XML2TM Engine CMSintegration Data  integration Escenic Taxon.import Ontopoly Web service
The engine Core API TMAPI 2.0 support Import/export RDF conversion TMSync Fulltext search Event API tolog query language tolog update language Engine
Query Engine Implementation of Ontopia’s tolog language (based on Prolog and SQL) Allows powerful queries on the topic map data structure Simplifies application development and improves performance Example: select $B, count($A) from  instance-of($B, city), { premiere($A : opera, $B : place) |    premiere($A : opera, $C : place),    located-in($C : containee, $B : container) }  order by $A desc? ,[object Object],[object Object]
How TMSync works Define which part of the target topic map you want, Define which part of the source topic map it is the master for, and The algorithm does the rest
If the source is not a topic map TMSync convert.xslt Simply do a normal one-time conversion let TMSync do the update for you In other words, TMSync reduces the update problem to a conversion problem source.xml
The City of Bergen usecase Norge.no Service Unit Person LOS City of Bergen LOS
The backends In-memory no persistent storage thread-safe no setup RDBMS transactions persistent thread-safe uses caching clustering Remote uses web service read-only unofficial Engine Memory RDBMS Remote
RDBMS Backend Allows the Engine to use topic maps stored in a relational database Based on a generic topic map schema Necessary when working with very large topic maps Transparent to applications Features Automatically loads data when needed Caches frequently used data Full support for RDBMS transactions Supports tolog-to-SQL compilation Statistical reports for performance tuning Platform support Oracle, MySQL, PostgreSQL, MS SQL Server Test suite available for verifying compatibility with other JDBC-enabled RDBMSes
DB2TM Upconversion to TMs from RDBMS via JDBC or from CSV Uses XML mapping can call out to Java Supports sync either full rescan or change table TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
DB2TM example Ontopia + = United Nations Bouvet <relation name="organizations.csv" columns="id name url">   <topic type="ex:organization"> 
   <item-identifier>#org${id}</item-identifier> 
   <topic-name>${name}</topic-name> 
   <occurrence type="ex:homepage">${url}</occurrence> 
 </topic> </relation>
TMRAP Web service interface via SOAP via plain HTTP Requests get-topic get-topic-page get-tolog delete-topic ... TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
Navigator framework Servlet-based API manage topic maps load/scan/delete/create JSP tag library XSLT-like based on tolog JSTL integration TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
Ontopia Navigator Framework Java API for interacting with TM repository JSP tag library based on tolog kind of like XSLT in JSP with tolog instead of XPath has JSTL integration Undocumented parts web presentation components some wrapped as JSP tags want to build proper portlets from them
http://www.ontopia.net/operamap
Navigator tag library example    <%-- assume variable 'composer' is already set --%> <p><b>Operas:</b><br/><tolog:foreach query=”composed-by(%composer% : composer, $OPERA : opera),                      { premiere-date($OPERA, $DATE) }?”>  <li>    <a href="opera.jsp?id=<tolog:id var="OPERA"/>”          ><tolog:out var="OPERA"/></a>        <tolog:if var="DATE">      <tolog:out var="DATE"/>    </tolog:if>  </li></tolog:foreach></p>
Elmer Preview
Automated classification Undocumented experimental Extracts text autodetects format Word, PDF, XML, HTML Processes text detects language stemming, stop-words Extracts keywords ranked by importance uses existing topics supports compound terms TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
Example of keyword extraction topic maps			1.0 metadata			0.57 subject-based class.	0.42 Core metadata		0.42 faceted classification	0.34 taxonomy			0.22 monolingual thesauri	0.19 controlled vocabulary	0.19 Dublin Core			0.16 thesauri			0.16 Dublin				0.15 keywords			0.15
Example #2 Automated classification		1.0	5 Topic Maps				0.51	14 XSLT					0.38	11 compound keywords		0.29	2 keywords				0.26	20 Lars					0.23	1 Marius					0.23	1 Garshol				0.22	1 ...
So how could this be used? To help users classify new documents in a CMS interface suggest appropriate keywords, screened by user before approval Automate classification of incoming documents this means lower quality, but also lower cost Get an overview of interesting terms in a document corpus classify all documents, extract the most interesting terms this can be used as the starting point for building an ontology (keyword extraction only)
Example user interface The user creates an article this screen then used to add keywords user adjusts the proposals from the classifier
Vizigator Viz Ontopoly Graphical visualization VizDesktop Swing app to configure filter/style/... Vizlet Java applet for web uses configuration loads via TMRAP uses “Remote” backend TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
The Vizigator Graphical visualization of Topic Maps Two parts VizDesktop: Swing desktop app for configuration Vizlet: Java applet for web deployment Configuration stored in XTM file
Without configuration
With configuration
The Vizigator The Vizigator uses TMRAP the Vizlet runs in the browser (on the client) a fragment of the topic map is downloaded from the server the fragment is grown as needed Server TMRAP
Ontopoly Viz Ontopoly Generic editor web-based, AJAX meta-ontology in TM Ontology designer create types and fields control user interface build views incremental dev Instance editor guided by ontology TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
Ontopoly A generic Topic Maps editor, in two parts ontology editor: used to create the ontology and schema instance editor: used to enter instances based on ontology Built with the Web Editor Framework works with both XTM files and topic maps stored in RDBMS backend supports access control to administrative functions, ontology, and instance editors existing topic maps can be imported parts of the ontology can be marked as read-only, or hidden
Typical deployment Viewing application Engine Users DB Backend Ontopoly Frameworks Editors DB TMRAP DB2TM HTTP DB External application Application server
CMS integration The best way to add content functionality to Ontopia the world doesn’t need another CMS better to reuse those which already exist So far two integrations exist Escenic OfficeNet Knowledge Portal more are being worked on
Implementation A CMS event listener the listener creates topics for new CMS articles, folders, etc the mapping is basically the design of the ontology used by this listener Presentation integration it must be possible to list all topics attached to an article conversely, it must be possible to list all articles attached to a topic how close the integration needs to be here will vary, as will the difficulty of the integration User interface integration it needs to be possible to attach topics to an article from within the normal CMS user interface this can be quite tricky Search integration the Topic Maps search needs to also search content in the CMS can be achieved by writing a tolog plug-in
Articles as topics is about Elections New city council appointed Goal: associate articles with topics mainly to say what they are about typically also want to include other metadata Need to create topics for the articles to do this in fact, a general CMS-to-TM mapping is needed must decide what metadata and structures to include
Mapping issues Article topics what topic type to use? title becomes name? (do you know the title?) include author? include last modified? include workflow state? should all articles be mapped? Folders/directories/sections/... should these be mapped, too? one topic type for all folders/.../.../...? if so, use associations to connect articles to folders use associations to reproduce hierarchical folder structure Multimedia objects should these be included? what topic type? what name? ...
Two styles of mappings Articles as articles Topic represents only the article Topic type is some subclass of “article” “Is about” association connects article into topic map Fields are presentational title, abstract, body Articles as concepts Topic represents some real-world subject (like a person) article is just the default content about that subject Type is the type of the subject (person) Semantic associations to the rest of the topic map works in department, has competence, ... Fields can be semantic name, phone no, email, ...
Article as article Article about building of a new school Is about association to “Primary schools” Topic type is “article”
Article as concept Article about a sports hall Article really represents the hall Topic type is “Location” Associations to ,[object Object]
events in the location
category “Sports”,[object Object]
Two projects
The project A new citizen’s portal for the city administration strategic decision to make portal main interface for interaction with citizens as many services as possible are to be moved online Big project started in late 2004, to continue at 							least into 2008 ~5 million Euro spent by launch date 1.7 million Euro budgeted for 2007 Topic Maps development is a fraction 							of this (less than 25%) Many companies involved Bouvet/Ontopia Avenir KPMG Karabin Escenic
Simplified original ontology Service catalog Escenic (CMS) LOS Form Article nearly everything Category Service Subject Department Borough External resource Employee Payroll++
Data flow Ontopoly Ontopia Escenic LOS Integration TMSync DB2TM Fellesdata Payroll (Agresso) Dexter/Extens Service Catalog
Conceptual architecture Data sources Oracle Portal Application Ontopia Escenic Oracle Database
The portal
Technical architecture
NRK/Skole Norwegian National Broadcasting (NRK) media resources from the archives published for use in schools integrated with the National Curriculum In production delayed by copyright wrangling Technologies OKS Polopoly CMS MySQL database Resin application server
Curriculum-based browsing (1) Curriculum Social studies High school
Curriculum-based browsing (2) Gender roles
Curriculum-based browsing (3) Feminist movement in the 70s and 80s Changes to the family in the 70s The prime minister’s husband Children choosing careers Gay partnerships in 1993
One video (prime minister’s husband) Metadata Subject Person Related resources Description
Conceptual architecture Polopoly HTTP Ontopia MediaDB Grep DB2TM TMSync RDBMS backend MySQL Editors
Implementation Domain model in Java Plain old Java objects built on Ontopia’s Java API tolog JSP for presentation using JSTL on top of the domain model Subversion for the source code Maven2 to build and deploy Unit tests
What we’d like to see The future
The big picture Auto-class. A.N.other A.N.other Other CMSs A.N.other A.N.other DB2TM Portlet support OKP XML2TM Engine CMSintegration Data  integration Escenic Taxon.import Ontopoly Web service
CMS integrations The more of these, the better Candidate CMSs Liferay (being worked on at Bouvet) Alfresco (might be started soon) Magnolia Inspera (possible project here) JSR-170 Java Content Repository CMIS (OASIS web service standard)
Portlet toolkit Subversion contains a number of “portlets” basically, Java objects doing presentation tasks some have JSP wrappers as well Examples display tree view list of topics filterable by facets show related topics get-topic-page via TMRAP component Not ready for prime-time yet undocumented incomplete
Ontopoly plug-ins Plugins for getting more data from externals TMSync import plugin DB2TM plugin Subj3ct.com plugin adapted RDF2TM plugin classify plugin ... Plugins for ontology fragments menu editor, for example
TMCL Now implementable We’d like to see an object model for TMCL (supporting changes) a validator based on the object model Ontopoly import/export from TMCL (initially) refactor Ontopoly API to make it more portable Ontopoly ported to use TMCL natively (eventually)
Things we’d like to remove OSL support Ontopia Schema Language Web editor framework unfortunately, still used by some major customers Fulltext search the old APIs for this are not really of any use
Management interface Import topic maps (to file or RDBMS)
What do you think? Suggestions? Questions? Plans? Ideas?
Setting up the developer environment Getting started
If you are using Ontopia... ...simply download the zip, then unzip, set the classpath, start the server, ... ...and you’re good to go
If you are developing Ontopia... You must have Java 1.5 (not 1.6 or 1.7 or ...) Ant 1.6 (or later) Ivy 2.0 (or later) Subversion Then check out the source from Subversion svn checkout http://ontopia.googlecode.com/svn/trunk/ ontopia-read-only ant bootstrap ant dist.jar.ontopia ant test ant dist.ontopia
Beware This is fun, because you can play around with anything you want e.g, my build has a faster TopicIF.getRolesByType you can track changes as they happen in svn However, you’re on your own if it fails it’s kind of hard to say why maybe it’s your changes, maybe not For production use, official releases are best
Participating etc The project
Our goal To provide the best toolkit for building Topic Maps-based applications We want it to be actively maintained, bug-free, scalable, easy to use, well documented, stable, reliable
Our philosophy We want Ontopia to provide as much useful more-or-less generic functionality as possible New contributions are generally welcome as long as they meet the quality requirements, and they don’t cause problems for others
The sandbox There’s a lot of Ontopia-related code which does not meet those requirements some of it can be very useful, someone may pick it up and improve it The sandbox is for these pieces some are in Ontopia’s Subversion repository, others are maintained externally To be “promoted” into Ontopia a module needs an active maintainer, to be generally useful, and to meet certain quality requirements
Communications Join the mailing list(s)! http://groups.google.com/group/ontopia http://groups.google.com/group/ontopia-dev Google Code page http://code.google.com/p/ontopia/ note the “updates” feed! Blog http://ontopia.wordpress.com Twitter http://twitter.com/ontopia
Committers These are the people who run the project they can actually commit to Subversion they can vote on decisions to be made etc Everyone else can use the software as much as they want, report and comment on issues, discuss on the mailing list, and submit patches for inclusion
How to become a committer Participate in the project! that is, get involved first let people get to know you, show some commitment Once you’ve gotten some way into the project you can ask to become a committer best if you have provided some patches first Unless you’re going to commit changes there’s no need to be a committer
Finding a task to work on Report bugs! they exist. if you find any, please report them. Look at the open issues there is always testing/discussion to be done Look for issues marked “newbie” http://code.google.com/p/ontopia/issues/list?q=label:Newbie Look at what’s in the sandbox most of these modules need work Scratch an itch if there’s something you want fixed/changed/added...
How to fix a bug First figure out why you think it fails Then write a test case based on your assumption make sure the test case fails (test before you fix) Then fix the bug follow the coding guidelines (see wiki) Then run the test suite verify that you’ve fixed the bug verify that you haven’t broken anything Then submit the patch
The test suite Lots of *.test packages in the source tree 3148 test cases as of right now test data in ontopia/src/test-data some tests are generators based on files some of the test files come from cxtm-tests.sf.net Run with ant test java net.ontopia.test.TestRunner src/test-data/config/tests.xml test-group
Source tree structure net.ontopia. utils					various utilities test					various test support code infoset				LocatorIF code + cruft persistence		OR-mapper for RDBMS backend product			cruft xml					various XML-related utilities topicmaps			next slides
Source tree structure net.ontopia.topicmaps. core				core engine API impl			engine backends + utils utils				utilities (see next slide) cmdlineutils	command-line tools entry			TM repository nav + nav2	navigator framework query			tolog engine viz classify			 db2tm webed			cruft
Source tree structure net.ontopia.topicmaps.utils *				various utility classes ltm			LTM reader and writer ctm			CTM reader rdf			RDF converter (both ways) tmrap		TMRAP implementation
Let’s write some code!
The engine The core API corresponds closely to the TMDM TopicMapIF, TopicIF, TopicNameIF, ... Compile with ant init compile.ontopia .class files go into ontopia/build/classes ant dist.ontopia.jar # makes a jar
The importers Main class implements TopicMapReaderIF usually, this lets you set up configuration, etc then uses other classes to do the real work XTM importers use an XML parser main work done in XTM(2)ContentHandler some extra code for validation and format detection CTM/LTM importers use Antlr-based parsers real code in ctm.g/ltm.g All importers work via the core API
Fixing a real bug There is a failing test case in the TM/XML importer So let’s fix that right now...
Find an issue in the issue tracker (Picking one with “Newbie” might be good,  but isn’t necessary) Get set up check out the source code build the code run the test suite Then dig in we’ll help you with any questions you have At the end, submit a patch to the issue tracker remember to use the test suite!

Weitere ähnliche Inhalte

Ähnlich wie Ontopia Code Camp Agenda and Overview

Graylog Engineering - Design Your Architecture
Graylog Engineering - Design Your ArchitectureGraylog Engineering - Design Your Architecture
Graylog Engineering - Design Your ArchitectureGraylog
 
OrientDB for real & Web App development
OrientDB for real & Web App developmentOrientDB for real & Web App development
OrientDB for real & Web App developmentLuca Garulli
 
Big Data Lakes Benchmarking 2018
Big Data Lakes Benchmarking 2018Big Data Lakes Benchmarking 2018
Big Data Lakes Benchmarking 2018Tom Grek
 
Rapid, Scalable Web Development with MongoDB, Ming, and Python
Rapid, Scalable Web Development with MongoDB, Ming, and PythonRapid, Scalable Web Development with MongoDB, Ming, and Python
Rapid, Scalable Web Development with MongoDB, Ming, and PythonRick Copeland
 
Normalizing x pages web development
Normalizing x pages web development Normalizing x pages web development
Normalizing x pages web development Shean McManus
 
Document Object Model
Document Object ModelDocument Object Model
Document Object Modelchomas kandar
 
Document Object Model
Document Object ModelDocument Object Model
Document Object Modelchomas kandar
 
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...Codemotion
 
Terraform Q&A - HashiCorp User Group Oslo
Terraform Q&A - HashiCorp User Group OsloTerraform Q&A - HashiCorp User Group Oslo
Terraform Q&A - HashiCorp User Group OsloAnton Babenko
 
20181215 introduction to graph databases
20181215   introduction to graph databases20181215   introduction to graph databases
20181215 introduction to graph databasesTimothy Findlay
 
SAP Open Source meetup/Speedment - Palo Alto 2015
SAP Open Source meetup/Speedment - Palo Alto 2015SAP Open Source meetup/Speedment - Palo Alto 2015
SAP Open Source meetup/Speedment - Palo Alto 2015Speedment, Inc.
 
Hopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, SunnyvaleHopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, SunnyvaleJim Dowling
 
NEOOUG 2010 Oracle Data Integrator Presentation
NEOOUG 2010 Oracle Data Integrator PresentationNEOOUG 2010 Oracle Data Integrator Presentation
NEOOUG 2010 Oracle Data Integrator Presentationaskankit
 
TMSync: Synchronizing topic maps
TMSync: Synchronizing topic mapsTMSync: Synchronizing topic maps
TMSync: Synchronizing topic mapsLars Marius Garshol
 
Mongo db pefrormance optimization strategies
Mongo db pefrormance optimization strategiesMongo db pefrormance optimization strategies
Mongo db pefrormance optimization strategiesronwarshawsky
 
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTDataHadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTDataCloudera, Inc.
 
Technology Stack Discussion
Technology Stack DiscussionTechnology Stack Discussion
Technology Stack DiscussionZaiyang Li
 

Ähnlich wie Ontopia Code Camp Agenda and Overview (20)

Processing XML with Java
Processing XML with JavaProcessing XML with Java
Processing XML with Java
 
Graylog Engineering - Design Your Architecture
Graylog Engineering - Design Your ArchitectureGraylog Engineering - Design Your Architecture
Graylog Engineering - Design Your Architecture
 
OrientDB for real & Web App development
OrientDB for real & Web App developmentOrientDB for real & Web App development
OrientDB for real & Web App development
 
Introduction to SDshare
Introduction to SDshareIntroduction to SDshare
Introduction to SDshare
 
Big Data Lakes Benchmarking 2018
Big Data Lakes Benchmarking 2018Big Data Lakes Benchmarking 2018
Big Data Lakes Benchmarking 2018
 
Rapid, Scalable Web Development with MongoDB, Ming, and Python
Rapid, Scalable Web Development with MongoDB, Ming, and PythonRapid, Scalable Web Development with MongoDB, Ming, and Python
Rapid, Scalable Web Development with MongoDB, Ming, and Python
 
Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)
 
Normalizing x pages web development
Normalizing x pages web development Normalizing x pages web development
Normalizing x pages web development
 
Document Object Model
Document Object ModelDocument Object Model
Document Object Model
 
Document Object Model
Document Object ModelDocument Object Model
Document Object Model
 
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
 
Terraform Q&A - HashiCorp User Group Oslo
Terraform Q&A - HashiCorp User Group OsloTerraform Q&A - HashiCorp User Group Oslo
Terraform Q&A - HashiCorp User Group Oslo
 
20181215 introduction to graph databases
20181215   introduction to graph databases20181215   introduction to graph databases
20181215 introduction to graph databases
 
SAP Open Source meetup/Speedment - Palo Alto 2015
SAP Open Source meetup/Speedment - Palo Alto 2015SAP Open Source meetup/Speedment - Palo Alto 2015
SAP Open Source meetup/Speedment - Palo Alto 2015
 
Hopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, SunnyvaleHopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, Sunnyvale
 
NEOOUG 2010 Oracle Data Integrator Presentation
NEOOUG 2010 Oracle Data Integrator PresentationNEOOUG 2010 Oracle Data Integrator Presentation
NEOOUG 2010 Oracle Data Integrator Presentation
 
TMSync: Synchronizing topic maps
TMSync: Synchronizing topic mapsTMSync: Synchronizing topic maps
TMSync: Synchronizing topic maps
 
Mongo db pefrormance optimization strategies
Mongo db pefrormance optimization strategiesMongo db pefrormance optimization strategies
Mongo db pefrormance optimization strategies
 
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTDataHadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
 
Technology Stack Discussion
Technology Stack DiscussionTechnology Stack Discussion
Technology Stack Discussion
 

Mehr von Lars Marius Garshol

JSLT: JSON querying and transformation
JSLT: JSON querying and transformationJSLT: JSON querying and transformation
JSLT: JSON querying and transformationLars Marius Garshol
 
Data collection in AWS at Schibsted
Data collection in AWS at SchibstedData collection in AWS at Schibsted
Data collection in AWS at SchibstedLars Marius Garshol
 
NoSQL and Einstein's theory of relativity
NoSQL and Einstein's theory of relativityNoSQL and Einstein's theory of relativity
NoSQL and Einstein's theory of relativityLars Marius Garshol
 
Using the search engine as recommendation engine
Using the search engine as recommendation engineUsing the search engine as recommendation engine
Using the search engine as recommendation engineLars Marius Garshol
 
Linked Open Data for the Cultural Sector
Linked Open Data for the Cultural SectorLinked Open Data for the Cultural Sector
Linked Open Data for the Cultural SectorLars Marius Garshol
 
NoSQL databases, the CAP theorem, and the theory of relativity
NoSQL databases, the CAP theorem, and the theory of relativityNoSQL databases, the CAP theorem, and the theory of relativity
NoSQL databases, the CAP theorem, and the theory of relativityLars Marius Garshol
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningLars Marius Garshol
 
Hafslund SESAM - Semantic integration in practice
Hafslund SESAM - Semantic integration in practiceHafslund SESAM - Semantic integration in practice
Hafslund SESAM - Semantic integration in practiceLars Marius Garshol
 

Mehr von Lars Marius Garshol (20)

JSLT: JSON querying and transformation
JSLT: JSON querying and transformationJSLT: JSON querying and transformation
JSLT: JSON querying and transformation
 
Data collection in AWS at Schibsted
Data collection in AWS at SchibstedData collection in AWS at Schibsted
Data collection in AWS at Schibsted
 
Kveik - what is it?
Kveik - what is it?Kveik - what is it?
Kveik - what is it?
 
Nature-inspired algorithms
Nature-inspired algorithmsNature-inspired algorithms
Nature-inspired algorithms
 
Collecting 600M events/day
Collecting 600M events/dayCollecting 600M events/day
Collecting 600M events/day
 
History of writing
History of writingHistory of writing
History of writing
 
NoSQL and Einstein's theory of relativity
NoSQL and Einstein's theory of relativityNoSQL and Einstein's theory of relativity
NoSQL and Einstein's theory of relativity
 
Norwegian farmhouse ale
Norwegian farmhouse aleNorwegian farmhouse ale
Norwegian farmhouse ale
 
Archive integration with RDF
Archive integration with RDFArchive integration with RDF
Archive integration with RDF
 
The Euro crisis in 10 minutes
The Euro crisis in 10 minutesThe Euro crisis in 10 minutes
The Euro crisis in 10 minutes
 
Using the search engine as recommendation engine
Using the search engine as recommendation engineUsing the search engine as recommendation engine
Using the search engine as recommendation engine
 
Linked Open Data for the Cultural Sector
Linked Open Data for the Cultural SectorLinked Open Data for the Cultural Sector
Linked Open Data for the Cultural Sector
 
NoSQL databases, the CAP theorem, and the theory of relativity
NoSQL databases, the CAP theorem, and the theory of relativityNoSQL databases, the CAP theorem, and the theory of relativity
NoSQL databases, the CAP theorem, and the theory of relativity
 
Bitcoin - digital gold
Bitcoin - digital goldBitcoin - digital gold
Bitcoin - digital gold
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
 
Hops - the green gold
Hops - the green goldHops - the green gold
Hops - the green gold
 
Big data 101
Big data 101Big data 101
Big data 101
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
Hafslund SESAM - Semantic integration in practice
Hafslund SESAM - Semantic integration in practiceHafslund SESAM - Semantic integration in practice
Hafslund SESAM - Semantic integration in practice
 
Approximate string comparators
Approximate string comparatorsApproximate string comparators
Approximate string comparators
 

Kürzlich hochgeladen

4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
ARTERIAL BLOOD GAS ANALYSIS........pptx
ARTERIAL BLOOD  GAS ANALYSIS........pptxARTERIAL BLOOD  GAS ANALYSIS........pptx
ARTERIAL BLOOD GAS ANALYSIS........pptxAneriPatwari
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1GloryAnnCastre1
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17Celine George
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfChristalin Nelson
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxMichelleTuguinay1
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseCeline George
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Association for Project Management
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 

Kürzlich hochgeladen (20)

4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
ARTERIAL BLOOD GAS ANALYSIS........pptx
ARTERIAL BLOOD  GAS ANALYSIS........pptxARTERIAL BLOOD  GAS ANALYSIS........pptx
ARTERIAL BLOOD GAS ANALYSIS........pptx
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdf
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 Database
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 

Ontopia Code Camp Agenda and Overview

  • 1. Ontopia Code Camp TMRA 2009-11-11 Lars Marius Garshol & Geir Ove Grønmo
  • 2. Agenda About you who are you? what do you want from the code camp? About Ontopia The product The future Participating in the project Writing some code!
  • 4. Brief history 1999-2000 private hobby project for Geir Ove 2000-2009 commercial software sold by Ontopia AS lots of international customers in diverse fields 2009- open source project
  • 5. The project Open source hosted at Google Code Contributors Lars Marius Garshol, Bouvet Geir Ove Grønmo, Bouvet Thomas Neidhart, SpaceApps Lars Heuer, Semagia Hannes Niederhausen, TMLab Stig Lau, Bouvet Baard H. Rehn-Johansen, Bouvet Peter-Paul Kruijssen, Morpheus Quintin Siebers, Morpheus
  • 6. Current activity (toward 5.1) tolog updates added by LMG Various fixes and optimizations by everyone Toma implementation (in sandbox) by Thomas TMQL implementation (in sandbox)? by Sven Krosse
  • 8. The big picture Auto-class. A.N.other A.N.other Other CMSs A.N.other A.N.other DB2TM Portlet support OKP XML2TM Engine CMSintegration Data integration Escenic Taxon.import Ontopoly Web service
  • 9. The engine Core API TMAPI 2.0 support Import/export RDF conversion TMSync Fulltext search Event API tolog query language tolog update language Engine
  • 10.
  • 11. How TMSync works Define which part of the target topic map you want, Define which part of the source topic map it is the master for, and The algorithm does the rest
  • 12. If the source is not a topic map TMSync convert.xslt Simply do a normal one-time conversion let TMSync do the update for you In other words, TMSync reduces the update problem to a conversion problem source.xml
  • 13. The City of Bergen usecase Norge.no Service Unit Person LOS City of Bergen LOS
  • 14. The backends In-memory no persistent storage thread-safe no setup RDBMS transactions persistent thread-safe uses caching clustering Remote uses web service read-only unofficial Engine Memory RDBMS Remote
  • 15. RDBMS Backend Allows the Engine to use topic maps stored in a relational database Based on a generic topic map schema Necessary when working with very large topic maps Transparent to applications Features Automatically loads data when needed Caches frequently used data Full support for RDBMS transactions Supports tolog-to-SQL compilation Statistical reports for performance tuning Platform support Oracle, MySQL, PostgreSQL, MS SQL Server Test suite available for verifying compatibility with other JDBC-enabled RDBMSes
  • 16. DB2TM Upconversion to TMs from RDBMS via JDBC or from CSV Uses XML mapping can call out to Java Supports sync either full rescan or change table TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
  • 17. DB2TM example Ontopia + = United Nations Bouvet <relation name="organizations.csv" columns="id name url"> <topic type="ex:organization"> <item-identifier>#org${id}</item-identifier> <topic-name>${name}</topic-name> <occurrence type="ex:homepage">${url}</occurrence> </topic> </relation>
  • 18. TMRAP Web service interface via SOAP via plain HTTP Requests get-topic get-topic-page get-tolog delete-topic ... TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
  • 19. Navigator framework Servlet-based API manage topic maps load/scan/delete/create JSP tag library XSLT-like based on tolog JSTL integration TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
  • 20. Ontopia Navigator Framework Java API for interacting with TM repository JSP tag library based on tolog kind of like XSLT in JSP with tolog instead of XPath has JSTL integration Undocumented parts web presentation components some wrapped as JSP tags want to build proper portlets from them
  • 22. Navigator tag library example <%-- assume variable 'composer' is already set --%> <p><b>Operas:</b><br/><tolog:foreach query=”composed-by(%composer% : composer, $OPERA : opera), { premiere-date($OPERA, $DATE) }?”> <li> <a href="opera.jsp?id=<tolog:id var="OPERA"/>” ><tolog:out var="OPERA"/></a> <tolog:if var="DATE"> <tolog:out var="DATE"/> </tolog:if> </li></tolog:foreach></p>
  • 24.
  • 25.
  • 26.
  • 27. Automated classification Undocumented experimental Extracts text autodetects format Word, PDF, XML, HTML Processes text detects language stemming, stop-words Extracts keywords ranked by importance uses existing topics supports compound terms TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
  • 28. Example of keyword extraction topic maps 1.0 metadata 0.57 subject-based class. 0.42 Core metadata 0.42 faceted classification 0.34 taxonomy 0.22 monolingual thesauri 0.19 controlled vocabulary 0.19 Dublin Core 0.16 thesauri 0.16 Dublin 0.15 keywords 0.15
  • 29. Example #2 Automated classification 1.0 5 Topic Maps 0.51 14 XSLT 0.38 11 compound keywords 0.29 2 keywords 0.26 20 Lars 0.23 1 Marius 0.23 1 Garshol 0.22 1 ...
  • 30. So how could this be used? To help users classify new documents in a CMS interface suggest appropriate keywords, screened by user before approval Automate classification of incoming documents this means lower quality, but also lower cost Get an overview of interesting terms in a document corpus classify all documents, extract the most interesting terms this can be used as the starting point for building an ontology (keyword extraction only)
  • 31. Example user interface The user creates an article this screen then used to add keywords user adjusts the proposals from the classifier
  • 32. Vizigator Viz Ontopoly Graphical visualization VizDesktop Swing app to configure filter/style/... Vizlet Java applet for web uses configuration loads via TMRAP uses “Remote” backend TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
  • 33. The Vizigator Graphical visualization of Topic Maps Two parts VizDesktop: Swing desktop app for configuration Vizlet: Java applet for web deployment Configuration stored in XTM file
  • 36. The Vizigator The Vizigator uses TMRAP the Vizlet runs in the browser (on the client) a fragment of the topic map is downloaded from the server the fragment is grown as needed Server TMRAP
  • 37. Ontopoly Viz Ontopoly Generic editor web-based, AJAX meta-ontology in TM Ontology designer create types and fields control user interface build views incremental dev Instance editor guided by ontology TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
  • 38. Ontopoly A generic Topic Maps editor, in two parts ontology editor: used to create the ontology and schema instance editor: used to enter instances based on ontology Built with the Web Editor Framework works with both XTM files and topic maps stored in RDBMS backend supports access control to administrative functions, ontology, and instance editors existing topic maps can be imported parts of the ontology can be marked as read-only, or hidden
  • 39.
  • 40. Typical deployment Viewing application Engine Users DB Backend Ontopoly Frameworks Editors DB TMRAP DB2TM HTTP DB External application Application server
  • 41. CMS integration The best way to add content functionality to Ontopia the world doesn’t need another CMS better to reuse those which already exist So far two integrations exist Escenic OfficeNet Knowledge Portal more are being worked on
  • 42. Implementation A CMS event listener the listener creates topics for new CMS articles, folders, etc the mapping is basically the design of the ontology used by this listener Presentation integration it must be possible to list all topics attached to an article conversely, it must be possible to list all articles attached to a topic how close the integration needs to be here will vary, as will the difficulty of the integration User interface integration it needs to be possible to attach topics to an article from within the normal CMS user interface this can be quite tricky Search integration the Topic Maps search needs to also search content in the CMS can be achieved by writing a tolog plug-in
  • 43. Articles as topics is about Elections New city council appointed Goal: associate articles with topics mainly to say what they are about typically also want to include other metadata Need to create topics for the articles to do this in fact, a general CMS-to-TM mapping is needed must decide what metadata and structures to include
  • 44. Mapping issues Article topics what topic type to use? title becomes name? (do you know the title?) include author? include last modified? include workflow state? should all articles be mapped? Folders/directories/sections/... should these be mapped, too? one topic type for all folders/.../.../...? if so, use associations to connect articles to folders use associations to reproduce hierarchical folder structure Multimedia objects should these be included? what topic type? what name? ...
  • 45. Two styles of mappings Articles as articles Topic represents only the article Topic type is some subclass of “article” “Is about” association connects article into topic map Fields are presentational title, abstract, body Articles as concepts Topic represents some real-world subject (like a person) article is just the default content about that subject Type is the type of the subject (person) Semantic associations to the rest of the topic map works in department, has competence, ... Fields can be semantic name, phone no, email, ...
  • 46. Article as article Article about building of a new school Is about association to “Primary schools” Topic type is “article”
  • 47.
  • 48. events in the location
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 55. The project A new citizen’s portal for the city administration strategic decision to make portal main interface for interaction with citizens as many services as possible are to be moved online Big project started in late 2004, to continue at least into 2008 ~5 million Euro spent by launch date 1.7 million Euro budgeted for 2007 Topic Maps development is a fraction of this (less than 25%) Many companies involved Bouvet/Ontopia Avenir KPMG Karabin Escenic
  • 56. Simplified original ontology Service catalog Escenic (CMS) LOS Form Article nearly everything Category Service Subject Department Borough External resource Employee Payroll++
  • 57. Data flow Ontopoly Ontopia Escenic LOS Integration TMSync DB2TM Fellesdata Payroll (Agresso) Dexter/Extens Service Catalog
  • 58. Conceptual architecture Data sources Oracle Portal Application Ontopia Escenic Oracle Database
  • 61. NRK/Skole Norwegian National Broadcasting (NRK) media resources from the archives published for use in schools integrated with the National Curriculum In production delayed by copyright wrangling Technologies OKS Polopoly CMS MySQL database Resin application server
  • 62. Curriculum-based browsing (1) Curriculum Social studies High school
  • 64. Curriculum-based browsing (3) Feminist movement in the 70s and 80s Changes to the family in the 70s The prime minister’s husband Children choosing careers Gay partnerships in 1993
  • 65. One video (prime minister’s husband) Metadata Subject Person Related resources Description
  • 66. Conceptual architecture Polopoly HTTP Ontopia MediaDB Grep DB2TM TMSync RDBMS backend MySQL Editors
  • 67. Implementation Domain model in Java Plain old Java objects built on Ontopia’s Java API tolog JSP for presentation using JSTL on top of the domain model Subversion for the source code Maven2 to build and deploy Unit tests
  • 68. What we’d like to see The future
  • 69. The big picture Auto-class. A.N.other A.N.other Other CMSs A.N.other A.N.other DB2TM Portlet support OKP XML2TM Engine CMSintegration Data integration Escenic Taxon.import Ontopoly Web service
  • 70. CMS integrations The more of these, the better Candidate CMSs Liferay (being worked on at Bouvet) Alfresco (might be started soon) Magnolia Inspera (possible project here) JSR-170 Java Content Repository CMIS (OASIS web service standard)
  • 71. Portlet toolkit Subversion contains a number of “portlets” basically, Java objects doing presentation tasks some have JSP wrappers as well Examples display tree view list of topics filterable by facets show related topics get-topic-page via TMRAP component Not ready for prime-time yet undocumented incomplete
  • 72. Ontopoly plug-ins Plugins for getting more data from externals TMSync import plugin DB2TM plugin Subj3ct.com plugin adapted RDF2TM plugin classify plugin ... Plugins for ontology fragments menu editor, for example
  • 73. TMCL Now implementable We’d like to see an object model for TMCL (supporting changes) a validator based on the object model Ontopoly import/export from TMCL (initially) refactor Ontopoly API to make it more portable Ontopoly ported to use TMCL natively (eventually)
  • 74. Things we’d like to remove OSL support Ontopia Schema Language Web editor framework unfortunately, still used by some major customers Fulltext search the old APIs for this are not really of any use
  • 75. Management interface Import topic maps (to file or RDBMS)
  • 76. What do you think? Suggestions? Questions? Plans? Ideas?
  • 77. Setting up the developer environment Getting started
  • 78. If you are using Ontopia... ...simply download the zip, then unzip, set the classpath, start the server, ... ...and you’re good to go
  • 79. If you are developing Ontopia... You must have Java 1.5 (not 1.6 or 1.7 or ...) Ant 1.6 (or later) Ivy 2.0 (or later) Subversion Then check out the source from Subversion svn checkout http://ontopia.googlecode.com/svn/trunk/ ontopia-read-only ant bootstrap ant dist.jar.ontopia ant test ant dist.ontopia
  • 80. Beware This is fun, because you can play around with anything you want e.g, my build has a faster TopicIF.getRolesByType you can track changes as they happen in svn However, you’re on your own if it fails it’s kind of hard to say why maybe it’s your changes, maybe not For production use, official releases are best
  • 82. Our goal To provide the best toolkit for building Topic Maps-based applications We want it to be actively maintained, bug-free, scalable, easy to use, well documented, stable, reliable
  • 83. Our philosophy We want Ontopia to provide as much useful more-or-less generic functionality as possible New contributions are generally welcome as long as they meet the quality requirements, and they don’t cause problems for others
  • 84. The sandbox There’s a lot of Ontopia-related code which does not meet those requirements some of it can be very useful, someone may pick it up and improve it The sandbox is for these pieces some are in Ontopia’s Subversion repository, others are maintained externally To be “promoted” into Ontopia a module needs an active maintainer, to be generally useful, and to meet certain quality requirements
  • 85. Communications Join the mailing list(s)! http://groups.google.com/group/ontopia http://groups.google.com/group/ontopia-dev Google Code page http://code.google.com/p/ontopia/ note the “updates” feed! Blog http://ontopia.wordpress.com Twitter http://twitter.com/ontopia
  • 86. Committers These are the people who run the project they can actually commit to Subversion they can vote on decisions to be made etc Everyone else can use the software as much as they want, report and comment on issues, discuss on the mailing list, and submit patches for inclusion
  • 87. How to become a committer Participate in the project! that is, get involved first let people get to know you, show some commitment Once you’ve gotten some way into the project you can ask to become a committer best if you have provided some patches first Unless you’re going to commit changes there’s no need to be a committer
  • 88. Finding a task to work on Report bugs! they exist. if you find any, please report them. Look at the open issues there is always testing/discussion to be done Look for issues marked “newbie” http://code.google.com/p/ontopia/issues/list?q=label:Newbie Look at what’s in the sandbox most of these modules need work Scratch an itch if there’s something you want fixed/changed/added...
  • 89. How to fix a bug First figure out why you think it fails Then write a test case based on your assumption make sure the test case fails (test before you fix) Then fix the bug follow the coding guidelines (see wiki) Then run the test suite verify that you’ve fixed the bug verify that you haven’t broken anything Then submit the patch
  • 90. The test suite Lots of *.test packages in the source tree 3148 test cases as of right now test data in ontopia/src/test-data some tests are generators based on files some of the test files come from cxtm-tests.sf.net Run with ant test java net.ontopia.test.TestRunner src/test-data/config/tests.xml test-group
  • 91. Source tree structure net.ontopia. utils various utilities test various test support code infoset LocatorIF code + cruft persistence OR-mapper for RDBMS backend product cruft xml various XML-related utilities topicmaps next slides
  • 92. Source tree structure net.ontopia.topicmaps. core core engine API impl engine backends + utils utils utilities (see next slide) cmdlineutils command-line tools entry TM repository nav + nav2 navigator framework query tolog engine viz classify db2tm webed cruft
  • 93. Source tree structure net.ontopia.topicmaps.utils * various utility classes ltm LTM reader and writer ctm CTM reader rdf RDF converter (both ways) tmrap TMRAP implementation
  • 95. The engine The core API corresponds closely to the TMDM TopicMapIF, TopicIF, TopicNameIF, ... Compile with ant init compile.ontopia .class files go into ontopia/build/classes ant dist.ontopia.jar # makes a jar
  • 96. The importers Main class implements TopicMapReaderIF usually, this lets you set up configuration, etc then uses other classes to do the real work XTM importers use an XML parser main work done in XTM(2)ContentHandler some extra code for validation and format detection CTM/LTM importers use Antlr-based parsers real code in ctm.g/ltm.g All importers work via the core API
  • 97. Fixing a real bug There is a failing test case in the TM/XML importer So let’s fix that right now...
  • 98. Find an issue in the issue tracker (Picking one with “Newbie” might be good, but isn’t necessary) Get set up check out the source code build the code run the test suite Then dig in we’ll help you with any questions you have At the end, submit a patch to the issue tracker remember to use the test suite!