MS4 level being good citizen -imperative- (1) (1).pdf
Ontopia Code Camp Agenda and Overview
1. Ontopia Code Camp TMRA 2009-11-11 Lars Marius Garshol & Geir Ove Grønmo
2. Agenda About you who are you? what do you want from the code camp? About Ontopia The product The future Participating in the project Writing some code!
4. Brief history 1999-2000 private hobby project for Geir Ove 2000-2009 commercial software sold by Ontopia AS lots of international customers in diverse fields 2009- open source project
5. The project Open source hosted at Google Code Contributors Lars Marius Garshol, Bouvet Geir Ove Grønmo, Bouvet Thomas Neidhart, SpaceApps Lars Heuer, Semagia Hannes Niederhausen, TMLab Stig Lau, Bouvet Baard H. Rehn-Johansen, Bouvet Peter-Paul Kruijssen, Morpheus Quintin Siebers, Morpheus
6. Current activity (toward 5.1) tolog updates added by LMG Various fixes and optimizations by everyone Toma implementation (in sandbox) by Thomas TMQL implementation (in sandbox)? by Sven Krosse
8. The big picture Auto-class. A.N.other A.N.other Other CMSs A.N.other A.N.other DB2TM Portlet support OKP XML2TM Engine CMSintegration Data integration Escenic Taxon.import Ontopoly Web service
9. The engine Core API TMAPI 2.0 support Import/export RDF conversion TMSync Fulltext search Event API tolog query language tolog update language Engine
10.
11. How TMSync works Define which part of the target topic map you want, Define which part of the source topic map it is the master for, and The algorithm does the rest
12. If the source is not a topic map TMSync convert.xslt Simply do a normal one-time conversion let TMSync do the update for you In other words, TMSync reduces the update problem to a conversion problem source.xml
13. The City of Bergen usecase Norge.no Service Unit Person LOS City of Bergen LOS
14. The backends In-memory no persistent storage thread-safe no setup RDBMS transactions persistent thread-safe uses caching clustering Remote uses web service read-only unofficial Engine Memory RDBMS Remote
15. RDBMS Backend Allows the Engine to use topic maps stored in a relational database Based on a generic topic map schema Necessary when working with very large topic maps Transparent to applications Features Automatically loads data when needed Caches frequently used data Full support for RDBMS transactions Supports tolog-to-SQL compilation Statistical reports for performance tuning Platform support Oracle, MySQL, PostgreSQL, MS SQL Server Test suite available for verifying compatibility with other JDBC-enabled RDBMSes
16. DB2TM Upconversion to TMs from RDBMS via JDBC or from CSV Uses XML mapping can call out to Java Supports sync either full rescan or change table TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
17. DB2TM example Ontopia + = United Nations Bouvet <relation name="organizations.csv" columns="id name url"> <topic type="ex:organization">
<item-identifier>#org${id}</item-identifier>
<topic-name>${name}</topic-name>
<occurrence type="ex:homepage">${url}</occurrence>
</topic> </relation>
18. TMRAP Web service interface via SOAP via plain HTTP Requests get-topic get-topic-page get-tolog delete-topic ... TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
19. Navigator framework Servlet-based API manage topic maps load/scan/delete/create JSP tag library XSLT-like based on tolog JSTL integration TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
20. Ontopia Navigator Framework Java API for interacting with TM repository JSP tag library based on tolog kind of like XSLT in JSP with tolog instead of XPath has JSTL integration Undocumented parts web presentation components some wrapped as JSP tags want to build proper portlets from them
27. Automated classification Undocumented experimental Extracts text autodetects format Word, PDF, XML, HTML Processes text detects language stemming, stop-words Extracts keywords ranked by importance uses existing topics supports compound terms TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
30. So how could this be used? To help users classify new documents in a CMS interface suggest appropriate keywords, screened by user before approval Automate classification of incoming documents this means lower quality, but also lower cost Get an overview of interesting terms in a document corpus classify all documents, extract the most interesting terms this can be used as the starting point for building an ontology (keyword extraction only)
31. Example user interface The user creates an article this screen then used to add keywords user adjusts the proposals from the classifier
32. Vizigator Viz Ontopoly Graphical visualization VizDesktop Swing app to configure filter/style/... Vizlet Java applet for web uses configuration loads via TMRAP uses “Remote” backend TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
33. The Vizigator Graphical visualization of Topic Maps Two parts VizDesktop: Swing desktop app for configuration Vizlet: Java applet for web deployment Configuration stored in XTM file
36. The Vizigator The Vizigator uses TMRAP the Vizlet runs in the browser (on the client) a fragment of the topic map is downloaded from the server the fragment is grown as needed Server TMRAP
37. Ontopoly Viz Ontopoly Generic editor web-based, AJAX meta-ontology in TM Ontology designer create types and fields control user interface build views incremental dev Instance editor guided by ontology TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
38. Ontopoly A generic Topic Maps editor, in two parts ontology editor: used to create the ontology and schema instance editor: used to enter instances based on ontology Built with the Web Editor Framework works with both XTM files and topic maps stored in RDBMS backend supports access control to administrative functions, ontology, and instance editors existing topic maps can be imported parts of the ontology can be marked as read-only, or hidden
39.
40. Typical deployment Viewing application Engine Users DB Backend Ontopoly Frameworks Editors DB TMRAP DB2TM HTTP DB External application Application server
41. CMS integration The best way to add content functionality to Ontopia the world doesn’t need another CMS better to reuse those which already exist So far two integrations exist Escenic OfficeNet Knowledge Portal more are being worked on
42. Implementation A CMS event listener the listener creates topics for new CMS articles, folders, etc the mapping is basically the design of the ontology used by this listener Presentation integration it must be possible to list all topics attached to an article conversely, it must be possible to list all articles attached to a topic how close the integration needs to be here will vary, as will the difficulty of the integration User interface integration it needs to be possible to attach topics to an article from within the normal CMS user interface this can be quite tricky Search integration the Topic Maps search needs to also search content in the CMS can be achieved by writing a tolog plug-in
43. Articles as topics is about Elections New city council appointed Goal: associate articles with topics mainly to say what they are about typically also want to include other metadata Need to create topics for the articles to do this in fact, a general CMS-to-TM mapping is needed must decide what metadata and structures to include
44. Mapping issues Article topics what topic type to use? title becomes name? (do you know the title?) include author? include last modified? include workflow state? should all articles be mapped? Folders/directories/sections/... should these be mapped, too? one topic type for all folders/.../.../...? if so, use associations to connect articles to folders use associations to reproduce hierarchical folder structure Multimedia objects should these be included? what topic type? what name? ...
45. Two styles of mappings Articles as articles Topic represents only the article Topic type is some subclass of “article” “Is about” association connects article into topic map Fields are presentational title, abstract, body Articles as concepts Topic represents some real-world subject (like a person) article is just the default content about that subject Type is the type of the subject (person) Semantic associations to the rest of the topic map works in department, has competence, ... Fields can be semantic name, phone no, email, ...
46. Article as article Article about building of a new school Is about association to “Primary schools” Topic type is “article”
55. The project A new citizen’s portal for the city administration strategic decision to make portal main interface for interaction with citizens as many services as possible are to be moved online Big project started in late 2004, to continue at least into 2008 ~5 million Euro spent by launch date 1.7 million Euro budgeted for 2007 Topic Maps development is a fraction of this (less than 25%) Many companies involved Bouvet/Ontopia Avenir KPMG Karabin Escenic
56. Simplified original ontology Service catalog Escenic (CMS) LOS Form Article nearly everything Category Service Subject Department Borough External resource Employee Payroll++
57. Data flow Ontopoly Ontopia Escenic LOS Integration TMSync DB2TM Fellesdata Payroll (Agresso) Dexter/Extens Service Catalog
61. NRK/Skole Norwegian National Broadcasting (NRK) media resources from the archives published for use in schools integrated with the National Curriculum In production delayed by copyright wrangling Technologies OKS Polopoly CMS MySQL database Resin application server
64. Curriculum-based browsing (3) Feminist movement in the 70s and 80s Changes to the family in the 70s The prime minister’s husband Children choosing careers Gay partnerships in 1993
65. One video (prime minister’s husband) Metadata Subject Person Related resources Description
67. Implementation Domain model in Java Plain old Java objects built on Ontopia’s Java API tolog JSP for presentation using JSTL on top of the domain model Subversion for the source code Maven2 to build and deploy Unit tests
69. The big picture Auto-class. A.N.other A.N.other Other CMSs A.N.other A.N.other DB2TM Portlet support OKP XML2TM Engine CMSintegration Data integration Escenic Taxon.import Ontopoly Web service
70. CMS integrations The more of these, the better Candidate CMSs Liferay (being worked on at Bouvet) Alfresco (might be started soon) Magnolia Inspera (possible project here) JSR-170 Java Content Repository CMIS (OASIS web service standard)
71. Portlet toolkit Subversion contains a number of “portlets” basically, Java objects doing presentation tasks some have JSP wrappers as well Examples display tree view list of topics filterable by facets show related topics get-topic-page via TMRAP component Not ready for prime-time yet undocumented incomplete
72. Ontopoly plug-ins Plugins for getting more data from externals TMSync import plugin DB2TM plugin Subj3ct.com plugin adapted RDF2TM plugin classify plugin ... Plugins for ontology fragments menu editor, for example
73. TMCL Now implementable We’d like to see an object model for TMCL (supporting changes) a validator based on the object model Ontopoly import/export from TMCL (initially) refactor Ontopoly API to make it more portable Ontopoly ported to use TMCL natively (eventually)
74. Things we’d like to remove OSL support Ontopia Schema Language Web editor framework unfortunately, still used by some major customers Fulltext search the old APIs for this are not really of any use
78. If you are using Ontopia... ...simply download the zip, then unzip, set the classpath, start the server, ... ...and you’re good to go
79. If you are developing Ontopia... You must have Java 1.5 (not 1.6 or 1.7 or ...) Ant 1.6 (or later) Ivy 2.0 (or later) Subversion Then check out the source from Subversion svn checkout http://ontopia.googlecode.com/svn/trunk/ ontopia-read-only ant bootstrap ant dist.jar.ontopia ant test ant dist.ontopia
80. Beware This is fun, because you can play around with anything you want e.g, my build has a faster TopicIF.getRolesByType you can track changes as they happen in svn However, you’re on your own if it fails it’s kind of hard to say why maybe it’s your changes, maybe not For production use, official releases are best
82. Our goal To provide the best toolkit for building Topic Maps-based applications We want it to be actively maintained, bug-free, scalable, easy to use, well documented, stable, reliable
83. Our philosophy We want Ontopia to provide as much useful more-or-less generic functionality as possible New contributions are generally welcome as long as they meet the quality requirements, and they don’t cause problems for others
84. The sandbox There’s a lot of Ontopia-related code which does not meet those requirements some of it can be very useful, someone may pick it up and improve it The sandbox is for these pieces some are in Ontopia’s Subversion repository, others are maintained externally To be “promoted” into Ontopia a module needs an active maintainer, to be generally useful, and to meet certain quality requirements
85. Communications Join the mailing list(s)! http://groups.google.com/group/ontopia http://groups.google.com/group/ontopia-dev Google Code page http://code.google.com/p/ontopia/ note the “updates” feed! Blog http://ontopia.wordpress.com Twitter http://twitter.com/ontopia
86. Committers These are the people who run the project they can actually commit to Subversion they can vote on decisions to be made etc Everyone else can use the software as much as they want, report and comment on issues, discuss on the mailing list, and submit patches for inclusion
87. How to become a committer Participate in the project! that is, get involved first let people get to know you, show some commitment Once you’ve gotten some way into the project you can ask to become a committer best if you have provided some patches first Unless you’re going to commit changes there’s no need to be a committer
88. Finding a task to work on Report bugs! they exist. if you find any, please report them. Look at the open issues there is always testing/discussion to be done Look for issues marked “newbie” http://code.google.com/p/ontopia/issues/list?q=label:Newbie Look at what’s in the sandbox most of these modules need work Scratch an itch if there’s something you want fixed/changed/added...
89. How to fix a bug First figure out why you think it fails Then write a test case based on your assumption make sure the test case fails (test before you fix) Then fix the bug follow the coding guidelines (see wiki) Then run the test suite verify that you’ve fixed the bug verify that you haven’t broken anything Then submit the patch
90. The test suite Lots of *.test packages in the source tree 3148 test cases as of right now test data in ontopia/src/test-data some tests are generators based on files some of the test files come from cxtm-tests.sf.net Run with ant test java net.ontopia.test.TestRunner src/test-data/config/tests.xml test-group
91. Source tree structure net.ontopia. utils various utilities test various test support code infoset LocatorIF code + cruft persistence OR-mapper for RDBMS backend product cruft xml various XML-related utilities topicmaps next slides
92. Source tree structure net.ontopia.topicmaps. core core engine API impl engine backends + utils utils utilities (see next slide) cmdlineutils command-line tools entry TM repository nav + nav2 navigator framework query tolog engine viz classify db2tm webed cruft
93. Source tree structure net.ontopia.topicmaps.utils * various utility classes ltm LTM reader and writer ctm CTM reader rdf RDF converter (both ways) tmrap TMRAP implementation
95. The engine The core API corresponds closely to the TMDM TopicMapIF, TopicIF, TopicNameIF, ... Compile with ant init compile.ontopia .class files go into ontopia/build/classes ant dist.ontopia.jar # makes a jar
96. The importers Main class implements TopicMapReaderIF usually, this lets you set up configuration, etc then uses other classes to do the real work XTM importers use an XML parser main work done in XTM(2)ContentHandler some extra code for validation and format detection CTM/LTM importers use Antlr-based parsers real code in ctm.g/ltm.g All importers work via the core API
97. Fixing a real bug There is a failing test case in the TM/XML importer So let’s fix that right now...
98. Find an issue in the issue tracker (Picking one with “Newbie” might be good, but isn’t necessary) Get set up check out the source code build the code run the test suite Then dig in we’ll help you with any questions you have At the end, submit a patch to the issue tracker remember to use the test suite!