2. Overview
Open TMS Overview
Architecture
Implementation
Current Status
2
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
3. Overview
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 3
4. Goals
Three translation memory systems for one and the same process?
Software investments that make translation costs shoot through the roof?
Exchange formats that put the brakes on productivity?
FOLT (Forum Open Language Tools) is concerned with the entire process of producing multilingual
documentation. From the creation of the source text to production in foreign languages, we analyze our
processes for weaknesses and a lack of standardisation.
Primary objectives:
- Sharing experiences of processes using standard industry software
- Sharing experiences of the use of Open Source software
- Standardisation of interchange formats
-Testing new Open Source technologies and improving existing technologies in the translation market
- Public support for non-proprietary software and software development
- Publication of aims and results
www.folt.de
Development of the OpenSource Translation
Memory system OpenTMS
4
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
5. OpenTMS Requirements
Software
Web based application
Server / Client Architecture
Thin client
No installation
No proprietary run time components
Preferred open source software
Modular software approach
OS independent operating system
Windows, Linux, Mac …
Standard hardware
Interfaces
Integration into CMS
Workflow management should be supported
Open source database
Basically all SQL da-tabases should be supported
Scalability
Single and multi user requirement
5
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
6. Architecture
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 6
7. Example Work Flow
Seamless integration of different tools in the translation / localisation workflow
Terminology
Translation
Machine OpenTMS
Translation Editor
Translation
Memory
XLIFF Back
Converter
2.
3.
Segmenter
Converter
1.
CMS
7
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
8. Architecture based on Standards
XLIFF
TMX
TBX
SRX
…
In general XML
8
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
9. OpenTMS System Architecture
Application Model
GUI Model Interface Model
Security Model
User Document Data
Model Model Model
Process
Model
OpenTMS Core Library
For details see Waldhör, K. (2008). OPENTMS SOFTWARE ARCHITECTURE.
9
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
10. Software Structure
Hierarchy of functions and processes
Common functions / methods stored in a core library
Method calls should be transparent
Running on server or user machine
Scripting language
OpenTMS primitive OpenTMS core
procedure library
OpenTMS Process
OpenTMS Network Process
10
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
11. Modelling Language
Linguistic Property N:1 Terminology
General Linguistic Object
Translation Memory
inherits
mapping
N:1
Monolingual Object Multilingual Object
Data Source
11
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
12. OpenTMS Processes
Human Initiated Interactions
Data Source
OpenTMS Initiated Interactions Interactive Interactive
Terminology Translation
Translation Memory
Pre OpenTMS
Back
Converter Segmenter Translation Translation
Converter
Memory Editor
12
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
13. Implementation
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 13
14. Programming Language et al
Java
Java Coding Standards
Java Documentation Standard
Delivered as jar files
Eclipse
Data Sources
SQL DB: Hibernate based
Documentation UML
Generated ESS Model
14
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
15. Data Sources
Language related data are represented as “data sources”
Idea
Make the data access interface independent from the data itself
Not being restricted to SQL databases only
Also flat data or xml files
TMX, XLIFF files as a data source
…
Machine translation (MT) as data source
Spread sheets
E.g. Excel as terminology lists
Object Oriented Databases
DMS systems
“Web Sites” (http based interfaces)
Define a common interface for all access functions
Allows adaption to individual data source properties
e.g. read only data sources like MT
15
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
16. Data Sources
Access to data
sources through
standardised interface
O
P
E
N Open
T TMS Data type
M specific
S Data access
Source functions
S
O
Layer
F
T Maps the OpenTMS
W access functions to the
specific data component
A
R Various data
components like files
E etc.
16
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
17. Core Data Model Status
Data Source methods defined
Are extended depending on needs and requirements
SQL
Access optimisation
Hibernate based
First version finished
Other OpenSource databases…
OODBS
DB4O partially implemented for testing purposes
Other data sources
TMX files
XLIFF files
MT
Google & Microsoft Translator
17
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
18. Data Source Core Functions
Data Sources
Create
Delete
Import TMX, XLIFF File
Export TMX, XLIFF File
Copy between data sources
18
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
19. Fuzzy Search – Core Function of TM
Step 1: Search in KD-TREE
Restricts the number of strings to search
Finds possible matching strings
Step 2: Levenshtein Similarity
Compare matches from step 1 now to determine
real similarity
Step 3: Get source and target MOLs / MUL
Create translation (alt-trans)
19
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
20. Data Source Configuration
SQL Data Source contained in hibernate
directory
Existing data sources contained in database
directory
20
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
21. Translation Core Functions
Convert (to and from XLIFF)
Currently externally done Araya
Complex document format like WinWord etc. thru
Open Office Converters
Segment
Currently external Araya
Translate
21
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
22. Current Data Source Interface
22
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
23. Security
Managing Security
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 23
24. Security Levels
Level 0
No security procedures are applied, data are transferred as
they are.
Level 1
The communication channel is secured. It uses standard
secure protocols here.
Level 2
Encoding for security is done here on data level. Basically
this means that strings are encrypted when the are
communicated through a communication channel or are
written or retrieved from a database. This also involves
encrypted XLIFF files (resp. parts of it).
Level 4
GUI level related security
24
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
25. Security and Files
Protection of parts of
the document
Encrypt specific parts of
the xml documents
Additional security
when transferring files
Even if a file gets in the
wrong hands the file
cannot be read.
Secure XLIFF
Source
Target
Secure TBX
Secure TMX
TU…
25
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
26. Security
Eclipse
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 26
34. Data Source Editor
Edit MOL/MOL
Properties
Language Specific Segments
Delete & Save Functions
Search Functions
34
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
35. Downloads
http://sourceforge.net/projects/op
en-tms
Ubuntu Version
Windows Version:
www.heartsome.de/arayatest/op
entmsserver.exe
Im Xliff Editor:
www.heartsome.de/arayatest/araya-
freeversion.exe
YourKit Java Profiler for
performance measurements
35
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
36. Possible Contributions
XML Parser!
Generalise OpenTMS XML interfaces to support any kind
of xml parsers (currently jdom)
Faster XML parser?!
Logging Packing
Optimised, line numbers, class names
Exception Handling
Improvement
Localisation approach / String handling
Test Environment
XLIFF / TMX package improvements
TBX reader
SRX segmentation
36
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
37. Possible Contributions Converters
Document Converters
XML
OpenOffice as central converter for txt, rtf, doc,
xls, ppt…
MIF
…
Data Model Converter
Trados
Star
Across
…
37
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör
38. Contact
Heartsome Europe GmbH
Friedrichstr. 17
D-90574 Roßtal
www.heartsome.de
Dr. Klemens Waldhör
T: +49 9127 579001
F: +49 9127 951178
klemens.waldhoer@heartsome.de
38
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör