Optimizing your DITA content model for translation
1. Optimizing your DITA Content Model
for Translation
Amber Swope
DITA Strategies, Inc.
2. About the Speaker
• Over 20 years of experience in the industry at
multiple companies of varying sizes and
industries
• Author of numerous papers/presentations on
information development and information
architecture, including the “DITA Maturity Model”
with Michael Priestley
3. Process overview
3
Remove
Indicate Use Avoid inline
Know what ambiguity
when to appropriate content or
DITA from
translate DITA key
provides content
content elements references
model
copyright DITA Strategies, Inc. 2012
4. DITA knowledge poll
4
1. Have implemented DITA and sent content through
multiple rounds of translation
2. Have implemented DITA and sent content through
first translation
3. Have implemented DITA but not yet sent content
through first translation
4. Have some theoretical DITA knowledge, but no
implementation experience
5. Know what the acronym means
copyright DITA Strategies, Inc. 2012
5. DITA overview
5
• Darwin Information Typing Architecture (DITA)
• Modular, structured, XML framework based on a
topic-based architecture
• Open-source standard approved and supported by
OASIS
• Implemented by companies in many industries
around the world
copyright DITA Strategies, Inc. 2012
6. 6
DITA translation support
Know what
Best practices
DITA
provides
copyright DITA Strategies, Inc. 2012
7. DITA translation support
7
Attributes that you can specify on each instance of an
element
@translate attribute
@xml:lang attribute
@dir attribute
copyright DITA Strategies, Inc. 2012
8. @translate attribute
8
Indicates whether the content of the element should be
translated or not.
Default value is “yes”.
Example:
copyright DITA Strategies, Inc. 2012
9. @xml:lang attribute
9
Specifies the language of the element content.
Values are from W3C (http://www.w3.org/TR/REC-xml/)
Example:
copyright DITA Strategies, Inc. 2012
10. @dir attribute
10
Specifies the directionality of text.
Values:
ltr – left-to-right (processing default)
rtl – right-to-left
Example:
copyright DITA Strategies, Inc. 2012
11. Best practices
11
Update files only with changed text
Translate reused or common content first
Provide translations for generated output
Provide full source language text for verification
Use language-specific stylesheets
copyright DITA Strategies, Inc. 2012
12. Goals
12
Avoid translators changing elements
Automate formatting with language-specific stylesheets
copyright DITA Strategies, Inc. 2012
13. 13
Use @translate attribute on an
Indicate element
when to Identify specific elements to not be
translate translated
content
copyright DITA Strategies, Inc. 2012
14. Use @translate attribute on an element
14
Pro: can control translation for each instance of an element
Con: must specify for each instance of an element
Common elements for which to indicate translation:
<term>
<ph>
<keyword>
<q>
Example
copyright DITA Strategies, Inc. 2012
15. Identify specific elements to not be translated
15
Pro: can globally indicate that content is not be translated
Con: no flexibility
Elements that are not usually translated:
All elements in the programming domain
(<codeblock>, <codeph>, <parmname>,…) because they
present code, which is usually in English
<tm> because trademarks are not usually translated
copyright DITA Strategies, Inc. 2012
17. 17
Elements to use
Use
Glossary element support for
appropriate alternative forms of a word or phrase
DITA
elements
copyright DITA Strategies, Inc. 2012
18. Elements to use
18
<menucascade><uicontrol> for menu option selection
<fn> for footnotes
<note> with appropriate @type attribute value
<prereq> for prerequisites
Any element for which you generate a label
copyright DITA Strategies, Inc. 2012
19. Glossary support
19
Glossary topic provides full definition of term, including
alternatives for the primary term defined in the
<glossterm> element
The alternatives are nested within the <glossAlt> element:
<glossAbbreviation> – abbreviated form of the primary term
<glossShortForm> – shorter alternative to the primary term
<glossAcronym> – acronym for the primary term
<glossSurfaceForm> – proper presentation for first instance of
term in output
Reference glossary content with the <term> or
<abbreviated-form> element using key referencing
copyright DITA Strategies, Inc. 2012
20. Glossary usage
20
1. Define all information for a term in glossary topic in
source language
2. Create key reference to the glossary topic that defines the
term.
If you want to reuse the primary term, use the <term> element
If you want to reuse an acronym or the surface form, use the
<abbreviated-form> element
3. Translate all elements in the glossary topic as applicable
in each target language; leave empty all inapplicable
elements.
The DITA-OT processing resolves the <abbreviated-form>
element to the <glossterm> element if <glossAcronym> and
<glossSurfaceForm> are empty.
copyright DITA Strategies, Inc. 2012
21. Glossary example
21
Glossary topic
Concept topic
Generated output
copyright DITA Strategies, Inc. 2012
22. 22
Guidelines
Remove
Element usage
ambiguity
Single purpose for each element
from
Manual formatting
content
Specialization or @outputclass
model
attribute
copyright DITA Strategies, Inc. 2012
23. Guidelines
23
Avoid Instead
Using the formatting elements Use element that identifies the
Using an element for multiple content
purposes Clearly indicate the proper
Typing formatting, such as usage for each element
quotation marks Use proper element and update
Adding unnecessary formatting stylesheets
that processing can handle Specialize to create elements if
Relying on @outputclass necessary
attribute values for element
identification
copyright DITA Strategies, Inc. 2012
24. Element usage
24
Content purpose Ambiguous Clear
User interface item <b> <uicontrol>
Citation of resource <i> or “…” <cite>
Presentation of new term <i> <term>
Quotation “…” <q> or <lq>
Directory path <codeph> or <ph> <filepath>
copyright DITA Strategies, Inc. 2012
25. Single purpose for elements
25
Guidelines
Be reasonable – find the balance between clarity and
complexity
Use elements for their intended purpose
Clearly define usage for content authors
Examples
<filepath> – if the formatting for directory paths and file
names is the same, then use for both purposes
<pre> versus <codeblock> versus <screen> versus
<systemoutput> – if formatting is same, the use <codeblock>
copyright DITA Strategies, Inc. 2012
26. Manual formatting to avoid
26
Quotation marks
Table headings
Titles
Terms
Labels
copyright DITA Strategies, Inc. 2012
27. Specialization versus @outputclass attribute
27
Specialization
Allows you to create new element types and attributes that are
explicitly and formally derived from existing types
Provides selectable elements or attributes for authors
@outputclass attribute
Names a role that the element is playing
Used primarily to provide styling instructions during generation
copyright DITA Strategies, Inc. 2012
28. Specialization versus @outputclass attribute
28
Specialize element Use @outputclass
No DITA element properly You need to indicate a variation
identifies the content on output formatting for
Authors need to use frequently existing element
and consistently Expert needs to use
Authors must specify usage infrequently
You can incorporate into
templates (no author
specification)
copyright DITA Strategies, Inc. 2012
29. Specialization considerations
29
When authors must have control over processing, such as
collapsible/expandable substeps
When authors must manually type a value
copyright DITA Strategies, Inc. 2012
30. Specialization examples
30
Sidebar support to provide sidebars for articles
Specific table types to support consistency
Collapsible/expandable elements to allow authors to
control display
Emphasis element to eliminate <b> or <i> usage
Foreign word to identify non-translated foreign words
Custom list structures to support consistency
copyright DITA Strategies, Inc. 2012
31. 31
Definitions
Avoid
Referencing issues
inline
Best practices
content or
Strategies
key
references
copyright DITA Strategies, Inc. 2012
32. Definitions
32
Content references allow you to directly reuse or include
elements into topics
Key references allow you to indirectly reuse content (like a
placeholder)
copyright DITA Strategies, Inc. 2012
33. Referencing issues
33
Article agreement of reused words or phrases
Gender
Singular v. plural
Capitalization
First word in a sentence
Expansion of abbreviated forms
Inflection in translated content
Word changes by role in sentence
copyright DITA Strategies, Inc. 2012
34. References best practices
34
Reference Do not reference
Complete units of content Common nouns
Block elements Translated text
Full sentences
Non-translated text
Proper nouns (when subject
of sentence)
copyright DITA Strategies, Inc. 2012
35. Strategies
35
Consider including the article in the reference
Avoid using references as the first word in a sentence
For commands, do not include the noun
No:
Yes:
copyright DITA Strategies, Inc. 2012
36. Summary
36
Remove
Indicate Use Avoid inline
Know what ambiguity
when to appropriate content or
DITA from
translate DITA key
provides content
content elements references
model
copyright DITA Strategies, Inc. 2012
37. Resources
37
OASIS DITA Translation Subcommittee
“Best Practice for Managing Acronyms and Abbreviations in
DITA”
“Translation Best Practice for Leveraging Translation
Memory”
“Best Practice for Indexing DITA Topics for Translation”
“Best Practice for Using the DITA CONREF Attribute for
Translation”
http://dita.xml.org/wiki/optimizing-dita-for-
translations
copyright DITA Strategies, Inc. 2012