Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

5 Reasons not to use Dita from a CCMS Perspective

2.034 Aufrufe

Veröffentlicht am

A critique of the OASIS DITA XML Standard including final recommendations on how to improve DITA.

Veröffentlicht in: Software
  • Als Erste(r) kommentieren

5 Reasons not to use Dita from a CCMS Perspective

  1. 1. 5 Reasons not to use DITA from a CCMS Perspective Marcus Kesseler Managing Director – SCHEMA GmbH TEKOM 2015 Stuttgart – November 10
  2. 2. Some Definitions and Terminology
  3. 3. SCHEMAGroup2015–Allrightsreserved Definitions and Terminology: Marcus Kesseler, SCHEMA & DERCOM Marcus Kesseler  Computer Scientist with a heavy Artificial Intelligence background.  One of two founders and managing directors of SCHEMA GmbH. SCHEMA  A software company based in Nürnberg.  SCHEMA is 20 years old and we have been making and selling CCMS from day one. DERCOM  Is the Association of German Manufacturers of Authoring and Content Management Systems.  Currently 7 companies, with 1,400 customers between them.
  4. 4. SCHEMAGroup2015–Allrightsreserved Definitions and Terminology: CCMS CCMS Component Content Management System. The main difference between a CMS and a CCMS: A CCMS has the ability to aggregate content components into larger documents.  A CCMS is able to publish content as “classic” documents or as Web portal content or app content, all with very high quality.
  5. 5. SCHEMAGroup2015–Allrightsreserved Definitions and Terminology: DITA DITA Darwin Information Typing Architecture, an XML and files-based standard for the representation of componentized and interlinked content. Although there are several DITA-based CCMS implementations, DITA can be used with just an XML Editor, the file system and the DITA Open Toolkit. What we like about DITA, is the visibility it brings to the enormous advantages of componentized content. We fully agree with the DITA community, that there really is no alternative to working with components (or topics) in large-scale, state-of-the-art technical content authoring, management and distribution.
  6. 6. SCHEMAGroup2015–Allrightsreserved More Terminology: Essential and Incidental Complexity Essential complexity, also called intrinsic or inherent complexity, is the complexity you cannot hide or get rid of in a software implementation. It is directly derived from the domain you are modelling. Example: When moving from a document based content authoring to a componentized one, the number of objects you have to deal with goes up by two or three orders of magnitude. The only way to hide this increase would be to hide the components, which, of course, would defeat the purpose. Incidental complexity, is an extra dose of complexity added on top of the essential complexity by bad choices of architecture, data representation or user experience design.
  7. 7. Context of this Talk: Large Technical Content Departments
  8. 8. SCHEMAGroup2015–Allrightsreserved Our context is not the Lone Technical Content Ranger All arguments in this talk assume that we are talking about the processes and needs of large technical content department operating at a high level of maturity. We are not talking about the perspective of the Lone Technical Content Ranger. Russell Ward presented this perspective in his great talk last year here at tekom 2014: Five reasons not to use DITA [http://conferences.tekom.de/fileadmin/tx_doccon/slides/742_5_Reasons_Not_to_Use_DITA.pdf]
  9. 9. SCHEMAGroup2015–Allrightsreserved Large Technical Content Departments: Some Parameters So, what is a Large Technical Content Department?  5 to several dozen technical writers.  Publications have to be regularly updated in 5 to 30 (or more) languages.  Multiple publication formats, including:  Paginated formats, like PDF (directly or via InDesign, FrameMaker or Word).  Online formats, like HTML, HTML5, EPUB, etc.  Custom XML formats.
  10. 10. SCHEMAGroup2015–Allrightsreserved Large Technical Content Departments: Processes & Worflows The following are defined and enforced:  Writing standards and terminology  Translation standards and workflows  Artwork & media standards and workflows  Publication workflows  Release workflows  Distribution Workflows
  11. 11. SCHEMAGroup2015–Allrightsreserved Large Technical Content Departments: Core Challenges  Layout has to be of the highest quality, strictly adhering to Corporate Design standards.  Products are highly modular or organized in product families with common base features, both of which are key requirements for effective and massive content reuse.  Product innovation is fast and relentless, the technical content team is always under pressure to keep product and information life cycles in sync.  So, just another great day in the wonderful world of technical content publishing. Life is good!
  12. 12. Reason 1 Coverage of Component Content Management Requirements in DITA is Surprisingly Small
  13. 13. SCHEMAGroup2015–Allrightsreserved Requirements Coverage of XML, DITA and CCMS # Process Name & Requirements Max Points XML DITA CCMS 1 Topics management (classes, workflows, versioning, ownership, access control). 10 0 3 9 2 Manage the links between topics (classes, workflows, versioning, ownership, referential integrity). 10 0 3 9 3 Management of the maps that build the publications out of the underlying components (versioning, ownership, referential integrity). 10 0 3 9 4 Manage the metadata on topics, links and maps (classes, workflows, versioning, ownership). 10 1 2 9 5 Translation management with automatic flagging of topics needing re-translation (ownership, workflow, dataflow). 10 1 1 8 6 Media assets management (classes, workflows, ownership, guidelines, conversion, translation). 10 1 2 7 7 Publication formats and layout management (design within corporate guidelines, implementation, revisions). 10 0 4 8 8 Automatic publication generation and channel specific distribution (workflow, IT systems integration). 10 0 2 6 9 Overall content, links and publications quality assurance and approval processes (correctness, writing style, terminology, translations, links, publication maps, graphics and layout). 10 2 3 8
  14. 14. SCHEMAGroup2015–Allrightsreserved Requirements Coverage of XML, DITA and CCMS # Process Name & Requirements Max Points XML DITA CCMS 10 Information model management (conceptual design, classes, roles, rights, workflows, evolution). 10 0 2 9 11 Performance & costs management (financial controlling, key performance indicators monitoring, tracking, corrective actions) 10 0 2 4 12 Security (user management, user roles, access control, change tracking). 10 0 0 8 13 IT and software infrastructure management (change, updates and upgrades). 10 0 0 4 14 Manage the communication with adjacent departments, like product management, engineering and marketing (responsibilities, workflows). 10 0 0 3 15 Team management (skills, training, structure, responsibilities, motivation). 10 0 0 0 Coverage [Points] 150 5 27 101 Coverage [Percent] 3% 18% 67% Coverage with CCMS baseline [Percent] 27% 100%
  15. 15. SCHEMAGroup2015–Allrightsreserved Requirements Coverage of XML, DITA and CCMS XML DITA CCMS [DITA] CCMS [DERCOM] Business Logic in DITA Open Toolkit Business Logic in Database, Workflow System, TMS Interfaces, Media Assets Management, etc Non-DITA CCMSs bonus for being on the market for at least 10 years longer ?
  16. 16. SCHEMAGroup2015–Allrightsreserved Drawbacks of a Small Requirements Coverage Comparing CCMSs based on their level of DITA compliance would not yield much insights, since most requirements are outside of DITA’s scope. All features not within DITA’s scope would not be trivially portable to other DITA-based systems. Some examples:  Versioning  Translation states & dataflow  Release and ongoing workflow states  Media assets management  Access rights & user management Note: Even with a DITA-based CCMS, you would incur a significant amount of vendor lock-in!
  17. 17. Reason 2 Evolution of the DITA Standard is too Slow
  18. 18. SCHEMAGroup2015–Allrightsreserved Evolution of DITA is too Slow An update every five years is just not compatible with the demands of an ever accelerating market (variables? scoped keys?). Fast evolution of DITA is impeded by the following two inherently conflicting requirements:  The need to add features that are crucially missing in real- life application scenarios.  The need to prevent new features that would add even more incidental complexity to the standard.
  19. 19. SCHEMAGroup2015–Allrightsreserved Evolution of DITA is too slow Scoped keys are a good example:  Under heavy reuse scenarios you are very, very likely to need them.  On the other hand, should tech writers really need to be trained in programming languages scoping concepts, just to be able to handle reuse variability?
  20. 20. Reason 3 How DITA deals with the Number of Files Explosion
  21. 21. SCHEMAGroup2015–Allrightsreserved How is a DITA Topic Represented in a File System? TOP [XML] DITA Topic File File Metadata (Name, Owner, LastWriteDate, …) Metadata within XML DITA Topic (class, author, target audience, …) XML Content
  22. 22. SCHEMAGroup2015–Allrightsreserved Now we add some translations… TOP EN . . .TOP FR TOP JA TOP PT
  23. 23. SCHEMAGroup2015–Allrightsreserved … and some versions … TOP EN V1 . . .TOP FR V1 TOP JA V1 TOP PT V1 TOP EN V2 . . .TOP FR V2 TOP JA V2 TOP PT V2 TOP EN Vn . . .TOP FR Vn TOP JA Vn TOP PT Vn ...
  24. 24. SCHEMAGroup2015–Allrightsreserved … and after several years, a single topic may have proliferated into m × n files! TOP EN V1 TOP FR V1 TOP JA V1 TOP PT V1 TOP EN V2 TOP FR V2 TOP JA V2 TOP PT V2 TOP EN Vn TOP FR Vn TOP JA Vn TOP PT Vn n versions m languages
  25. 25. SCHEMAGroup2015–Allrightsreserved How m × n Topics are accessed in DITA In DITA each single translation or version is a unique, individual file and hence a distinct topic. The user has to know exactly what language and version is being referenced. Keys or file names will likely follow some pattern like this: Topic_Intro_en_V1 Topic_Intro_fr_V1 Topic_Intro_ja_V1 Topic_Intro_en_V2 Topic_Intro_fr_V2 Topic_Intro_ja_V2
  26. 26. SCHEMAGroup2015–Allrightsreserved How m × n Topics are Accessed in a CCMS In a CCMS implemented on top of a database, all these m × n topics can be addressed with a single key: [ID_Intro, Language, LatestReleasedVersion] where Language and LatestReleasedVersion are variables, that the system will automatically populate as needed. In Computer Science this is called a composite key, and was invented over 45 years ago at IBM. Composite keys capture and optimally encode the regularities in the target domain and let the computer do the tedious book- keeping. This is what computers are good at!
  27. 27. SCHEMAGroup2015–Allrightsreserved How m × n Topics are Accessed by the Author in a CCMS Authors will rarely need to see, insert or handle full CCMS composite topic keys: [ID_Intro, Language, LatestReleasedVersion] Since the composite key structure is universal within the system, there is no need to explicitly represent the variable parts. They are optional and will be implicitly added at document aggregation time. What the author sees and handles is just: [ID_Intro] And, of course, usually even this is hidden by the GUI.
  28. 28. SCHEMAGroup2015–Allrightsreserved Advantages of Composite Keys DITA would be so much easier, if references were defined as composite keys:  Maps would be directly reusable. No need to create and maintain a map for each language. A change to the map structure in English is automatically available in all other languages.  New languages (or versions) can be added to your pool without touching the maps at all!  No need to develop, train and enforce sophisticated file name or key patterns to manually capture and encode these rather trivial domain regularities.  Authors need only insert a reference to the topic, the system does the tedious and error-prone book-keeping.
  29. 29. SCHEMAGroup2015–Allrightsreserved Representation of m × n Topics in a CCMS EN FR JA PT TOPIC Metadata for this version in this language Metadata for all versions in this language Metadata for all versions in all languages Topic container Language container XML container XML V1 XML V2 XML Vn XML V1 XML V2 XML Vn XML V1 XML V2 XML Vn XML V1 XML V2 XML Vn XML content
  30. 30. SCHEMAGroup2015–Allrightsreserved Cool stuff you can easily do with Composite Keys A complete and detailed translation status report is just a trivial query.
  31. 31. SCHEMAGroup2015–Allrightsreserved Translation Report: Details
  32. 32. SCHEMAGroup2015–Allrightsreserved Representation of a Graphic in a CCMS Neutral GRAPHIC Graphic container Language container Format container V1 Vector [SVG] Graphics file V2 Vn V1 Pixel [PNG] V2 Vn V1 Source V2 Vn EN V1 Vector [SVG] V2 Vn V1 Pixel [PNG] V2 Vn V1 Source V2 Vn PT V1 Vector [SVG] V2 Vn V1 Pixel [PNG] V2 Vn V1 Source V2 Vn
  33. 33. SCHEMAGroup2015–Allrightsreserved Call Out Designer
  34. 34. SCHEMAGroup2015–Allrightsreserved Call Out Designer
  35. 35. Reason 4 DITA‘s XML-first Paradigm vs. a Database-first Paradigm
  36. 36. SCHEMAGroup2015–Allrightsreserved DITA‘s XML-first Paradigm vs. a Database-first Paradigm In DITA, every information or data that is needed to drive business processes has to be inside the XML files together with the content as such (= DITA’s XML first paradigm). This goes against quite a few Computer Science information model designing principles. Any change, however minimal, to a topic can affect content, structure, linking or metadata and therefore has to be carefully scrutinized to identify what exactly changed and if any consistency rules were broken. Enforcing the principles of Atomicity, Consistency and Isolation in DITA is quite a challenge (cf. The ACID Principles of Database Design).
  37. 37. SCHEMAGroup2015–Allrightsreserved DITA‘s XML-first vs. Database-first Please note that DITA’s XML first is a huge incidental complexity driver for DITA-based CCMS implementations:  There is pressure to improve metadata handling by keeping them in the database, but, with XML-first, you also have to keep them in the DITA files. Now there are two distinct and separate representations. You’ve lost your single source of truth.  The database value and the DITA XML value can get inconsistent from update conflicts and may have to be manually corrected by the users.  Controlling change permissions for individual metadata values in a file is also a huge challenge. It is possible to do it in good XML editors. But users can still open the XML file in Notepad…
  38. 38. Reason 5 The Default DITA Content Model is too Complex
  39. 39. SCHEMAGroup2015–Allrightsreserved Trend in CCMS: Content Model Complexity Reduction In the last 10 years, there has been a very strong trend in the CCMS market to reduce content model complexity (aka semantic DTDs). Content departments observed, that in the long term, they never got back their investment into design, implementation, training and especially maintenance of their sophisticated, made-to-order content models. The trend is simply to move the needed business data from the XML content into the database, where it is much easier to implement, manage, interface with, retrieve and use productively.
  40. 40. SCHEMAGroup2015–Allrightsreserved Examples of Content Model Complexity Reduction Some examples:  Topic types or classes are just metadata in the database. The variability on the XML Editor (DTD) level is reduced to an absolute minimum.  All metadata assigned to a topic is moved from the XML into the database.  Fine grained variability in the content is handled by variables, which on the XML content level are just very simple references into the database. The data model for variables in the database is very powerful and table oriented (=EXCEL), so that it is easy to maintain versions, languages and taxonomic dependencies of variable names and values without touching the XML content.
  41. 41. SCHEMAGroup2015–Allrightsreserved DITA Specialization As a Computer Scientist, I think DITA Specialization is a really impressive and elegant solution for the implementation of sophisticated content models. But again, DITA is adding all this sophistication to the XML level, where it will incur a big cost in incidental complexity. I think that there is a consensus, that even the default DITA content model is already challenging for most technical writers new to component-based authoring.
  42. 42. SCHEMAGroup2015–Allrightsreserved DITA Specialization There is a paradox, in that just to trim the content model down to a more manageable scope, you already need a significant amount of consulting and configuration. The OASIS Lightweight DITA Initiative, chaired by Michael Priestley (IBM), is trying to remedy this situation, so that you can start simple and add more features later, when you understand the principles and can be sure that you really need them.
  43. 43. Summary & Conclusion
  44. 44. SCHEMAGroup2015–Allrightsreserved Summary of our 5 Reasons against DITA 1. Coverage of Component Content Management Requirements in DITA is Surprisingly Small. 2. Evolution of the DITA Standard is too Slow. 3. How DITA deals with the Number of Files Explosion. 4. DITA‘s XML-first Paradigm. 5. The Default DITA Content Model is too Complex.
  45. 45. SCHEMAGroup2015–Allrightsreserved Conclusion As long as the DITA standard is based on a non-negotiable XML-first paradigm, it will always incur a tremendous incidental complexity cost on multiple levels:  Initial configuration, even if just to trim DITA back, is significant.  Integrating DITA into a CCMS (or database) is fragile and expensive.  Technical writers will need a lot of training and close motivation monitoring.
  46. 46. SCHEMAGroup2015–Allrightsreserved Recommendation Our recommendation would be to decouple the DITA business logic from the XML-first principle. In the end, this means the DITA Open Toolkit would not be just a smart topic aggregation compiler, but behave much more like an integrated database application, in short: just like a state-of-the-art CCMS. Tekom 2015 presents a very convenient opportunity to take a closer look at these systems!
  47. 47. Thank you very much for your attention!
  48. 48. Lesen Sie unseren Blog http://blog.schema.de