Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Reports and DITA Metrics IXIASOFT User Conference 2016

484 Aufrufe

Veröffentlicht am

Reports and DITA Metrics IXIASOFT presentation at the IXIASOFT User Conference 2016, by Keith Schengili-Roberts, DITA Information Architect, IXIASOFT, Nathalie Laroche, Lead Technical Writer, IXIASOFT and Dustin Clark, Lead DITA Architect, Intel

Veröffentlicht in: Software
  • Als Erste(r) kommentieren

Reports and DITA Metrics IXIASOFT User Conference 2016

  1. 1. Keith Schengili-Roberts, DITA Information Architect, IXIASOFT Nathalie Laroche, Lead Technical Writer, IXIASOFT Dustin (Dusty) Clark, Lead DITA Architect, Intel Reports and DITA Metrics
  2. 2. Agenda •  Introduction •  DITA Metrics for Production Purposes •  DITA Metrics DITA CMS •  Agile, the DITA CMS and DITA Metrics •  Topic Type Usage/Ratios •  DITA Structural Metrics •  DITA Metrics for Checking Consistency •  Migrating to DITA 1.3 •  DITA Reuse Metrics •  How To Use the DITA CMS Reports Feature (Live Demo) [Nathalie] •  The DITA QA Plugin (Live Demo) [Dusty] •  Q/A
  3. 3. Introductions Keith Schengili-Roberts, DITA Specialist, IXIASOFT What I do: •  Liaison with OASIS; on DITA Adoption and Technical Committees •  Industry researcher •  DITA evangelist •  Have 10+ years of experience with DITA XML Nathalie Laroche, Lead Technical Writer and Product Owner, IXIASOFT What I do: •  Technical Writer for more than 20 years •  Working in DITA for 7+ years •  Now also Product Owner for the IXIASOFT web tools Dusty Clark, Lead DITA Architect, Intel Corporation What I do: •  Information architect for a distributed team •  Tools developer and systems integrator •  10+ years authoring experience, 6+ years DITA experience
  4. 4. Why Measure Documentation Production? Provides the ability to: • Set more accurate project estimates • Justify need for more resources (tools/people) • Understand quality of production • It’s also an opportunity to measure value
  5. 5. Documentation Metrics: The Bad Old Days Prior to the advent of structured content, documentation managers were limited in what they could easily measure, mainly limited to: •  How many pages/publications produced over time •  Individual writer productivity •  Painstaking reviews of quality
  6. 6. DITA and Return on Investment (ROI) • This has been a primary focus of much work on DITA metrics •  Topic-based nature of DITA lends itself to cost-based measures • The book DITA Metrics 101 (2013) looks at this aspect almost exclusively, focusing on justifying the cost of investing in DITA + CMS
  7. 7. DITA Metrics for Production Purposes • But not everything is about ROI: § What if you have a mature DITA environment and have already established your ROI? § Or, are simply looking for ways to use DITA + metrics to measure things that are not cost-related? •  DITA metrics can be used to guide managers, information architects and writers on how to improve their content
  8. 8. DITA Production Metrics without a CMS • DITA metrics outside of a CMS are limited to information contained within the DITA files + file system § Can search for text strings within the XML, and also use date/time info from filename… and that’s about it §  Make no mistake though, there’s plenty of information there to be mined • But more options are available within the IXIASOFT DITA CMS
  9. 9. DITA Production Metrics with the DITA CMS •  The DITA CMS captures additional information which can be used for metrics, including: § Author information § Workflow status § How many times a topic has been modified/versioned § Topic/map dependencies § Word count …and much more!
  10. 10. •  At Scrum meetings doc manager can report on topics assigned to their group and report on how “done” they are •  DITA CMS enables you to capture a snapshot of how “done” (i.e. workflow status) the topics/images/objects in your map Agile and DITA Metrics
  11. 11. Time and Workflow Metrics • This can be done with the DITA CMS, and it tracks who is responsible for which topic production, and whether it is on schedule • This is possible because workflow data is an associated object to map/topics; not possible with DITA alone
  12. 12. Content Types and Document Make-up Looks at the topic types that go into maps •  Why would this matter? It can provide you with an idea as to whether content is being properly “typed”. It ensures that writers are writing/structuring content properly. Some examples: § A typical “Installation Guide” ought to be made up primarily of task topics § APIs ought to have a lot of reference topics § Would generally expect to have more maps than bookmaps
  13. 13. Content Types within a Single Document • Single document: § Joe Gollner and Eliot Kimber have uploaded an excellent set of sample DITA demo files at: github.com/gnostyx/dita-demo- content-collection § It is a User Guide for a fictional software application called “Thunderbird” § It’s a User Guide, but there doesn’t seem to be a lot of task topics to help people use the product…
  14. 14. Counting String Instances in Excel •  Do a search for each topic type contained the map, then count the results •  If you can output results to Excel, simply select column and use COUNTIF with string you are looking for, in this case: =COUNTIF(B2:B100, "concept")
  15. 15. Thunderbird Document Metrics •  I would argue that a user-oriented document ought to have a more even balance of concepts and tasks than we see here •  My direction to the Thunderbird technical writers: check that all possible tasks a user might encounter are explainedCount: 87 s
  16. 16. Content Types within All Documents Over a Year •  This chart looks at the DITA topic breakdown for all documentation produced by IXIASOFT in 2015 •  Documentation consists of User/ Admin Guides for our DITA CMS and TEXTML software •  Good ratio of concept to task topics •  When I showed this to our Lead Tech Doc person, she immediately wanted to investigate the 3% of generic topic types §  Nice practical example of how DITA metrics can improve quality! Count: 1307
  17. 17. Tracking Topic Type Usage Over Several Years •  These charts look at several years-worth of semiconductor documents •  While I expected a high percentage of reference topics, I wondered whether there were more topics that ought to be tasks which were instead done as references
  18. 18. Tracking Topic Type Usage Directing Change •  Asked writers to be more diligent about writing task topics where they might be temped to write them as references instead •  Result was a measurable increase in the percentage of task topics created over the course of the following year
  19. 19. Ratio of Structural Elements •  Look at the ratio of structural elements, such as ditamaps and maps •  Why? Provides an idea as to how content is being structured •  If, for example, you use maps as “sub-maps”, would expect to see more maps than bookmaps § That’s exactly what we see here Count: 118
  20. 20. Readability Metrics •  Readability statistics provides an idea as to how easy or hard a document is to read •  Why? The need for clarity and simplicity. “Most users prefer clear, simple language, [web]site visitors with poor reading skills need it.” (Nielsen & Loranger) •  In documentation you want to aim at or below the likely reading level of your audience •  One of the most widely-accepted readability metrics is the Flesch-Kincaid reading ease and grade level tests •  Can either do this topic-by-topic or document-by-document
  21. 21. Other Possible DITA Consistency Checks • If you have a house style that recommends against certain tags (for example: <b>, <i> or <u>) search for topics containing those tags • If you want to optimize use of relationship tables, look at the ratio between the number of topics in a map and the number of topicrefs within the relationship table • Are you adding short descriptions to your topics?
  22. 22. A Couple of Sample Results •  Clearly Thunderbird is doing something right! ;) •  List of non-compliant IXIASOFT topics merits further investigation
  23. 23. Preparing to Move Content to DITA 1.3 •  DITA 1.3 opens up many new possibilities for structuring and describing content •  Using new elements/features opens up new possibilities •  A couple of easy examples: § New XML Mention domain means that you can replace angle brackets for tags (i.e. &lt; and &gt;) with a pair of “<xmlelement>” tags §  This is the most common example, and there are other entities in this domain for describing attributes, parameters, numeric characters and more § With new Troubleshooting topic type, look for obvious candidates for topic conversion containing the word “troubleshoot”
  24. 24. Results from Search for Angle Bracket Entities • 40 matches were found in 1502 topics from 2015 Sample DITA file full of &lt;*&gt; examples
  25. 25. Results from Search for “trouble*” in Topics •  38 file matches from 1502 topics; each would need to be investigated •  Example above is a solid troubleshooting topic candidate
  26. 26. Other DITA 1.3 Possibilities • Search all maps for the names of keys and look for those that have the same value (“name”) § Introduction of keyscopes in DITA 1.3 allows you to share keys (and the values) across maps; identifying key matches suggests opportunities for key scoping • Search for instances of MathML or SVG graphics § DITA 1.3 has MathML and SVG “baked in”, so you can insert code directly or partition them off as referenced topics § In most instances search for content contained with <foreign> tags for likely candidates
  27. 27. DITA Reuse Metrics • Arguably the most influential article on this topic is Bill Hackos’ “Reuse of DITA Topics? What is the Best Metric to Measure the Success of Your Reuse of DITA Topics?” ( http://ow.ly/X7mzM)
  28. 28. DITA Reuse Metrics • Bill Hackos proposed “Percent Repository Words Reused in Context” (PRWRC) where: PRWRC = (Words in All Produced Content – Words in the Repository)/(Words in the Repository) §  From his example: §  Document1 – 25,413 words §  Document2 – 23,069 words §  Document3 – 26,366 words §  Total number of words in the produced documents – 74,848 words §  Total number of words in the repository – 40,060 words PRWRC = (74,848 – 40,060)/40,060 = 87%
  29. 29. Example Based on IXIASOFT DITA Documents Based on 2015 numbers from IXIASOFT documentation: • Total number of words in the repository: 268,663 • Words in All Produced Content: 623,078 • PRWRC = (623,078 – 268,663)/268,663 • PRWRC = 354,415 / 268,663 = 132%
  30. 30. How is a +100% Value Possible? •  Easy: ditaval •  Though ditaval is not mentioned in original article, Bill Hackos does talk about +100% values being entirely possible •  We have number of publications that are created based on a series of ditaval values, as much as 21 per bookmap
  31. 31. How To Use the IXIASOFT DITA CMS Reports Feature • DITA CMS contains a tremendous amount of info: § Workflow § Authors § Number of revisions § Creation and modification dates § Versions § Labels § Conditions § Localization § Reviews • DITA CMS Reports Feature: Data mining tool
  32. 32. The DITA CMS Reports Feature 3 steps: 1.  Create a query: What information do you want to extract from the Content Store? 2.  Create a viewpoint: How do you want to organize the information? 3.  Create the report: Associate a query with a viewpoint
  33. 33. Running the Report § The DITA CMS runs the query, organizes the results according to the viewpoint specified, and then uses an XSL file to transform the data § By default, it generates an HTML report (this can be configured) § Two ways of generating report: •  Manually: HTML report •  Scheduler: HTML report + .tsv file of the results And that’s where the fun begins J
  34. 34. Step 1: Create a Query • Search all topics • Save query as xml (e.g., “All topics”)
  35. 35. Step 2: Create a Viewpoint
  36. 36. Step 3: Create the Report
  37. 37. Open .tsv in Data Mining Tool
  38. 38. Other Examples • Reports of maps and topics that were created in the last release cycle: § Is the # of topics what you expected? §  # is higher: Not enough reuse? Unplanned features? §  # is lower: Overestimates? Length of topics? § Look at topic titles: Can you see possibilities for reuse? (e.g., two topics with the same title, yet another « log in » topic)
  39. 39. Other Examples • Report of topics that were reused in the release cycle: § Search for topics that were created before the start of the release and were modified during the release § Compare with topics that were created during the release to get a ratio • Report of modified topics over a week: § Performance issues? Look at modification dates; are your writers all checking in at the same time?
  40. 40. Why Use Reports? • Reports can be scheduled at specific intervals (e.g., weekly and monthly reports) • Queries can be complex (Think once, do many times) • Process can be reproduced systematically (you know it’s always the same query that gets run) • Reports can be discussed and planned before the start of the work
  41. 41. The DITA QA Plugin
  42. 42. Overview • DITA Open Toolkit Plugin • Part of the DITA Community project on GitHub • Generates: § HTML dashboard overview § Detailed CSV report § XML data file • Checks can be customized § Structural § Terminology § Count metrics
  43. 43. DITA CMS – QA Output Type
  44. 44. DITA QA Plugin • Download: https://github.com/dita-community/org.dita- community.qa • IXIASOFT documentation: QA Plugin setup • Ditanauts blog: http://ditanauts.org/tag/qa
  45. 45. Setting up the QA Plugin • Follow instructions in IXIASOFT documentation to set up • Reminder: chunk attribute must be set on root map § Set using xmltask (for IXIASOFT CMS) § Set on map itself § Set using setchunk parameter
  46. 46. Output • HTML report – dashboard overview • CSV – detailed list of violations • Output map – violations ditamap • Database (.dita) file – database of all collected values
  47. 47. Database File Output
  48. 48. Creating Rules • Rules are XPATH if statements • Add rules to xsl/qachecks/_qa_checks.xsl • Hint: there’s a compiler tool that allows you to maintain your checks in DITA
  49. 49. Compiler Tool – Keep Rules in DITA
  50. 50. QA Rule Best Practices • Supply specific resolutions for each violation • Keep the list of violations short and impactful •  Be as specific as possible (minimize false positives) • Match on the @class value instead of the element name
  51. 51. QA