... a[n] [XML] topic is a chunk of content, or content module, that is understood in isolation and used in multiple contexts
If you're a player in the ED&D side of the house for an automotive (or defense or commercial vehicle) OEM or supplier, you may have stumbled upon folks from the service side of the house referencing "XML," "single-source publishing," "dynamic publishing," etc. And you may have walked away from the encounter with a clear-as-mud understanding of what they're talking about.
The snapshot explanation that follows will get you acquainted with the concept and lingo of the structured content approach fundamental for developing content for service/operator technical publications and parts information since the late 1990s.
eXtensible Mark-up Language (XML)
Structured authoring for automotive (and defense and commercial truck) is based on eXtensible Mark-up Language (XML), which executes the strategy of developing and managing content as separate from format. Content owners and Subject Matter Experts (SMEs) plan topics to develop, which are subjects, themes, or talking points. Essentially, a topic is a chunk of content, or content module, that is understood in isolation and used in multiple contexts. Content developers, or authors, manage the development, editing, review, translation, and approval of each topic as a single source, or element. This includes both the structural hierarchy of topics and the mark up of topics, which adds meaning, or intelligence, to content as hidden, background information about a topic in the form of metadata and attributes.
On December 18, 2015, the Organization for the Advancement of Structured Information Standards (OASIS) ratified DITA 1.3 as the approved standard for authoring and publishing technical content. DITA stands for Darwin Information Typing Architecture, an open, topic-based XML standard first developed by IBM (in 2000) and used increasingly more widely in the automotive industry since the 1.2 release in 2010. Enterprise IT rules and requirements determine what an organization uses as its Content Management System (CMS) repository for DITA files. End-user experience and allocated budget factor into what XML-based authoring tool is chosen.
XML-based Content Planning
Initial XML-based content planning determines the XML Schema Definition (XSD), or a document definition for well-formed content structure. XSD becomes the content model, or architecture. This requires functional analysis by a team of representatives from all functional areas that develop and use the content. For many industries, a standard XSD is prescribed, such as DITA for automotive or S1000d for defense. Analysis and inputs by a functionally representative team ensures that the structure and mark-up, as consistent with the chosen definition or schema, make sense.
The team will also plan the Extensible Stylesheet Language Transformations (XSLT), or style templates required for required deliverables, such as XSLT for the Web (XML-to-HTML conversion) and an Extensible Stylesheet Language Formatting Objects (XSL_FO) for a formal print document (XML conversion to a PDF file). XSLT determines the formatting output for imported XML text, much like a styles template in a word-processing document. The difference is that XML content and XSLT formatting rules are stored and developed independently from the content module files. So with the task of responding to a Request for Proposal (RFP), for example, a program associate, as part of a proposal response, would have available in an XML-based content repository a high-level product design description leveraged from previous programs. The program associate would import this content with other content relevant for the proposal response into the XSL-FO stylesheet (pre-configured for showing the compay logo/branding) for distribution to management for review in Adobe Acrobat, before the new business lead submits the formal response to the customer in PDF-formatted, cleanly published document.
DITA is a highly adaptable approach to XML and the fastest growing XML architecture for developing and publishing technical content, largely due to its birth as an OASIS standard for structured authoring and content reuse. Initially embraced as a Control Versioning System (CVS) used for software development, it is where the idea of object-oriented content (for reuse) matured.
DITA plus CMS (tied with a workflow and versioning API) is the automated form of XML-based structured authoring for the automotive industry. It is ideal for enterprises with “big data” ─ many documents that require disciplined processes for change management and configuration control to meet legal and regulatory compliance and industry standards included as requirements in program source packages. From a resource efficiency and cost-reduction standpoint, the DITA approach in XML-based content management is lightweight on a network and optimized for reuse, and a rapid, tracked release cycle.
Adobe FrameMaker XML Author ─ described as “a complete solution for bi-directional technical content” ─ has DITA 1.3 support built-in. Yet FM users may author content in native XML code using a WYSIWYG interface and further customize it with dynamic content filters, a Quick Element Toolbar (QET), and new table features. Authors may also write inline math equations in the content with FrameMaker XML’s native integration with MathFlow Structure and Style editors from Design Science, which are exported to published deliverables as high quality vector graphics. The Packager feature allows the sharing of content with all referenced object files from the DITA exchange CMS included.
Converting Legacy Content to XML-ready Content
A structured-content strategy moves content developers away from inefficient formatting-centric word-processing to strategic topic-based structured authoring and single-source publishing. Topic-element structure allows assignment to SMEs by topic for developing and reviewing content for their topic of specialization. Therefore, it is not feasible to convert each legacy file, one-to-one, to an XML-structured file.
Conversion from word-processing formats to XML applies mapping of styles to XML element tags. Conversion is imperfect, as word-processing styles are formatting-based and XML tags are semantically driven. To eliminate redundancy, a good practice is to index legacy content prior to the conversion process and take care to convert “final” or latest versions. Word-processing styles in legacy documents should also be updated for topic-associated modules of content so that they’ll map directly to corresponding tags for XML topic elements. (New content generated directly in the authoring tool is initially tagged to the refined, prescribed topic-element structure.) A best practice when initiating the actual conversion process is to conduct a pilot-run with a small set of legacy files to validate and refine steps in preparing files for conversion.
Further reading:
Technical Documentation and Process
DITA ─ the Topic-Based XML Standard
Content Strategy 101
© 2017, Powerplay Communications