What is DITA XML? It’s a technology designed to facilitate text authoring, reuse, and translation.
It’s a bit of a foreign concept to many people, but we’ll try to make it easy to grasp so that you can get a basic understanding of DITA XML as well as how to start using it.
DITA XML Article Series
This article is part one of our DITA XML series. You can find part two of the series on DITA localization here: How to Translate DITA Projects [Step-by-Step Guide].
In this article we’ll catch you up to speed on the fundamentals of DITA technical writing, why you need it, how it works, and how you can use different tools to author, edit, translate, and transform files according to your business’s needs.
Table of Contents
- How is DITA Different From Other Mark-up Languages?
- Structured Authoring in Technical Writing Using DITA
Why Use DITA XML?
Before we explain exactly what DITA XML is or how it works, let’s show why it’s useful.
Imagine if you have two products that are similar to each other, like for example a bicycle and a tricycle. And now imagine you have to write a user guide for each of them. Since they are so similar, you don’t want to write two user guides from scratch, so you copy most of the text from the bicycle user guide to the tricycle user guide.
Now now, imagine you have updated the product line, and you need to update the user guides. Will you update each guide separately?
Now imagine you have dozens of similar products. Updating all related content for each product separately will be a huge task and may result in hundreds of errors.
That is, if you use an old fashioned workflow.
But with DITA XML, things are different.
With DITA XML you can write the content in modules, which can then be plugged into user guides as needed. So for example, your section on wheels might be identical between the bicycle user guide and the tricycle user guide . . . and the unicycle user guide.
If you are using DITA XML, you can update the section on wheels, and your changes will be automatically propagated to all of the publications and user guides that include that text.
So with DITA XML, your content can be reused as needed throughout all of the organization’s publications.
Reusing text like this across publications in the organization can save a fortune and improve consistency and quality.
This strategy of reusing content can also save money on translations. When content reuse is maximized, that means less original content needs to be translated.
So that’s why DITA XML is worth knowing about, so what is DITA XML?
Have DITA Content to Localize?
Rely on IVANNOVATION’s 20+ years of experience localizing structured content.
Get a Free Quote
DITA XML: What is it?
DITA XML is a powerful technology designed to facilitate text authoring, reuse, and translation, and it is especially popular in the field of technical writing for user guides and elearning materials.
It’s one of the best standards for structured content on the market for all kinds of industries. Companies of all sizes have reported up to 30%-50% gains in efficiency in their publication systems after switching to the DITA documentation standards.
With advanced reuse mechanisms, one can produce documents of all types, sizes, and structures in the best possible ways making use of DITA XML.
DITA XML is an open standard to author, edit, organize, and manage content, which IBM created for internal use, but later donated to the Organization for the Advancement of Structured Information Standards, OASIS, in the year 2005.
It’s tool independent, which means if you are just learning DITA, you need not depend on specific premium tools or DITA editors for writing DITA documentation. If you have set up DITA-OT libraries on your computer, you can create DITA models and files on a basic text editor (such as Notepad++ or EmEditor), and use the command line to transform the content you wrote into different DITA output formats according to your needs.
So that’s a summary of what it is and how it’s used, but what does DITA XML look like? Here’s an example of DITA XML followed by an image of how it might look in its final published form.
How is DITA Different From Other Mark-up Languages?
The DITA model follows the XML mark-up language, and it looks like one of the most popular mark-up languages, HTML. But, HTML, DITA, and XML differ from each other in multiple respects.
- Both HTML and DITA use tags and attributes within the tags, but whereas HTML may still work when you forget to add a closing tag, DITA and XML are very strict about their closing tags.
- HTML uses a set of pre-defined tags (<body>, <p>, <span>, etc.), which you cannot change. However, XML in addition to standard tags can also use user-defined tags which are defined in a separate file.
- HTML is lenient about the order in which different tags appear in a file, but DITA is very particular about the tags that should come first and the order of the tags that should follow.
- The outermost, root tag in an HTML file is <html>, while the root tags in a DITA file depend on the type of topic you are creating. For example, <concept>, <task>, or <reference> etc.
Learn how to localize websites with our article—How to Build a Multilingual Website? Here Are the First Steps [Complete Guide]
Structured Authoring in Technical Writing Using DITA
As explained before, structured authoring, typically used for writing user guides and elearning materials, is a way of creating content according to a predefined standard and organization method.
Although DITA is one authoring standard or organization method, it is not the only kind of structured authoring in technical writing. Other DITA writing alternatives could be:
- Troff (Unix-based document processing system by AT&T corporation),
- LinuxDoc (SGML Document Type Definition for Linux systems)
- DocBook (XML based mark-up language for technical documentation)
- LaTex (preferred software system for document preparation in academia)
- S1000D (XML-based specification for the production of technical publications)
While most of the standards remain restricted to certain distributions and industries, DITA documentation is used in all kinds of industries and computer systems. That’s why more people are learning DITA than any other standard.
Not only does DITA provide a set of pre-defined rules and a DITA style guide to structure a document correctly, but it also allows you to add different rules or element names to match your industry’s terminologies. You can depend on an Information Architect with DITA training to define these names and rules according to your company or industry.
What Software Reads, Creates, and Transforms DITA Files?
A ton of DITA software is available that you can use for creating, editing, translating DITA files, and transforming them into formats such as TXT, PDF, HTML, HTML help, Windows help, Web help, Markdown, etc. Let’s look at some basic DITA software and resources you need for creating a DITA environment or DITA maintenance kit on a system.
There are free tools you can use, and there are premium tools. First, let’s look at how you can write and publish DITA XML with free tools. Next, we can look at the premium tools.
Go to top
How to Publish DITA XML for Free
Writing and publishing DITA for free takes a little work, but at least it’s free. To do it you’ll need:
- Text editor
- DITA Open Toolkit
- The terminal or command-line interface of your computer
Here are the steps:
1. Write the text with a text editor
For creating and editing DITA files, you can pick basic text-editors such as Notepad, Notepad++, or a free DITA IDE such as Codex. You just need to save a text file with relevant DITA extensions (DITA or DITAmap).
2. Download and Install DITA Open Toolkit
Next download and install DITA Open Toolkit (DITA-OT). DITA-OT is an open-source publishing engine for transforming DITA files into desired formats. The DITA-OT engine supports the transformation of a DITA file into all of the OASIS DITA specifications:
- Normalized DITA
- Eclipse Help
- HTML Help
3. Use the Command-Line Interface to Transform the DITA Files
After you have set up DITA-OT, you can use the terminal or command-line interface of your operating system to transform a DITA file (which you created using basic a text editor) into different formats (as explained above).
You simply need to navigate to the directory of your DITA file and use the following command lines:
On Windows or Linux:
In the code example above, “input-file“ is your DITA or DITAmap file and “format“ is the output format you want, such as HTML or PDF.
How to Publish DITA XML With Premium Tools
So you probably don’t want to go through all those complex steps, right?
Fortunately, you can use specialized DITA software or IDEs, which can handle authoring, managing, and publishing DITA technical writing all in one tool.
DITA IDEs are premium tools developed by third-parties on the DITA-OT platform to make things more organized and easier to use for an organization for massive projects.
Here is a list of some well known DITA XML editors or authoring tools:
- Oxygen XML
- Adobe FrameMaker
- XMetaL Author Enterprise
- Madcap Flare
- Content Mapper
- Easy DITA
- DITA Exchange
These are all-in-one, premium DITA software with authoring, transformation, and publishing capabilities.
These tools also provide easier ways of creating DITA documentation with drag-and-drop or assistive suggestions to complete a code. Using these tools, you won’t have to remember DITA codes completely but just the basic DITA syntax.
These tools also have dedicated interfaces for a typical DITA project to make collaborations simple. For example they might have a developer interface, author interface, DITA editor interface, and publisher interface.
Most of them also offer additional DITA content management systems, also known as CCMS, to facilitate management of content repositories.
So what exactly does a CCMS do?
IXIASOFT, a major CCMS, explains it this way:
Component Content Management Software (CCMS) manages content at a granular level. Unlike platforms like WordPress, which manage content at the document level, a CCMS stores words, phrases, chapters, and sometimes even entire procedures in a central repository to maximize content reuse. Each of these components are stored only once, which creates a trusted and consistent single source from which content can be published across multiple platforms, such as for print, mobile, or desktop.
According to IXIASOFT, using a CCMS has the following benefits:
- Improved UX for end user
- Reduced localization costs
- Ease of searching for content
- “Where used” and “used by” information
- Enhanced content development team collaboration
- Revision history
- Version control
- Taxonomy term maintenance and application
- Automated publishing
- Improved security with variable user permissions
In summary, free tools can be used for DITA, but there is a wealth of feature-rich software available on the market to make life easier for content developers.
Looking for Technology to Make Your Life Easier?
Check out our articles on software tools:
- 50+ Productivity Tools
- 100+ Tools for Professional Translators
- 5 Free Translation Tools Anyone Can Use Right Now [Free Download]
DITA Topics: The Basic Building Block of DITA
Since DITA is a topic-based authoring standard, the building block of any DITA file/project is ‘Topic’. We represent it with the tag <topic> in a DITA project.
The tag <topic> is called a ‘generic topic’, and it has a general structure like this:
Topics help to create modular content with a specific topic forming a module of unique information. We then combine different topics together to form a “map,” which we transform into different DITA output formats (e.g. HTML, PDF, etc) as per requirements. In this way, the same topic can be used in different maps according to the content needs of different types of documents.
For example, let’s say a company that manufacturers refrigerators creates a DITA topic about “Specifications of Refrigerator Model Number 1005”. Once created, the company can use this particular topic in DITA maps as a part of user manuals, repair guides, marketing documents, and a whole variety of related documents about the product.
Then if something about the product specifications changes, the content manager can simply update the topic for “Specifications of Refrigerator Model Number 1005” and the changes will be propagated to all of the DITA maps referencing the topic.
Put more simply, the content manager can change the specifications once, and the change will be automatically implemented in all articles, documents, and curricula that include those specifications.
Like the article?
Types of DITA Topics
The generic topic, which we mentioned before, is OK, but actual DITA documentation projects rarely use the generic topic tags. Based on the information to be documented, we use more refined topic types.
Here are the four standard topic types in DITA architecture with information about each type:
Concept contains background information that answers the question “Why?”
We use concept topics to give the background information of a matter in discussion. A concept topic may include notes, paragraphs, tables, images, etc.
Important Elements used in Concept Topics
Here are some tags used in concept topics.
- <conbody>: To contain the body of a concept topic.
- <p>: To contain a paragraph
- <ul>: To contain an unordered list
- <ol>: To contain an ordered list
- <li>: To contain the items in an ordered or unordered list
- <fig>: To contain a figure
- <image>: Used inside <fig> to contain an image
- <section>: Creates subdivisions within concept topics
A Simple Concept Topic Syntax
A simple concept topic looks like this-
We use task topics to explain how to do something with step-by-step instructions for the process. There are two types of task topics: Strict Task and General Task.
While using the Strict Task topic, we cannot use more than one set of steps, and steps must occur in a specific order. However, a General Task topic allows us to add more than one set of steps, and it’s forgiving about the orders in which they occur.
Important Elements used in Task Topics
- <task>: The root task topic tag
- <taskbody>: The body of your task topic
- <steps>: To contain other tags that contain the order of actions
- <step>: To contain each action and other tags that contain steps in each action
- <cmd>: To contain steps in each action
- <step>: To contain each action and other tags that contain steps in each action
- <example>: (an example of how to do the entire task)
A Simple Task topic Syntax:
As the name suggests, a reference topic is used for giving information about items, just like referring to Wikipedia to explain something.
Important Elements used in Reference Topics
- <reference>: The root tag of a reference topic
- <refbody>: The body of a reference topic
- <section>: To create subdivisions inside a reference topic
- <table>: To contain a table
- <fig>: To contain a figure
- <properties>: To contain a list of properties in a reference topic
- <refsyn>: To create a syntax diagram of a reference topic
A simple Reference topic Syntax:
Glossary Entry Topic
We use glossary entry topics to give definitions of certain terms in a document.
Important Elements Used in Glossary Entry Topics
- <glossentry>: Root element to create a glossary entry
- <glossterm>: To add a word or phrase
- <glossdef>: To add the definition the word or phrase
A simple Glossary Entry Topic Syntax
Bringing It All Together
DITA XML is a powerful solution for companies that produce a large amount of content that is published across multiple media and is updated over time.
It’s easy to imagine that when doing curriculum development or when writing user guides for the first time it might save some money. But the real magic happens when it is time to update those materials or translate them.
The ability to treat sections of text as modules that can be reused an unlimited number of times saves companies innumerable work hours (and thus money) and helps them avoid human error due to manually making changes.
It also helps companies reduce the cost and difficulty of translating corporate content. Since DITA XML reduces the incidence of similar but slightly different sets of content, it also reduced the amount of unique text that must be translated.
The tragedy of DITA is the same as the tragedy of modern translation project management; it’s that while large enterprises take DITA and effective translation project management for granted, many of the struggling small to midsize companies that need them most, don’t realize the power they hold for them.
They don’t realize that every day they don’t use DITA and modern translation workflows, they squander more money and waste more work hours and they publish more avoidable human errors. They don’t realize that every word published the old fashioned way represents a ballooning cost to them later.
That’s why at IVANNOVATION we publish articles like this. It’s our passion to help growing companies understand that affordable solutions exist that will transform their effectiveness as surely as printing presses transformed the work of scribes.
IVANNOVATION has had a long history of translating or localizing DITA XML content using modern translation best practices. If you want to get on the road to effective translation project management or if you need to localize your DITA content, contact us and let’s take your content to a new level.
Vinay Kumar, Software Engineer and Technical Writer by profession. He has 5+ years of experience in DITA-XML, Web Development, Digital Marketing, Blogging, and Google SEO. Loves reading thrillers and playing video games in his leisure time.
Darren Jansen, business development and content manager for IVANNOVATION, has a lifetime love for tech and languages. At IVANNOVATION he helps software developers get professional localization for their apps, software, and websites. On his time away from the office, he can be found hiking the Carolina wilderness or reading Chinese literature.
This article was written by Vinay Kumar and Darren Jansen
Get free localization tips straight to your inbox!
- Get tips on how to translate your website, marketing materials!
- Get actionable advice to help you succeed with international business.
- Be the first to access free language and management tools.