Skip to main content

What is DITA XML? It’s a technology designed to facilitate text authoring, reuse, and translation.

It’s a bit of a foreign concept to many people, but we’ll try to make it easy to grasp so that you can get a basic understanding of DITA XML as well as how to start using it.


DITA XML Article Series

This article is part one of our DITA XML series. You can find part two of the series on DITA localization here: How to Translate DITA Projects [Step-by-Step Guide].


Summary

In this article we’ll catch you up to speed on the fundamentals of DITA technical writing, why you need it, how it works, and how you can use different tools to author, edit, translate, and transform files according to your business’s needs.

Why Use DITA XML?

Before we explain exactly what DITA XML is or how it works, let’s show why it’s useful.

Imagine if you have two products that are similar to each other, like for example a bicycle and a tricycle. And now imagine you have to write a user guide for each of them. Since they are so similar, you don’t want to write two user guides from scratch, so you copy most of the text from the bicycle user guide to the tricycle user guide.

The old fashioned copy and paste method is inefficient

Now now, imagine you have updated the product line, and you need to update the user guides. Will you update each guide separately?

Now imagine you have dozens of similar products. Updating all related content for each product separately will be a huge task and may result in hundreds of errors.

That is, if you use an old fashioned workflow.

But with DITA XML, things are different.

With DITA XML you can write the content in modules, which can then be plugged into user guides as needed. So for example, your section on wheels might be identical between the bicycle user guide and the tricycle user guide . . . and the unicycle user guide.

If you are using DITA XML, you can update the section on wheels, and your changes will be automatically propagated to all of the publications and user guides that include that text.

Using DITA XML can save hours of work and prevent errors

So with DITA XML, your content can be reused as needed throughout all of the organization’s publications.

Reusing text like this across publications in the organization can save a fortune and improve consistency and quality.

This strategy of reusing content can also save money on translations. When content reuse is maximized, that means less original content needs to be translated.

So that’s why DITA XML is worth knowing about, so what is DITA XML?


Have DITA Content to Localize?

Rely on IVANNOVATION’s 20+ years of experience localizing structured content.
DITA Localization Services



Go to top

DITA XML: What is it?

DITA XML is a powerful technology designed to facilitate text authoring, reuse, and translation, and it is especially popular in the field of technical writing for user guides and elearning materials.

What is DITA XML? It is a powerful tool for curriculum and user documentation development

DITA XML is often used by technical writers to create curriculum, help documentation, user guides and other such materials.

It’s one of the best standards for structured content on the market for all kinds of industries. Companies of all sizes have reported up to 30%-50% gains in efficiency in their publication systems after switching to the DITA documentation standards.

What Is Structured Content?

Structured content is text that’s written according to a set of rules that tell machines what to do with the content. HTML, which is used on the web, is a great example of structured content. It has tags that tell the browser how to display the text. For example, the HTML tag <h1> tells the browser that the text should be a large heading, and the tag <p> tells it that the text should be a paragraph. DITA has similar tags that tell computers about the content.

With advanced reuse mechanisms, one can produce documents of all types, sizes, and structures in the best possible ways making use of DITA XML.

What Does DITA Stand For?

DITA stands for “Darwin Information Typing Architecture.”

The word “Darwin” in the name makes sense when you realize that the system is built on the idea of inheritance. Child elements “evolve” from the parent elements and are based on them.

What Does XML Stand For?

XML stands for “Extensible Markup Language,” which is a system that uses tags to tell the computer what to do with the text. It has opening tags and closing tags surrounding the text such as this:

<section>Here’s some text with an opening tag in front and a closing tag at the end.</section>

DITA XML is an open standard to author, edit, organize, and manage content, which IBM created for internal use, but later donated to the Organization for the Advancement of Structured Information Standards, OASIS, in the year 2005.

It’s tool independent, which means if you are just learning DITA, you need not depend on specific premium tools or DITA editors for writing DITA documentation. If you have set up DITA-OT libraries on your computer, you can create DITA models and files on a basic text editor (such as Notepad++ or EmEditor), and use the command line to transform the content you wrote into different DITA output formats according to your needs.

Here Are Some Rapid Fire Facts on DITA:
  • DITA is built on XML-based mark-up language.
  • It allows for topic-based authoring.
  • It follows structured authoring concepts.
  • It separates content from formatting via the DITA style guide.
  • It ensures minimalism in your original content while allowing for the versatility of transformation of the content into different DITA output formats and localization in different languages, all without affecting your original content repository.


Go to top

 

How is DITA Different From Other Mark-up Languages?

The DITA model follows the XML mark-up language, and it looks like one of the most popular mark-up languages, HTML. But, HTML, DITA, and XML differ from each other in multiple respects.

  • Both HTML and DITA use tags and attributes within the tags, but whereas HTML may still work when you forget to add a closing tag, DITA and XML are very strict about their closing tags.
  • HTML uses a set of pre-defined tags (<body>, <p>, <span>, etc.), which you cannot change. However, XML in addition to standard tags can also use user-defined tags which are defined in a separate file.
  • HTML is lenient about the order in which different tags appear in a file, but DITA is very particular about the tags that should come first and the order of the tags that should follow.
  • The outermost, root tag in an HTML file is <html>, while the root tags in a DITA file depend on the type of topic you are creating. For example, <concept>, <task>, or <reference> etc.

Read More

Learn how to localize websites with our article—How to Build a Multilingual Website? Here Are the First Steps [Complete Guide]



Go to top

 

Structured Authoring in Technical Writing Using DITA

As explained before, structured authoring, typically used for writing user guides and elearning materials, is a way of creating content according to a predefined standard and organization method.

Although DITA is one authoring standard or organization method, it is not the only kind of structured authoring in technical writing. Other DITA writing alternatives could be:

  • Troff (Unix-based document processing system by AT&T corporation),
  • LinuxDoc (SGML Document Type Definition for Linux systems)
  • DocBook (XML based mark-up language for technical documentation)
  • LaTex (preferred software system for document preparation in academia)
  • S1000D (XML-based specification for the production of technical publications)

While most of the standards remain restricted to certain distributions and industries, DITA documentation is used in all kinds of industries and computer systems. That’s why more people are learning DITA than any other standard.

Localization of DITA XML

DITA XML remains one of the most popular standards for structured content management.

Not only does DITA provide a set of pre-defined rules and a DITA style guide to structure a document correctly, but it also allows you to add different rules or element names to match your industry’s terminologies. You can depend on an Information Architect with DITA training to define these names and rules according to your company or industry.


Like the Article?

Share with all your friends by clicking on a social sharing button below.






Go to top

What Software Reads, Creates, and Transforms DITA Files?

A ton of DITA software is available that you can use for creating, editing, translating DITA files, and transforming them into formats such as TXT, PDF, HTML, HTML help, Windows help, Web help, Markdown, etc. Let’s look at some basic DITA software and resources you need for creating a DITA environment or DITA maintenance kit on a system.

There are free tools you can use, and there are premium tools. First, let’s look at how you can write and publish DITA XML with free tools. Next, we can look at the premium tools.

Go to top

 

How to Publish DITA XML for Free

Writing and publishing DITA for free takes a little work, but at least it’s free. To do it you’ll need:

  • Text editor
  • DITA Open Toolkit
  • The terminal or command-line interface of your computer

Here are the steps:

1. Write the text with a text editor

For creating and editing DITA files, you can pick basic text-editors such as Notepad, Notepad++, or a free DITA IDE such as Codex. You just need to save a text file with relevant DITA extensions (DITA or DITAmap).

Note:

Basic text editors and Codex can create DITA files, but they cannot transform them into publishing formats. They are tools for creating DITA files, not for publishing DITA. For the transformation and publication functionalities, you need an additional resource, which is explained below.

2. Download and Install DITA Open Toolkit

Next download and install DITA Open Toolkit (DITA-OT). DITA-OT is an open-source publishing engine for transforming DITA files into desired formats. The DITA-OT engine supports the transformation of a DITA file into all of the OASIS DITA specifications:

  • HTML
  • PDF
  • Markdown
  • Normalized DITA
  • Eclipse Help
  • HTML Help

3. Use the Command-Line Interface to Transform the DITA Files

After you have set up DITA-OT, you can use the terminal or command-line interface of your operating system to transform a DITA file (which you created using basic a text editor) into different formats (as explained above).

You simply need to navigate to the directory of your DITA file and use the following command lines:

On Windows or Linux:


DITA --input=input-file --format=format

In the code example above, “input-file“ is your DITA or DITAmap file and “format“ is the output format you want, such as HTML or PDF.

What is DITA XML Structured Authoring

Free tools exist for handling DITA XML since it is an open standard, but premium DITA tools make authoring with DITA much easier and faster.


Go to top

 

How to Publish DITA XML With Premium Tools

So you probably don’t want to go through all those complex steps, right?

Fortunately, you can use specialized DITA software or IDEs, which can handle authoring, managing, and publishing DITA technical writing all in one tool.

DITA IDEs are premium tools developed by third-parties on the DITA-OT platform to make things more organized and easier to use for an organization for massive projects.

Here is a list of some well known DITA XML editors or authoring tools:

These are all-in-one, premium DITA software with authoring, transformation, and publishing capabilities.

These tools also provide easier ways of creating DITA documentation with drag-and-drop or assistive suggestions to complete a code. Using these tools, you won’t have to remember DITA codes completely but just the basic DITA syntax.

These tools also have dedicated interfaces for a typical DITA project to make collaborations simple. For example they might have a developer interface, author interface, DITA editor interface, and publisher interface.

Most of them also offer additional DITA content management systems, also known as CCMS, to facilitate management of content repositories.

So what exactly does a CCMS do?

IXIASOFT, a major CCMS, explains it this way:

Component Content Management Software (CCMS) manages content at a granular level. Unlike platforms like WordPress, which manage content at the document level, a CCMS stores words, phrases, chapters, and sometimes even entire procedures in a central repository to maximize content reuse. Each of these components are stored only once, which creates a trusted and consistent single source from which content can be published across multiple platforms, such as for print, mobile, or desktop.

According to IXIASOFT, using a CCMS has the following benefits:

  • Improved UX for end user
  • Reduced localization costs
  • Ease of searching for content
  • “Where used” and “used by” information
  • Enhanced content development team collaboration
  • Revision history
  • Version control
  • Taxonomy term maintenance and application
  • Automated publishing
  • Improved security with variable user permissions

In summary, free tools can be used for DITA, but there is a wealth of feature-rich software available on the market to make life easier for content developers.


Looking for Technology to Make Your Life Easier?

Check out our articles on software tools:



Go to top

DITA Topics: The Basic Building Block of DITA

Since DITA is a topic-based authoring standard, the building block of any DITA file/project is ‘Topic’. We represent it with the tag <topic> in a DITA project.

What’s a Topic?

“In DITA, a topic is the basic unit of authoring and reuse. All DITA topics have the same basic structure: a title and, optionally, a body of content. Topics can be generic or more specialized; specialized topics represent more specific information types or semantic roles, for example, <concept>, <task>, <reference>, or <learningContent>.”

Oasis

The tag <topic> is called a ‘generic topic’, and it has a general structure like this:


<topic id="give your topic id here">

   <title>Title of your topic </title>

   <shortdesc>A brief description of the topic</shortdesc>

   <body>

        (The actual content of your topic goes here.)

   </body>

</topic>

Topics help to create modular content with a specific topic forming a module of unique information. We then combine different topics together to form a “map,” which we transform into different DITA output formats (e.g. HTML, PDF, etc) as per requirements. In this way, the same topic can be used in different maps according to the content needs of different types of documents.

What’s a DITA Map?

A DITA map is a kind of XML file composed of references to individual topics. By using references to topics, DITA map file pulls together the topics into a collection in order to form an entire document.

For example, let’s say a company that manufacturers refrigerators creates a DITA topic about “Specifications of Refrigerator Model Number 1005”.  Once created, the company can use this particular topic in DITA maps as a part of user manuals, repair guides, marketing documents, and a whole variety of related documents about the product.

Then if something about the product specifications changes, the content manager can simply update the topic for “Specifications of Refrigerator Model Number 1005” and the changes will be propagated to all of the DITA maps referencing the topic.

Put more simply, the content manager can change the specifications once, and the change will be automatically implemented in all articles, documents, and curricula that include those specifications.


Like the article?

Click here to share on Twitter>>
Click here to follow IVANNOVATION on Twitter and be first to learn about our new content>>



Go to top

Types of DITA Topics

The generic topic, which we mentioned before, is OK, but actual DITA documentation projects rarely use the generic topic tags.  Based on the information to be documented, we use more refined topic types.

What Are the Main DITA Topic Types?

There are four standard topic types in DITA architecture:

  1. Concept—contains background information that answers the question “Why?”
  2. Task—contains a procedural information that addresses the question “How?”
  3. Reference—contains facts, frequently in a table, about a topic that answers the question “What?”
  4. Glossary entry—defines a single term and gives the meaning of it, answering the question “What does it mean?”

Here are the four standard topic types in DITA architecture with information about each type:

Concept Topic

Concept contains background information that answers the question “Why?”

We use concept topics to give the background information of a matter in discussion. A concept topic may include notes, paragraphs, tables, images, etc.

Important Elements used in Concept Topics

Here are some tags used in concept topics.

  • <conbody>: To contain the body of a concept topic.
  • <p>: To contain a paragraph
  • <ul>: To contain an unordered list
  • <ol>: To contain an ordered list
  • <li>: To contain the items in an ordered or unordered list
  • <fig>: To contain a figure
  • <image>: Used inside <fig> to contain an image
  • <section>: Creates subdivisions within concept topics
A Simple Concept Topic Syntax

A simple concept topic looks like this-


<concept id="hello_world">

    <title> My First Concept </title>

    <conbody> (body elements go here) </conbody>

</concept>

Task Topic

We use task topics to explain how to do something with step-by-step instructions for the process. There are two types of task topics: Strict Task and General Task.

While using the Strict Task topic, we cannot use more than one set of steps, and steps must occur in a specific order. However, a General Task topic allows us to add more than one set of steps, and it’s forgiving about the orders in which they occur.

Important Elements used in Task Topics
  • <task>: The root task topic tag
  • <taskbody>: The body of your task topic
  • <steps>: To contain other tags that contain the order of actions
    • <step>: To contain each action and other tags that contain steps in each action
      • <cmd>: To contain steps in each action
  • <example>: (an example of how to do the entire task)
A Simple Task topic Syntax:

<task id="sample_task">

    <title>Sample Task Title</title>

    <taskbody>

        <steps>

            <step>

                <cmd> sample step 1 </cmd>

                <cmd> sample step 2 </cmd>

            </step>

        </steps>

    </taskbody>

</task>

Reference Topic

As the name suggests, a reference topic is used for giving information about items, just like referring to Wikipedia to explain something.

Important Elements used in Reference Topics
  • <reference>: The root tag of a reference topic
  • <refbody>: The body of a reference topic
  • <section>: To create subdivisions inside a reference topic
  • <table>: To contain a table
  • <fig>: To contain a figure
  • <properties>: To contain a list of properties in a reference topic
  • <refsyn>: To create a syntax diagram of a reference topic
A simple Reference topic Syntax:

<reference id="sample_reference_id">

    <title> Sample Reference Title</title>

    <shortdesc> Sample brief information about the reference topic </shortdesc>

    <refbody>

        <properties>

            <property>

                <proptype> Sample property type 1 </proptype>

                <propvalue> Sample property value 1 </propvalue>

            </property>

            <property>

                <proptype> Sample property type 2 </proptype>

                <propvalue> Sample property value 2 <propvalue>

            </property>

        </properties>

    </refbody>

</reference>

Glossary Entry Topic

We use glossary entry topics to give definitions of certain terms in a document.

Important Elements Used in Glossary Entry Topics
  • <glossentry>:  Root element to create a glossary entry
  • <glossterm>: To add a word or phrase
  • <glossdef>: To add the definition the word or phrase
A simple Glossary Entry Topic Syntax

<glossentry id="sample_entry">

    <glossterm> Sample Term or Word </glossterm>

    <glossdef> Sample definition </glossdef>

</glossentry>


Go to top

Bringing It All Together

DITA XML is a powerful solution for companies that produce a large amount of content that is published across multiple media and is updated over time.

It’s easy to imagine that when doing curriculum development or when writing user guides for the first time it might save some money. But the real magic happens when it is time to update those materials or translate them.

The ability to treat sections of text as modules that can be reused an unlimited number of times saves companies innumerable work hours (and thus money) and helps them avoid human error due to manually making changes.

It also helps companies reduce the cost and difficulty of translating corporate content. Since DITA XML reduces the incidence of similar but slightly different sets of content, it also reduced the amount of unique text that must be translated.

The tragedy of DITA is the same as the tragedy of modern translation project management; it’s that while large enterprises take DITA and effective translation project management for granted, many of the struggling small to midsize companies that need them most, don’t realize the power they hold for them.

They don’t realize that every day they don’t use DITA and modern translation workflows, they squander more money and waste more work hours and they publish more avoidable human errors. They don’t realize that every word published the old fashioned way represents a ballooning cost to them later.

That’s why at IVANNOVATION we publish articles like this. It’s our passion to help growing companies understand that affordable solutions exist that will transform their effectiveness as surely as printing presses transformed the work of scribes.

IVANNOVATION has had a long history of translating or localizing DITA XML content using modern translation best practices. If you want to get on the road to effective translation project management or if you need to localize your DITA content, contact us and let’s take your content to a new level.


Like the Article?

Share with all your friends by clicking on a social sharing button below.






Vinay Kumar, Software Engineer and Technical Writer by profession. He has 5+ years of experience in DITA-XML, Web Development, Digital Marketing, Blogging, and Google SEO. Loves reading thrillers and playing video games in his leisure time.

Darren Jansen, business development and content manager for IVANNOVATION, has a lifetime love for tech and languages. At IVANNOVATION he helps software developers get professional localization for their apps, software, and websites. On his time away from the office, he can be found hiking the Carolina wilderness or reading Chinese literature.

This article was written by Vinay Kumar and Darren Jansen

9 Comments

Leave a Reply