« Apache programming book on the way! | Main | Our bundle of joy has arrived! »

Software documentation with DocBook quick how-to

November 19, 2005

I am amazed how we still don't have proper technology to produce technical content. If you are just starting on a software project you can, for example, choose to use your favourite text processor. (It is what I initially did for ModSecurity.) This choice is quick to start with and allows you to write comfortably. Unfortunately it is not adequate when it comes to publishing. The text processor I used, OpenOffice, produces nice PDF documents but it fails miserably when it comes to HTML output.

One approach that looks particularly promising is DocBook; I have been looking at it for years. DocBook is a XML-based markup language designed specially to be used with technical content. People behind DocBook have done tremendous work on the backend stuff. DocBook appears to be well-designed and well-documented. You will even find two complete DocBook books, containing everything you need to know, freely available online. The problematic area is authoring, because the support for DocBook in text processors is very limited. Until recently your choice was to write XML by hand or, at best, write with the help of an XML editor. But it is insane to write anything but the simplest documents this way. As if writing is not difficult enough and you need your tools to make it more difficult.

Book publishers are trying to get round this problem by customising the text processors, using special templates and macros. (Publishers also have a much bigger problem as they need to support collaboration between people involved in book writing too.) This approach generally works but it is an one way street. Toward the end of the process the manuscript is converted into something more suitable for use in production. (I don't know what happens when you need to write the second edition, I haven't tried that with my book yet.)


For me, discovery of the XMLmind XML editor was a glimpse of hope. Here we have a tool that allows you to write DocBook in a way that is similar to that of writing using a normal text processor. Naturally, the feature set of this young tool cannot be compared with those of the mature text writing tools. Still, XMLmind editor is quite usable in its current state. What's even better, the Standard edition is completely free. We appear have finally sorted the authoring part of the problem. All you now need is a little patience to learn the DocBook ways (you can start with DocBook 5.0: The Definitive Guide).


After having written the documentation in DocBook you need to figure out how to convert it into one of the supported formats. You will need the following resources for that:

To produce PDF:

fop.sh -xsl $DOCBOOK_XSL_HOME/fo/docbook.xsl -xml input.xml -pdf output.pdf

To produce singe-page HTML:

xalan.sh -xsl $DOCBOOK_XSL_HOME/html/docbook.xsl -in input.xml -out output.html

To produce multi-page HTML:

xalan.sh -xsl $DOCBOOK_XSL_HOME/html/chunked.xsl -in input.xml -param base.dir ./output/

Although it is possible to use XSL to publish DocBook to text format I did not find the option very useful. You can get much better results creating text output from a single-page HTML using Lynx:

lynx -dump input.html > output.txt

FOP does not support RTF output at this time (although there is some talk of it being supported shortly), but you can produce it with the XSL utility. From the command line:

xslutil -out rtf output.rtf input.xml $DOCBOOK_XSL_HOME/fo/docbook.xsl