login | register    

Document Type Definition (DTD)

DTD Editor with Validation

Liquid XML Studio - Free DTD Editor

A DTD is the original type of XML schema. XML Schemas are used to formally describes the contents of an XML document. An XML schema describes the shape of the XML document, defining the data, sub elements or attributes it can contain, along with the number of times given entities can occur.

There are a number of different standards for describing an XML Schema,

  • DTD (Document Type Definition) - the original standard, defined within the W3C's XML standard. The DTD standard is all but obsolete now, replaced by the W3C's XSD standard. DTD's have there own format, can define substitutions internally within themselves, requiring multiple parses to extract the normalised document. They were also quite limited, allowing course validation, and minimal re-use.
  • XDR (XML-Data Reduced) - a standard developed by Microsoft that bridged the gap between DTD, and XSD schemas. A parser was implemented in MSXML up to version 6 when it was dropped. It was also used to describe data in older versions of Biz Talk. The document was described in terms of XML, and was very simplistic, offering minimal validation or reuse, but was simple to parse and extensible.
  • XSD (XML Schema Definition) - ratified by the W3C, it is now the de facto mechanism of describing XML documents. It allows for complex validation, re-use via inheritance and type creation, is described in terms of XML, so is easy to parse, and has support on most platforms. Almost all major data standards are now described in terms of XSDs.
  • RELAX NG (REgular LAnguage for XML Next Generation) - RELAX NG is relatively simple structure, and shares many features with the W3C XSD standard, data typing, regular expression support, namespace support, ability to reference complex definitions. Open source parsers exist on most platforms, but it is not widely used.

DTD Editor

The Free Community Edition of Liquid XML Studio contains a DTD editor, which provides syntax highlighting and validation, it also contains an XML Editor that can perform validation against embedded and external DTD's.

 

The DTD Standard Overview

The XSD standard has now superseded the DTD standard, and DTD's are now rarely used except to support legacy standards. DTDs can be converted to XSD with a reasonably high level of fidelity, but the XSD standard does not support substitutions (see example below).

The DTD standard is very simplistic, allowing it to describe basic conditions i.e. [a or b] must be present, [a or [b and c]] must be present. It defines a number of basic data types, which are not extensible.

<!ELEMENT people_list (person*)>
<!ELEMENT person (name, birthdate?, gender?, socialsecuritynumber?)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT birthdate (#PCDATA)>
<!ELEMENT gender (#PCDATA)>
<!ELEMENT socialsecuritynumber (#PCDATA)>

However, because the DTD standard can define substitutions (defined using the ENTITY tag), it can take a number of parses to determine a documents true meaning. This makes parser design much more difficult, slows validation and is error prone, see the following example.


Using the following example.

1: <?xml version='1.0'?>
2: <!DOCTYPE test [
3: <!ELEMENT test (#PCDATA) >
4: <!ENTITY % xx '&#37;zz;'>
5: <!ENTITY % zz '&#60;!ENTITY tricky "error-prone" >' >
6: %xx;  
7: ]>
8: <test>This sample shows a &tricky; method.</test>

The first pass through, expands the entity %xx; defined in line 4, into its definition in line 6 giving (&#37; expands to %)

1: <?xml version='1.0'?>
2: <!DOCTYPE test [
3: <!ELEMENT test (#PCDATA) >
4: <!ENTITY % xx '&#37;zz;'>
5: <!ENTITY % zz '&#60;!ENTITY tricky "error-prone" >' >
6: %zz;  
7: ]>
8: <test>This sample shows a &tricky; method.</test>

The second pass through expands %zz; defined in line 5 into line 6 (&#60; expands to <)

1: <?xml version='1.0'?>
2: <!DOCTYPE test [
3: <!ELEMENT test (#PCDATA) >
4: <!ENTITY % xx '&#37;zz;'>
5: <!ENTITY % zz '&#60;!ENTITY tricky "error-prone" >' >
6: <!ENTITY tricky "error-prone" >  
7: ]>
8: <test>This sample shows a &tricky; method.</test>

And finally the &tricky; in the XML is expanded to "error-prone"

1: <?xml version='1.0'?>
2: <!DOCTYPE test [
3: <!ELEMENT test (#PCDATA) >
4: <!ENTITY % xx '&#37;zz;'>
5: <!ENTITY % zz '&#60;!ENTITY tricky "error-prone" >' >
6: <!ENTITY tricky "error-prone" >  
7: ]>
8: <test>This sample shows a error-prone method.</test>