this electronic form is original
This section shall support people who have modified the TEI DTD and want to migrate these modifications from SGML to XML, i.e., who want to use the XML-based P4 DTD with equivalent modifications. We begin with some general remarks, then describe an example DTD modification that covers the most important issues, outline a recommended migration procedure and carry out the key step hands-on on the example.
If the elements or content models that the TEI provides don't
quite meet the requirements of your project, there is an official
esacpe route: you can modify the DTD in a number of well-defined ways
and your documents still remain TEI conformant.
This involves
creating two extension files, setting some parameter entities,
possibly defining new elements or redefining existing ones, and
making these modifications known to the parser in the DTD subset at
the beginning of the document.
Although the process is a lot simpler than it looks at first
glance, many people have taken unofficial escape routes, especially
the users of the TEI Lite DTD, who would have been required to first
switch to a full TEI DTD before applying local extensions. It is
admittedly simpler to just open your local copy of teilite.dtd
and change a few lines. Only later will you find out why the TEI
Guidelines don't advertise this, and one of those moments could be
the migration of your customized DTD to XML.
This section shall support you in migrating DTD extensions made using the official procedures. If you had decided to go your own way when you hacked your DTD, there is little general advice to give, but one: You can, at this point, either take the new XML-based TEI DTDs and reapply your hacks, or you can think twice and on this occasion become conformant by remaking your modifications the official way. We recommend that this is what you should do. The hassle just of finding the modifications in a patched DTD should convince you that the little extra effort needed for conformance is worthwhile.
This being said: what types of TEI extensions exist and what is involved in migrating them from SGML to XML? The guidelines know four kinds of modification:
- deletion of elements;
- renaming of elements;
- extension of classes;
- modification of content models or attribute lists.
For practical purposes, the fourth item can be subdivided into:
The first three cases are extremely easy , the items of group 4 require more detailed attention. The following is a short list of some critical issues involved. In the following subsections, we will work through a fictitious example that covers most of these issues.
-or
Othat indicate whether start and end tag are required or can be omitted. These indicators don't exist anymore in XML DTDs and your private DTD snippets need to be modified.
In this subsection, we will do some simple TEI DTD modifications
in SGML. This will then serve as a tutorial example for the
migration to XML. While working on this example, the main problems
in converting DTD extensions should be covered. Not everyone will
need everything treated here, and some needs might not be covered,
but this should be an easy, hands-on starting point for most
projects.
Let's assume that five years ago, we wanted a TEI P3 DTD for prose that meets the following extra requirements (these requirements are tutorial examples only, this is no statement on whether they are recommendable TEI practice):
imageurlthat contains an URL for an image of the page;
These requirements can be cast into TEI SGML by creating two
files, my_sgml.ent
and my_sgml.dtd
that look as
follows:
A sample document godot.sgml
using these extensions would
look like this:
This example shall be migrated to TEI P4 XML in the next two subsections
Although the following step-by-step list may sound over-protective, this approach is recommended to help you keep a clear head while you are converting DTD and documents. You are switching from P3 to P4, from SGML to XML DTDs, from SGML to XML documents and from SGML to XML parsers at the same time, and it can be difficult to find one's way through these many potential pitfalls.
Of the procedure recommended above, we will now focus on rewriting the
DTD extensions in XML, with the example DTD modification described
earlier as a basis. We will be creating the files my_xml.ent
and
my_xml.dtd
If we first check for consistent case of the element and attribute
names, we discover that element
Some things are easy: the renaming of
We have to decide now what to do with the imageUrl
and it will be
merged with the existing ATTLIST in the TEI DTD files; there is no need
to suppress and copy the definition of
On the other hand, the P4 DTD can be used to parse XML and SGML, and
we can do the same for our customized DTD, if we will need to support
SGML in parallel for some while. If we want that, we need a different
solution that makes use of the parameter entities om.RO
etc. that
the P4 DTD introduces to write DTDs that are parseable as SGML and XML.
om.RO
is used for elements that require only a start tag (mostly
empty elements), om.RR
is used for elements that require start and
end tag (non-empty elements should be defined that way). See the example
files for the application of these parameter entities.
The most difficult problem is the
Also, the simple way of allowing Incl
that is part of
every content model within
The first runs with the XML parser result in many warnings because of
redefined parameter entities; this is normal. Some syntax correction is
required where XML is more strict than SGML: we forgot a semicolon in a
parameter entity reference, %paraContent
must not be in brackets
while the #PCDATA
for
The cross-check of the double-use version with the SGML parser exposes a little additional problem: the document now uses the character entities < and > which are predefined in XML, but not in SGML; once discovered this is easily fixed.
The reworked extension files in the XML-only form look like this:
In the double-use form, the DTD extension file comes out a little
longer (my_double.ent
stays the same):
We can easily convert our short test document manually. All that
needs to change is the initial XML declaration, the XML-specific
parameter entity, the empty-tag syntax for