Tobias Rischer, 2 Nov 2002. Applying my checklist1 to a sample from the TEI web page. It took me ca. 30 minutes to do it, with small interruptions for checking the dinner in the oven. 101. What is the title and source repository of the sample? What files are there? "The Poetical Works of Elizabeth Margaret Chandler, 1845" from the Women Writers Project. The sample consists of three files: wwp-chandler-poetical.sgml, wwp-chandler-poetical.xml, wwp-store-xml.flat.dtd The following analysis will look at the SGML file. 102. Does the sample come with: no DTD / copy of standard DTD (which?) / Pizza DTD? It comes with a Pizza XML DTD -- so it can't be used for parsing the SGML. 103. Was the sample (by the look of it) generated by a program, or written/edited by a person? It looks hand-written; there are line separations and indenting for clarity, but no rigid pretty-printing. 104. Is the sample document all in one file, or distributed over several SGML files? All in one file. It doesn't seem to reference non-SGML files either. (no
tags). 201. Is the sample valid (parseable) SGML? (with its own DTD / with some standard TEI DTD?) A special DTD is required but not there -- cannot check. (The XML version parses against the XML DTD given) 202. Is the sample already XML? Is it valid? Both are there. 203. Does the sample use SUBDOCs? For anything else than WSDs? No. 204. Are all elements fully tagged without minimization techniques? No, there are some unclosed tags (checked for

; one spot is at line 1978, but not the only one). 205. Are all attribute values quoted? Yes. (ran perl script and glanced over document). 206. Are there any omitted attribute names (as in )? I guess no, it seems pretty strict in its attribute syntax. I didn't properly check. 207. Does the text use SDATA entity references for well-known (Unicode) characters? Are there any self-defined / non-ISO / non-Unicode SDATA entities? The text uses non-standard entities -- e.g. &legalese; and &rule; -- but which of those are SDATA cannot be determined without the DTD. 208. Are there comments? In formats not legal in XML? Just two, and they are XML-safe. 209. Are there Processing Instructions? Do they start with a name? No (grepped for "<?"). 210. Does the sample use really obscure SGML features? (CONCUR, ...) Most probably not. 211. What kind of warnings and errors do you get from sx? Something not yet probed by the previous checks? Pointless without DTD. 301. On which TEI DTD is the sample based? (P2, P3, P4, TEILite, unknown) Seemingly a customized P3 Pizza DTD (by looking at the XML DTD and assuming it is a close relative to the original SGML DTD). There is a tag "hyperDiv" that is not TEI. (found it with the perl code for item 302). 302. Does the sample (consistently) use the TEI camelCase spelling? Seems so. (checked tag list created with perl code, didn't check attributes). 303. Is there a substantial DTD subset? (that is the part between [ and ] at the beginning of the document, within the DOCTYPE element.) Does it contain more than ENTITY declarations with TEI DTD parameters and invocations of character sets? None at all. It's all in the custom DTD. 304. Does the sample DTD rename TEI tags? All tags in the list for 302 looked familiar. 305. Are there real DTD modifications? With recommended technique or by editing DTD files? The new element <hyperDiv>, otherwise I don't know. This one is discreetly merged into the Pizza DTD.