California Digital Library TEI Best Practice Guidelines for Encoding Printed Books

California Digital Library Structured Text Working Group

The encoding guidelines provided here are unedited working drafts produced by CDL's Structured Text Working Group. They should not be treated as final documents. Updated guidelines will be available in November 2004.


Table of Contents

Introduction
Using These Guidelines
1. General Instructions
1.1. File Management and ARKs
1.1.1. Naming
1.1.2. Associated Content Files
1.1.3. Image Files
1.2. Invoking the CDL TEI Printed Book DTD
1.3. Case Sensitivity
1.4. Character Encoding
1.5. Hyphenation
1.6. Extent of Encoding
1.7. Metadata Encoding and Transmission Standard (METS) Record
2. Encoding Practice
2.1. Root Element
2.1.1. <TEI.2>
2.2. Document Header
2.2.1. <teiHeader>
2.3. Text Structure
2.3.1. <text>
2.3.2. <group>
2.4. Front Matter
2.4.1. <titlePage>
2.4.2. Tables of Contents
2.5. Document Body
2.5.1. <body>
2.6. Back Matter
2.6.1. <back>
2.6.2. Appendices
2.6.3. Indexes
2.7. Divisions
2.7.1. <divn>
2.8. Division Headings, Openers, and Closers
2.8.1. <head>
2.8.2. <epigraph>
2.8.3. <byline>
2.8.4. <dateline>
2.8.5. <closer>
2.8.6. <trailer>
2.9. Paragraphs
2.9.1. <p>
2.10. Page Breaks and Milestones
2.10.1. <pb>
2.10.2. <fw>
2.10.3. <lb>
2.10.4. <milestone>
2.11. Typographical Phenomena and Formatting
2.11.1. <hi>
2.11.2. Nested <hi> Tags
2.11.3. <emph>
2.11.4. Alignment and Indention
2.12. Language Shifts
2.12.1. <foreign>
2.13. Quotations
2.13.1. <quote>
2.13.2. <cit>
2.14. Speech
2.14.1. <sp>
2.14.2. <speaker>
2.15. Verse
2.15.1. <divn> in Verse
2.15.2. <head> in Verse
2.15.3. <l>
2.15.4. <lg>
2.16. Notes
2.16.1. <note>
2.16.2. In-line notes
2.16.3. Footnotes
2.16.4. Endnotes
2.16.5. <bibl> in <note>
2.17. Names, Dates, and Addresses
2.17.1. <name>
2.17.2. <date>
2.17.3. <address>, <addrLine>
2.18. Lists
2.18.1. <list>
2.18.2. Standard Ordered Lists
2.18.3. Non-standard Ordered Lists
2.18.4. <label>
2.19. Bibliographies
2.19.1. <bibl>, <listBibl>
2.19.2. <title> levels
2.19.3. <note> in Bibliographic Citations
2.20. Internal Links and Cross References
2.20.1. <ref>
2.21. External Objects
2.21.1. <xref>
2.22. Graphic Elements
2.22.1. Tables
2.22.2. <figure>
2.22.3. Formulas
2.23. Arbitrary Containers and Segments
2.23.1. <seg>
2.23.2. <ab>
3. Quality Assurance
3.1. Validation
3.2. Best Practice Checking
3.3. Proofreading
Tag Library

Introduction

This document is part of a collection of best practice guidelines established by the California Digital Library's Structured Text Working Group for encoding electronic texts. The guidelines provide best practices for marking up XML documents in accordance with the Textual Encoding Initiative's TEI P4: Guidelines for Electronic Text Encoding and Interchange (TEI P4). All projects submitting text documents to the CDL must follow the CDL TEI best bractices in order to produce files that may be automatically ingested and distributed by the CDL. There are four separate but related guidelines available, each geared toward a specific type of text, each accompanied by a specific DTD:

All of the above guidelines also require projects to consult the CDL's separate, universal set of guidelines for creating a TEI header: California Digital Library Best Practice Guidelines for Encoding TEI Headers

These documents assume that readers are already familiar with the basics of XML and TEI P4 and are only seeking guidance as to how to apply them to specific cases. In other words, the best practices guidelines are not exhaustive instructions on XML nor the TEI. Not every element nor attribute available through a particular CDL TEI DTD is discussed in-depth, although a complete list of available elements and attributes for each set of best practices can be found in the appendix of each set of guidelines.

All CDL TEI best practices also assume that the electronic text being encoded is being derived principally, if not wholly, from an existing paper source document. That is, these guidelines are not expressly intended for projects creating born-digital texts, although they may be adapted for such use. These guidelines are intended for projects that are producing semi-diplomatic transcriptions of a source document with few if any editorial changes. While projects may choose not to reproduce the look or layout of a source document through their encoding, no emendation (meaning deliberate editorial change) of any textual element in the source document is permitted unless the project's emendation policy, spelling out what has been changed and what preserved, can be consistently applied and clearly explained in the document header.

The CDL TEI Best Practice Guidelines for Encoding Printed Books are for the encoding of printed volumes. Though they are intended to cover general books, they may also be used to encode rare books (published materials that have added value because of age, scarcity, aesthetic properties, association or subject matter, whether or not they include handwritten marginalia). Encoders of rare books, especially those books with marginalia, must consult the CDL TEI Manuscript guidelines as well, as they provide instruction on how to encode handwritten documents. Encoders of rare books should also be aware that all CDL guidelines are skewed toward capturing semantic content rather than physical description. Therefore, projects seeking to provide full physical description of artifactual documents will wish to supplement these guidelines. As with all CDL TEI guidelines, this document is meant to be used in conjunction with full documentation for TEI P4. Where an issue is not directly addressed in these guidelines, the official TEI Guidelines should be consulted.

Using These Guidelines

These guidelines are prescriptive. However, not all individual practices mentioned here are absolutely required for compliance to the standard. The following list provides the words and phrases that should serve as cues throughout this document as to whether a practice is required, recommended, or optional:

  • REQUIRED

    must, must not, will, will not, do, do not

    Unless the practice is followed, the document will not be considered valid as a CDL TEI document. Where possible, these practices will be enforced by the DTD or schema.

  • RECOMMENDED

    should, should not

    The recommendation should be followed if possible; it should only be violated if the encoder has a good reason for doing so. Where possible, these recommendations will be enforced by the CDL using a Schematron assertion language schema.

  • OPTIONAL

    may, may not, can, cannot

    Although suggested, the practice is optional. Encoders may choose other valid strategies as necessary.

If a question arises that cannot be resolved through consulting these guidelines, the encoder should consult official TEI P4 documentation. Throughout these guidelines, relevant sections of TEI P4 will be referenced using the following notation:

[P4: 11.2]

Chapter 1. General Instructions

1.1. File Management and ARKs

Every digital object submitted to the CDL, including objects that are associated files referenced by the main XML document, must be assigned an Archival Resource Key (ARK) that will serve as the object's unique and persistent identifier. Projects may obtain ARKs through the CDL for use in their encoding, or their files may automatically be assigned ARKs by the CDL upon ingest. The method by which a project's files will receive ARKs should be negotiated in advance and laid out in each project's submission agreement with the CDL.

For TEI files, each text's ARK will also be assigned as the value of the id attribute in the root element of the text's XML file. It will also be recorded in an <idno> element in the text's TEI header.

1.1.1. Naming

It is highly recommended that where possible the ARK also be used for naming TEI files, using the following convention:

ARK.xml, where "ARK" is the unique key assgned.

To facilitate the ingest of files, projects should use the following naming conventions for images, PDFs, and other associated content:

ARK_NAME.EXTENSION, where "ARK" is the unique key assigned, "NAME" is the result of whatever local naming convention has been applied to individual files, and " EXTENSION" is the normal file format extension (".gif", ".jpg", ".pdf", etc.).

type of fileARKfile name
TEIkt167nb66rkt167nb66r.xml
GIFARK kt167nb66rkt167nb66r_fig002.gif

1.1.2. Associated Content Files

All digital objects referenced as external entities by a TEI document must first be declared as entities at the beginning of the document. The entity declaration must give the object's entity reference and then define the reference using the object's system identifier. The system identifier must either be a system path relative to the document or, preferably, a URL. Ideally, to facilitate the preview and ingest of TEI objects, projects should make their documents and all associated content files (DTDS, images, pdfs, etc.) available via the web. Entity declarations must use the full object filename and the appropriate file format notation (e.g., GIF, JPG, or PDF).

<!ENTITY fig002 SYSTEM "http://www.server.domain/figures/kt167nb66r_fig002.gif" NDATA GIF>
...
<figure id="fig002" entity="fig002" rend="block">
        

1.1.3. Image Files

The CDL will accept image files in either the GIF or JPEG format. If possible, two derivative images should be created for each plate, figure, graphic, or other pictorial element that appears as a discrete element in the text. One of these derivatives should be at web resolution (72 ppi) and the same size as the figure in the printed text. The other image should be at higher resolution (300 ppi), again at original size, but not exceeding 768 pixels in width. In-line images, such as images of formulas, need only be provided in the low-resolution version. When necessary, images should be cropped and flipped for proper orientation for web display. For more information about the CDL's digital image standards, see the California Digital Library Digital Object Standard: Metadata, Content and Encoding and the California Digital Library Digital Image Format Standards .

The master version of the image (usually a TIFF) does not need to be submitted to CDL. However, projects interested in preserving master images for future use should consider submitting them to the UC Libraries Digital Preservation Repository, scheduled to launch in 2005.

If images are to be supplied in multiple resolutions, it will be necessary to encode this fact in a metadata record conforming to the Metadata Encoding and Transmission Standard (METS) schema.

<fileGrp ID="figures">
   <fileGrp ID="fig1">
      <file ID="fig1-m" ADMID="image-rights" USE="med-res" MIMETYPE="image/gif">
         <FLocat LOCTYPE="URL" 
                 xlink:href="/dynaxml/data/cj/kt109nc2cj/figures/fig1.gif"/>
      </file>
      <file ID="fig1-h" ADMID="image-rights" USE="hi-res" MIMETYPE="image/gif">
         <FLocat LOCTYPE="URL" 
                 xlink:href="/dynaxml/data/cj/kt109nc2cj/figures/fig1_h.gif"/>
      </file>
   </fileGrp>
   ...
            

Please consult the CDL ingest team before constructing a METS record for objects with multiple resolutions.

1.2. Invoking the CDL TEI Printed Book DTD

All documents complying to these guidelines must explicitly invoke the CDL TEI Lite DTD. To do this, declare the the TEI XML DTD and include the prose, figures, and linking tag sets. Then include the CDL user extension files and the entity "CDL.lite". Other external entity declarations should directly follow. (See the section on associated files for instructions on how to declare entities.)

<!DOCTYPE TEI.2 SYSTEM "../dtd/tei2.dtd" [
<!ENTITY % TEI.XML "INCLUDE">
<!ENTITY % TEI.prose "INCLUDE">
<!ENTITY % TEI.figures "INCLUDE">
<!ENTITY % TEI.linking "INCLUDE">

<!ENTITY % TEI.extensions.ent SYSTEM '../dtd/CDL_base.ent'>
<!ENTITY % TEI.extensions.dtd SYSTEM '../dtd/CDL_base.dtd'>
<!ENTITY % CDL.book "INCLUDE">
. . .
<!ENTITY fig002 SYSTEM "http://www.server.domain/figures/kt167nb66r_fig002.gif" NDATA GIF>
. . . 
]>
            

Encoders of rare books should take note that the CDL TEI Printed Book DTD does not include the TEI's transcription tagset. Therefore, projects wishing to encode handwritten marginalia should instead use the CDL TEI Manuscript DTD. To do this, declare the the TEI XML DTD and include the prose, figures, transcription, and linking tag sets. Then include the CDL user extension files and the entity "CDL.ms". To encode the printed text, follow the instructions detailed here. To encode the marginalia, follow the instructions detailed in the CDL TEI Manuscript DTD.

<!DOCTYPE TEI.2 SYSTEM "../dtd/tei2.dtd" [
<!ENTITY % TEI.XML "INCLUDE">
<!ENTITY % TEI.prose "INCLUDE">
<!ENTITY % TEI.figures "INCLUDE">
<!ENTITY % TEI.transcr "INCLUDE">                  
<!ENTITY % TEI.linking "INCLUDE">

<!ENTITY % TEI.extensions.ent SYSTEM '../dtd/CDL_base.ent'>
<!ENTITY % TEI.extensions.dtd SYSTEM '../dtd/CDL_base.dtd'>
<!ENTITY % CDL.ms "INCLUDE">
. . .
<!ENTITY fig002 SYSTEM "http://www.server.domain/figures/kt167nb66r_fig002.gif" NDATA GIF>
. . . 
]>
            

[P4: 3.3]

1.3. Case Sensitivity

Please take note that XML is case-sensitive. All elements and attributes must be in the proper case to be valid. In the CDL TEI DTDs, all elements made up of compound words use the "camel case" format: e.g., "teiHeader" instead of "teiheader" or "TEIHEADER".

1.4. Character Encoding

Special characters in the text must be encoded using the Unicode Standard (UTF-8) and documents must include "UTF-8" as the value of the encoding attribute in the XML declaration.

<?xml version="1.0" encoding="UTF-8"?>
            

Special characters may be incorporated into a document directly as native Unicode (à) or may be represented by numeric character entities. These numeric character entities can take either the decimal (&#224;) or hexadecimal forms (&#x00E0;). Characters must not be represented using named character entities (&agrave;), with the exception of those specifically exempted in the XML 1.0 Specification. These must be used to avoid validation errors:

characterdescriptionUnicode
<less than&lt;
>greater than&gt;
&ampersand&amp;
            
                <p>The &lt;body&gt; element contains the main body of the text.</p>
                
            
            

Unicode named character entities must also be used within attribute values that need to contain single or double quotation marks or apostrophes. Use the following named character entities to avoid a parser error:

characterdescriptionUnicode
"quotation marks&quot;
'apostrophe or single quotation mark&apos;
<name reg="Ol&apos; Yeller">
            

As part of the CDL ingest process, documents will be checked for the correct Unicode character encoding and rejected if nonconforming characters or encodings are detected.

1.5. Hyphenation

When encoding the text, take care not to transcribe end-line hyphens that have been introduced into the text as a result of typesetting. Record all hyphens that are required by the source for the correct spelling of a compound word or phrase. Similarly, record all hyphens that are absolutely necessary to the meaning of an expression, e.g., hyphens in dates, formulas, code, etc.

1.6. Extent of Encoding

All sections of printed books should be encoded, from title pages up to, but not including, colophons. Bastard titles or series titles, series lists, or frontispieces need not be included. The half title following the front matter sections may also be ignored. If a particular section is encoded but need not be displayed or accessed, it may be commented out of the XML file. A project's specific policies regarding what has been encoded and what left out, including policies adopted at the suggestion of these guidelines, must be articulated in the file's <editorialDecl> in the <teiHeader>.

1.7. Metadata Encoding and Transmission Standard (METS) Record

The principal container for metadata at the CDL is a digital object's METS record. TEI documents should be submitted with as complete a METS record as possible. The CDL may generate METS records for projects that are unable to provide them. For more information, see The CDL METS Repository's web stie.

Chapter 2. Encoding Practice

Table of Contents

2.1. Root Element
2.1.1. <TEI.2>
2.2. Document Header
2.2.1. <teiHeader>
2.3. Text Structure
2.3.1. <text>
2.3.2. <group>
2.4. Front Matter
2.4.1. <titlePage>
2.4.2. Tables of Contents
2.5. Document Body
2.5.1. <body>
2.6. Back Matter
2.6.1. <back>
2.6.2. Appendices
2.6.3. Indexes
2.7. Divisions
2.7.1. <divn>
2.8. Division Headings, Openers, and Closers
2.8.1. <head>
2.8.2. <epigraph>
2.8.3. <byline>
2.8.4. <dateline>
2.8.5. <closer>
2.8.6. <trailer>
2.9. Paragraphs
2.9.1. <p>
2.10. Page Breaks and Milestones
2.10.1. <pb>
2.10.2. <fw>
2.10.3. <lb>
2.10.4. <milestone>
2.11. Typographical Phenomena and Formatting
2.11.1. <hi>
2.11.2. Nested <hi> Tags
2.11.3. <emph>
2.11.4. Alignment and Indention
2.12. Language Shifts
2.12.1. <foreign>
2.13. Quotations
2.13.1. <quote>
2.13.2. <cit>
2.14. Speech
2.14.1. <sp>
2.14.2. <speaker>
2.15. Verse
2.15.1. <divn> in Verse
2.15.2. <head> in Verse
2.15.3. <l>
2.15.4. <lg>
2.16. Notes
2.16.1. <note>
2.16.2. In-line notes
2.16.3. Footnotes
2.16.4. Endnotes
2.16.5. <bibl> in <note>
2.17. Names, Dates, and Addresses
2.17.1. <name>
2.17.2. <date>
2.17.3. <address>, <addrLine>
2.18. Lists
2.18.1. <list>
2.18.2. Standard Ordered Lists
2.18.3. Non-standard Ordered Lists
2.18.4. <label>
2.19. Bibliographies
2.19.1. <bibl>, <listBibl>
2.19.2. <title> levels
2.19.3. <note> in Bibliographic Citations
2.20. Internal Links and Cross References
2.20.1. <ref>
2.21. External Objects
2.21.1. <xref>
2.22. Graphic Elements
2.22.1. Tables
2.22.2. <figure>
2.22.3. Formulas
2.23. Arbitrary Containers and Segments
2.23.1. <seg>
2.23.2. <ab>

2.1. Root Element

2.1.1. <TEI.2>

Each document should contain one and only one <TEI.2> root element. The id attribute is required and must contain the unique ARK assigned to the text in question.

<TEI.2 id="kt5n39n99v">
            

2.2. Document Header

2.2.1. <teiHeader>

Generally, the <teiHeader> for each document must conform to the practices described in detail in the California Digital Library Best Practice Guidelines for Encoding TEI Headers. Those guidelines cover both mandatory practices as well as suggested or optional practices. It is often sufficient to follow the instructions there for encoding the mandatory minimal header. However, projects that will depend on the TEI header as their principal source of metadata (e.g., projects not providing their own METS records) are advised to use the recommendations for full header encoding.

CDL search indexing and metadata collection depend on using a crosswalk that maps individual TEI header elements to their Dublin Core Metadata Initiative (DC) equivalents. A detailed list of which elements in the TEI header map to which elements in DC can be found in Appendices A and B of the CDL TEI Header guidelines .

It is particularly important to note that every TEI document must make use of the <idno> element in the TEI header to record both the text's ARK and its local object identifier. Each must be given as the content of a separate <idno> element. The type attribute must be used to identify whether an "ARK" or "LOCAL" identifier is being given. These <idno> elements are essential to maintaining the link between the document and its various identities.

The following is an example of a minimal TEI header suitable for CDL TEI Printed Book documents.

<teiHeader type="cdl/tei bk">
  <fileDesc>
    <titleStmt>
      <title>The Opening of the Apartheid Mind : Electronic Version</title>
      <respStmt>
        <resp>Text encoder:</resp>
        <name reg="Hastings, Kirk">Kirk Hastings</name>
      </respStmt>
    </titleStmt>
    <extent>816 Kb</extent>
    <publicationStmt>
      <publisher>University of California Press</publisher>
      <pubPlace>Berkeley</pubPlace>
      <date>1993</date>
      <idno type="ARK">ark:/13030/ft958009mm</idno>
      <idno type="LOCAL">6178</idno>
    </publicationStmt>
    <sourceDesc>
      <biblFull>
        <titleStmt>
          <title>The Opening of the Apartheid Mind: Options for the New South Africa</title>
          <author><name>Heribert Adam</name> and <name>Kogila Moodley</name></author>
        </titleStmt>
        <editionStmt>
          <p>1st ed.</p>
        </editionStmt>
        <extent>xvi, 277 p. : map ; 23 cm.</extent>
        <publicationStmt>
          <publisher>University of California Press</publisher>
            <pubPlace>Berkeley</pubPlace>
            <date>1993</date>
            <idno type="ISBN">0520081994 (alk. paper)</idno>  
          </publicationStmt>
      </biblFull> 
    </sourceDesc>
  </fileDesc>
</teiHeader>
            

[P4: 5.6]

2.3. Text Structure

2.3.1. <text>

The <teiHeader> is directly followed by the mandatory <text> element, which fully contains the content of the book being encoded. The <text> element contains three subelements, <front> for front matter(e.g., title pages, prefaces, and introductions), <body> for the main body of the text, and <back> for back matter (e.g., endnotes and appendices). Of these three, only <body> is required.

<TEI.2 id="ARK>
   <teiHeader> . . . </teiHeader>
   <text>
      <front> . . . </front>       OPTIONAL
      <body>                       REQUIRED
         <div1> . . . </div1>
      </body>
      <back>                       OPTIONAL
         <div1> . . . </div1>
      </back>
   </text>
</TEI.2>
            

[P4: 7.1]

2.3.2. <group>

Groups of individual texts are sometimes packaged together within a single document. Normally, the <divn> element will be enought to create the structural divisions necessary for documents that can share a TEI header and thus may be encoded together in a single file. However, groups of texts that each need their own distinct title pages or other <front> sections may be encoded using the <group> element. Each text would then be encoded within a separate <text> element within <group>. Each <text> element can carry its own <front> section. Avoid using <teiCorpus>.

<TEI.2>
  <teiHeader></teiHeader>
  <text>
    <front>
      <titlePage></titlePage>
    </front>
    <group>
      <text>
        <front>
          <titlePage></titlePage>
        </front>
        <body></body>
        <back></back>
      </text>
      <text></text>
      <text></text>
      <text></text>
    ...
    </group>
  </text>
</TEI.2>
        

2.4. Front Matter

The <front> element is used to contain the various components that make up front matter, including prefaces, introductions, and title pages. Each of these sections is normally contained within another structural element such as <titlePage> or a <divn>. For a full list of the types of <divn>s available in <front>, please see the section on divisions. In printed books, front matter can usually be distinguished from the body of the text because the page numbering almost always uses roman numerals.

2.4.1. <titlePage>

Do not encode a <titlePage> unless the book itself contains a formal title page. Title pages may use a number of formatting peculiarities, such as specific alignment, fonts, incidental images, etc. It is not necessary to attempt to reproduce the look of the title page in the book exactly. It is often enough to convey to users that the book has a title page, what textual information it contains, and the order in which the information appears.

Example:

<titlePage>
   <docTitle>
      <titlePart type="main">The Opening of <lb/>the Apartheid Mind</titlePart>
      <titlePart type="subtitle">Options for the New South Africa</titlePart>
   </docTitle>
   <docAuthor><name>Heribert Adam</name> and 
              <name>Kogila Moodley</name></docAuthor>
   <docImprint>
      <publisher>UNIVERSITY OF CALIFORNIA PRESS</publisher>
      <pubPlace>Berkeley · Los Angeles · London</pubPlace>
      <docDate>1993</docDate>
   </docImprint>
</titlePage>
          

2.4.1.1.  <docTitle>, <titlePart>

The <docTitle> element is required within <titlePage>. Use <titlePart> within <docTitle> to encode individual formal titles, subtitles, and other subsidiary title parts as they appear on the title page. If there is more than one <titlePart> given, projects must use the type attribute to classify the various <titlePart>s. Supported type attribute values are "main," "subtitle," "alternate," and "abbreviated." Any <titlePart> without a type attribute will be considered and formatted as a "main" title. If there is more than one <titlePart>, then give the type attribute is mandatory for all of them.

<docTitle>
      <titlePart type="main">Inventory of furniture and art.</titlePart>
</docTitle>
                    

2.4.1.2.  <docAuthor>

Record here the names of authors and others responsible for the intellectual content of the document as they appear on the title page. Each <docAuthor> element will be displayed by the stylesheet on a single line. Therefore, projects may choose to encode multiple names within a single <docAuthor> if it is desired that they display on a single line, or may choose to repeat <docAuthor> if the names should be displayed on separate lines.

Projects may use the <name> element to surround each author's name. This practice is optional, but is particularly useful when more than one name has been encoded in a single <docAuthor>. The <name> element also allows projects to regularize names using the reg attribute.

In the content of <docAuthor> and <name>, names should be recorded as they appear on the title page. Do not attempt to reorder the name into catalog entry form or use the form of the name as it may appear in a name authority file. Again, the reg attribute may be used to correlate a name to an authority.

<docAuthor>
  <name>Tom Jennings </name> and 
  <name>Julia Hoffman, MD</name>
</docAuthor>
          

OR:

<docTitle>
  <titlePart type="main">Canine morphotypes and physiology</titlePart>
</docTitle>
<docAuthor><name reg="Jennings, Tom">Tom Jennings</name></docAuthor>
<docAuthor><name reg="Hoffman, Julia">Julia Hoffman, MD</name></docAuthor>
          

2.4.1.3. <byline>

Authors are frequently listed on the title page accompanied by a more explicit description of their role in the creation of the document; e.g., "foreword by" or simply "by." In such cases, encode both the <docAuthor>s and their statements of responsibility inside an encompassing <byline> element.

<docTitle>
  <titlePart type="main">Canine morphotypes and physiology</titlePart>
</docTitle>
<byline>By <docAuthor>Tom Jennings </docAuthor> and 
<docAuthor>Julia Hoffman, MD</docAuthor></byline>
          

2.4.1.4. <docImprint>, <pubPlace> , <publisher>

Record the remaining publication information in <docImprint>. Within <docImprint>, use <pubPlace> and <publisher> in any order and as often as necessary to record every place of publication and every publisher respectively.

<docImprint>
  <pubPlace> Collinsport:</pubPlace>
  <publisher> Stoddard and Associates, 1993.</publisher>
</docImprint>
          

2.4.1.5. <docDate>

Record copyright and publication dates within <docDate> in <docImprint>. Do not include any associated text or symbols such as the word "copyright" or the symbol "©". Such words and symbols may be kept in the surrounding <docImprint> element. A regularized form of the date may be encoded in ISO 8601:2000 5.2.1.1 standard form (e.g., YYYY-MM-DD) in the value attribute of the <docDate> element. This is useful if document dates need to be consistently indexed.

&lt;docImprint&gt;New York Publishing Company &#xA9;<docDate value="1971.00.00"> 1971.</docDate>
          

[P4: 7.5]

2.4.1.6. epigraph

Record quotations that may appear on the title page in the <epigraph> element. Unattributed epigraphs may be recorded in a <quote> element within <epigraph>. Attributed quotations should be encoded in <cit> within <epigraph>. Within <cit>, the quotation is surrounded by <quote>, while the attribution is given inside <bibl>. (See the section on quotations fur further information.)

<epigraph rend="italic">
  <quote>The price we pay one day may make us weep.</quote>
</epigraph>
                            
                            
<epigraph rend="italic">
  <cit>
    <quote>No man is an island, but some men are peninsulas.</quote>
    <bibl>Joe Haskell</bibl>
  </cit>
</epigraph> 
          

2.4.2. Tables of Contents

For every TEI document, the CDL will automotically create a navigational table of contents using the <head>s encoded throughout the document. Projects may therefore choose to forgo encoding the table of contents in a source document. However, projects wishing to retain the original table of contents, which will often differ from that which would be produced by collecting the document's <head>s, may encode the original in a <divn type="contents">. The table of contents is normally encoded as a <list>, using <ref>s to link each entry to its proper section. (See the section on lists for information on encoding lists. See the section on internal linking for more information on <ref>s.)

Contents
Upward  . . . . . . 1
January . . . . . . 4
Unto This Present . 7

<div1 type="contents">
  <head>Contents</head>
  <list type="simple">
    <item>Upward<ref target="p1" type="pageref" rend="align right">1</ref></item>
    <item>January<ref target="p4" type="pageref" rend="align right">4</ref></item>
    <item>Unto this Present<ref target="p7" type="pageref" rend="align right">7</ref></item>
  </list>
</div1>
            

2.5. Document Body

2.5.1. <body>

Containing the main body of the text, the mandatory <body> element is further subdivided into a hierarchy of nested divisions beginning with a mandatory <div1>. Use the type attribute in each <divn> to describe the type of section being encoded. For a full list of the types available, please see the section on divisions.

[P4: 7.1]

2.6. Back Matter

2.6.1. <back>

The optional <back> element may contain any number of <divn> elements containing advertisements, afterwords, indexes, bibliographies, appendices, or other sections that appear at the end of the document after the main body of the text. Use the type attribute in each <divn> to describe the type of back matter being encoded. For a full list of the types of <divn>s available in <back>, please see the section on divisions.

<back>
   <div1 type="appendix">
      <head>Photographs</head>
      <p>The author was a prolific photographer who. . .
      </p>
   </div1>

            

2.6.2. Appendices

The <divn type="appendix>> element should be used within <back> for back matter sections collected together under a common heading, usually "Appendix".

<back>
  <div1 type="appendix">
    <head>Appendix</head>
    <div2 type="biography">
      <head>Niels Reimers Curriculum Vitae</head>
      ...
    </div2>
    <div2 type="section">
      <head>Stanford Office of Technology Licensing web page.</head>
      ...
    </div2>
    <div2 type="chronology">
      <head>Cohen/Boyer Patent Chronology.</head>
      ...
    </div2>
  </div1>
        

2.6.3. Indexes

The <divn type="index"> element should be used to encode indexes in <back>. Indexes should be encoded as lists or nested lists as appropriate. (Note that indexes encoded as lists will be displayed using the standard indention used for lists.) Page numbers in indexes should be tagged as <ref>s with target attributes containing the unique ids of the pages being referenced. (See the section on internal linking for more information.)

Example:

Index

[The numbers below represent page numbers in the volume. Clicking on the hyperlink will take you to the top of that page.]

  • Abbott, Grace, 137, 142, 143, 144

    • personality, 116, 138, 148

  • Acheson, Dean, 102

    • administration:

      • bureau autonomy, 220‑224

      • by presidential appointees (Puerto Rico), 120-122, 124-126

      • educational process, 73, 81, 87, 198‑199

<div1 type="index">
  <head>Index</head>
  <p>[The numbers below represent page numbers in the volume. 
    Clicking on the hyperlink will take you to the top of that page.]</p>
  <list type="simple">
  <item>Abbott, Grace, <ref target="p137">137</ref>, <ref target="p142">142</ref>, 
  <ref target="p143">143</ref>, <ref target="p144">144</ref> 
	  <list>
	    <item>personality, <ref target="p116">116</ref>, <ref target="p138">138</ref>, 
	    <ref target="p148">148</ref></item></list></item>
      <item>Acheson, Dean, <ref target="p102">102</ref></item>
      <item>administration:
	      <list>
	        <item>bureau autonomy, <ref target="p220">220‑224</ref></item>
	        <item>by presidential appointees (Puerto Rico), <ref target="p120">120‑122</ref>, 
	        <ref target="p124">124‑126</ref></item>
	        <item>educational process, <ref target="p73">73</ref>, <ref target="p81">81</ref>, 
	        <ref target="p87">87</ref>, <ref target="p198">198‑199</ref></item>
        

2.7. Divisions

2.7.1. <divn>

The<front>, <body>, and <back> elements in the document must use a hierarchical structure of numbered <divn> elements to identify their significant divisions. The elements <body> and <back> are both required to contain at least one <div1>. No unnumbered <div> or <div0> elements are permitted.

Each <divn> element throughout the text must have a unique id attribute to serve as an indentifier. If necessary these can be added automatically on ingest by the CDL, depending on the project's submission agreement with the CDL.

All <divn>s must also contain a type attribute describing the kind of division being encoded. Every attempt should be made to supply the most specific and consistent type values possible for <divn> elements.

<div1 id="ch01" type="chapter">
  <div2 id="ss1.1" type="ss1">
    <div3 id="ss2.1" type="ss2">
      <div4 id="ss3.1" type="ss3">
          

The following table lists the <divn> types available for printed books. Please note that the types listed below may be used for <divn>s in <front>, <body>, or <back> as necessary.

valuedescription
copyright copyright information page for the printed book
dedication book dedication, epigraph, or author's inscription
contents table of contents
frontispiece a pictorial frontispiece, possibly containing text
preface a foreword or preface explaining the content, origin, or purpose of the text
fmsec other front matter sections, such as illustration and table lists, acknowledgments, introductions, etc.
epigraph epigraph appearing on its own page
halftitle half title between the front matter and the text
volume volume in a text that contains multiple volumes; this is rarely used
part book part
chapter book chapter
ss1-ss6 sub-sections 1-6; these have no relation to the number of the <divn> element itself and need not be hierarchically applied but should reflect the formatting and arrangement of the book itself
appendix book appendix
endnotes endnotes section in the back matter or at the end of a part, chapter or sub-section
glossary book glossary
bibliography book bibliography
index book index
colophon a statement that describes the conditions of the book's physical production, often including details about number of copies printed
advertisement publisher's advertisements, or advertisements for other books or products
errata errata
subscribers lists of subscribers to the publication
bmsec other back matter sections

[P4: 7.1.2]

2.8. Division Headings, Openers, and Closers

Significant textual divisions often open with a heading identifying the content of the division. They may also begin and end with phrases such as bylines, epigraphs, datelines, and the like.

2.8.1. <head>

The <head> element is used to record division headings, such as chapter or section titles, and is used by the system for allow users to navigate easily from one section to another.

Specific guidelines are supplied below regarding where <head>s may or may not appear. Generally, record headings as they appear in the source document.

Headings may be supplied by the encoder if they are not available in the text but are necessary in order to provide a way of navigating to a particular division. Headings may also be supplied in cases in which a <head> is necessary to conform to rules about when they must appear.

Supplied headings should be enclosed in square brackets or signalled by some other convention expressly detailed in the <editorialDecl> of the <teiHeader>.

Title transcribed from text:

<head>Chapter 4. The Ghost Returns to Middlington Manor.</head>
        

Title supplied by encoder:

          <head>[Segment 2]</head>
        

It is good practice to provide a <head> tag for all major textual divisions. In any case, the following rules must be strictly followed:

  1. If any <divn> at any level contains a <head>, then all of its sibling <divn>s at the same level must also contain a <head>. Therefore, if any <div1> uses a head, all <div1>s in the text must do so. If any <div2> contains a <head>, all other <div2>s nested with that <div2> in its parent <div1> must also contain <head>s, etc.

  2. If a <divn> at any level is left without a <head>, then any subordinate <divn>s below the headless <divn> are not permitted to have <head>s. Conversely, if any subordinate <divn> contains a head, the parent <divn> must also contain a <head>.

The following example is incorrect because one of the <divn> descendants contains a <head> but none of its ancestors contain one. If the rules are strictly followed, the single <div4> with a <head> forces all other <div>s in the tree to contain <head>s:

<div1>
  <div2></div2>
  <div2></div2>
  <div2>
    <div3></div3>
    <div3>
      <div4><head></head></div4>
    </div3>
  </div2>
  <div2></div2>
</div1>
          

Multiple <head> elements may be differentiated using the type attribute (e.g., "subtitle" for a subtitle).

<div1 id="ch01">
  <head type="main"> . . . </head>
  <head type="subtitle"> . . . </head>
        

2.8.2. <epigraph>

Epigraphs contain quotations, anonymous or attributed, appearing at the start of a section, chapter, or other major division. They should be enclosed within the <epigraph> element. An epigraph appearing on a page by itself should be encoded in <epigraph> within a <divn type="epigraph">.

Within <epigraph>, attributed epigraphs should be enclosed entirely within the <cit> element, with <quote> containing the quoted passage and <bibl> containing the attribution. Within <quote>, use <p>, <lg>, or other block elements as necessary.

<epigraph>
  <cit>
    <quote>"I believe that any other ideal is impracticable and is a collision with human destiny
    and God."</quote>
    <bibl>Attributed to George Herron.</bibl>
  </cit>
</epigraph>
 
                        
<epigraph>
   <cit>
      <quote>
            <lg>
             <l>`Twas brillig, and the slithy toves</l>
             <l>Did gyre and gimble in the wabe:</l>
             <l>All mimsy were the borogoves,</l>
             <l>And the mome raths outgrabe.</l>
             </lg>
      </quote>
        <bibl>"Jabberwocky"--Lewis Carroll</bibl>
   </cit> 
</epigraph>
          

Within <epigraph>, unattributed epigraphs should simply be encoded within <quote>, with <p> and other block elements used as necessary to contain the quoted passage. There is no need to use <cit> for unattributed epigraphs.

<div1 id="ch01" type="chapter" n="1">
   <head>Chapter 1</head>
   <epigraph>
      <quote>
         <p>I pity the man who can travel from Dan to Beersheba<p>
      </quote>
   </epigraph>

<epigraph>
  <quote rend="italic">
    <lg>
      <l>What you have seen to love in me</l>
      <l>I do not know.</l>
      <l>What I have seen to love in thee</l>
      <l>No word can show. </l>
      <l>But word or knowledge, dear, we lay aside.</l>
      <l>We need them not for compass or for guide.</l>
      <l>By love we go.</l>
    </lg>
  </quote>
</epigraph>
        

2.8.3. <byline>

Bylines are formal statements of responsibility, which may sometimes be found near the top of a division (usually after a <head>) and sometimes at the bottom. Do not use <bylines> to record attributive information for quoted passages; use instead the <cit>/<quote>/<bibl> structure described in the section on quotations. Do not use <byline> for the attribution of correspondence, which is normally signed (<signed>). Do not use <byline> when a more complete bibliographical citation is present; in that case <bibl> is normally more appropriate. (See the section on bibliographic citation.) Take care not to confuse the the use of <byline> and similar elements within <divn>s with their use within formal <titlePage>s.

<div1 type="introduction">
  <head>Introduction</head> 
  <byline>by Sherna Gluck</byline>
  <p>The following interviews with Sylvie Thygeson represent two distinct interviews ...

                            
<div2 type="essay">
  <head>In the Public Interest——Jeannette Rankin</head>
  <bibl>by <author>Ralph Nadar</author>
    (<title rend="italic">The New Republic Feature Syndicate</title>
    <biblScope>Number 33</biblScope>
    <date>September 11, 1972</date>)
  </bibl>
  <p>WASHINGTON——A few weeks ago we sent a questionnaire ...
        

2.8.4. <dateline>

Use <dateline> to encode a place and date associated with the creation of the document. Encode the place name directly within <dateline>, but use <date> to enclose the date itself within <dateline>. When additional address information is available, use <address> within <dateline>. (See the section on addresses.) As with <byline>, do not use <dateline> to encode more complete bibliographic citations. Use <bibl> instead.

Example:

<div1 type="chapter">
  <head>Prologue</head>
  <dateline>March 1945: Shensi Province, China</dateline>
  <p>A dull orange haze, the first light of dawn, ...
        

2.8.5. <closer>

Often poems, chapters, or essays will end with a closing statement, such as "The End" or "Finis," that is not considered part of the section it closes. These statements can be enclosed within <closer>.

          <closer>Finis</closer>
        

If a single poem or essay ends with "The End" or "Finis," the statement should be tagged using <closer> inside the <divn>.

If the last poem or essay in the book ends with "The End" or "Finis," consider the statement as applying to the entire book, and encode it as a <closer> outside of the last <divn>, but inside the <body> of the text.

Dates and datelines that act as closers may be, but need not necessarily be, encoded in <closer>.

                
<div1 type="poem">
                    . . .
<lg type="stanza">
<l>Nor certitude, nor peace, nor help for pain;</l>
<l>And we are here as on a darkling plain</l>
<l>Swept with confused alarms of struggle and flight,</l>
<l>Where ignorant armies clash by night.</l>
</lg>
<closer><date>1867</date></closer> 
</div1>                    
                    
                

2.8.6. <trailer>

Use the <trailer> element to encode printers' or publishers' names and addresses that appear the end of the book. Use <address& and <addrLine> within <trailer> as necessary.

[P4: 7.2]

2.9. Paragraphs

2.9.1. <p>

The paragraph is the fundamental organizational unit for all prose texts. Paragraphs are encoded within <p>s, which, by default, begin a new line and are displayed with the first line indented. To dictate a different display, use the rend attribute in <p>. Please see the section on alignment and indention for a list of available rend values.

                
<p>In another moment down went Alice after it, never once
considering how in the world she was to get out again.</p>               
                    
                

[P4: 6.1]

2.10. Page Breaks and Milestones

Milestones are empty elements (<lb>, <milestone>, <pb>) that serve a function in the text analogous to the one mileposts serve on a road. They are used to mark significant points in the text, often beginnings or endings of sections, that exist outside the hierarchy of <divn> containers.

2.10.1. <pb>

Projects must use the empty <pb> element to mark the beginning of each physical page of the source document (including the first page). The <pb> element should be placed at the beginning of each page, but entirely within any overlapping <divn>. Never encode <pb>s between <divn> elements. All such interstitial page breaks should be encoded as if they belonged to the nearest subsequent <divn>, before the <head> element. If a page break occurs in the middle of a smaller block element (e.g., <p>), it can simply be encoded there.

If desired, the n attribute of <pb> may be used to record page numbers as they appear in the source document so that the system can subsequently render those page numbers for display. Do not supply page numbers if they do not exist in the source document. Page numbers should be recorded using the n attribute of the <pb> element at the beginning of the page, regardless of where the number appears on the document.

<div1 type="chapter" n="I" id="ch01">
  <pb n="1" id="p1"/>
  <head>Introduction</head>

          

If a page number is given, the id attribute is also highly recommended. If anything is linked to the page breaks (such as an index entry or table of contents that refers to pages), the id attribute is required.

<p>of the Sea, <ref target="p1" type="pageref">1</ref></p>
. . .
<pb n=1 id="p1"/>
          

2.10.2. <fw>

Projects that wish to capture catchwords and running heads may be record them in <fw> ('forme work'). The <fw> element must contain a type attribute. Possible values are:

valuedescription
header a running title at the top of the page
footer a running title at the bottom of the page
sig a signature or gathering symbol
catch a catch-word

The <fw> element should directly follow the <pb> element indicating the start of the page on which it appears, regardless of where it actually physically appears on the page. Projects that wish to record the location of the content of <fw> should use the place attribute.

<fw type="sig" place="bottom">C3</fw>
          

[P4:18.3]

2.10.3. <lb>

The <lb> element marks the start of a new line. Use this element only when it is absolutely essential to preserve line breaks as they appear in the source document. (Note that the <lb> tag is intended for producing line breaks in prose only. the <l> element must be used to encode lines of verse.)

<p>When I approached the door, I saw that it's knocker yawned as a great
<lb/>O
<lb/>before me, impossibly heavy. . . </p>
          

2.10.4. <milestone>

The empty <milestone> element may be used to mark significant boundaries between sections of text that are neither page breaks nor normal divisions. For instance, it may be used to encode the decorative section breaks common to monographs. The unit attribute is required to describe the kind of break being marked. The n attribute must be used to record any characters or symbols that are used to create the boundary.

<milestone unit="endPart" n="&2766;"/>
                        
<milestone unit="endPart" n="****"/>
          

[P4: 6.9.3]

2.11. Typographical Phenomena and Formatting

2.11.1. <hi>

Record font changes and other typographical highlighting with the <hi> element. Use the required rend attribute to record the type of font shift employed in the source document. Unless otherwise stated in the <editorialDecl>, the value of the rend attribute must convey and ultimately display (if possible) the actual marking in the source document. In other words, do not use <hi> to introduce editorial changes to a text's typesetting.

When text with special formatting has already been tagged for other structure or content, and when the special formatting is consistent, the rend value can be applied directly to the encompassing tag. For example, if the contents of <name> are underscored, or if the contents of <p> are entirely in bold font, then the rend values of those tags can be defined accordingly. Because rend is a global attribute, it is available for all TEI elements. When special formatting does not coincide perfectly with an encompassing tag (as is often the case), <hi> is used to surround the special text.

          <p><hi rend="underline">Where</hi> did he go?</p>
                        
          <head rend="smallcaps">The Last Stand</head> 
          

The CDL supports the following rend values for display:

valuedisplay
normal standard font for the document; unemphasized, unhighlighted text; should be used to format unemphasized text in the middle of an emphasized passage
mono mono-spaced font, e.g., Courier
italic italics
smallcaps small caps
bold bold
bolder extra bold
lighter extra light
underline underscored
overline written with a line drawn above the text
strikethrough strikethrough
subscript below the baseline of standard text
superscript above the baseline of standard text
hide do not display

Projects requiring more specialized display may include syntax from the Cascading Style Sheet (CSS) standard in the rend attribute.

    
<p rend ="color: white; background-color: red">This text will be white on a red background.</p>

2.11.2. Nested <hi> Tags

When multiple rend values are required for a single element, repeat <hi> elements as necessary. For instance, in the following example, the word "wow" is rendered in both bold and italics as "wow" .

        <hi rend="bold"><hi rend="italic">wow!</hi></hi>

Remember that once a rend value has been applied to a tag, the display is applied to the entire contents of that tag unless it is explicitly negated by another tag. For instance, the tagging

        <hi rend="bold">w<hi rend="italic">ow!</hi></hi> 

will produce the word "wow ".

On the other hand, the tagging

        <hi rend="bold">w<hi rend="normal"><hi rend="italic">ow!</hi></hi></hi> 

will produce the word "w ow".

2.11.3. <emph>

If desired, the <emph> element may be used instead of <hi> to mark a typographic shift that explicitly conveys emphasis rather than simply a change in typography or other meaning. In the following example, the word "very" is underscored to provide emphasis.

        <hi rend="bold">Once Upon a Time</hi> Chicken Little decided to build a <emph 
rend="underline">very</emph> big house.
 

The same rend values available for <hi> are also available for <emph>, as they are for all rend attributes in any element.

[P4: 6.3.2.2]

2.11.4. Alignment and Indention

Alignment and indention of text can also be represented using the rend attribute in <hi> or any other encompassing tag. Available rend attribute values for alignment are:

valuedisplay
left justify left, ragged right, initial indent
center center
right justify right, ragged left, initial indent
justify fully justify, initial indent
indent standard paragraph indent
hang hanging indent
blockindent full block indent
blockquote full block indent used for quotes (<quote>)
noindent no initial indent

Projects requiring more precise alignment of text may also use CSS language within the rend attribute to describe the alignment required.

    [4em hanging indent]
<p rend ="text-indent: -4em; margin-left: 4em"> 

Note that all <p>s are flush left with an initial indent by default, so any paragraph that should not be indented must be given a rend value of "noindent".

2.12. Language Shifts

2.12.1. <foreign>

Use the <foreign> element to tag text that appears in a language that will require the use of a different character set or writing direction. The lang attribute must contain the name of the applicable language as given in the <language> element of the TEI header. Note that the language must be declared in the TEI header in order for this attribute to function. The enclosed text should be input using the appropriate Unicode character entitities. (See the section on character encoding.)


   <profileDesc>
  <langUsage>
    <language id="Greek">(Range: 0370-03FF)</language>
  </langUsage>
. . .
     <foreign lang="Greek">&#0371;&#0372;&#0399;</foreign>
                

ųŴƏ

2.13. Quotations

Quotations that are set apart from the rest of the text by quotation marks need not be specially encoded. Quotation marks are normally left intact in the text and, if possible, recorded in the form that they appear (i.e., straight or curly, single or double). (The exceptions to this rule are quotations marks around <title>s in bibliographic citations that use the level attribute to provide their formatting [see the section on bibliographies].)

Quotations that employ formatting beyond the simple use of quotation marks must be specifically tagged. Simple block quotes containing only one paragraph may be recorded using <p>.

<p rend="blockindent">It was seen from the beginning of the study . . . </p>
      

2.13.1. <quote>

Quotes comprising multiple paragraphs or lines of verse should be enclosed in the <quote> element, with individual paragraphs contained in <p>s and lines of verse contained in <lg> and <l>.

      
<quote rend="blockquote">
  <p>It was seen from the beginning that the study . . . </p>
  . . .
</quote>
                        
 <quote>
    <lg>
      <l>What you have seen to love in me</l>
      <l>I do not know.</l>
      <l>What I have seen to love in thee</l>
      <l>No word can show. </l>
      <l>But word or knowledge, dear, we lay aside.</l>
      <l>We need them not for compass or for guide.</l>
      <l>By love we go.</l>
    </lg>
  </quote>  
          

2.13.2. <cit>

If desired, quotations that are accompanied by citations may be encoded using <cit>. Enclose both the quote and the citation within <cit>. The text of the quote should be further enclosed within <q> and <p> as necessary, and the bibliographic citation should be further enclosed within <bibl>.

<cit>
   <quote>
      <l>Since I can do no good because a woman</l>
      <l>Reach constantly at something that is near it.</l>
   </quote>
   <bibl>
      <title>The Maid's Tragedy</title>
      <author>Beaumont and Fletcher</author>
   </bibl>
</cit>
                        
<cit>
      <quote>
            <lg>
             <l>`Twas brillig, and the slithy toves</l>
             <l>Did gyre and gimble in the wabe:</l>
             <l>All mimsy were the borogoves,</l>
             <l>And the mome raths outgrabe.</l>
             </lg>
      </quote>
        <bibl>"Jabberwocky"--Lewis Carroll</bibl>
 </cit> 
        

[P4: 6.3.3]

2.14. Speech

Texts that are made up primarily of attributed speech--e.g., plays, screenplays, and interview transcripts--should be encoded using the <sp> and <speaker> elements. Transcriptions of speech embedded in prose or verse texts may also be encoded using these elements.

2.14.1. <sp>

The <sp> element is used to contain instances of speech in a performance text or a transcript of spoken words in a prose or verse text. The entire speech along with its attribution should be encoded within <sp>. Within <sp>, use <p>, <lg>, and other block elements as necessary to format and contain the contents of the speech.

                <sp>
      <speaker>FILCH.</speaker>
      <p>Sir, Black Moll hath sent word her Trial comes on in
       the Afternoon, and she hopes you will order Matters
       so as to bring her off.</p>
   </sp>
   <sp>
      <speaker>PEACHUM.</speaker>
      <p>Why, she may plead her Belly at worst; to my 
        Knowledge she hath taken care of that Security.
        But, as the Wench is very active and industrious, 
                you may satisfy her that I'll soften the Evidence.</p>

The <sp< element may also carry an optional who attribute that gives the identity of the speaker. The value of who must refer to the id of a person previously identified in either a cast list (<role> in <castItem> in <castList> for dramas and screenplays) or a description of the participants in a transcribed speech or interview (<person> in <partiDesc>). For more specific information on how to assign ids that would be valid in who, see P4: 10.1.4 for encoding cast lists, P4: 5.4 for encoding participants.

<profileDesc>
  <particDesc>
    <person id="LaBerge" role="interviewer/editor">
      <persName reg="LaBerge, Germaine">Germaine LaBerge</persName>
    </person>
    <person id="Bouche" role="interviewee">
      <persName reg="Bouché, Brieuc">Brieuc Bouché</persName>
    </person>
  </particDesc>
...
  <sp who="LaBerge">
    <speaker>LaBerge</speaker>
    <p>Why don't we start with where you were born, and a little bit about your family background?</p>
  </sp>
  <sp who="Bouche">
    <speaker>Bouché</speaker>
    <p>Yes. How much detail do you want? Full detail or just very sketchy?</p>
  </sp>          
          

2.14.2. <speaker>

The <speaker> element is used within <sp> as a specialized form of heading giving the name of the speaker responsible for the spoken words. Encode the name of the speaker as it is given in the source document. Do not supply a name if one does not appear in the source text. The content of <speaker> is displayed in bold and flush left on a line preceding the text of the speech.

<sp who="LaBerge">
    <speaker>LaBerge</speaker>
    <p>Why don't we start with where you were born, and a little bit about your family background?</p>
  </sp>
  <sp who="Bouche">
    <speaker>Bouché</speaker>
    <p>Yes. How much detail do you want? Full detail or just very sketchy?</p>
  </sp>          
    

is displayed as:

LaBerge

Why don't we start with where you were born, and a little bit about your family background?

Bouché

Yes. How much detail do you want? Full detail or just very sketchy?

Projects that require a different kind of styling for the display of speaker names should use the rend attribute to override the default styling imposed by <speaker>.

[P4: 10.2.2]

2.15. Verse

2.15.1. <divn> in Verse

Generally, verse or verse fragments in a text should be enclosed within a separate <divn> element with an identifying type attribute. Projects must enclose a poem in a <divn> if they wish to attach a searchable, indexable title to the poem using <head> or if they wish to encode a <closer> at the end of the poem. The most common type attribute values for verse are:

verse
poem
sonnet
drama
free-verse
song

If projects do not wish to enclose a poem within a separate <divn>, they may simply enclose its lines using the mandatory <lg> element. (See below.)

2.15.2. <head> in Verse

Projects may use the <head> element for all titles, subtitles, etc., for verse encoded within a <divn>, bearing in mind the rules for using <head> within <divn>. When more than one <head> is required, use the type attribute to describe the different type of headings or titles being applied. Any <head> element for verse that does not have a type attribute will be considered a "main" title.

      
<head type="main">
<head type="subtitle">
<head type="dedication">
        

2.15.3. <l>

Individual lines of verse must be surrounded by the <l> tag. Lines that are numbered may use the n attribute to encode the line number. Use the rend attribute as necessary to provide proper indention.

<l n="5" rend="indent">
      

2.15.4. <lg>

Regardless of whether verse is contained within its own <divn>, groups of lines must be encoded within the <lg> element, with each individual line also encoded in the <l> element. The <lg> tag is used to identify groups of lines that carry coherent poetic structure (i.e., function as a formal unit, such as a stanza) within a poem. The type of structure may be identified with the type attribute. Some available type values are:

stanza
verse
paragraph
couplet
quatrain
fragment
refrain

The value "fragment" should be used for line groups that do not carry poetic structure.

      
<div1 type="poem">
                  <lg type="stanza">
      <l>How doth the little crocodile</l> 
<l>Improve his shining tail,</l> 
<l>And pour the waters of the Nile</l> 
<l>On every golden scale!</l>
       </lg>
   </div1>
          

The following text could be tagged in different ways:

`Repeat, "You are Old, Father William,"' said the Caterpillar. 
                 Alice folded her hands, and began:--
                 `You are old, Father William,' the young man said,            
                          `And your hair has become very white; 
                  And yet you incessantly stand on your head- 
                            Do you think, at your age, it is right?' 
                            

within <divn>:

            <div1 type="chap5">
. . .
<p>`Repeat, "You are Old, Father William,"' said the Caterpillar.</p> 
<p>Alice folded her hands, and began:--</p>
<div2 type="poem>
<head type="poem-title" rend="center">[You Are Old, Father William]</head>
<lg type="stanza" rend="blockindent"> 
<l>`You are old, Father William,' the young man said,</l>
<l rend="indent">`And your hair has become very white;</l>
<l>And yet you incessantly stand on your head-</l>
<l rend=indent">Do you think, at your age, it is right?'</l>
</lg>
</div2>
          

or without <divn>:

. . .
<div1 type="chap5">
<p>`Repeat, "You are Old, Father William,"' said the Caterpillar.</p> 
<p>Alice folded her hands, and began:--</p>
<lg type="stanza" rend="blockindent"> 
<l>`You are old, Father William,' the young man said,</l>
<l rend="indent">`And your hair has become very white;</l>
<l>And yet you incessantly stand on your head-</l>
<l rend=indent">Do you think, at your age, it is right?'</l>
</lg>
            
          

2.16. Notes

2.16.1. <note>

Use the <note> element to encode notes, using the place attribute to indicate the location of the note. Available place values are:

valuetype of note
end endnote, note appears at the end of a chapter, part, or volume
foot footnote, note appears at the foot of the page
inline note appears as a marked section in the body of the text

Notes without the place attribute will be considered in-line. For notes that are tagged at the point of reference, the numbers attached to the notes (as distinct from reference numbers that are located elsewhere) are normally recorded as the value of the n attribute and should not be included in the text of the note itself. Similarly, dingbats, crosses, daggers, and the like used to label notes for referencing may also be recorded as Unicode characters within the n attribute. A separate <ref> is not necessary. If a note is targeted by a <ref> elsewhere, it must contain a unique id attribute. Be sure to enclose the contents of notes in <p>s or other appropriate block elements if necessary. (See the section on internal linking for more information about <ref>s.)

2.16.2. In-line notes

In-line notes may be tagged directly in place.

                <p>Collections are ensembles of distinct entities or objects of any sort. 
<note place="inline">We explain below why we use the uncommon term collection instead 
of the expected set. our usage corresponds to the aggregate of many mathematical writings
and to the sense of class found in older logical writings.</note> The elements. . .</p>

2.16.3. Footnotes

Footnotes (those references, notes, and citations appearing at the bottom of the page) must be encoded where they are referenced. In other words, at the location of the footnote reference in the text, embed the <note> itself in place. If a footnote is tagged in place and the n attribute contains the note's reference number, projects must not encode a separate <ref> with that same number in the same location. The result would be two duplicate numbers appearing in place at the point of reference. However, if no n attribute is given in <note>, then a separate <ref> may be used in place. In either case, other references to that footnote from other locations in the text may be tagged with <ref>. If a footnote is targeted by any <ref> anywhere in the text, it must include an id attribute. (See the section on internal linking.)

<p>...Whites, however, did not vote to transfer power 
<hi rend="italic">to</hi> the black majority, as the 
media reported, but only to share power. 
<note id="fn0.1" place="foot" n="*">
<p>The use of racial and ethnic labels is not meant 
to reproduce, uncritically...</p></note>...</p>
            

2.16.4. Endnotes

Endnotes (those appearing at the end of a chapter, section, or other significant textual division) must be encoded where they appear in the document, in a separate <divn> if necessary. For an endnote to function properly, the reference to the note in the text must be tagged with <ref> and each endnote <note> must carry an id attribute. Further, if projects wish to allow users to link directly from the note back to its reference in the text, then the id and corresp attributes must also be properly used in <ref> and <note> respectively. (See the section on internal linking for more detailed instructions.)

<p>...falsely assumed South Africa to be the only developed 
capitalist country “[that] is not only ‘objectively’ ripe for 
revolution but has actually entered a stage of overt and 
seemingly irreversible revolutionary struggle.”
<ref target="bn0.1" id="d0e912" type="noteref">1</ref> ...</p>

....

<div2 id="d0e1020" type="endnotes">
   <head type="main">Notes</head>
   <note id="bn0.1" place="end" n="1" corresp="d0e912">
      <p>Paul M. Sweezy and Harry Magdoff, “The Stakes in 
         South Africa,” <hi rend="italic">Monthly Review,
         </hi> April 1986.</p>
   </note>
</div2>
            

[P4: 6.8.1]

2.16.5. <bibl> in <note>

When the footnote is clearly bibliographic in nature, enclose it within the TEI <bibl> element inside <note>. Projects may further encode the author or authors as <author>, titles as <title>, dates as <date>, and references to a page number, span of page numbers, or chapters as <biblScope>. (See the section on bibliographic citations.)

Example:


  <note n="5" id="n5" place="foot">
    <bibl>
      <author>Gallagher, Robert S., </author> 
      <title level="a">I Was Arrested, Of Course, </title>an interview,
      <title level="j">American Heritage, </title>
      <date>February, 1974, </date> 
      <biblScope>pp. 17‑24, 92‑94. </biblScope>
    </bibl>
  </note>
 
        

2.17. Names, Dates, and Addresses

Although it is not required, it sometimes useful to tag names, dates, and addresses as they occur throughout the text, not only when they occur on the title page. Tagging names and dates also allows them to be regularized in order to provide more fruitful searching.

2.17.1. <name>

The <name> element may be used to encode any proper noun or proper noun phrase. The type attribute can be used to indicate the type of name. Supported type values are "person" and "place". The reg attribute may be used to give a normalized or regularized form of the name.

At the time of the events which led to
<name reg="Benedict XII, Pope of Avignon (Jacques Fournier)" 
type="person">Fournier's</name> investigations, 
the local population consisted of between 200 and
250 inhabitants.
        

2.17.2. <date>

Use <date> to encode a date that has been given in any format. The value attribute can be used to contain the value of the date in the standard ISO 8601:2000 5.2.1 format (e.g., YYYY-MM-DD). Again, this is useful if document dates need to be indexed for searching.

Because the <date> element is not directly allowed within <divn> it can be surrounded by <dateline> if necessary. When it appears at the beginning or end of a division, <date> is normally located within the <opener> or <closer> elements. Projects not wishing to use <opener> and <closer> may also insert <date> directly within <p> if that is appropriate.

<p>Given on the <date value="1977-06-12">Twelfth Day of June
in the Year of Our Lord One Thousand Nine Hundred and
Seventy-seven of the Republic the Two Hundredth and first
and of the University the Eighty-Sixth.</date></p>
        

2.17.3. <address>, <addrLine>

The <address> and <addrLine> elements can be used to encode postal or other addresses. Enclose the entire address within <address> and each individual line within <addrLine>.

<address>
   <addrLine>110 Southmoor Road,</addrLine>
   <addrLine>Oxford OX2 6RB,</addrLine>
   <addrLine>UK</addrLine>
</address>
        

Because <address> is not allowed directly in <divn>, when it appears at the beginning or end of a division, it normally is enclosed within the <opener> or <closer> elements. Projects not wishing to use <opener> or <closer> may insert <address> directly inside a <p> if that is appropriate.

<div1 type="letter">
  <head>Appendix: Letter to Earl Warren</head>
  <opener>
    <date>November 10, 1971</date>
    <address>
      <addrLine>Honorable Earl Warren</addrLine>
      <addrLine>Supreme Court of the United States</addrLine>
      <addrLine>Washington, D. C.</addrLine>
      <addrLine>Re: ACLU Proposed Earl Warren Civil Liberties Award</addrLine>
    </address>
    <salute>Dear Governor:</salute>
  </opener>
        

[P4: 6.4]

2.18. Lists

2.18.1. <list>

Individual items in a list must be encoded as <item>s within <list> rather than as a series of <p>s or <l>s. Use the <list> element's type attribute to define the type of list appearing in the document. Valid type attributes are:

valuetype of list
ordered lists with sequential markers
bulleted marked or bulleted lists
simple unmarked or unnumbered lists
gloss definition lists (e.g., glossary, chronology, etc.) consisting of a term encoded in <label> and a definition or expansion of the term encoded in <item>
ordered numbered lists
label non-gloss lists whose items are each labeled with a <label>

Nest lists as appropriate, noting that they will be automatically indented to reflect the nesting. Use the <head> element to provide headings for lists.

2.18.2. Standard Ordered Lists

Encode lists that include sequential markers, numbers, or letters as <list type="ordered">. Use the rend attribute to describe the kind of sequential system used. Each item in the list is encoded as an <item>, without the sequential marker. The rend attribute will tell the stylesheet what kind of enumerative system to supply for display. If no system is specified in the rend attribute, then the default system of "arabic"--meaning arabic integers starting with "1."-- will be applied. The available rend values are as follows.

valueenumerators
arabic 1., 2., 3., etc.
upperalpha A., B., C., etc..
loweralpha a., b., c., etc.
upperroman I., II., III., etc.
lowerroman i., ii., iii., etc..
supplied non-standard enumerations encoded within each <item>'s n attribute (see below)

Departments

  1. English

  2. History

  3. Biology

  4. Political Science

<list type="ordered" rend="upperalpha">
   <head>Departments</head>
   <item>English</item>
   <item>History</item>
   <item>Biology</item>
   <item>Political Science</item>
</list>
              

2.18.3. Non-standard Ordered Lists

Lists that use a use a non-sequential or otherwise non-standard method of enumeration may still carry the type attribute value of "ordered" if the specific mark of numeration may be explicitly supplied in the n attribute of each individual <item> element. Whatever is encoded as the value of the n attribute will be exactly displayed as the enumerator for the item. Therefore, don't forget to include punctuation if it is desired. In such cases, set the <list>'s rend attribute to "supplied."

  1. Food and supplies

  2. Medicine

  3. Fuel

  4. Fuel storage containers

  5. Radios

<list type="ordered" rend="supplied">
   <item n="1.">Food and supplies</item>
   <item n="2.">Medicine</item>
   <item n="3.">Fuel</item>
   <item n="5.">Fuel storage containers</item>
   <item n="6.">Radios</item>
</list>
              

Note that all <item>s in a <list rend="supplied" type="ordered"> must contain an n attribute, even if some of the items conform to the standard enumerative conventions. Again, never encode the sequential marker within the text of the <item> as well. Such encoding will usually result in two duplicate markers appearing before each <item> in the list. n attribute.

[P4: 6.7]

2.18.4. <label>

Rather than enumerators, items in a <list type="gloss"> have labels, such as headwords in a glossary or dates in a chronology. The <label> element is used to capture each label immediately preceding its associated <item>.


<list type="gloss" rend="label">
<label>1835</label><item>born in Florida, MO</item> 
<label>1848</label><item>apprenticed</item>

2.19. Bibliographies

2.19.1. <bibl>, <listBibl>

Individual bibliographic citations should be encoded using the <bibl> element. Groups of <bibl>s are further contained within a <listBibl>.

The <bibl> element allows unstructured bibliographic data, including standard bibliographic elements as well as uncontained text such as more discursive or descriptive citations or annotation. Unlike the stricter bibliographic containers found in the TEI, <bibl> allows the encoder some latitude both in the order of subelements and the level of encoding.

There are no elements absolutely required within <bibl>. However, most projects will most likely take advantage of the following: <author>, <date>, <title>, <pubPlace>, <publisher>, and <biblScope>.

<listBibl>
   <bibl id="bib010_ch02">
      <author>Johnson, Douglas W.</author> 
      <date>1919</date>. 
      <title level="m">Shore processes</title>. 
      <pubPlace>New York</pubPlace>, 
      <publisher>Wiley &amp; Sons</publisher>, 
      <biblScope type="pages">584 pp.</biblScope>, 
      <date>1919</date>.
   </bibl>
          

2.19.2. <title> levels

Projects using the <title> element may also use its level attribute to define the type of title being provided and dictate the standard typographic styling used to display the title. Therefore, <title>s that carry a level attribute do not need to be tagged again for italics, quotation marks, and the like. Titles that require special formatting not supported by the available levels can use the rend attribute to dictate the styling required. The supported attribute values and their resulting display are as follows:

valuetype of titletype of styling
a analytic title (article, poem, or other item published as part of a larger item)surrounded in quotation marks
m monographic title (book, collection, or other item published as a distinct item, including single volumes of multi-volume works)italics
j journal titleitalics
s series titleitalics
u title of unpublished material (including theses and dissertations unless published by a commercial press)surrounded in quotation marks

2.19.3. <note> in Bibliographic Citations

The CDL TEI DTDs all allow the <note> element within <bibl>. Use it to record notes, including in-line bibliographic annotation and footnotes, that occur within bibliographic citations.

                                     
<bibl><title level="m">Alice's adventures in Wonderland</title> by 
<author>Lewis Carroll</author>. 
<pubPlace>London</pubPlace>: 
<publisher>Macmillan</publisher>, 
<date value="1869.00.00">1869</date>.  
<note>This work is remarkable example of the intersection of mathematics and literature.</note></bibl>
                    

[P4: 6.10.1]

2.20. Internal Links and Cross References

Internal references and links can take many forms: numbers in the text that point to endnotes, page numbers in indexes that point to specific pages, pointers to specific sections of the text (e.g., "See Section 2A"), or short form bibliographic references (e.g., "Baxter 1978"). The practice described in this section applies only to references pointing to elements within the same file. See the section on external references to point to locations outside the document.

2.20.1. <ref>

Internal references will be encoded using the <ref> element and are required to have both a target and type attribute to indicate the id of the element being targeted and the nature of the target. (No specific system need be employed for creating ids in the elements being targeted as long as they are unique and begin with a letter character [e.g., id="id001"].) The following type attribute values are supported for <ref>:

valuetype of reference
citeref bibliographic citation reference
figref figure reference
fnoteref footnote reference
formularef formula reference
noteref endnote or general note reference
pageref reference to a <pb> element, such as would be used in an index
secref section reference, usually used to refer to a chapter or subsection.
tableref table reference

(Note that the use of a <ref> for footnotes is normally optional as the in-line presence of the <note> will automatically create a reference. References to the footnote from other locations are to be treated as <ref>s. See the section on footnotes for more information.)

<ref target="enote1" type="noteref">1</ref>

<note id="enote1" place="end" n="1">
. . .
</note>
        

In order to create a bidirectional link (i.e., from the reference [i.e., <ref>] to the referenced object [e.g., <note>] and then from the object back to the reference), projects must also include a unique id attribute in the <ref>. The value of the id in <ref> is then recorded in the corresp attribute of the element that is being referenced.

<ref id="bkd0e131" target="d0e131" type="noteref">1</ref>

<note id="d0e131" corresp="bkd0e131" place="end" n="1">
. . .
</note>
        

[P4: 6.6]

2.21. External Objects

2.21.1. <xref>

Use the <xref> element to refer to objects or locations outside of the encoded document. There are six attributes available for <xref>. Take care to note which of these are required.

attribute use possible values required?
doc contains the object's entity name[local entity name; must resolve to a valid declared entity]required when href is not used
href contains the external URI, may be URL or ARK[external URI (e.g., URL or ARK)]required when doc is not used
type indicate the type of object being linked to
obj
mets
url
pdf
sound
video
stream
required
rend defines the way the linking takes place
new
replace
embed
none
required
from contains the starting location of the portion of the digital object being linked to; also used to record single locations within objects[usually a unique id on a structural element]optional
to contains the ending location of the portion of the digital object being linked to[usually a unique id on a structural element; not required when only a single location in the object is being linked to ]optional

Note that every <xref> must have either a doc attribute or an href attribute or both.

The following table describes the actions dictated by the rend attribute:

valueresulting action
new a new window displaying the referenced external object appears
replace document view replaced by the referenced external object
embed the referenced external object is embedded in place
none no action

URL:

<xref href="http://www.cdlib.org" type="url" rend="new">
          

Result: new window displaying referenced URL.

CDL digital object:

<xref href="ark:/13030/kt5n39n99v" type="obj" rend="replace" from="ch02">
          

Result: document view replaced with Chapter 2 of referenced object.

PDF document:

<xref doc="kt167nb66r_ch19.pdf" type="pdf" rend="new">
          

Result: new window displaying a PDF of Chapter 19.

[P4: 14.2]

2.22. Graphic Elements

When encoding graphic elements such as illustrations, formulas, and tables, take special care to preserve both the information represented and, as far as possible, the form of presentation.

2.22.1. Tables

The CDL TEI Printed Book guidelines employ the full XHTML table module instead of the TEI default table scheme to encode tables. See the full XHTML table module guidelines for detailed instructions on how to encode tables. Projects should try as much as possible to encode for correct display in both Netscape and Internet Explorer browsers on the Windows and Mac platforms. (Take care to encode definition lists as <list type="gloss"> when encountered; these can sometimes be confused for two-column tables).

<table id="tab001">
   <caption>PERCENTAGES OF THE EARTH'S SURFACE</caption>
   <colgroup span="3">
      <col align="right" span="1"/>
      <col align="char" char="." span="1"/>
      <col align="char" char="." span="1"/>
   </colgroup>
   <thead>
      <tr>
         <th>Latitude</th>
         <th>%</th>
         <th>Cumulative %</th>
      </tr>
   </thead>
   <tbody>
      <tr>
         <td>40 N 30 W</td>
         <td>8.68</td>
         <td>8.68</td>
      </tr>
        

[P4: 22.1]

2.22.2. <figure>

Figures, charts, plates, formulas, or any other component of the text that must be delivered as an image must be encoded using the <figure> element. Any <figure> must contain a unique id attribute and an entity attribute that contains a valid entity name that resolves to a real file. The entity named in the entity attribute must be declared at the beginning of the document in order for the document to validate and function properly during ingest and preview. See the sections on associated files and image files for detailed instructions on how to create entities and produce image files. The rend attribute is also required. The following rend values are available:

valuedisplay
inline in-line as part of a text string
block as a block separate from the surrounding text
popup linked to a higher resolution version; for pop-up figures use the following syntax: rend="popup(ENTITY_NAME)", where the value in the parentheses is a valid entity name

Figure captions may be encoded in the <head> element within <figure> using the the type attribute value "caption".

<!ENTITY fig001   SYSTEM "http://www.server.domain/figures/fig001.gif" NDATA GIF>
<!ENTITY fig001_h SYSTEM "http://www.server.domain/figures/fig001_h.gif" NDATA GIF>
]>

<figure id="fig001" entity="fig001" rend="popup(fig001_h)">
   <head type="caption">Bottom topography in the South Atlantic Ocean.</head>
</figure>
        

[P4: 22.3]

2.22.3. Formulas

2.22.3.1. Formulas in <figure>

The difficulty of encoding mathematical and chemical formulas almost always makes it necessary for projects to submit an image of a formula rather than a marked-up representation. To provide the image of a formula, use <figure>.

                
<!ENTITY formula001 SYSTEM "http://www.server.domain/kt168nb88r_formula001.gif" NDATA GIF>
<!ENTITY fig001_h SYSTEM "http://www.server.domain/figures/formula001_h.gif" NDATA GIF>
]>
. . .
<figure id="formula001" entity="formula001" rend="inline">
                    

2.22.3.2. <formula>

The CDL also supports the encoding of TeX formulas within the TEI's <formula> element. To encode TeX formulas, give the notation attribute a value of "TeX" and use the rend attribute to indicate whether the formula should be displayed "inline" or as a "block". Projects that wish to give both a TeX expression and and an image of the formula may do both.

<formula notation="TeX" rend="block">
\[
\sigma_{s, \vartheta, p} = ({{\rho_{s, \vartheta, p}} -1})1000.
\]
</formula>
        

[P4: 22.2]

2.23. Arbitrary Containers and Segments

Arbitrary containers (<ab> and <seg>) can be nested virtually anywhere in the document and therefore can be used sparingly to resolve otherwise impossible encoding problems. When a necessary element is not valid in the location where it should logically go within a TEI document, an arbitrary container can be be inserted in the correct place instead. The text can then either be tagged directly as the content of the arbitrary container, or it can be tagged first with the desired element, which is then dropped into the arbitrary container.

Arbitrary containers may also be used when no other available container element is appropriate for the text being marked up. This usage, however, should be very limited.

The type attribute is required for both <ab> and <seg> elements. Suggested attribute values for type are "figure", "illgrp", "tblgrp", and "text". Projects may assign other values as needed.

2.23.1. <seg>

Use <seg> to contain a segment of text or an element that may normally appear in a paragraph but needs to encoded inside another element in which it is not otherwise allowed.

    
<address>
   <addrLine>The Compton Hotel<seg type=figure><figure id="seal1" entity="fig001"></addrLine></seg>
   <addrLine>1515 42nd Street</addrLine>
   <addrLine>Chicago, IL</addrLine>
</address> 
    

2.23.2. <ab>

Use <ab> to contain element that may normally appear in a paragraph, but needs to be encoded directly into a major division such as a <divn> where it is not otherwise allowed.

<ab type="illgrp">
  <figure id="fig001" entity="kt167nb66r_fig001.gif">
</ab>

      
          

[P4: 14.3]

Chapter 3. Quality Assurance

3.1. Validation

All documents must parse correctly before being submitted to the CDL. All texts will be validated on ingest and rejected if errors are detected.

3.2. Best Practice Checking

In addition to being validated against the supplied DTDs, documents will be checked for conformance to the appropriate CDL TEI best practice guidelines using a Schematron assertion language schema. Users can check their documents on their own by using the CDL Text Preview page

http://texts.cdlib.org/dynaxml/preview.html

3.3. Proofreading

Proofreading the actual text of submitted documents is the responsibility of the contributor. It is highly recommended that all texts at least be spot-checked for major errors before submission. If the project warrants it, documents should be proofread by a professional using the CDL Text Preview page:

http://texts.cdlib.org/dynaxml/preview.html

Tag Library

Below you will find a brief description of every element supported under the CDL Standard for Printed Books and their attributes. Attribute value definitions take one of the following forms:

ENTITY Entity Name defined in an entity declaration (<!ENTITY fig1 SYSTEM "fig1.gif" NDATA GIF>) 
ID Unique ID 
ID REFERENCE Reference to an existing ID 
TEXT Unrestricted text 
URI Uniform Resource Indicator 
(OPTION1 | OPTION2) A set list of optional values from which the encoder must choose. 

In addition each element and attribute is declared REQUIRED, RECOMMNEDED, or OPTIONAL (See Using these Guidelines)

<ab>

Anonymous block. Contains any arbitrary component-level unit of text, acting as an anonymous container for phrase or inter-level elements analogous to, but without the semantic baggage of, a paragraph.

Attributes:

type(illgrp | tblgrp | text)REQUIRED

See Also Arbitrary Containers .

<address>

Contains a postal or other address, for example of a publisher, an organization, or an individual.

See Also Names, Dates, and Addresses .

<addrLine>

Contains one line of a postal or other address.

See Also Names, Dates, and Addresses .

<author>

In a bibliographic reference, contains the name of the author(s), personal or corporate, of a work; the primary statement of responsibility for any bibliographic item. The 'rend' attribute can be used to hide authors that are implied in the text by a long dash, but need to be present for searching.

Attributes:

rend(hide | show)OPTIONAL

See Also Bibliographies .

<availability>

Supplies information about the availability of a text, for example any restrictions on its use or distribution, its copyright status, etc.

See Also Document Header .

<back>

Back matter. Contains any appendixes, etc. following the main part of a text.

See Also Text Structure .

<bibl>

Bibliographic citation. Contains a loosely-structured bibliographic citation of which the sub-components may or may not be explicitly tagged.

Attributes:

idIDREQUIRED
correspID REFERENCEOPTIONAL

See Also Bibliographies .

<biblFull>

Contains a fully-structured bibliographic citation, in which all components of the TEI file description are present.

Attributes:

idIDOPTIONAL

See Also Document Header .

<biblScope>

Scope of citation. Defines the scope of a bibliographic reference, for example as a list of page numbers, or a named subdivision of a larger work.

Attributes:

type(article | chapter | issue | pages | part | section | volume)OPTIONAL

See Also Bibliographies .

<body>

Text body. Contains the whole body of a single unitary text, excluding any front or back matter.

See Also Text Structure .

<byline>

Contains the primary statement of responsibility given for a work on its title page or at the head or end of the work.

See Also Division Openers and Closers .

<catDesc>

Category description. Describes some category within a taxonomy or text typology, either in the form of a brief prose description or in terms of the situational parameters used by the TEI formal <textDesc>.

Attributes:

idIDOPTIONAL

See Also Document Header .

<category>

Category. Contains an individual descriptive category, possibly nested within a superordinate category, within a user-defined taxonomy.

Attributes:

idIDOPTIONAL

See Also Document Header .

<change>

Summarizes a particular change or correction made to a particular version of an electronic text which is shared between several researchers.

Attributes:

idIDOPTIONAL

See Also Document Header .

<cit>

A quotation from some other document, together with a bibliographic reference to its source.

See Also Quotations .

<classDecl>

Classification declarations. Contains one or more taxonomies defining any classificatory codes used elsewhere in the text

Attributes:

idIDOPTIONAL

See Also Document Header .

<date>

Contains a date in any format. The content of 'value' must follow the ISO 8601:2000 5.2.1 date format (yyyy-mm-dd)

Attributes:

value(yyyy-mm-dd)OPTIONAL

See Also Names and Dates .

<dateline>

Contains a brief description of the place, date, time, etc. of production of a letter, newspaper story, or other work, prefixed or suffixed to it as a kind of heading or trailer.

Attributes:

idIDOPTIONAL

See Also Division Openers and Closers .

<div1-7>

Level 1-7 text divisions. Used to encode the structural subdivisions of the front, body, or back of a text.

Attributes:

idIDREQUIRED
nTEXTOPTIONAL
type(copyright | dedication | contents | fmsec | halftitle | volume | part | chapter | ss1-ss6 | bmsec | appendix | endnotes | glossary | bibliography | index)REQUIRED

See Also Divisions .

<docAuthor>

Document author. Contains the name of the author of the document, as given on the title page.

See Also Title Page .

<docDate>

Document date. Contains the date of a document, as given (usually) on a title page.

See Also Title Page .

<docEdition>

Document edition. Contains an edition statement as presented on a title page of a document.

Attributes:

idIDOPTIONAL

See Also Title Page .

<docImprint>

Document imprint. Contains the imprint statement (place and date of publication, publisher name), as given (usually) at the foot of a title page.

See Also Title Page .

<docTitle>

Document title. Contains the title of a document, including all its constituents, as given on a title page.

See Also Title Page .

<edition>

Edition. Describes the particularities of one edition of a text.

See Also Bibliographies .

<editionStmt>

Edition statement. Groups information relating to one edition of a text.

Attributes:

idIDOPTIONAL

See Also Document Header .

<editor>

Editor. Secondary statement of responsibility for a bibliographic item, for example the name of an individual, institution or organization (or of several such) acting as editor, compiler, translator, etc.

See Also Bibliographies .

<editorialDecl>

Editorial practice declaration. Provides details of editorial principles and practices applied during the encoding of a text.

Attributes:

idIDOPTIONAL

See Also Document Header .

<emph>

Emphasized. Marks words or phrases which are stressed or emphasized for linguistic or rhetorical effect.

Attributes:

rend(bold | italic | mono | roman | smallcaps | strikethrough | subscript | superscript | underline)OPTIONAL

See Also Font Changes .

<encodingDesc>

Encoding description. Documents the relationship between an electronic text and the source or sources from which it was derived.

Attributes:

idIDOPTIONAL

See Also Document Header .

<epigraph>

Epigraph. Contains a quotation, anonymous or attributed, appearing at the start of a section or chapter, or on a title page.

See Also Division Openers and Closers .

<extent>

Describes the approximate size of the electronic text as stored on some carrier medium, specified in any convenient units.

Attributes:

idIDOPTIONAL

See Also Document Header .

<figure>

Indicates the location of a graphic, illustration, or figure.

Attributes:

idIDREQUIRED
entityENTITYREQUIRED
rend(block | hide | inline | popup(ENTITY))REQUIRED

See Also Figures .

<fileDesc>

File Description. Contains a full bibliographic description of an electronic file.

See Also Document Header .

<foreign>

Identifies a word or phrase as belonging to some language other than that of the surrounding text. The value of 'lang' should be a UNICODE code chart name (e.g. Greek, Hebrew, etc.)

Attributes:

langID REFERENCEREQUIRED

See Also Foreign Words .

<formula>

Contains a mathematical or other formula.

Attributes:

idIDREQUIRED
notation(mathML | TeX)OPTIONAL
rend(block | inline)REQUIRED

See Also Formulas .

<front>

Front matter. Contains any prefatory matter (headers, title page, prefaces, dedications, etc.) found at the start of a document, before the main body.

See Also Front Matter .

<funder>

Funding body. Specifies the name of an individual, institution, or organization responsible for the funding of a project or text.

Attributes:

idIDOPTIONAL

See Also Document Header .

<group>

Contains the body of a composite text, grouping together a sequence of distinct texts (or groups of such texts) which are regarded as a unit for some purpose, for example the collected works of an author, a sequence of prose essays, etc.

See Also Groups of Texts .

<head>

Heading. Contains any heading, for example, the title of a section, or the heading of a list or glossary.

Attributes:

type(main | subtitle | alternate | abbreviated)OPTIONAL

See Also Division Openers and Closers .

<hi>

Highlighted. Marks a word or phrase as graphically distinct from the surrounding text, for reasons concerning which no claim is made.

Attributes:

rend(bold | italic | mono | roman | smallcaps | strikethrough | subscript | superscript | underline)OPTIONAL

See Also Font Changes .

<idno>

Identifying number. Supplies any standard or non-standard number used to identify a bibliographic item.

Attributes:

type( ARK | ISBN | ISSN | LCCN | LOCAL | OTHER )REQUIRED

See Also Document Header .

<imprint>

Groups information relating to the publication or distribution of a bibliographic item.

Attributes:

idIDOPTIONAL

See Also Bibliographies .

<item>

Contains one component of a list.

See Also Lists .

<keywords>

Keywords. Contains a list of keywords or phrases identifying the topic or nature of a text.

Attributes:

idIDOPTIONAL
schemeID REFERENCEOPTIONAL

See Also Document Header .

<l>

Verse line. Contains a single, possibly incomplete, line of verse.

Attributes:

nTEXTOPTIONAL
rend(indent1 | indent2 | indent3 | indent4 | indent5 | indent6 | indent7 | indent8 | indent9 | indent10)OPTIONAL

See Also Lines of Verse .

<label>

Contains the label associated with an item in a list; in glossaries, marks the term being defined.

<language>

Characterizes a single language or sub-language used within a text. The 'id' attribute should use a UNICODE code chart name (e.g. Greek, Hebrew, etc.)

Attributes:

idIDREQUIRED

See Also Document Header .

<langUsage>

Language usage. Describes the languages, sub-languages, registers, dialects etc. represented within a text.

See Also Document Header .

<lb/>

Line break. Marks the start of a new (typographic) line in some edition or version of a text.

See Also Milestones .

<lg>

Line group. Contains a group of verse lines functioning as a formal unit, e.g. a stanza, refrain, verse paragraph, etc.

Attributes:

type(couplet | paragraph | quatrain | stanza | verse)OPTIONAL

See Also Line Groups and Fragments .

<list>

Contains any sequence of items organized as a list.

Attributes:

rend(arabic | upperalpha | loweralpha | upperroman | lowerroman | supplied)OPTIONAL
type(bulleted | gloss | ordered | simple)REQUIRED

See Also Lists .

<listBibl>

Citation list. Contains a list of bibliographic citations of any kind.

See Also Bibliographies .

<milestone >

Marks the boundary between sections of a text, as indicated by changes in a standard reference system.

Attributes:

idIDREQUIRED
rend(decorative)OPTIONAL
unit(section)REQUIRED

See Also Milestones .

<monogr>

Monographic level. Contains bibliographic elements describing an item (e.g. a book or journal) published as an independent item (i.e. as a separate physical object).

Attributes:

idIDOPTIONAL

See Also Bibliographies .

<name>

Name, proper noun. Contains a proper noun or noun phrase.

Attributes:

regTEXTOPTIONAL
type(personal | place)OPTIONAL

See Also Names and Dates .

<note>

Contains a note or annotation.

Attributes:

correspIDOPTIONAL
idID REFERENCEREQUIRED
nTEXTOPTIONAL
place(end | foot | inline)OPTIONAL

See Also Notes .

<p>

Paragraph. Marks paragraphs in prose.

Attributes:

rend(blockquote | center | hang | indent | noindent | left | right | CSS)OPTIONAL

See Also Paragraphs .

<pb>

Page break. Marks the boundary between one page of a text and the next in a standard reference system.

Attributes:

idIDREQUIRED
nTEXTOPTIONAL

See Also Milestones .

<prinicipal>

Principal researcher. Supplies the name of the principal researcher responsible for the creation of an electronic text.

Attributes:

idIDOPTIONAL

See Also Document Header .

<profileDesc>

Text-profile description. Provides a detailed description of non-bibliographic aspects of a text, specifically the languages and sub-languages used, the situation in which it was produced, the participants and their setting.

See Also Document Header .

<projectDesc>

Project description. Describes in detail the aim or purpose for which an electronic file was encoded, together with any other relevant information concerning the process by which it was assembled or collected.

Attributes:

idIDOPTIONAL

See Also Document Header .

<ptr>

Defines a pointer to another location in the current document in terms of one or more identifiable elements.

Attributes:

idIDOPTIONAL
targetID REFERENCEREQUIRED
type(citeref | figref | fnoteref | formularef | noteref | pageref | secref | tableref)REQUIRED

See Also Internal Links and Cross References .

<publicationStmt>

Publication statement. Groups information concerning the publication or distribution of an electronic or other text.

See Also Document Header .

<publisher>

Provides the name of the organization responsible for the publication or distribution of a bibliographic item.

See Also Title Page .

<pubPlace>

Contains the name of the place where a bibliographic item was published.

See Also Title Page .

<q>

Quoted speech or thought. Contains a quotation or apparent quotation — a representation of speech or thought marked as being quoted from someone else (whether in fact quoted or not); in narrative, the words are usually those of of a character or speaker; in dictionaries, <q> may be used to mark real or contrived examples of usage.

Attributes:

rend(blockquote)OPTIONAL

See Also Quotations .

<ref>

Defines a reference to another location in the current document, in terms of one or more identifiable elements, possibly modified by additional text or comment.

Attributes:

idIDOPTIONAL
targetID REFERENCEREQUIRED
type(citeref | figref | fnoteref | formularef | noteref | pageref | secref | tableref)REQUIRED

See Also Internal Links and Cross References .

<refsDecl>

References declaration. Specifies how canonical references are constructed for this text.

Attributes:

idIDOPTIONAL
doctypeTEXTOPTIONAL

See Also Document Header .

<resp>

Contains a phrase describing the nature of a person's intellectual responsibility.

Attributes:

idIDOPTIONAL

See Also Document Header .

<respStmt>

Statement of responsibility. Supplies a statement of responsibility for someone responsible for the intellectual content of a text, edition, recording, or series, where the specialized elements for authors, editors, etc. do not suffice or do not apply.

Attributes:

idIDOPTIONAL

See Also Document Header .

<revisionDesc>

Revision description. Summarizes the revision history for a file. Attributes

Attributes:

idIDOPTIONAL

See Also Document Header .

<seg>

Arbitrary segment. Contains any arbitrary phrase-level unit of text.

Attributes:

type(illgrp | tblgrp | text)REQUIRED

See Also Arbitrary Containers .

<series>

Series information. Contains information about the series in which a book or other bibliographic item has appeared.

Attributes:

idIDOPTIONAL

See Also Document Header .

<seriesStmt>

Series statement. Groups information about the series, if any, to which a publication belongs.

Attributes:

idIDOPTIONAL

See Also Document Header .

<sourceDesc>

Supplies a bibliographic description of the copy text(s) from which an electronic text was derived or generated.

See Also Document Header .

<sp>

Speech. An individual speech in a performance text, or a passage presented as such in a prose or verse text.

See Also Speech .

<speaker>

A specialized form of heading or label, giving the name of one or more speakers in a dramatic text or fragment.

See Also Speech .

<sponsor>

Specifies the name of a sponsoring organization or institution.

Attributes:

idIDOPTIONAL

See Also Document Header .

<table>

Contains text displayed in tabular form, in rows and columns. Can contain the following elements: caption, td, th, tr, col, colgroup, tbody, thead, tfoot.

See Also Tables .

<taxonomy>

Taxonomy. Defines a typology used to classify texts either implicitly, by means of a bibliographic citation, or explicitly by a structured taxonomy.

Attributes:

idIDOPTIONAL

See Also Document Header .

<TEI.2>

TEI document. Contains a single TEI-conformant document, comprising a TEI header and a text. The value of 'id' should be the unique key of the ARK assigned to the text.

Attributes:

idIDREQUIRED

See Also Root Element .

<teiHeader>

TEI Header. Supplies the descriptive and declarative information making up an ‘electronic title page’ prefixed to every TEI-conformant text.

See Also Document Header .

<text>

Contains a single text of any kind, whether unitary or composite, for example a poem or drama, a collection of essays, a novel, or a dictionary.

See Also Text Structure .

<textClass>

Text classification. Groups information which describes the nature or topic of a text in terms of a standard classification scheme, thesaurus, etc.

Attributes:

idIDOPTIONAL

See Also Document Header .

<title>

Contains the title of a work, whether article, book, journal, or series, including any alternative titles or subtitles.

Attributes:

level(a | m | j | s | u)OPTIONAL
type(main | subtitle | alternate | abbreviated)OPTIONAL

See Also Bibliographies .

<titlePage>

Title page. Contains the title page of a text, appearing within the front or back matter.

See Also Title Page .

<titlePart>

Title part. Contains a subsection or division of the title of a work, as indicated on a title page.

Attributes:

type(main | subtitle | alternate | abbreviated)OPTIONAL

See Also Title Page .

<titleStmt>

Title statement. Groups information about the title of a work and those responsible for its intellectual content.

See Also Document Header .

<xptr>

Extended pointer. Defines a pointer to another location in the current document or an external document. NOTE: The value of 'href' can be either a URL (e.g. http://texts.cdlib.org/xtf/servlet/dynaXML?docId=ft958009mm) or CDL ARK (e.g. ark:/13030/ft958009mm)

Attributes:

docENTITYOPTIONAL
hrefURIOPTIONAL
type(mets | obj | pdf | sound | stream | url | video)REQUIRED
rend(embed | new | none | replace)REQUIRED
fromIDOPTIONAL
toIDOPTIONAL

See Also External Objects .

<xref>

Extended reference. Defines a reference to another location in the current document, or an external document, using an extended pointer notation, possibly modified by additional text or comment.

Attributes:

docENTITYOPTIONAL
hrefARK or URLOPTIONAL
type(mets | obj | pdf | sound | stream | url | video)REQUIRED
rend(embed | new | none | replace)REQUIRED
fromIDOPTIONAL
toIDOPTIONAL

See Also External Objects .