California Digital Library

Digital Object Standard: Metadata, Content and Encoding

Version 2

May 18, 2001

Reviewed and updated annually

Table of Contents

 

 

Introduction *

Definition of a Digital Object *

Metadata and Digital Object Encoding Standards *

Descriptive Metadata *

Metadata and Encoding Tables *

Summary of Required Metadata Elements for Simple Digital Objects *

Required Metadata for Complex Digital Objects *

Table of Required Metadata for Complex Digital Objects *

Example of Metadata for Complex Digital Objects *

Standards Development Process *

Distribution *

Appendix A: CDL Digital Object Document Type Definition Tutorial *

Appendix B: CDL Digital Object Document Type Definition *

Appendix C: Metadata for Digital Objects *

Metadata and Encoding Tables *

Content File Inventory *

Structural Metadata Table *

Administrative Metadata Table - General *

Administrative Metadata Table - Technical *

Administrative Metadata Table - Rights *

Administrative Metadata Table - Source *

Descriptive Metadata Table - Generic *

Introduction

This document addresses the standards for digital object collections for the California Digital Library. These standards describe the file formats, storage and access standards for digital objects created by or incorporated into the CDL as part of the permanent collections. They attempt to balance adherence to industry standards, reproduction quality, access, potential longevity and cost. Adherence to these standards is required for all CDL contributors and may also serve University of California staff as guidelines for digital object creation and presentation. These standards are not intended to address all of the administrative, operational, and technical issues surrounding the creation of digital object collections.

Definition of a Digital Object

A digital object is defined for the purposes of this document as something (e.g., an image, an audio recording, a text document) that has been digitally encoded and integrated with metadata to support discovery, use, and storage of those objects.

It should be noted that there is an important distinction between digital objects (e.g., an encoded text document or a digitized image) and the digital collections (e.g., the Online Archive of California) to which they belong. The distinction between digital objects and digital collections is analogous to the distinction between a collection of works by Arthur Conan Doyle and a particular copy of the Hound of the Baskervilles. Continuing the analogy, this document would describe standards for the description, structure and content of the digital Hound. This document, however, would be silent on how to represent the fact that A Study in Scarlet was also part of the collection.

Metadata and Digital Object Encoding Standards

Metadata is an important component of digital objects, as it supports the discovery, use, storage and migration of these objects over time. Metadata must be collected and associated with each digital object as part of the collection development process. Two types of digital objects may be created, simple objects or single files, such as a digital image collection, and complex objects such as a digital books which include many files that are related and linked to one another in a specific order.

It is important to make a distinction between the individual image files created in a digitization project and a digital library object which is the aggregation of digitized content and its related metadata. For example, a digital book object could include hundreds of page image files and metadata which describes their relationship to one another. A more complete digital library object example is a digitized diary pointed to by an EAD encoded finding aid, which itself is part of the Online Archive of California. The diary may have three sections, one each for dated entries, personal contacts and account information. In addition, the diary may have 50 pages that were digitized as 600 DPI TIFF master files, and then 50 lower resolution JFIF viewing files and 50 GIF thumbnail files which were derived from the masters for each page. The diary may have also been transcribed using TEI. This digital diary object now includes the content (150 files of digitized page images and one text transcription file), a reference to the descriptive metadata (the EAD), structural metadata that allows you to jump to a particular section or can show the transcribed text that relates to a digitized page you are viewing, and all administrative metadata (the master files were scanned as 600 DPI TIFFs, etc.)

Management, migration and transport of digital objects, require a standard method of encoding metadata for digital objects. The CDL has adopted an XML DTD as a means to encode these complex objects and requires that all metadata submitted with a collection to be encoded in this format. This DTD was originally created for the Making of America II project (http://sunsite.berkeley.edu/moa2). This XML encoding keeps track of which files represent master, viewing and thumbnail images, the administrative metadata that relates to these different file groups, the object’s internal organization (encoded as a structural map), etc. A tutorial for this DTD can be found in Appendix A, which also includes a sample encoding of a simple object. The full DTD can be found at Appendix B.

It's not expected that digital collection staff will have to encode the metadata for these objects in XML by hand. Instead, digitization management software can be used to provide a more cost-effective process for defining an object’s structure and collecting its metadata. A program can read a database created by the digitization management software and automatically create the XML encoded objects. This same software and database can also be used to maintain the information in these objects over time. This type of software and XML generation process was developed with standard personal computing database software as part of the MOA II project and may be available to CDL contributors.

Three types of metadata are associated with digital objects.

1) Descriptive Metadata is used in the discovery and identification of an object. Examples include EAD, MARC and Dublin Core records. Additionally, descriptive metadata for digital objects applies to information on the full collection of files associated with the digital object and their relationships to one another. The descriptive metadata actually stored within a digital library object is minimal; most of the descriptive metadata regarding the object is stored externally to the object and is only referenced (or, in Warwick Framework terms, is an indirect package).

2) Structural Metadata is used to display and navigate a particular object for a user and includes the information on the internal organization of that object (e.g., a book may have an introduction, chapters, pages and an index).

3) Administrative Metadata represents the management information for this object, including the date it was created, its content file format (JPEG, GIF, etc.), scanning resolutions used, rights information, etc.

Descriptive Metadata

Descriptive metadata for the discovery and identification of a digital file or object is not specified in detail in this document, as these metadata elements need to be defined by the community that works with the content. For example, the Online Archive of California community has standardized on collection-level MARC cataloging and EAD encoded digital finding aids. Other communities will adopt their own standards. A suggested minimum descriptive metadata set could include the major elements of the Dublin Core, simply to provide basic interoperability with other collections that have utilized Dublin Core attributes. Descriptive metadata for the management and display of digital objects is fully specified below.

Metadata and Encoding Tables

The tables in Appendix C are intended to be comprehensive and are recommendations for the full set of metadata elements (except for descriptive metadata, as noted above) that may be useful in the management of a digital image collection. Simple digital objects will only use the minimum set of required elements described below. The full suite of metadata applies to master images only. Derivatives may use a subset.

These tables include both minimal and maximal values; identify required and repeatable fields; and identify which field values may be automatically generated or supplied manually. The columns of the table are:

    1. Element: The name of the metadata element
    2. Example: Examples of this element’s content
    3. Description/Comments: A definition of this metadata element
    4. Req'd for These Types: The digital object types for which this element is required.
    5. Rep: Shows if the element is repeatable
    6. Source: Given reasonable digitization management software, the column describes how the element is created (e.g., manually supplied, automatically generated)
    7. Element / Attribute: Shows where this element is encoded in the XML DTD

Summary of Required Metadata Elements for Simple Digital Objects

The metadata required for simple digital objects, for example a single archival image and derivitives for a three dimensional object which are not part of a complex digital object such as a book, may be described by the following metadata. This is a minimal subset of the "Metadata for Digital Objects" described in detail in Appendix C. Note that only two elements require manual input. All other elements can be automatically generated by software as part of the digitization process.

Element

Example

Description

Req'd for These Types

Repeatable

Source

Unique identifier reference

urn:ucb:I0182A, 10.1000/I0182A, http://purl.berkeley.edu/I0182A

This element uniquely identifies a particular digital object

All

No

Automatically generated

Descriptive Metadata Reference

http://sunsite2.berkeley.edu:28008/dynaweb/oac/calher/breen/

An identifier or location for descriptive metadata regarding this object.

All

Yes

Manually supplied in data capture

"Generic" Descriptive Metadata 1

The MOA2 DTD contains descriptive metadata elements that may be used directly without reference or type declaration.

All

No

Manually supplied in data capture

Descriptive Metadata Type 1

MARC, EAD, RDF, Dublin Core

The form of descriptive metadata associated with this object.

All

No

Automatically generated

Version

A digital library object may encapsulate several different electronic expressions of the original work which has been digitized in different formats. A version within a digital library object consists of all files necessary to process and display a particular expression to a user (e.g., an SGML transcription + DTD +DSSSL style sheet). Files within a single, root <FileGrp> element constitute a digitized version of the object.

All

Yes

Automatically derived from sub-object hierarachy

File ID

<File ID="I0182A">

A unique identifier, internal to the object, for referencing this particular File from the Structural Map.

All

Yes

Automatically generated

File Type

text/sgml, text/xml, image/tiff, etc.

Used to inform client software regarding the file's data format, and hence what general viewer type will be needed.

All

No

Automatically generated from defaults

File Sequence

23rd of 42 page images

Relative position of a particular file within its encapsulating subset of files.

All

No

Automatically generated

File Date

1999-05-13

The date the file was created expressed as ISO 8601 Date Format YYYY-MM-DD

All

No

Automatically generated from defaults

File Use

Archive, reference, thumbnail

Used to describe generic instances of an image.

All

No

Automatically generated from Master/Derivative distinctions in database

File Locator

urn:ucb:I0182A, http://purl.berkeley.edu/I0182A.jpg

A unique identifier or locator which may be used by client software to retrieve the file in question.

All

No

Automatically generated from defaults

Administrative Metadata ID

<AdminMD ID="AM183">

A unique identifier, internal to a digital library object, which allows this metadata to be referenced by other portions of the object

All

No

Automatically generated

Compression Format

LZW

Type of algorithm needed to decompress the image

Image

No

Automatically generated from defaults

Color Space

CMYK, RGB, CIELab

Color space used.

Image

No

Automatically generated from defaults

Source Item ID

A local catalog number plus page number for a book; an accession number (and possibly a page or part number) for a special collections item

A number or alphanumeric string uniquely identifying the source of this file (recursively).

All

No

Manually supplied

Source Type

Photographic print, slide, manuscript, printed page(s), VHS Tape, wire recorder, another digital object

To identify the material from which the digital file was created - the item on hand, even if it itself is a reformatted version.

All

No

Automatically generated from defaults

Physical Dimensions of Source

10.2cm x 18.4cm

Actual physical dimension of source. Needed for appropriate facsimile output.

Image

No

Automatically generated from defaults

Descriptive Metadata ID

<DMD ID="DM3">

<wrapper ID="DM1">

A unique identifier, internal to a digital library object, which allows this descriptive metadata to be referenced by other portions of the object

Yes

No

Automatically generated in case of GDM.

 

Example of Metadata for Simple Digital Objects

Here is an example of values for a simple object, a stereograph from the Alfred Hart Collection at the UC Berkeley Bancroft Library, for which the only digital version is a TIFF master image

Element

Example

Unique identifier reference

19xx.141:356t

Descriptive Metadata Reference

http://sunsite2.berkeley.edu/cgi-bin/oac/calher/centpac

Descriptive Metadata Type

EAD Instance

Version

2001-02-20

File ID

urn:x-ucb:19xx.141:356t.mstr

File Type

Image/tiff

File Sequence

23rd of 42 page images

File Date

1999-07-14

File Use

archive

File Locator

urn:x-ucb:19xx.141_356t.mstr

Administrative Metadata ID

<AdminMD ID="AM183">

Compression Format

none

Color Space

CIELAB

Source Item ID

urn:x-ucb:19xx.141:356t

Source Type

Stereographic print

Physical Dimensions of Source

18 x 9 cm.

Descriptive Metadata ID

<DMD ID="DM3"> <wrapper ID="DM1">

Required Metadata for Complex Digital Objects

The minimum metadata required for complex digital objects, for example a book, is described by the following metadata. The full description of metadata for complex digital objects is provided in Appendix C.

Content File Inventory

A listing of all of the files containing digital content derived from the primary source.

Structural Metadata

Records the abstract structure of the work from which the digital object is derived.

General Administrative Metadata

All information necessary for objects' long term use and management.

Technical Administrative Metadata

Information necessary to document the technical processes employed in both digitizing primary source material and storing the digitization for future use.

Rights Administrative Metadata

Information regarding the intellectual property rights relevant to the digital object's storage, transmission and use.

Source Administrative Metadata

All information necessary to determine the origin of the current file, including both the sources used to produce the current file and any transformations which were applied to the content of the file in moving from an earlier version to the current resource.

 

Table of Required Metadata for Complex Digital Objects

The following table is the minimum metadata required for complex digital objects, in addition to that required for simple digital objects. These requirements are not dependent on the types of simple (or complex objects) contained.

 

Structural Metadata Table

Element

Example

Description/Comments

Rep

Source

Element/ Attribute

Structural Type

Logical, physical, etc.

Structural type is used to indicate whether the internal structure of the object is best described as a logical structure (e.g., this is a diary consisting of entries) or a physical structure (e.g., this is a book consisting of pages).

Yes

Automatically generated

TYPE Attribute of StructMap element

Structural Divisions/Sub-object Relationships

Parent div of diary, with two child divs of type entry, which are siblings:

 

<div TYPE='diary'>

<div TYPE='entry'>

</div>

<div TYPE='entry'>

</div>

</div>

A digital object may be logically divided into parts (e.g., letters in a diary). If resources are made available to support some level of encoding, structural divisions are encoded with the TEI element DIV. Many of the attributes of the Digital Object will be applicable to the Structural Divisions.

DIVs provide information on sub-object relationships. A diary entry in a diary section (e.g., a year) would have as its parent the section, and would have as siblings the previous and next diary entries. If, for example, it was an unusually long diary entry with sections of its own, its "children" would be the sections within the entry.

Yes

Automatically derived from sub-object hierarchy

div element

Sub-object Format

text/sgml, text/xml, image/tiff, etc.

Images of all types (e.g., page images and continuous tone images) require format information. The contents of the Sub-object Format element are coordinated with the Content Type element (see above). While Content Type declares the available formats for a particular "type" of information (e.g., encoded text), the Sub-object Format element refers to these declarations to inform the intermediary of the available formats for the object at hand. For example, a page image may be said to be available as a GIF image, a PDF file, and a TIFF G4 image.

No

Automatically generated from defaults

MIMETYPE attribute on fptr element

Sub-object reference

<fptr FILEID="I0182A">

This attribute carries information needed to locate the sub-object. In the digital library object, this consists of an IDREF attribute referring to a particular file within the File Inventory section, possibly combined with a reference to a tagged item within a file.

Yes

Automatically generated

FILEID attribute and possibly TAGID attribute of fptr element

 

Example of Metadata for Complex Digital Objects

The following table shows the required elements for a complex object, the Breen diary, a book with both multiple scanned image versions and an SGML transcription available. This is a minimal subset of the "Metadata for Digital Objects" described in detail in Appendix C.

Element

Example

Unique Identifier Reference

BANC MSS C-E 176

Descriptive Metadata Reference

http://sunsite2.berkeley.edu:8000/login:sessionid=0:bad=html/no_auth.html:entitybakermsg=%22%22:next=nextcmd%22/query:%7fnext=html/glad_results.html%7fformat=b%7fentityheader=&amp;headergenwithwordlist;%7fentityvalidpatron=true:entitycurrentsearchscreen=html/glad_search.html%7fentitycurrentresultsscreen=html/glad_results.html%7fentitylocalnext=html/glad_results.html%7fnumrecs=10%7fentitytoprecno=1%7fentitycurrecno=1%7ftempjds=true%7fentitycounter=1%7fsessionid=0%7fdbname=glad%7fkey1=date%7fdirection1=d%7fterm-gl:=167968461%22

and

http://sunsite2.berkeley.edu:28008/dynaweb/oac/calher/breen/@generic__bookview

Descriptive Metadata Type

MARC Record and EAD Instance

Version

4 <FileGrp> elements, 3 for different scanned image versions, and one for an SGML transcription

File ID

FID1, FID2, FID3 etc. through FID97

File Type

image/jpg, image/gif and text/sgml

File Sequence

1-32 for scanned image versions, 1 for the SGML transcription

File Date

1998-04-12 for SGML transcription, 1998-04-03 for image files

File Use

ARCHIVE, REFERENCE and THUMBNAIL

File Locator

http://sunsite.berkeley.edu/~jmcdonou/BREEN/sgml/breen2.sgm http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018235A.jpg,

etc. for all 97 data files in the object

Structural Type

logical

Structural Divisions / Sub-object Relationships

<div> structural divisions are marked by existence of <div> elements.

Sub-object Format

image/jpg, image/gif, text/sgml

Sub-object reference

FID1 - FID97

Administrative Metadata ID

ADM1, ADM2, ADM3, etc. through ADM143

Compression Format

JPEG, LZW

Color Space

RGB

Source Item ID

BANC MSS C-E 176, BANC MSS C-E 176: Friday 11th, BANC MSS C-E 176: Satd. 12th, etc.

Source Type

Diary entry

Physical Dimensions of Source

12 x 17 cm.

Descriptive Metadata ID

<DMD ID="DM3"> <wrapper ID="DM1">

Standards Development Process

This is the first version of the CDL Digital Object Standard. This version is based upon the September 1, 1999 version of the CDL's Digital Image Standard, which included recommendations of the Museum Educational Site Licensing Project (MESL), the Library of Congress and the MOA II participants.

The Museum Educational Site Licensing Project (MESL) offered a framework for seven collecting institutions, primarily museums, and seven universities to experiment with new ways to distribute visual information--both images and related textual materials.

The Fowler Museum of Cultural History

The George Eastman House

Harvard University Art Museums

The Library of Congress

The Museum of Fine Arts, Houston

The National Gallery of Art

The National Museum of American Art

American University

Columbia University

Cornell University

University of Illinois at Urbana-Champaign

University of Maryland

University of Michigan

The University of Virginia

The Getty Information Institute

MUSE Educational Media

The Making of America (MoA II) Testbed Project is a Digital Library Federation (DLF) coordinated, multi-phase endeavor to investigate important issues in the creation of an integrated, but distributed, digital library of archival materials (i.e., digitized surrogates of primary source materials found in archives and special collections). The participants include Cornell University, New York Public Library, Pennsylvania State University, Stanford University and UC Berkeley.

The Library of Congress white papers and standards are based on the experience gained during the American Memory Pilot Project. The concepts discussed and the principles developed still guide the Library's digital conversion efforts, although they are under revision to accomodate the capabilities of new technologies and new digital formats.

The CDL Technical Architecture and Standards Workgroup includes the following members with extensive experience with digital object collection and management:

 

Distribution

Draft standards are provided to each UC campus for distribution to staff for review and comment. The final copy of these standards are submitted to the CDL University Librarian annually by the CDL Technical Architecture and Standards Workgroup after review and updates based upon changes in technical standards and current practice.

Appendix A: CDL Digital Object Document Type Definition Tutorial

 

An XML Document Type Definition has been created for CDL digital objects. This DTD provides a means of encoding the various descriptive, administrative and structural metadata for all electronic versions of a particular archival object.

An CDL digital object consists of four major sections:

A more detailed explanation of each section and their inter-relations follows

The examples in this tutorial are taken from a simplified version of the of CDL encoding of the Breen Diary (from the collection of The Bancroft Library at UC Berkeley).   This simplified Breen diary consists of two pages of the diary proper; followed by a two page letter that has been inserted at the end of the diary. You may view the entire XML source for the sample encoding at the end of this tutorial.
 

Descriptive Metadata

The Descriptive Metadata section of an CDL XML Object (the <DescMD> element) may refer to an external source of descriptive metadata; or may itself contain embedded descriptive metadata. References to external descriptive metadata appear in <DMDRef elements.  Embedded descriptive metadata appears within an <DMD> element.

External Descriptive Metadata: Descriptive Metadata Reference. A descriptive metadata reference element (<DMDRef) simply provides the URI for an external source of descriptive metadata. For example, the descriptive metadata reference below points to an external finding aid:

<DescMD>

<DMDRef LOCTYPE='URL' DMDTYPE='FINDAID'>http://sunsite2.berkeley.edu/cgi-bin/oac/calher/breen
</DMDRef >

</DescMD>

This <DMDRef> contains two attributes. The LOCTYPE specifies the type of URI being provided (PURL, HANDLE, DOI and PDI would be other options). The DMDTYPE identifies the type of descriptive metadata being referred to: MARC record, FINDAID, RDF, PICS or OTHER. Additional supported attributes provide for specifying the MIMETYPE of the external descriptive metadata, and a LABEL that can be used to identify the available descriptive metadata to the user.

Embedded Descriptive Metadata. Embedded descriptive metadata appearing under a <DMD> element can either use generic descriptive metadata elements defined in the DTD, or another user-defined text format (e.g., MARC, Dublin Core) enclosed in a wrapper. The generic descriptive metadata elements provided for by the DTD are grouped under a <GDM> element. These are closely related to the descriptive metadata fields supported by GenDB, a database designed and used at UC Berkeley to gather the metadata needed to construct both EAD and CDL objects. The core descriptive metadata elements include title, date, caption, dimensions, and material origin. In addition to the core descriptive metadata elements, the following elements are supported: administrative information, alternate date, content, creator, general notes, physical description, related materials, subobject source, and subject.

Note that a <GDM> element includes an ID attribute. This attribute provides a unique, internal name for each GDM element which can be used in the StructMap to link a particular division of the document hierarchy to a particular GDM element This allows specific sections of descriptive metadata to be linked to specific parts of the digital object. In other words, a <GDM> element may pertain to the entire digital object described by an CDL.xml file or just a portion of it.  For example, in the case of the Breen Diary it would be possible to set up one <GDM> element that pertained to the Breen Diary as a whole, and a second <GDM> element that just pertained to the letter appended to the end of the Diary.
 

File Inventory

The file inventory section consists of one or more <FileGrp> elements used to group together related files.  A <FileGrp> lists all of the files which comprise a single electronic version of the archival object. For example, there might be separate <FileGrp> elements for the thumbnails, the master archival images, the pdf versions, the TEI encoded text versions, etc.

Consider the first <FileGrp> from the simplified encoding of the Breen Diary:

<FileGrp VERSDATE ='12/4/1998' >
    <File ID ='FID1' MIMETYPE ='text/sgml' SEQ = '1' CREATED = '12/4/1998' ADMID = 'ADM4 ADM4 ADM6'
    GROUPID = 'GID1' USE = 'ARCHIVE'>
        <FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/sgml/breen2.sgm>
        </FLocat >
    </File>
</FileGrp>

The <FileGrp> above represents the single file containing the SGML encoded text transcription of the Breen Diary.The <FileGrp> tag contains a VERSDATE attribute, which provides the date the SGML version was created. It could also include an ADMID, which would provide the names of the various sections within the administrative metadata portion of the document which apply to all the files in the file group. However, the ADMID information may also be supplied, as it is here, at the <File> element level.  The <FileGrp> here contains a single <File> element, which identifies the the one file in this file group.  Its attributes include such information as the mimetype of the file and its intended use.  The <File> element in turn contains a <FLocat> (file location) element. The <FLocat> provides a network location for the file (in this case, a URL), and provides an attribute to specify whether this location is a URL, PURL, URN, etc.

A more complicated <FileGrp> from the simplified Breen Diary is shown below. This aggregates all of the <File> elements that represent medium resolution jpeg versions of the diary.  Within the highest level <FileGrp> the <File> elements are divided between two secondary <FileGrp> elements.  The first secondary <FileGrp> represents the medium resolution jpegs of the diary pages; the second represents the medium resolution jpegs of the letter pages.

<FileGrp VERSDATE ='4/3/1998' >
    <FileGrp
        <File ID ='FID6' MIMETYPE='image/jpg' SEQ = '1' X ='512' Y = '768' UNIT = 'PIXELS' CREATED = '4/3/1998'
        ADMID ='ADM2 ADM4 ADM11' GROUPID = 'GID2' USE = 'REFERENCE' >
            <FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018236B.jpg >
            </FLocat >
        </File>
        <File ID ='FID7' MIMETYPE='image/jpg' SEQ = '2' X ='512' Y = '768' UNIT = 'PIXELS' CREATED = '4/3/1998'
        ADMID ='ADM2 ADM4 ADM12' GROUPID = 'GID3' USE = 'REFERENCE' >
            <FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018237B.jpg >
            </FLocat >
        </File>
    </FileGrp>
    <FileGrp>
        <File ID ='FID8' MIMETYPE='image/jpg' SEQ = '3' X ='512' Y = '768' UNIT = 'PIXELS' CREATED = '4/3/1998'
        ADMID ='ADM2 ADM4 ADM9' GROUPID = 'GID31' USE = 'REFERENCE'
            <FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018266B.jpg >
            </FLocat>
        </File>
        <File ID ='FID9' MIMETYPE='image/jpg' SEQ = '4' X ='512' Y = '768' UNIT = 'PIXELS' CREATED = '4/3/1998'
        ADMID ='ADM2 ADM4 ADM10' GROUPID = 'GID32' USE = 'REFERENCE' >
            <FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018267B.jpg>
            </FLocat>
        </File>
    </FileGrp>
</FileGrp>

Note that the <File> element contains an ID attribute. This attribute provides a unique, internal name for this file which can be referenced by other portions of the document. You’ll see this type of referencing in action when we look at the Structural Map Section.

 

Administrative Metadata

<AdminMD> elements contain the administrative metadata pertaining to the files comprising an CDL document. There are three main forms of administrative metadata that are provided for: file management information (<FileMgmt> element), intellectual property rights information (<Rights> element), and information regarding the original source of the electronic files referred to by the document (<Source> element). Multiple instances of each of these types of information may occur within a single document.

An example of file management information for an image file associated with the Breen Diary appears below:

<AdminMD ID='ADM2'>
    <FileMgmt >
        <Image>
            <Compression>JPEG
            </Compression >
            <BitDepth BITS='24' / >
            <ColorSpace>RGB
            </ColorSpace>
            <Resolution>90
            </Resolution>
        </Image>
    </FileMgmt>
</AdminMD>

Note that each administrative metadata section ( <AdminMD> ) has an ID attribute.  In the sample above it is "ADM2".  This ID attribute allows the <AminMD> element to be linked to particular files or file groups.  For example, the <File> element below links to the <AdminMD> element shown above, as well as to two additional <AdminMD> elements--one containing rights information and one containing source information.

<File ID ='FID7' MIMETYPE='image/jpg' SEQ = '2' X ='512' Y = '768' UNIT = 'PIXELS' CREATED = '4/3/1998'
ADMID ='ADM2 ADM4 ADM12' GROUPID = 'GID3' USE = 'REFERENCE'>
    <FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018237B.jpg>
    </FLocat>
</File>

Notice that the <File> tag has an ADMID attribute, the first item in which is the name ADM2, providing the link to the <AdminMD> element above.  Note that if a particular <AdmMD element pertains to all of the files in a <FileGrp>, the pertinent ADMID attribute can be specified at the <FileGrp> level rather than as an attribute of each <File> in the <FileGrp>.

You’ll note that the <File> tag also has ADMID names of ADM4 and ADM12. If you examine the XML for the  simplified Breen Diary at the end of this tutorial, you’ll find the administrative metadata sections carrying these names. These sections provide additional administrative metadata describing the files in this file group.
 

Structural Map

The structural map section of an CDL object defines a hierarchical structure (or structures) which will eventually be presented to users of the electronic archival object to allow them to navigate through it. The <StructMap> element encodes this hierarchy as a nested series of <div> elements. Each <div> carries attribute information specifying what kind of division it is, and also may contain multiple file pointer ( <fptr> ) elements. File pointers specify files (or in some cases, locations within files) that correspond to the portion in the hierarchy represented by the <div>.

To get a sense of the information encoded in <div> elements, consider the following <div> element for the first entry in the Breen Diary:

<div N = '1' TYPE = 'entry' LABEL = 'Friday Nov. 20th 1846'>
    <fptr FILEID = 'FID2' MIMETYPE = 'image/tif' / >
    <fptr FILEID = 'FID6' MIMETYPE = 'image/jpg' / >
    <fptr FILEID = 'FID10' MIMETYPE = 'image/GIF' / >
    <fptr FILEID = 'FID1' MIMETYPE = 'text/sgml' TAGID = 'entry1' / >
</div >

The type of object represented by this <div is an diary entry  (TYPE=’entry’), and the entry has a label which should be displayed to the user (‘Friday Nov. 20 1846’).  The <fptr> elements specify the files that correspond with this level of hierarchy: there is a master tif file, a jpeg file, a gif file, and an sgml file containing a transcription. The FILEID atrributes in the <fptr> elements link to the corresponding <File> elements in the file inventory portion of the CDL xml document. To see the medium resolution jpeg image associated with the "Friday Nov. 20" entry in the diary, for example, you would look at the <File> element with the ID attribute  of ‘FID6’.

Note in the case of the SGML file (see the last <fptr> element in the example above), there is one additional piece of information provided as an attribute, a TAGID (‘entry1’). This indicates that within the actual file identified within this document by the <File> element ‘FID1,’ you should find an SGML element tag with the ID attribute value of ‘entry1.’ This element within the SGML document marks the beginning of the diary entry in question.

To get a sense of the hierarchical structure that can be encoded in a <StructMap> we need to look at the entire <StructMap> from the sample document.

<StructMap TYPE='logical'>
    <div N = '1' TYPE = 'diary' LABEL = '[Patrick Breen Diary November 20, 1846 - March 1, 1847]'>
        <div N = '1' TYPE = 'entry' LABEL = 'Friday Nov. 20th 1846'>
            <fptr FILEID = 'FID2' MIMETYPE = 'image/tif' />
            <fptr FILEID = 'FID6' MIMETYPE = 'image/jpg' / >
            <fptr FILEID = 'FID10' MIMETYPE = 'image/GIF' / >
            <fptr FILEID = 'FID1' MIMETYPE = 'text/sgml' TAGID = 'entry1' />
        </div>
        <div N = '2' TYPE = 'entry' LABEL = 'sat. 21st'>
            <fptr FILEID = 'FID3' MIMETYPE = 'image/tif' />
            <fptr FILEID = 'FID7' MIMETYPE = 'image/jpg' />
            <fptr FILEID = 'FID11' MIMETYPE = 'image/GIF' />
            <fptr FILEID = 'FID1' MIMETYPE = 'text/sgml' TAGID = 'entry2' / >
        </div>
        <div N = '1' TYPE = 'letter' LABEL = 'Letter by George McKinstry, tipped into original diary'>
            <div N = '1' TYPE = 'page' LABEL = 'Letter, G. McKinstry, page 1'>
                <fptr FILEID = 'FID4' MIMETYPE = 'image/tif' />
                <fptr FILEID = 'FID8' MIMETYPE = 'image/jpg' />
                <fptr FILEID = 'FID12' MIMETYPE = 'image/GIF' />
                <fptr FILEID = 'FID1' MIMETYPE = 'text/sgml' TAGID = 'GMletter1' />
            </div>
            <div N = '2' TYPE = 'page' LABEL = 'Letter, G. McKinstry, Page 2'>
                <fptr FILEID = 'FID5' MIMETYPE = 'image/tif' />
                <fptr FILEID = 'FID9' MIMETYPE = 'image/jpg' />
                <fptr FILEID = 'FID13' MIMETYPE = 'image/GIF' />
                <fptr FILEID = 'FID1' MIMETYPE = 'text/sgml' TAGID = 'GMletter2' />
            </div>
        </div>
    </div>
</StructMap>
 

This structural map indicates the document has a three level hierarchy: it is a ‘diary’ with two ‘entry’ components (or <div elements) and one "letter" component. (The letter is appended at the and of the diary).  The "letter" component has, in turn, two "page" components.

 

Conclusion


The DTD provides a flexible mechanism for encoding the descriptive, administrative and structural metadata that describe the files comprising multiple electronic versions of an archival object and their relationships.. It also manages to encode this information in a relatively efficient format. This flexibility and efficiency does come at the cost of some complexity. However, it is anticipated that CDL XML documents will be primarily machine-generated, and machine-processed for display, so that complexity should be relatively well hidden from those producing documents, and users examining them.

 

Complete Sample encoding for the tutorial.

Below is the complete CDL encoding for the simplified version of the Breen Diary used in this tutorial. This encoding is color-coded to facilitate interpretation.

 

<?xml version='1.0' standalone='no'?>
<!DOCTYPE ArchObj SYSTEM 'CDL.DTD'>
<ArchObj OBJID='BANC MSS C-E 176' TYPE='diary' LABEL='[Patrick Breen Diary November 20, 1846 - March 1, 1847]' >

<-- Descriptive Metadata: the <DescMD element -->
<DescMD>
<DMDRef LOCTYPE='URL'> DMDTYPE='MARC'http://sunsite2.berkeley.edu:8000/WebZ/Authorize:sessionid=0:bad=html/authofail.html:next=NEXTCMD%22/WebZ/QUERY:next=html/results.html:format=B:numrecs=20:entitytoprecno=1:entitycurrecno=1:tempjds=TRUE:entitycounter=1:entitydbgroup=Glad:entityCurrentPage=SearchRecentAcq:dbname=Glad:entitycountAvail=0:entitycountDisplay=0:entitycountWhere=0:entityCurrentSearchScreen=html/search.html:entityactive=1:indexA=gl%3A:termA=167968461:next=html/Cannedresultsframe.html:bad=error/badsearchframe.html
</DMDRef >

<DMDRef LOCTYPE='URL' DMDTYPE='FINDAID'http://sunsite2.berkeley.edu/cgi-bin/oac/calher/breen>
</DMDRef >
</DescMD>

<-- File Inventory: the <FileGrp and <File elements -->
<FileGrp VERSDATE ='12/4/1998'>
<File ID ='FID1' MIMETYPE ='text/sgml' SEQ = '1' CREATED = '12/4/1998' ADMID = 'ADM4 ADM4 ADM6' GROUPID = 'GID1' USE = 'ARCHIVE' >
<FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/sgml/breen2.sgm>
</FLocat>
</File>
</FileGrp>

<FileGrp VERSDATE ='4/3/1998'>
<FileGrp>
<File ID ='FID2' MIMETYPE='image/TIF' SEQ = '1' X ='4184' Y ='6606' UNIT='PIXELS' CREATED = '4/3/1998' ADMID ='ADM1 ADM4 ADM7' GROUPID = 'GID2' USE = 'ARCHIVE'>
<FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018236A.tif>
</FLocat>
</File>
<File ID ='FID3' MIMETYPE='image/TIF' SEQ = '2' X ='4184' Y ='6606' UNIT='PIXELS' CREATED = '4/3/1998' ADMID ='ADM1 ADM4 ADM8' GROUPID = 'GID3' USE = 'ARCHIVE'>
<FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018237A.tif >
</FLocat>
</File>
</FileGrp>
<FileGrp >
<File ID ='FID4' MIMETYPE='image/TIF' SEQ = '3' X ='4184' Y ='6606' UNIT='PIXELS' CREATED = '4/3/1998' ADMID ='ADM1 ADM4 ADM5' GROUPID = 'GID31' USE = 'ARCHIVE'>
<FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018266A.tif >
</FLocat >
</File>
<File ID ='FID5' MIMETYPE='image/TIF' SEQ = '4' X ='4184' Y ='6606' UNIT='PIXELS' CREATED = '4/3/1998' ADMID ='ADM1 ADM4 ADM6' GROUPID = 'GID32' USE = 'ARCHIVE'>
<FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018267A.tif>
</FLocat>
</File>
</FileGrp>
</FileGrp>

<FileGrp VERSDATE ='4/3/1998' >
<FileGrp >
<File ID ='FID6' MIMETYPE='image/jpg' SEQ = '1' X ='512' Y = '768' UNIT = 'PIXELS' CREATED = '4/3/1998' ADMID ='ADM2 ADM4 ADM11' GROUPID = 'GID2' USE = 'REFERENCE'>
<FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018236B.jpg>
</FLocat>
</File>
<File ID ='FID7' MIMETYPE='image/jpg' SEQ = '2' X ='512' Y = '768' UNIT = 'PIXELS' CREATED = '4/3/1998' ADMID ='ADM2 ADM4 ADM12' GROUPID = 'GID3' USE = 'REFERENCE' >
<FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018237B.jpg >
</FLocat>
</File>
</FileGrp>
<FileGrp>
<File ID ='FID8' MIMETYPE='image/jpg' SEQ = '3' X ='512' Y = '768' UNIT = 'PIXELS' CREATED = '4/3/1998' ADMID ='ADM2 ADM4 ADM9' GROUPID = 'GID31' USE = 'REFERENCE'>
<FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018266B.jpg>
</FLocat>
</File>
<File ID ='FID9' MIMETYPE='image/jpg' SEQ = '4' X ='512' Y = '768' UNIT = 'PIXELS' CREATED = '4/3/1998' ADMID ='ADM2 ADM4 ADM10' GROUPID = 'GID32' USE = 'REFERENCE'>
<FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018267B.jpg>
</FLocat>
</File>
</FileGrp>
</FileGrp>

<FileGrp VERSDATE ='4/3/1998'>
<FileGrp>
<File ID ='FID10' MIMETYPE='image/GIF' SEQ = '1' X ='128' Y = '192' UNIT = 'PIXELS' CREATED = '4/3/1998' ADMID ='ADM3 ADM4 ADM11' GROUPID = 'GID2' USE = 'THUMBNAIL'>
<FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018236A.gif >
</FLocat
</File
<File ID ='FID11' MIMETYPE='image/GIF' SEQ = '2' X ='128' Y = '192' UNIT = 'PIXELS' CREATED = '4/3/1998' ADMID ='ADM3 ADM4 ADM12' GROUPID = 'GID3' USE = 'THUMBNAIL'
<FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018237A.gif
</FLocat>
</File>
</FileGrp>
<FileGrp>
<File ID ='FID12' MIMETYPE='image/GIF' SEQ = '3' X ='128' Y = '192' UNIT = 'PIXELS' CREATED = '4/3/1998' ADMID ='ADM3 ADM4 ADM9' GROUPID = 'GID31' USE = 'THUMBNAIL'>
<FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018266A.gif >
</FLocat>
</File>
<File ID ='FID13' MIMETYPE='image/GIF' SEQ = '4' X ='128' Y = '192' UNIT = 'PIXELS' CREATED = '4/3/1998' ADMID ='ADM3 ADM4 ADM10' GROUPID = 'GID32' USE = 'THUMBNAIL'>
<FLocat LOCTYPE = 'URL'http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018267A.gif>
</FLocat>
</File>
</FileGrp>
</FileGrp>

<-- Administrative Metadata: the <AdminMD element -->
<AdminMD ID='ADM1'>
<FileMgmt>
<Image>
<Compression>UNKNOWN
</Compression>
<BitDepth BITS='24' />
<ColorSpace>RGB
</ColorSpace>
<ColorProfile CPLOCAT='FILE' CPFILE='UNKNOWN' />
<Resolution>450
</Resolution>
<LgtSourceTungsten>
</LgtSource>
</Image>
</FileMgmt >
</AdminMD>

<AdminMD ID='ADM2'>
<FileMgmt >
<Image>
<Compression>JPEG
</Compression>
<BitDepth BITS='24' />
<ColorSpace>RGB
</ColorSpace>
<Resolution>90
</Resolution>
</Image>
</FileMgmt>
</AdminMD >

<AdminMD ID='ADM3' >
<FileMgmt >
<Image >
<Compression>LZW
</Compression>
<BitDepth BITS='0' />
<ColorSpace>RGB
</ColorSpace>
<Resolution>90
</Resolution>
</Image >
</FileMgmt>
</AdminMD>

<AdminMD ID='ADM4'>
<Rights >
<Owner>U. C. Berkeley
</Owner >
<Credit>UNKNOWN
</Credit>
<CopyRest>Copyright has not been assigned to The Bancroft Library. All requests for permission to quote from the diary must be submitted in writing to the Curator of the Bancroft Collection of Western Americana.
</CopyRest>
<DispRest>UNKNOWN
</DispRest>
<License BEGINDATE='UNKNOWN' ENDDATE='UNKNOWN'UNKNOWN>
</License>
</Rights>
</AdminMD>

<AdminMD ID = 'ADM5' >
<Source SOURCEID='BANC MSS C-E 176: Ltr., G. McKinstry, p. 1'>
<Typepage>
</Type>
<Details>
</Details>
<SrcDimen>
<OrgDimen X ='12' Y ='17' UNIT ='CM' / >
<ScanDimen X ='11.7701149425287' Y ='17.6551724137931' UNIT='IN' />
</SrcDimen >
</Source>
</AdminMD>

<AdminMD ID = 'ADM6'>
<Source SOURCEID='BANC MSS C-E 176: Ltr., G. McKinstry, p. 2' >
<Typepage>
</Type>
<Details>
</Details>
<SrcDimen>
<OrgDimen X ='12' Y ='17' UNIT ='CM' / >
<ScanDimen X ='11.7701149425287' Y ='17.6551724137931' UNIT='IN' />
</SrcDimen>
</Source>
</AdminMD>

<AdminMD ID = 'ADM7'>
<Source SOURCEID='BANC MSS C-E 176: Nov. 20th 1846>'
<Type>entry
</Type>
<Details>
</Details>
<SrcDimen>
<OrgDimen X ='12' Y ='17' UNIT ='CM' / >
<ScanDimen X ='11.7701149425287' Y ='17.6551724137931' UNIT='IN' / >
</SrcDimen>
</Source>
</AdminMD>

<AdminMD ID = 'ADM8'>
<Source SOURCEID='BANC MSS C-E 176: sat. 21st' >
<Typeentry>
</Type>
<Details>
</Details>
<SrcDimen>
<OrgDimen X ='12' Y ='17' UNIT ='CM' />
<ScanDimen X ='11.7701149425287' Y ='17.6551724137931' UNIT='IN' />
</SrcDimen>
</Source>
</AdminMD>

<AdminMD ID = 'ADM9'>
<Source SOURCEID='http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018266A.tif' >
<Type>image/TIF
</Type>
</Source>
</AdminMD>

<AdminMD ID = 'ADM10'>
<Source SOURCEID='http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018267A.tif'>
<Type>image/TIF
</Type>
</Source >
</AdminMD>

<AdminMD ID = 'ADM11'>
<Source SOURCEID='http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018236A.tif'>
<Type>image/TIF
</Type>
</Source>
</AdminMD>

<AdminMD ID = 'ADM12'>
<Source SOURCEID='http://sunsite.berkeley.edu/~jmcdonou/BREEN/figures/I0018237A.tif'>
<Type>image/TIF
</Type>
</Source>
</AdminMD>

<-- Structural Metadata: the <StructMap element -->
<StructMap TYPE='logical' >
<div N = '1' TYPE = 'diary' LABEL = '[Patrick Breen Diary November 20, 1846 - March 1, 1847]' >
<div N = '1' TYPE = 'entry' LABEL = 'Friday Nov. 20th 1846' >
<fptr FILEID = 'FID2' MIMETYPE = 'image/tif' />
<fptr FILEID = 'FID6' MIMETYPE = 'image/jpg' / >
<fptr FILEID = 'FID10' MIMETYPE = 'image/GIF' />
<fptr FILEID = 'FID1' MIMETYPE = 'text/sgml' TAGID = 'entry1' / >
</div>
<div N = '2' TYPE = 'entry' LABEL = 'sat. 21st' >
<fptr FILEID = 'FID3' MIMETYPE = 'image/tif' />
<fptr FILEID = 'FID7' MIMETYPE = 'image/jpg' / >
<fptr FILEID = 'FID11' MIMETYPE = 'image/GIF' / >
<fptr FILEID = 'FID1' MIMETYPE = 'text/sgml' TAGID = 'entry2' / >
</div
<div N = '1' TYPE = 'letter' LABEL = 'Letter by George McKinstry, tipped into original diary'>
<div N = '1' TYPE = 'page' LABEL = 'Letter, G. McKinstry, page 1' >
<fptr FILEID = 'FID4' MIMETYPE = 'image/tif' />
<fptr FILEID = 'FID8' MIMETYPE = 'image/jpg' / >
<fptr FILEID = 'FID12' MIMETYPE = 'image/GIF' />
<fptr FILEID = 'FID1' MIMETYPE = 'text/sgml' TAGID = 'GMletter1' />
</div
<div N = '2' TYPE = 'page' LABEL = 'Letter, G. McKinstry, Page 2' >
<fptr FILEID = 'FID5' MIMETYPE = 'image/tif' />
<fptr FILEID = 'FID9' MIMETYPE = 'image/jpg' / >
<fptr FILEID = 'FID13' MIMETYPE = 'image/GIF' / >
<fptr FILEID = 'FID1' MIMETYPE = 'text/sgml' TAGID = 'GMletter2' / >
</div>
</div>
</div>
</StructMap>
</ArchObj>

 

 

 

Appendix B: CDL Digital Object Document Type Definition

<!-- CDL Document Type Definition -->

<!-- Version 2.0 (BETA 1.3) -->

<!-- December 21, 2000 -->

<!-- California Digital Library, UC Office of the President -->

<!-- 1111 Franklin Blvd -->

<!-- Oakland, CA 94607

-->

<!-- -->

<!-- Copyright (c) 1998 - 2005 The Regents of the University of -->

<!-- California -->

<!-- All rights reserved. -->

<!-- -->

<!-- Permission is hereby granted, without written agreement and -->

<!-- without license or royalty fees, to use, copy, modify, and -->

<!-- distribute this document type definition for any purpose, -->

<!-- provided that the above copyright notice and the following -->

<!-- two paragraphs appear in all copies of this document. -->

<!-- -->

<!-- IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO -->

<!-- ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR -->

<!-- CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OF THIS DOCUMENT -->

<!-- TYPE DEFINITION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN -->

<!-- ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -->

<!-- -->

<!-- THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY -->

<!-- WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED -->

<!-- WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR -->

<!-- PURPOSE. THE DOCUMENT TYPE DEFINITION PROVIDED HEREUNDER IS -->

<!-- ON AN "AS IS" BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO -->

<!-- OBLIGATION TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, -->

<!-- ENHANCEMENTS, OR MODIFICATIONS. -->

<!-- PURPOSE OF THE CDL DTD -->

<!-- This DTD is intended to provide a transfer syntax for -->

<!-- electronic reproductions of archival documents. -->

<!-- Developed as part of the Making of America II -->

<!-- Project (http://sunsite.berkeley.edu/MOA2/), the DTD provides a -->

<!-- a mechanism for specifying: 1. all of the component files for -->

<!-- multiple versions of an electronic reproduction of an archival -->

<!-- object; 2. a hierarchical structure for the electronic -->

<!-- reproduction of an electronic object, 3. equivalent -->

<!-- locations within the various electronic versions for the -->

<!-- hierarchical structure delineated; and 4. administrative -->

<!-- metadata regarding the production of the electronic versions of -->

<!-- of the original archival object. A document produced in -->

<!-- accordance with the CDL DTD should provide all of the -->

<!-- structural and administrative metadata necessary to display, -->

<!-- navigate, evaluate, and manage an electronic reproduction of an -->

<!-- archival object. -->

<!-- =========================================================== -->

<!-- -->

<!-- BEGINNING OF ACTUAL DOCUMENT TYPE DECLARATION -->

<!-- -->

<!-- =========================================================== -->

 

<!-- global parameter entities -->

<!-- All elements within this DTD carry the attribute(s) listed here. -->

<!ENTITY % a.global 'ID ID #IMPLIED' >

<!-- other parameter entities -->

<!ENTITY % loctype

'LOCTYPE (URN|URL|PURL|HANDLE|DOI|PDI) "URL"' >

<!ENTITY % Dimensions

'X CDATA #IMPLIED

Y CDATA #IMPLIED

UNIT CDATA #IMPLIED' >

<!-- CDL.DTD -->

<!-- ArchObj (Archival Object) -->

<!-- 0.0 TOP LEVEL -->

<!-- ============= -->

<!-- The ArchObj is the root node for a CDL document. As such, it -->

<!-- bears little information in and of itself, other than an ID -->

<!-- value, which should be a unique identifying string assigned to -->

<!-- the object by the institution in control of the object. -->

<!-- An ArchObj contains three major parts: 1. file lists of all -->

<!-- the files comprising each electronic version of the archival -->

<!-- object; 2. Administrative metadata regarding the production and -->

<!-- maintenance of all of the files in the various file lists; and -->

<!-- 3. a structural map delineating a hierarchical structure for the -->

<!-- archival object, with mapping to each of the various electronic -->

<!-- version files (e.g., chapter 1 of this book may be found at this -->

<!-- point in this TEI encoded text file, or in this JPG image file). -->

<!-- The ArchObj content model has been made recursive so that a -->

<!-- archival object can be defined as consisting of several other -->

<!-- archival objects as a group. -->

<!-- -->

<!-- The ArchObj's attributes are: -->

<!-- OBJID - A unique identifying string (presumably a URN) -->

<!-- assigned to this CDL object -->

<!-- LABEL - A string identifying this archival object to the -->

<!-- user, e.g., its title/name. -->

<!-- TYPE - a description of the object type. Within CDL, -->

<!-- this includes ledger, image, photoalbum, journal, -->

<!-- book, and correspondence -->

 

<!ELEMENT ArchObj ((DescMD, FileGrp+, AdminMD*, StructMap+)?, ArchObj*) >

<!ATTLIST ArchObj %a.global;

OBJID CDATA #REQUIRED

LABEL CDATA #IMPLIED

TYPE CDATA #IMPLIED >

<!-- 1.0 DescMD (Descriptive Metadata) -->

<!-- The descriptive metadata section allows you to point at all -->

<!-- relevant pieces of descriptive metadata describing this object, -->

<!-- as well as provides for a minimal encoding of descriptive -->

<!-- metadata within the object itself. The -->

<!-- DescMD element contains a series of pointers to various external -->

<!-- metadata descriptions of the CDL object and/or it contains -->

<!-- embedded descriptive metadata either in a CDL DTD specified -->

<!-- format or a user specified format. -->

<!ELEMENT DescMD (DMDRef*, DMD?) >

<!ATTLIST DescMD %a.global; >

 

<!-- 1.1 DMDRef (Descriptive Metadata Reference) -->

<!-- A pointer to descriptive metadata for this CDL object, such as -->

<!-- MARC record, Finding Aid, etc. Element itself should contain a -->

<!-- network location (e.g., URN/URL/PURL/etc.). Element has the -->

<!-- following attributes: -->

<!-- LOCTYPE - the type of identifier or location used to point -->

<!-- to the desc. metadata. Valid values are URN, -->

<!-- URL, PURL, HANDLE, DOI, PDI. -->

<!-- DMDTYPE - the type of metadata. Valid values are MARC, -->

<!-- Finding Aid, RDF, PICS and OTHER. If OTHER is -->

<!-- used, the MIMETYPE attribute should be used to -->

<!-- to allow software to know what form the metadata -->

<!-- takes. -->

<!-- MIMETYPE - MIME type for referenced descriptive metadata. -->

<!-- Should only really be necessary for desc. -->

<!-- metadata of type 'OTHER' -->

<!-- LABEL - A label for the desc. metadata that can be shown -->

<!-- to the user. -->

<!-- TAGID - In the case of a Finding Aid desc. metadata ref. -->

<!-- (or other desc. metadata in SGML/XML format), -->

<!-- CDL document authors can include a TAGID to -->

<!-- specify a particular location within an EAD -->

<!-- Finding Aid that pertains to this archival -->

<!-- object. This is meant to allow tool developers -->

<!-- to produce software that will drop the user in -->

<!-- the appropriate location in the Finding Aid if -->

<!-- the user wishes to determine this object's -->

<!-- context as part of a larger collection. -->

<!ELEMENT DMDRef (#PCDATA) >

<!ATTLIST DMDRef %a.global;

%loctype;

DMDTYPE (MARC|FINDAID|RDF|PICS|OTHER) "FINDAID"

MIMETYPE CDATA #IMPLIED

LABEL CDATA #IMPLIED

TAGID CDATA #IMPLIED >

<!-- 1.2 DMD (Descriptive Metadata) -->

<!-- Used to incorporate descriptive metadata internally. Such -->

<!-- descriptive metadata can either use the generic descriptive -->

<!-- metadata subelements defined for the GDM element within this -->

<!-- CDL.DTD. Or it can be expressed through another -->

<!-- user defined text format. If the latter, it's type -->

<!-- *must* be declared, and it should be wrapped in a CDATA section -->

<!-- to ensure that it does not interfere with parsing of the -->

<!-- document. -->

<!ELEMENT DMD (GDM*, wrapper*) >

<!ATTLIST DMD %a.global; >

<!-- 1.2.1 GDM (Generic Descriptive Metadata) -->

<!-- This element provides an XML encoding for the descriptive -->

<!-- metadata elements associated with particular subobjects within -->

<!-- the Berkeley Generic Database. -->

<!ELEMENT GDM (Admin*, AltDate*, Content*, Core*, Creator*,

General*, PhysDesc*, Related*, dmSource*, Subject*) >

<!ATTLIST GDM %a.global; >

<!-- 1.2.1.1 Admin (Administrative Information) -->

<!-- Administrative information regarding the source object (not -->

<!-- its electronic encapsulation) -->

<!-- This element has the following attributes: -->

<!-- FieldType: A more precise specification of the kind of -->

<!-- administrative information contained within. -->

<!-- The value of the attribute is limited to -->

<!-- values specified in the GenericDB. -->

<!-- Public: Specifies whether the information within this -->

<!-- element can be shown to the public. -->

<!-- Seq: Sequence of this admin. info. note, if more -->

<!-- than one. -->

<!ELEMENT Admin (#PCDATA) >

<!ATTLIST Admin %a.global;

FieldType (admInstitutionName|admProcessInfo|

admInstitutionAddr|admGeneral|

admFunding|admAcquisition|

admAltForm) "admGeneral"

Public (Yes|No) "Yes"

Seq CDATA #IMPLIED >

<!-- 1.2.1.2 AltDate (Alternative Date Information) -->

<!-- Date information associated with the subobject; Element contains -->

<!-- date note information; actual dates stored in attributes -->

<!-- This element has the following attributes: -->

<!-- Date: Secondary date as displayed on material -->

<!-- EndDate: Date completed (normalized to YYYY-MM-DD/YYYY-MM/ -->

<!-- YYYY) -->

<!-- BeginDate: if date range, beginning date, normalized as above -->

<!-- Seq: Sequence of this date info. note, if more -->

<!-- than one. -->

<!ELEMENT AltDate (#PCDATA) >

<!ATTLIST AltDate %a.global;

Date CDATA #IMPLIED

EndDate CDATA #IMPLIED

BeginDate CDATA #IMPLIED

Seq CDATA #IMPLIED >

<!-- 1.2.1.3 Content (Object's Contents) -->

<!-- Information regarding the subobject's content. -->

<!-- This element has the following attributes: -->

<!-- FieldType: A more precise specification of the kind of -->

<!-- information contained within the note. -->

<!-- The value of the attribute is limited to -->

<!-- values specified in the GenericDB. -->

<!-- Public: Specifies whether the information within this -->

<!-- element can be shown to the public. -->

<!-- Seq: Sequence of this note, if more than one. -->

<!ELEMENT Content (#PCDATA) >

<!ATTLIST Content %a.global;

FieldType (conAbstract|

conGeneral|

conScopeContent|

conStylePeriod) "conGeneral"

Public (Yes|No) "Yes"

Seq CDATA #IMPLIED >

<!-- 1.2.1.4 Core (Object's core description) -->

<!-- Core descriptive metadata regarding the subobject -->

<!ELEMENT Core (coreDate*, Caption*, Dimensions*, EADLevel*,

LocalID*, Origin*, SOType*, Title*) >

<!ATTLIST Core %a.global; >

<!-- 1.2.1.4.1 coreDate (Subobject date) -->

<!-- Primary date associated with subobject -->

<!-- The element itself should be used for PrimeDateNote information -->

<!-- from the database; the three attributes correspond with their -->

<!-- matching field type. -->

<!ELEMENT coreDate (#PCDATA) >

<!ATTLIST coreDate %a.global;

beginDateNorm CDATA #IMPLIED

endDateNorm CDATA #IMPLIED

primaryDate CDATA #IMPLIED >

<!-- 1.2.1.4.2 Caption (Subobject Caption) -->

<!-- Caption appearing on the subobject -->

<!ELEMENT Caption (#PCDATA) >

<!ATTLIST Caption %a.global; >

<!-- 1.2.1.4.3 Dimensions (Subobject Dimensions) -->

<!-- Physical dimensions of the subobject -->

<!ELEMENT Dimensions EMPTY >

<!ATTLIST Dimensions %a.global;

height CDATA #IMPLIED

width CDATA #IMPLIED

depth CDATA #IMPLIED

units CDATA #IMPLIED >

<!-- 1.2.1.4.4 EADLevel (EAD Level for subobject) -->

<!ELEMENT EADLevel (#PCDATA) >

<!ATTLIST EADLevel %a.global; >

<!-- 1.2.1.4.5 LocalID -->

<!-- Call number, accession number, shelf location, etc. -->

<!ELEMENT LocalID (#PCDATA) >

<!ATTLIST LocalID %a.global;

LocalIDType CDATA #IMPLIED >

<!-- 1.2.1.4.6 Origin (Place of origin for material) -->

<!-- where the material was created, published, found, etc. -->

<!ELEMENT Origin (#PCDATA) >

<!ATTLIST Origin %a.global; >

<!-- 1.2.1.4.7 SOType (Subobject Type) -->

<!ELEMENT SOType (#PCDATA) >

<!ATTLIST SOType %a.global; >

<!-- 1.2.1.4.8 Title (Subobject title) -->

<!-- title of material. -->

<!ELEMENT Title (#PCDATA) >

<!ATTLIST Title %a.global; >

<!-- 1.2.1.5 Creator (Subobject creator) -->

<!-- Element should contain creator's name; all other information is -->

<!-- in attributes: -->

<!-- NameType: personal, corporate, etc. -->

<!-- Dates: Life, death, active dates for creator -->

<!-- Nationality: Nationality of creator -->

<!-- Source: Source for creator information -->

<!-- SrcCheck: Whether or not source is confirmed -->

<!-- Role: Role creator played in creation of subobject -->

<!-- Seq: Sequence for creator -->

<!ELEMENT Creator (#PCDATA) >

<!ATTLIST Creator %a.global;

NameType CDATA #IMPLIED

Dates CDATA #IMPLIED

Nationality CDATA #IMPLIED

Source CDATA #IMPLIED

SrcCheck (Yes|No) "Yes"

Role CDATA #IMPLIED

Seq CDATA #IMPLIED >

<!-- 1.2.1.6 General (General Notes) -->

<!-- General notes on the subobject -->

<!-- This element has the following attributes: -->

<!-- FieldType: A more precise specification of the kind of -->

<!-- information contained within the note. -->

<!-- The value of the attribute is limited to -->

<!-- values specified in the GenericDB. -->

<!-- Public: Specifies whether the information within this -->

<!-- element can be shown to the public. -->

<!-- Seq: Sequence of this note, if more than one. -->

<!ELEMENT General (#PCDATA) >

<!ATTLIST General %a.global;

FieldType (genAltTitle|

genAppraisal|

genBibliography|

genBiblioHist|

genBiographical|

genCitation|

genConsHist|

genEdition|

genExhibitHist|

genGeneral|

genOriginal|

genProvenance|

genScale|

genSeries|

genValue) "genGeneral"

Public (Yes|No) "Yes"

Seq CDATA #IMPLIED >

<!-- 1.2.1.7 PhysDesc (Physical description) -->

<!-- Physical description of the subobject -->

<!-- This element has the following attributes: -->

<!-- FieldType: A more precise specification of the kind of -->

<!-- information contained within the note. -->

<!-- The value of the attribute is limited to -->

<!-- values specified in the GenericDB. -->

<!-- Public: Specifies whether the information within this -->

<!-- element can be shown to the public. -->

<!-- Seq: Sequence of this note, if more than one. -->

<!ELEMENT PhysDesc (#PCDATA) >

<!ATTLIST PhysDesc %a.global;

FieldType (phyCondition|

phyDecorationDetails|

phyDimensionNote|

phyDimensions|

phyDuration|

phyExtent|

phyGeneral|

phyGenreform|

phyLanguage|

phyMarksInscriptions|

phyMediumMaterials|

phyOrganization|

phyPhysDesc|

phyPlaceOfOrigin|

phyPresentation|

phyProcessTechnique|

phyScript|

phySubstrateSupport) "phyGeneral"

Public (Yes|No) "Yes"

Seq CDATA #IMPLIED >

<!-- 1.2.1.8 Related (Related material) -->

<!-- information on material related to this subobject. Element -->

<!-- contains name/title of related material -->

<!-- The element has the following attributes: -->

<!-- RelIDNumber: ID Number for related material -->

<!-- RelInst: related material's institution -->

<!-- RelURL: URL for electronic version of related material -->

<!-- RelType: Type of relationship between materials -->

<!ELEMENT Related (#PCDATA) >

<!ATTLIST Related %a.global;

RelIDNumber CDATA #IMPLIED

RelInst CDATA #IMPLIED

RelURL CDATA #IMPLIED

RelType CDATA #IMPLIED >

<!-- 1.2.1.9 Source (Subobject Source) -->

<!-- Source material from which subobject derives -->

<!-- This element has the following attributes: -->

<!-- FieldType: A more precise specification of the kind of -->

<!-- information contained within the note. -->

<!-- The value of the attribute is limited to -->

<!-- values specified in the GenericDB. -->

<!-- Public: Specifies whether the information within this -->

<!-- element can be shown to the public. -->

<!-- Seq: Sequence of this note, if more than one. -->

<!ELEMENT dmSource (#PCDATA) >

<!ATTLIST dmSource %a.global;

FieldType (srcCharacteristics|

srcDimensions|

srcGeneral|

srcLocalID|

srcReproduction|

srcType) "srcGeneral"

Public (Yes|No) "Yes"

Seq CDATA #IMPLIED >

<!-- 1.2.1.10 Subject (Subobject subject) -->

<!-- Subject headings applied to subobject. This element has the -->

<!-- following attributes: -->

<!-- Source: Authoritative source for subject headings -->

<!-- SrcCheck: indicates if term has been checked in an -->

<!-- authoritative source or thesauri -->

<!-- Definition: topical, geographic, personal name, etc. -->

<!ELEMENT Subject (#PCDATA) >

<!ATTLIST Subject %a.global;

Source CDATA #IMPLIED

SrcCheck CDATA #IMPLIED

Definition CDATA #IMPLIED >

<!-- 1.2.2 wrapper (Descriptive Metadata wrapper) -->

<!-- The wrapper element is intended to allow users to include -->

<!-- non-XML forms of descriptive metadata within a CDL object. -->

<!-- Such metadata should always be enclosed within a CDATA section -->

<!-- within the wrapper element, unless it is absolutely certain not -->

<!-- to conflict with parsing the CDL document. -->

<!-- DMDTYPE - the type of metadata. Valid values are MARC, -->

<!-- Finding Aid, RDF, PICS and OTHER. If OTHER is -->

<!-- used, the MIMETYPE attribute should be used to -->

<!-- to allow software to know what form the metadata -->

<!-- takes. -->

<!-- MIMETYPE - MIME type for descriptive metadata. -->

<!-- Should only really be necessary for desc. -->

<!-- metadata of type 'OTHER' -->

<!-- LABEL - A label for the desc. metadata that can be shown -->

<!-- to the user. -->

<!-- ENCODING - Indicates whether not included metadata is -->

<!-- encoded or not. If encoded, must be Base64 -->

<!-- encoding to ensure XML compatibility. -->

<!ELEMENT wrapper (#PCDATA) >

<!ATTLIST wrapper %a.global;

DMDTYPE (MARC|FINDAID|RDF|PICS|OTHER) "OTHER"

MIMETYPE CDATA #IMPLIED

LABEL CDATA #IMPLIED

ENCODING (None|Base64) "None" >

<!-- 2.0 FileGrp (File Group) -->

<!-- The file group tag allows you to group together all of the -->

<!-- individual files which comprise a particular version of an -->

<!-- archival document. For example, you could group all of the -->

<!-- individual page image files that are in JPG format in one -->

<!-- file list, all of the page image files in TIFF in another file -->

<!-- list, etc. The FileGrp element has an IDREF attribute to an -->

<!-- AdminMD section, to reference AdminMD relevant to all files in -->

<!-- this group. If individual files within the group *also* have -->

<!-- AdminMD references, the individual file information should be -->

<!-- assumed to take precedence over administrative metadata input -->

<!-- for a FileGrp. -->

<!-- -->

<!-- The FileGrp tag has the following attributes: -->

<!-- -->

<!-- VERSDATE - The date of creation for this electronic -->

<!-- version of the archival object. Should be -->

<!-- given in the ISO format of YYYY-MM-DD. -->

<!-- ADMID - IDRESF to the Administrative Metadata section(s) -->

<!-- for this FileGrp -->

<!ELEMENT FileGrp (FileGrp | File)+ >

<!ATTLIST FileGrp %a.global;

VERSDATE CDATA #IMPLIED

ADMID IDREFS #IMPLIED >

<!-- 2.1 File (File) -->

<!-- Specifies a file comprising part or all of a digital -->

<!-- reproduction of an archival object. The file may be specified -->

<!-- by providing: 1. a PURL or URL to retrieve the file, 2. the -->

<!-- encoded contents of the file itself, or 3. both. As Base64 -->

<!-- appears to be the only encoding format which guarantees that -->

<!-- content may be transferred within an XML document without -->

<!-- the use of character entities to replace characters such as the -->

<!-- left angle bracket in the encoded byte stream, its use is -->

<!-- STRONGLY encouraged if you wish to include content within a -->

<!-- CDL document. IT IS THE RESPONSIBILITY OF THE CDL DOCUMENT -->

<!-- AUTHOR TO ENSURE THAT ENCODED FILE CONTENT DOES NOT INTERFERE -->

<!-- WITH CDL DOCUMENT PARSING! -->

<!-- -->

<!-- The File tag has the following attributes: -->

<!-- -->

<!-- MIMETYPE - the MIME type (see RFC's 2045-2049) for the -->

<!-- file's contents -->

<!-- SEQ - The sequence number of this file within this -->

<!-- particular file list. In the case of page image -->

<!-- files, the sequence specified will typically -->

<!-- match the order of pages. For groups of SGML -->

<!-- or XML files (for a transcription of a work), -->

<!-- sequence would typically be used to specify the -->

<!-- order of processing of the files to ensure -->

<!-- successful parsing of the entire document. -->

<!-- SIZE - The total number of bytes for the file -->

<!-- CREATED - The original date of creation for this file, -->

<!-- given in ISO format YYYY-MM-DD. -->

<!-- OWNERID - A number or alphanumeric string uniquely -->

<!-- identifying this image as belonging to the -->

<!-- owner (ID number, barcode, filename, etc.). -->

<!-- ADMID - IDREFS to the administrative metadata for this -->

<!-- file. If the FileGrp containing a File also -->

<!-- also has an ADMID reference, administrative MD -->

<!-- for the file should be assumed to take -->

<!-- precedence over that for the whole group. -->

<!-- GROUPID - A common identifier applied to several different -->

<!-- <File>s to indicate they are of the same thing. -->

<!-- Note that two different files having the same -->

<!-- <Source> does not necessarily mean they are of -->

<!-- the same thing. A page image and a detail may -->

<!-- have the same source, if a page scan was cropped -->

<!-- to produce a detail. -->

<!-- USE - Intended to capture the ultimate intended use -->

<!-- for this file, e.g., whether a thumbnail, -->

<!-- reference, archival master, etc. -->

<!ELEMENT File (FLocat?, FContent?) >

<!ATTLIST File %a.global;

MIMETYPE CDATA #REQUIRED

SEQ CDATA #REQUIRED

SIZE CDATA #IMPLIED

%Dimensions;

CREATED CDATA #REQUIRED

OWNERID CDATA #IMPLIED

ADMID IDREFS #IMPLIED

GROUPID CDATA #IMPLIED

USE (THUMBNAIL|REFERENCE|ARCHIVE) "REFERENCE" >

<!-- 2.1.1 FLocat (File Location) -->

<!-- The location from which a file may be retrieved, or an -->

<!-- identifier which can resolve to a location, e.g., URN, URL, -->

<!-- PURL, Handle, etc. -->

<!-- -->

<!-- The FLocat element has the following attribute: -->

<!-- -->

<!-- LOCTYPE - The type of identifier or location. Valid -->

<!-- values are URN, URL, PURL, HANDLE, PDI -->

<!-- -->

<!ELEMENT FLocat (#PCDATA) >

<!ATTLIST FLocat %a.global;

%loctype; >

<!-- 2.1.2 FContent (File Content) -->

<!-- The encoded content of a file. The use of Base64 as an -->

<!-- encoding format is *STRONGLY* encouraged, as Base64 encoded -->

<!-- content should not interfere with parsing of the CDL XML -->

<!-- document. -->

<!-- -->

<!-- The FContent element has the following attribute(s): -->

<!-- -->

<!-- ENCODE - the encoding format for the content (Base64, -->

<!-- uuencode, etc.). Be aware that it is the -->

<!-- responsibility of the document author to ensure -->

<!-- that any characters in an encoded byte stream -->

<!-- which might interfere with parsing of the CDL -->

<!-- document are replaced with appropriate character -->

<!-- entities. -->

<!-- -->

<!ELEMENT FContent (#PCDATA) >

<!ATTLIST FContent %a.global;

ENCODE CDATA #REQUIRED >

<!-- 3.0 AdminMD (Administrative Metadata) -->

<!-- Administrative metadata regarding either a single file or a -->

<!-- group of files. Administrative metadata is considered to be -->

<!-- any information necessary to the long term management of a -->

<!-- digital collection, including data regarding the creation of -->

<!-- electronic images, intellectual property rights, and any -->

<!-- additional information needed to identify an instantiation/ -->

<!-- version of a file and determine what is needed to view or use -->

<!-- it. -->

<!ELEMENT AdminMD (FileMgmt?, Rights?, Source*) >

<!ATTLIST AdminMD %a.global; >

<!-- 3.1 FileMgmt (Creation/Nature of file) -->

<!-- Administrative metadata relating to the creation and properties -->

<!-- of a file or files. -->

<!ELEMENT FileMgmt (Image | Text) >

<!ATTLIST FileMgmt %a.global; >

<!-- 3.1.1 Image (Image Creation Data ) -->

<!-- Information regarding a particular image or images creation, -->

<!-- such as compression algorithm, dimensions, etc. -->

<!ELEMENT Image (Compression, BitDepth, ColorSpace,

CLUT*, ColorProfile?, Resolution?, LgtSource?) >

<!ATTLIST Image %a.global; >

<!-- 3.1.1.1 Compression (Image Compression Format) -->

<!-- Type of algorithm needed to decompress the image, with note of -->

<!-- software packaged used to apply the format, and degree/percent -->

<!-- of compression used when such options exist. -->

<!ELEMENT Compression (#PCDATA) >

<!ATTLIST Compression %a.global; >

<!-- 3.1.1.2 BitDepth (Image Bit-depth) -->

<!-- color depth. Should indicate both number of bits and color or -->

<!-- grey scale, e.g., 24 bit color, 8 bit grey, etc. -->

<!ELEMENT BitDepth EMPTY >

<!ATTLIST BitDepth %a.global;

BITS CDATA #REQUIRED >

<!-- 3.1.1.3 ColorSpace (Image's Color Space) -->

<!-- Color space used by image, e.g., CMYK, RGB, Lab, etc. -->

<!ELEMENT ColorSpace (#PCDATA) >

<!ATTLIST ColorSpace %a.global; >

<!-- 3.1.1.4 CLUT (Color Lookup Table) -->

<!-- Lookup table employed to map from low in to high (e.g., 8-bit -->

<!-- to 24-bit) color space. CLUT has the following, additional -->

<!-- attribute: -->

<!-- -->

<!-- FResident - File Resident (i.e., whether the CLUT resides -->

<!-- in the actual image file(s) covered by this -->

<!-- AdminMD). -->

<!-- ENCODE - Encoding format for CLUT in CDL Document. Must -->

<!-- be either Base64 or Text, with Base64 used for -->

<!-- for encoding actual, binary CLUT, and text used -->

<!-- for a text version of the values in the CLUT. -->

<!ELEMENT CLUT (#PCDATA) >

<!ATTLIST CLUT %a.global;

FResident (YES|NO) "YES"

ENCODE (Base64|Text) "Text" >

 

<!-- 3.1.1.5 ColorProfile (Color Profile for the Scanning Device) -->

<!-- Color profile for the Scanning Device originally used to capture -->

<!-- the image. Element itself is empty; attributes are as follows: -->

<!-- -->

<!-- CPLOCAT - specifies whether the color profile resides in -->

<!-- a separate file (FILE), in the image itself -->

<!-- (IMAGE), or in both. -->

<!-- CPFILE - if the color profile resides in a separate file, -->

<!-- this attribute provides a PURL or other network -->

<!-- location from which the file can be retrieved. -->

<!-- -->

<!ELEMENT ColorProfile EMPTY >

<!ATTLIST ColorProfile %a.global;

CPLOCAT (FILE|IMAGE|BOTH) "FILE"

CPFILE CDATA #IMPLIED >

<!-- 3.1.1.6 Resolution (Scanning Resolution) -->

<!-- Actual optical input scanning resolution of scanning device, -->

<!-- e.g., 600 dpi, 400 dpi interpolated to 600 dpi, etc. -->

<!ELEMENT Resolution (#PCDATA) >

<!ATTLIST Resolution %a.global; >

<!-- 3.1.1.7 LgtSource (Scanning Device's light source) -->

<!-- Light source used by a particular scanner, e.g., 3400K tungsten, -->

<!-- infrared, Osram Delux L fluorescent, etc. -->

<!ELEMENT LgtSource (#PCDATA) >

<!ATTLIST LgtSource %a.global; >

<!-- 3.1.2.1 Text (Text Creation Data) -->

<!-- Administrative Metadata relating to the creation and properties -->

<!-- of a text file or file. -->

<!ELEMENT Text (Encoding?, Transcriber?) >

<!ATTLIST Text %a.global; >

<!-- 3.1.2.1.1 Encoding (Character Encoding) -->

<!-- Character encoding scheme used within the document, e.g, Unicode -->

<!-- 2.0, ISO-8859-1, etc. -->

<!ELEMENT Encoding (#PCDATA) >

<!ATTLIST Encoding %a.global; >

<!-- 3.1.2.1.2 Transcriber -->

<!-- Person or entity responsible for transcription -->

<!ELEMENT Transcriber (#PCDATA) >

<!ATTLIST Transcriber %a.global; >

<!-- 3.2 Rights (Intellectual Property Rights Data) -->

<!-- -->

<!-- The Rights element has the following attributes: -->

<!-- COPYRIGHT - the copyright date for the image file(s), -->

<!-- given as a 4 digit year (e.g., 1998) -->

<!ELEMENT Rights (Owner+, Credit?, CopyRest?, DispRest?, License?) >

<!ATTLIST Rights %a.global;

COPYRIGHT CDATA #IMPLIED >

<!-- 3.2.1 Owner (Intellectual Property Rights Holder) -->

<!-- Owner of intellectual property rights for the *electronic -->

<!-- image or text*. -->

<!ELEMENT Owner (#PCDATA) >

<!ATTLIST Owner %a.global; >

<!-- 3.2.2 Credit (Credit Line) -->

<!-- Text required to be displayed whenever the text or image is -->

<!-- displayed, e.g., Copyright Berkeley Art Museum, 1978. All -->

<!-- Rights Reserved. -->

<!ELEMENT Credit (#PCDATA) >

<!ATTLIST Credit %a.global; >

<!-- 3.2.3 CopyRest (Copying and Distribution Restrictions) -->

<!-- Any copyright restrictions pertaining to the copy and distrib. -->

<!-- of the file(s), e.g., Copy and distribution of this file is -->

<!-- prohibited without the express consent of.... -->

<!ELEMENT CopyRest (#PCDATA) >

<!ATTLIST CopyRest %a.global; >

<!-- 3.2.4 DispRest (Display and Transmission Restrictions) -->

<!-- Any copyright restrictions pertaining to the display and -->

<!-- transmission of the file(s), e.g., This file may be displayed or -->

<!-- transmitted across a network only by person(s) who have signed -->

<!-- a license agreement with .... -->

<!ELEMENT DispRest (#PCDATA) >

<!ATTLIST DispRest %a.global; >

<!-- 3.2.5 License (Licensing information) -->

<!-- Any information regarding licensing arrangements covering the -->

<!-- file(s). -->

<!-- -->

<!-- The License element has the following attributes: -->

<!-- -->

<!-- BEGINDATE - Start date for the licensing agreement -->

<!-- covering this file given in ISO format of -->

<!-- YYYY-MM-DD -->

<!-- ENDDATE - End date for the licensing agreement -->

<!-- covering this file given in ISO format of -->

<!-- YYYY-MM-DD -->

<!ELEMENT License (#PCDATA) >

<!ATTLIST License %a.global;

BEGINDATE CDATA #IMPLIED

ENDDATE CDATA #IMPLIED >

<!-- 3.3 Source (Source of file data, i.e., original archival -->

<!-- object) -->

<!-- The Source element has the following attributes: -->

<!-- -->

<!-- SOURCEID - a number or alphanumeric string uniquely -->

<!-- identifying the source of this file, e.g., -->

<!-- local catalog unique ID for a book, accession -->

<!-- number for a special collections item, etc. -->

<!-- ID number for the original archival object. -->

<!ELEMENT Source (Type, Details?, SrcDimen?, AdminMD?) >

<!ATTLIST Source %a.global;

SOURCEID CDATA #REQUIRED >

<!-- 3.3.1 Type (Source Type) -->

<!-- Identifies the type of material from which the electronic file -->

<!-- was created. Should also mention if conversion source is -->

<!-- already a reformatted version of the original (i.e., a 35 mm -->

<!-- slide of a painting). -->

<!ELEMENT Type (#PCDATA) >

<!ATTLIST Type %a.global; >

<!-- 3.3.2 Details (Source Details) -->

<!-- Relevent descriptive details of the source material which may -->

<!-- impact on scanning, e.g., film type, print type, tightly bound -->

<!-- volume, etc. or transcription -->

<!ELEMENT Details (#PCDATA) >

<!ATTLIST Details %a.global; >

<!-- 3.3.3 SrcDimen (Source Dimensions) -->

<!-- Used to record both the physical dimensions for the source -->

<!-- object, and the actual physical dimension scanned -->

<!ELEMENT SrcDimen (OrgDimen?, ScanDimen) >

<!-- 3.3.3.1 OrgDimen (Original's Dimensions) -->

<!-- Dimensions of the original source object that was scanned. -->

<!ELEMENT OrgDimen EMPTY >

<!ATTLIST OrgDimen %a.global;

%Dimensions; >

<!-- 3.3.3.2 ScanDimen (Actual Physical Dimension Scanned) -->

<!-- Actual physical dimension scanned. Needed for facsimile output. -->

<!ELEMENT ScanDimen EMPTY >

<!ATTLIST ScanDimen %a.global;

%Dimensions; >

<!-- 4.0 StructMap (Structural Map) -->

<!-- -->

<!-- A Structural Map provides a hierarchical, structural definition -->

<!-- of a particular archival object. As there may be more than one -->

<!-- view of an object's structure, multiple StructMap elements are -->

<!-- allowed for a single object. A StructMap has two attributes: -->

<!-- ID - as per all CDL elements -->

<!-- MAPTYPE - indicates whether the structure captured in the -->

<!-- map describes a logical or physical view of the -->

<!-- object, e.g., book/chapter/subchapters vs. -->

<!-- book/pages. -->

<!ELEMENT StructMap (div+) >

<!ATTLIST StructMap %a.global;

TYPE (logical|physical) "logical" >

<!-- 4.1 div (divisions) -->

<!-- Very similar in intent to unnumbered divisions in TEI. A -->

<!-- structural map delineates a hierarchical structure imposed on -->

<!-- or derived from the archival object. This hierarchy is shown -->

<!-- in the CDL document as a series of nested div elements. Any -->

<!-- div element may contain, in addition to subsidiary divs, a -->

<!-- pointer out to various file locations that match this location -->

<!-- in the document hierarchy. So, for example for a div element -->

<!-- that matches the beginning of chapter 2 in a book, the div -->

<!-- element may contain both: A. divs for subchapters, and B. a -->

<!-- series of pointers to files that listed previously in the CDL -->

<!-- document that match the beginning of chapter 2. Typically, -->

<!-- such a pointer will either be to a page image file, or to an -->

<!-- XML/SGML text file. In the case of XML/SGML files, the file -->

<!-- pointer will also contain an unambiguous reference to a -->

<!-- particular tagged element in the file that matches the specified -->

<!-- div. -->

<!-- -->

<!-- The div element has the following attributes: -->

<!-- -->

<!-- N - numeric sequence within this level of div -->

<!-- TYPE - type of division, e.g., book, chapter, subchapter -->

<!-- LABEL - A label for this particular division within a -->

<!-- CDL document, e.g. Chapter I: The Question at -->

<!-- Issue -->

<!-- DESCMD - IDREFs to one or more descriptive metadata -->

<!-- elements which apply at this particular level of -->

<!-- div. Descriptive metadata linked to the root div -->

<!-- is assumed to describe the entire CDL object -->

<!-- -->

<!ELEMENT div (mptr*, fptr*, div*) >

<!ATTLIST div %a.global;

N CDATA #IMPLIED

TYPE CDATA #IMPLIED

LABEL CDATA #IMPLIED

DESCMD IDREFS #IMPLIED >

<!-- 4.1.1 mptr (CDL Object Pointer) -->

<!-- The mptr has two purposes in life: -->

<!-- 1. Allow a CDL object to point to another CDL object which -->

<!-- contains it. So, for example, a diary might contain a letter -->

<!-- which had been tipped in. The letter CDL object could contain -->

<!-- an mptr to the CDL object for the diary, to identify the diary -->

<!-- as a containing resouce. -->

<!-- 2. Allow a CDL object to point to another CDL object as a -->

<!-- subsidiary resource. One could have a 15 minute quadrangle map -->

<!-- represented as a CDL object, which could contain <div> elements -->

<!-- for each individual quad which pointed to CDL objects for the -->

<!-- 7.5 minute maps representing those individual quads. -->

<!-- The mptr is based on the XLink standard, and uses the following -->

<!-- attributes: -->

<!-- xmlns:xlink - the XLink namespace declaration -->

<!-- xlink:type - the XLink type, in this case, simple. -->

<!-- xlink:href - the URI for the resource. -->

<!-- xlink:role - a machine-readable description of the role played -->

<!-- by the resource identified by the XLink. The -->

<!-- following convention has been established for -->

<!-- describing the roles of CDL resources: -->

<!-- "container" - indicates a resource which -->

<!-- can be abstractly considered to -->

<!-- contain this object, as in the map -->

<!-- example. -->

<!-- "content" - indicates a subsidiary resource -->

<!-- contained by this object -->

<!-- "related" - indicates a resource for which neither -->

<!-- container nor contained provides an -->

<!-- an adequate description of the -->

<!-- relationship to the current object -->

<!-- Additionally, it is conventional to encode an mptr pointing to -->

<!-- a container at the root <div> for a CDL object. Similarly, -->

<!-- 'contained' mptrs will typically be used at leaf <div>s. -->

<!ELEMENT mptr EMPTY >

<!ATTLIST mptr %a.global;

xmlns:xlink CDATA #FIXED "http://www.w3.org/1999/xlink"

xlink:type CDATA #FIXED "simple"

xlink:href CDATA #REQUIRED

xlink:role CDATA #IMPLIED

xlink:title CDATA #IMPLIED >

<!-- 4.1.2 fptr (File Pointer) -->

<!-- The file pointer identifies a file (or position within a file) -->

<!-- listed in the FileGrp component of a CDL document which -->

<!-- corresponds to a particular point in the hierarchy for the CDL -->

<!-- document's structural map. So, for a div corresponding with -->

<!-- chapter 2 in the structural map, the fptr element should specify -->

<!-- either the image file matching the beginning of chapter 2, or -->

<!-- the SGML/XML document which includes chapter 2 along with a -->

<!-- reference to the tagged element beginning chapter 2 within that -->

<!-- -->

<!-- The fptr element has the following attributes: -->

<!-- -->

<!-- FILEID - the value for the ID attribute for the File -->

<!-- element matching the div for this fptr -->

<!-- MIMETYPE - specifies the mime type for the file being -->

<!-- pointed at -->

<!-- TAGID - to be used only with fptrs with a FILETYPE -->

<!-- attribute value of TEXT, the TAGID provides -->

<!-- the value of an attribute of type ID within -->

<!-- the document pointed to by the fptr. -->

<!-- -->

<!ELEMENT fptr EMPTY >

<!ATTLIST fptr %a.global;

FILEID IDREF #REQUIRED

MIMETYPE CDATA #REQUIRED

TAGID CDATA #IMPLIED >

<!-- =========================================================== -->

<!-- -->

<!-- END OF DOCUMENT TYPE DEFINITION -->

<!-- -->

<!-- =========================================================== -->

 

Appendix C: Metadata for Digital Objects(Required elements highlighted in grey)

 

Metadata and Encoding Tables

These tables are intended to be comprehensive and are recommendations for the full set of metadata elements that may be useful in the management of a digital image collection. These tables include both minimal and maximal values; identify required and repeatable fields; and identify which field values may be automatically generated or supplied manually. The Columns of the table are:

 

  1. Feature: a descriptive name of the metadata element
  2. Example: examples of this element’s content
  3. Description/Comments: a definition of this metadata element
  4. Req: Shows if the metadata element is required for the specified type of digital object. If not type is specified, this element is applicable to all types of digital objects.
  5. There are currently 20 required metadata elements, two of which need to be manually input for each object. The other elements would normally be automatically generated or inherited from "default" fields set in digitization management software. The required fields are shaded gray in the following metadata tables.

  6. Rep: Shows if the element is repeatable
  7. Source: Given reasonable digitization management software, the column describes how the element is created (e.g., manually supplied, automatically generated)
  8. Element / Attribute: Shows how this feature is encoded in the XML DTD. It may be represented by an <element> and/or by an attribute or group of attributes.

These metadata elements provide descriptive information regarding the entirety of a digital object. The descriptive metadata actually stored within a digital library object is minimal; most of the descriptive metadata regarding the object is stored externally to the object and is only referenced (or, in Warwick Framework terms, is an indirect package).

Feature

Example

Description/Comments

Req.

Rep.

Source

Element/ Attribute

Unique identifier reference

urn:ucb:I0182A, 10.1000/I0182A, http://purl.berkeley.edu/I0182A

This element uniquely identifies a particular digital object

Yes

No

Automatically generated

OBJID attrib. of ArchObj. element

Label

Patrick Breen diary : ms., 1846 Nov. 20-1847 Mar. 1.

Name or title for object, not necessarily unique, for display to user.

No

No

Manually supplied in encoding

LABEL attrib. of ArchObj element

Genre

diary, ledger, photoalbum, stereograph, etc.

Class of work of which this digital object is an instance. Analogous to a MARC 655 field.

No

No

Automatically generated from defaults

TYPE attrib. of ArchObj element

Descriptive Metadata Reference

http://sunsite2.berkeley.edu:28008/dynaweb/oac/calher/breen/

An identifier or location for descriptive metadata regarding this object.

Yes

Yes

Manually supplied in data capture

DMDRef Element and possibly TAGID Attribute of DMDRef element

Descriptive Metadata Type

MARC, EAD, RDF, PICS, OTHER

The form of descriptive metadata associated with this object.

Yes

No

Automatically generated for the most part. Distinction between EAD and OTHER finding tools must be set manually in data capture

DMDTYPE Attribute of DMDRef element

 

Content File Inventory

The content file inventory of a digital library object contains a listing of all of the files containing digital content derived from the primary source. The files are grouped within <FileGrp> elements. Root <FileGrp> elements encapsulate particular digital versions of the material (so that the files comprising an SGML transcription would be in one <FileGrp>, for example, while those comprising a series of page images would be in another). Subsidiary <FileGrp> elements may be used to divide files within a version according the nature of their content (so that all page image files would be in subsidiary <FileGrp>, while details from individual pages would be in another). The metadata elements within the content file inventory contain the most basic information needed to identify, retrieve, and display the content files which compose a digital object.

Feature

Example

Description/Comments

Req.

Rep.

Source

Element/ Attribute

Version

A digital library object may encapsulate several different electronic expressions of the original work which has been digitized in different formats. A version within a digital library object consists of all files necessary to process and display a particular expression to a user (e.g., an SGML transcription + DTD +DSSSL style sheet). Files within a single, root <FileGrp> element constitute a digitized version of the object.

Yes

Yes

Automatically derived from sub-object hierarachy

Root FileGrp element

Version Date

1963-05-23

Date the version referred to in VERSION was created; could be somewhat redundant if this data is used in the VERSION field. ISO Date format of YYYY-MM-DD is recommended.

No

No

Automatically generated from defaults

VERSDATE attribute on FileGrp element.

File Subset

Within a particular version of a work contained within a digital object, object designers may wish to subdivide files into particular subsets (e.g., all page image files vs. details of illustrations on pages). Such subdivisions can themselves contain further divisions.

No

Yes

Automatically derived from sub-object hierarchy

FileGrp element.

File ID

<File ID="I0182A">

A unique identifier, internal to the object, for referencing this particular File from the Structural Map.

Yes

No

Automatically generated

ID Attribute on File element

File Type

text/sgml, text/xml, image/tiff, etc.

Used to inform client software regarding the file's data format, and hence what general viewer type will be needed.

Yes

No

Automatically generated from defaults

MIMETYPE attribute on File element

File Sequence

23rd of 42 page images

Relative position of a particular file within its encapsulating subset of files.

Yes

No

Automatically generated

SEQ attribute on File element

File Size

1,047,245 bytes

The file size of an object sent to an intermediary such as a client or tool.
No

No

Not supported

SIZE attribute on File element

File Date

1999-05-13

The date the file was created expressed as ISO Date Format YYYY-MM-DD

Yes

No

Automatically generated from defaults

CREATED attribute on File element

File Owner ID

BANC PIC 1963.002:0449--C

A number or alphanumeric string uniquely identifying this image as belonging to the owner (ID number, barcode, filename, etc.).

No

No

Manually supplied in encoding

OWNERID attribute on File element

Admini-strative Metadata Reference

<File ADMID="A125 A137">

This attribute carries information necessary to locate all administrative metadata relevant to this file. In the digital library object, this consists of an IDREF attribute referring to a particular tagged section within the Administrative Metadata portion of the digital library document.

No

No

Automatically generated

ADMID attributes on File element and FileGrp element

File Equivalents

This element provides the ability to indicate that two separate images may be considered equivalent in some sense. Typically, this will be used to indicate that a derivative image in one version of the object corresponds with a particular master image in a reference version of the object.

No

No

Automatically derived from the sub-object hierarchy

GROUPID attribute on File element

File Use

ARCHIVE, REFERENCE, THUMBNAIL

Used to describe generic instances of an image.

Yes

No

Automatically generated from Master/Deriva-tive distinctions in database combined with image resolution when available

USE attribute on File element.

File Dimensions

1024 x 1028 pixels

Dimension information such as the resolution offered by the object (i.e., not the captured resolution) may be provided. This element documents the forms of the image object that can be requested from the repository (i.e., in order to assist an intermediary in navigation, manipulation, etc.).  For images of all types (i.e., bitonal and continuous tone), this is resolution and pixel dimensions. The element is not applicable for text.

No

No

Automatically generated from defaults

X, Y and UNIT Attributes of File element

File Locator

urn:ucb:I0182A, http://purl.berkeley.edu/I0182A.jpg

A unique identifier or locator which may be used by client software to retrieve the file in question.

Yes

No

Automatically generated from defaults

FLocat element

File Encoding

Base64

Character encoding standard used for embedding actual contents of a file within a digital library document. As any such encoding scheme must avoid producing a result which a parser might interpret as markup in examining a digital library document, Base64 encoding is recommended.

No

No

Not supported.

ENCODE Attribute of FContent element

Structural Metadata Table

Structural metadata records the abstract structure of the work from which the digital object is derived. Digital library objects are all structured as some type of hierarchy. Structural metadata is crucial for display and navigation of a digital object, as well as for indicating the relationships which exist between different digital versions of the same work.

Feature

Example

Description/Comments

Req.

Rep.

Source

Element/ Attribute

Structural Type

Logical, physical, etc.

Structural type is used to indicate whether the internal structure of the object is best described as a logical structure (e.g., this is a diary consisting of entries) or a physical structure (e.g., this is a book consisting of pages).

Yes

Yes

Automatically generated

TYPE Attribute of StructMap element

Structural Divisions/Sub-object Relationships

Parent div of diary, with two child divs of type entry, which are siblings:

 

<div TYPE='diary'>

<div TYPE='entry'>

</div>

<div TYPE='entry'>

</div>

</div>

A digital object may be logically divided into parts (e.g., letters in a diary). If resources are made available to support some level of encoding, structural divisions are encoded with the TEI element DIV. Many of the attributes of the Digital Object will be applicable to the Structural Divisions.

DIVs provide information on sub-object relationships. A diary entry in a diary section (e.g., a year) would have as its parent the section, and would have as siblings the previous and next diary entries. If, for example, it was an unusually long diary entry with sections of its own, its "children" would be the sections within the entry.

Yes

Yes

Automatically derived from sub-object hierarchy

div element

 

 

Feature

Example

Description/Comments

Req.

Rep.

Source

Element/ Attribute

Sub-object Type

Table of contents, entry, illustration, etc.

Similar to the genre for an object, sub-object type specifies a class of material of which this sub-object is a particular instance, such as entries in a diary, pages in a photoalbum, etc.

No

No

Automatically generated from defaults

TYPE attribute on div element.

Sub-object sequence

1 - N

Pages require a sequence indicator (e.g., this is the third page in the sequence of pages contained in this book).

No

No

Automatically generated

N attribute on div element

Sub-object Label

Page 3, "Fit the Second - The Bellman's Speech"

Name or title for the sub-object, not necessarily unique, for display to user.

No

No

Manually supplied in data capture

LABEL attribute on div element

Descriptive Metadata ID Reference

<div DESCMD=DMD2>

This attribute carries the information necessary to locate embedded descriptive metadata that pertains to the particular sub-object (or <div>). This consists of an IDREF type attribute referring to a particular GDM element

No

No

Automatically generated

DESCMD atttribute on a div element

Sub-object Format

text/sgml, text/xml, image/tiff, etc.

Images of all types (e.g., page images and continuous tone images) require format information. The contents of the Sub-object Format element are coordinated with the Content Type element (see above). While Content Type declares the available formats for a particular "type" of information (e.g., encoded text), the Sub-object Format element refers to these declarations to inform the intermediary of the available formats for the object at hand. For example, a page image may be said to be available as a GIF image, a PDF file, and a TIFF G4 image.

Yes

No

Automatically generated from defaults

MIMETYPE attribute on fptr element

Sub-object reference

<fptr FILEID="I0182A">

This attribute carries information needed to locate the sub-object. In the digital library object, this consists of an IDREF attribute referring to a particular file within the File Inventory section, possibly combined with a reference to a tagged item within a file.

Yes

Yes

Automatically generated

FILEID attribute and possibly TAGID attribute of fptr element

Contained or Containing CDL object pointer

xlink:href='http://sunsite.berkeley.edu/~jmcdonou/BREEN/breen.letter.xml'

Contains an xlink:href style reference to another CDL object. If associated with the root <div>, this refers to a CDL object that contains the current CDL object. If associated with a subsidiary <div>, this refers to a CDL object that is a subsidiary resource.

No

No

Not supported

xlink:href attribute of <mptr> element

Contained or containing CDL object role

xlink:role='contained'

This attribute clarifies the relationship between the current CDL object and the CDL object pointed to by the href link (see above)

No

No

Not supported

xlink:role attribute of <mptr> element

Contained or containing CDL object title

xlink:title='Letter by George McKinstry, tipped into original diary'

Name or title of the contained or containing CDL object. This would be displayed to the user, most likely in the form of a hot link to the related object.

No

No

Not supported

xlink:title attribute of <mptr> element

 

 

 

Administrative Metadata Table - General

Administrative metadata encompasses all information necessary for objects' long term use and management. It includes information on the technical features of content files (sometimes called technical metadata), intellectual property rights information, and source and provenance information.

Feature

Example

Description/Comments

Req.

Rep.

Source

Element/ Attribute

Administra-tive Metadata ID

<AdminMD ID="AM183">

A unique identifier, internal to a digital library object, which allows this metadata to be referenced by other portions of the object

Yes

No

Automatically generated

ID Attribute of AdminMD element

 


 

Administrative Metadata Table - Technical

Technical administrative metadata elements include that information necessary to document the technical processes employed in both digitizing primary source material and storing the digitization for future use (e.g., photographic and imaging processes and file formats used for storage).

Feature

Example

Description/Comments

Req.

Rep.

Source

Element/ Attribute

(Lossless) Compression Format

LZW

Type of algorithm needed to decompress the image, with note of software package used to apply the format, and degree/percentage of compression used where options exist.

Yes / Image

No

Automatically generated from defaults

Compression element

Bit Depth

1, 8, 24, etc.

Color depth, often needed by viewer and acts as an indication of quality to user.

No

No

Automatically generated from defaults

BITS attribute on BitDepth element

Color Space

CMYK, RGB, CIELab
Color space used, often needed by viewer and indicates whether image was initially created for onscreen display or for pre-press output. (Some color space parameters such as white point may require individual tags).

Yes / Image

No

Automatically generated from defaults

ColorSpace element

Color Lookup Table

(usually a binary table of RGB values)

Color values actually used in the image, often needed by some file formats, especially GIF.

No

Yes

Automatically generated from defaults

CLUT element

ICC Scanner Profile

Describes the color artifacts introduced by the scanning device. Necessary to map the images into standard color space and to adjust for display and printing devices.

No

No

Automatically generated from defaults

ColorProfile element

Resolution

600 dpi; 400 dpi interpolated to 600 dpi

the settings on the input scanning device (cameras usually measure these in dimensions, other devices in dpi). Note where device does its own interpolation.

No

No

Automatically generated from defaults

Resolution element

Light Source

Example: 3400K Tungsten, infrared, Osram Delux L fluorescent

Should be specific to settings for this scan (f-stop, electronic shutter speed, filtering, illumination level); may be necessary in later evaluation of color capture. Again, may be specific to each image or by inheritance to collections of images a via a separate descriptive file (with anomalies indicated per image as needed).

No

No

Automatically generated from defaults

LgtSource element

Character Encoding

Unicode 2.1 UTF-8, ISO-8859-1, etc.

Identification of the character encoding standard used for production of text files at sufficient detail to allow software to know how character data should be interpreted. Identifying the character encoding as ISO 10646 using UCS-4, for example, would be preferable to simply identifying the character set as ISO 10646.

No

No

Not supported.

Encoding element

Transcriber

"Ford Prefect"

The individual or entity responsible for producing the transcription.

No

No

Manually supplied in data capture

Transcriber element

 

 

Administrative Metadata Table - Rights

Rights administrative metadata elements include information regarding the intellectual property rights relevant to the digital object's storage, transmission and use.

Feature

Example

Description/Comments

Req.

Rep.

Source

Element/ Attribute

Copyright Date

1999

Date of copyright expressed as yyyy; in current approaches to interpretation of copyright law, the year is sufficient information.

No

No

Automatically generated from defaults

COPYRIGHT Attribute on Rights element

Owner

Saskia

Owner(s) of the copyright on the digital image file, which MAY be the creator of the digital image file, or the person(s) from whom the digital image file was purchased or licensed. It should contain the name(s) of the person(s) from whom copy/distribution and display/transmission rights may be secured. Note: this refers to the copyright on the digital image only, not the work(s) represented in the digital image.

No

Yes

Automatically generated from defaults

Owner element

Credit Line

Copyright Berkeley Art Museum, 1978. All rights reserved.

The text required to be displayed whenever the image/data appears.

No

No

Automatically generated from defaults

Credit element

Copying & Distribution Restrictions

Copy and distribution of this file is prohibited without the express written consent of...

text that spells out any copyright restrictions pertaining to the copy and distribution of this image file.

No

No

Automatically generated from defaults

CopyRest element

Display & Transmission Restrictions

This file may be displayed or transmitted across a network only by person(s) who have signed a license agreement with ...

text that spells out any copyright restrictions regarding the transmission and display of this image file.

No

No

Automatically generated from defaults

DispRest element

 

Feature

Example

Description/Comments

Req.

Rep.

Source

Element/ Attribute

License Information

This material licensed for use by the University of California for a period....

Specifies the terms of any licensing arrangements covering the use of the file.

No

No

Automatically generated from defaults

License element

License Begin Date

1999-01-01

start date of any licensing agreement covering this image expressed ISO Date Format YYYY-MM-DD

No

No

Automatically generated from defaults

BEGINDATE attribute on License element

License End Date

2003-12-31

end date of any licensing agreement covering this image expressed as ISO Date Format YYYY-MM-DD

No

No

Automatically generated from defaults

ENDDATE attribute on License element

 

 

 

Administrative Metadata Table - Source

Source administrative metadata elements are intended to record all information necessary to determine the origin of the current file, including both the sources used to produce the current file and any transformations which were applied to the content of the file in moving from an earlier version to the current resource. By preference, source information will be chained together so that an unbroken path from the existing file back to the original primary source material from which it derives can be traced.

Feature

Example

Description/Comments

Req.

Rep.

Source

Element/ Attribute

Source Item ID

a local catalog number plus page number for a book; an accession number (and possibly a page or part number) for a special collections item

A number or alphanumeric string uniquely identifying the source of this file (recursively).

Yes

No

Manually supplied

SOURCEID Attribute of Source element

Source Type

Photographic print, slide, manuscript, printed page(s), another digital image

To identify the material from which the digital file was created - the item on hand, even if it itself is a reformatted version, e.g. the scan of a 35mm slide of a painting would be entered here as a 35mm slide.

Yes

No

Automatically generated from defaults

Type subelement of the Source element

Source Character-istics

Print or film type, tightly bound volume, Kodak Q60 Color Input Target included in image, etc.

Relevant additional descriptive details which may impact on scanning quality or scholar's ability to evaluate file in hand.

No

No

Automatically generated from defaults

Details subelement of Source element

Physical Dimensions of Source

10.2cm x 18.4cm

Actual physical dimension of source. Needed for appropriate facsimile output.

Yes, if avail / Image

No

Automatically generated from defaults

X, Y and UNIT Attribute of OrgDimen element

Physical Dimensions of Area Scanned

8.3cm x 11.2cm

Physical dimensions of area actually scanned. This will be different from "source physical dimensions" if only a detail of the source was scanned. Needed for appropriate facsimile output.

No

No

automatically generated

X, Y and UNIT Attribute of ScanDimen element

 

 

 

 

Descriptive Metadata Table - Generic

Generic descriptive metadata elements are intended to record descriptive metadata internally. Such descriptive metadata may pertain to the object as a whole, or just to a particular division (or subobject) within the object For example, in the case of a photographic collection, the descriptive metadata elements could be use to record descriptive metadata both about the entire collection and about each photograph in the collection. While external descriptive metadata (in the form of a MARC record, for example) will often be available for describing entire objects; it is less likely to be available for the divisions (or subobjects) comprising that object. Generic Descriptive Metadata elements are intended to meet descriptive metadata needs not filled by externally available descriptive metadata. Thus, embedded descriptive metadata may supplement external descriptive metadata pointed to by DMDRef elements.

Feature

Example

Description/Comments

Req.

Rep.

Source

Element/ Attribute

Descriptive Metadata ID

<GDM ID="DM3">

<wrapper ID="DM1">

A unique identifier, internal to a digital library object, which allows this descriptive metadata to be referenced by other portions of the object--specifically by the <div> elements of the <StructMap>

Yes

No

Automatically generated in case of GDM.

ID Attribute of a GDM or wrapper element

 

 

Generic Descriptive Metadata - Core

The core generic descriptive metadata elements are intended to capture key descriptive information about an object as a whole or about specific divisions or subobjects comprising the object. . The elements defined closely correspond to descriptive metadata fields in the Berkeley Generic Database.

Feature

Example

Description/Comments

Req.

Rep.

Source

Element/ Attribute

Date

First printing; 1895

May include a date note , along with the primary date or the beginning and ending dates for the object or division represented.

No

Yes

Manually supplied

coreDate subelement of Core element, beginDateNorm, endDateNorm and/or primaryDate attributes of the coreDate. Note that the Date element proper may contain a note clarifying the meaning of the date or dates; but the date or dates themselves appear as attributes.

Caption

Caption appearing on the object or the division of the object represented.

No

Yes

Manually supplied

Caption subelement of Core element

Dimensions

15.0 cm. x 20.0 cm

Physical dimensions of the object or division of the object represented.

No

Yes

Manually supplied.

height, width, depth, and units attributes of the Dimensions subelement of the Core element

EADLevel

EAD level of the object or division of the object represented.

No

Yes

Manually supplied

EADLevel subelement of the Core element

LocalID

xF1207 R66 vault

Call number, accession number, shelf location or other locally defined identifier associated with the object or division of the object represented.

No

Yes

Manually supplied

LocalID subelement of Core element

Place of origin for material

Place where the object or division of the object represented was created, published, found, etc

No

Yes

Manually supplied

Origin subelement of Core element

Type

diary, entry, page, photo

The type of the object or the division of object represented by the descriptive metadata. In the case of a photoalbum, the type of the object as a whole would be "photoalbum", the type for its first level divisions (or subobjects) would be "page", and the type for its second level divisions (or subobjects) would be "photo".

No

Yes

Manually supplied

SOType subelement of Core element

Title

Breen Diary, entry 1: January 5, page 1

The title or an appropriate label for the object or the division of the object represented by the descriptive metadata.

No

Yes

Manually supplied

Title subelement of Core element

 

 

Generic Descriptive Metadata - non-Core

The non-core generic descriptive metadata elements are intended to capture the secondary descriptive information pertaining to the object or to divisions of the object. The elements defined closely correspond to descriptive metadata fields in the Berkeley Generic Database.

Feature

Example

Description/Comments

Req.

Rep.

Source

Element/ Attribute

Administrative Information

Administrative information regarding the source of an object or the division of the object represented.(not its electronic encapsulation). The following kinds of Administrative information notes are supported: institution name, institution address,processing info, funding info, acquisition info, alternate form, and general info.

No

Yes

Manually supplied.

Admin subelement of the GDM element. The type of Administrative information represented is specified in the FieldType attribute.

Alternative Date Information

Reprinted: 1546

Secondary date information (such as reprint date) . May be just a single date or a date range.

No

Yes

Manually supplied

AltDate subelement of the GDM element. The Alternate date or dates themselves specified in the Date, BeginDate and/or EndDate attributes. The AltDate content expresses the nature of the alternate dates (i.e, "Reprinted").

Contents

Information regarding the content of the object or the division of the object represented. The following kinds of Contents notes are supported: abstract, scope/content, style/period, and general.

No

Yes

Manually supplied

Content subelement of the GDM element. The type of Content information represented is specified in the FieldType attribute.

Creator

Publisher: Cumarraga, Juan de

Printer: Pablos, Juan

Creator of the object or division of the object represented. In addition to the creator name, this may include name type, role and date information pertaining to the persons and/or organizations who contributed to the creation of the material represented.

No

Yes

Manually supplied

Creator subelement of the GDM element. Name type, nationality, dates and role are expressed in the NameType, nationality, dates and role attributes respectively.

General notes

Scale: 1:62500

General note: Surveyed in 1892, 93, 94

General notes about the object or the division of the object represented. Numerous kinds of General notes are supported: appraisal, bibliography, biblio-history, biographical, citation, conservation history, edition, exhibition history, original, provenance, scale, series, value and general.

No

Yes

Manually supplied

General subelement of the GDM element. The specific type of note represented is specified in the FieldType attribute.

Physical description

Physical description of the object or the division of the object represented. Numerous kinds of physical description notes are supported: condition, decoration details, dimension note, dimensions, duration, extent, genre form, language, marks/inscriptions, medium/materials, organization, physical description, place of origin, presentation, process/technique, script, substrate support, and general.

No

Yes

Manually supplied

PhysDesc subelement of the GDM element. The type of physical description information represented is specified in the FieldType attribute.

Related material

Related: Oakland West 7.5-minute Quadrangle: http://sunsite.berkeley.edu/xdlib/servlet/archobj?DOCCHOICE=maps/brk00010.00000008.xml

Information, including URL, on material related to the object or the division of the object represented.

No

Yes

Manually supplied

Related subelement of the the GDM element. Subelement contents specifies the name or title of the related material. The URL, type, instution, and idnumber are expressed as attributes.

Source

Notes about the source material from which the object or division of the object represented derives. Numerous kinds of source notes are supported: characteristics, dimensions, local id, reproduction, source type and general.

No

Yes

Manually supplied

Source subelement of the GDM element. The type of source note represented is specified in the FieldType attribute.

Subject

Subject headings applied to the subobject. May include the source of the subject heading, its type (topical, geographic, personal name, etc), and an indication of whether the source has been checked

No

Yes

Manually supplied

Subject subelement of the GDM element. The source of the subject heading can be expressed as a Source attribute, and the type as a Definition attribute. The SrcCheck attribute indicates if the source has been checked