CDL ingests content in the form of METS (Metadata Encoding and Transmission Standard) encoded digital objects. CDL depends upon METS Profiles to successfully process submitted objects.
METS profiles describe classes of METS digital objects that share common characteristics, such as content file formats (e.g., digital images, TEI texts) or metadata encoding formats (e.g., MODS or Dublin Core). Profiles should include enough details to enable METS creators and programmers to create and process METS-encoded digital objects conforming with a particular profile. A METS profile itself is an XML document that should adhere to the METS XML Profile Schema. For information about METS profiles, see the METS website.
METS files must conform to valid METS profiles, which must be declared during pre-submission discussions with CDL staff.
The METS top-level <mets> element must have an OBJID attribute containing an ARK identifier for the digital object. For more information about ARKs, visit the Archival Resource Key (ARK) page.
If an ARK is not supplied (within objects submitted to the CDL for the Basic Service Level only), a unique local identifier must be supplied as the OBJID. Under this scenario, CDL will generate an ARK when ingesting the object, and will use this ARK as the primary identifier and consider the supplied local identifier to be the equivalent of the <metsHdr><altRecordID> element.
To support the orderly transmission and ingest of digital objects, the CDL strongly recommends submission of checksum (MD5, SHA-1, or CRC32) and byte size values in the METS File <file> element.
The METS Content File Section <fileSec> element must contain links to network-exposed (i.e., online) content files using File Location <FLocat> elements. Each <FLocat> element must contain a xlink:href attribute that identifies a link to its associated content file.
The METS file and associated content files must be well formed and uncorrupted.
Although METS allows for linking to external metadata using <mdRef>, the DPR ingest process will not capture this information. If you want to preserve external metadata, link to the file in the <fileSec> using <file><FLocat>.
The Basic Service Level does not require any metadata, but strongly encourages that you supply the following kernel metadata:
Descriptive Metadata Recommendations (Summary) |
|
| [NOTE: See Appendix A for detailed descriptions of each element. Element names below are also linked to those descriptions] | |
| Identifier | |
| Title | |
| Creator (or Contributor or Publisher) | |
| Date | |
| Description | |
| Format/Physical Description | |
The descriptive metadata mappings provided in Appendix A are for MODS and qualified Dublin Core. Other descriptive metadata schemas may be used, but must be defined as part of the pre-submission negotiation and will require either A) a mapping of the metadata to Dublin Core, or B) an XSL style sheet that performs the mapping.
The following data are generated by the CDL during the DPR ingest process, and can identify and provide access to digital objects submitted with no descriptive metadata. Only the most basic and fundamental of DPR services will be available for such objects. CDL-generated data:
The CDL generates the technical metadata required to support the orderly management of digital objects in its repositories. Currently, the CDL utilizes the JSTOR/Harvard Object Validation Environment (JHOVE) tool to derive technical metadata for accepted content file types.
You are encouraged to submit any additional technical metadata associated with a particular digital object (such as checksum [MD5, SHA-1, or CRC32] and byte size values in the METS <file> element, or information based on NISO's Data Dictionary: Technical Metadata for Still Images), but are not required to do so. CDL preservation services will store any supplied additional metadata with the object.
Note that all supplied technical metadata should be encoded using valid XML extension schemas as specified by CDL-supported METS profiles (such as in the NISO Metadata for Images in XML Schema (MIX) format). If a given set of metadata does not conform to a valid XML extension schema, then you should create a schema to embed the metadata and facilitate validation of the METS file. Otherwise, the metadata should be stored independently of the METS file and referred to using the METS <mdRef> Metadata Reference from within the METS file.
The following content file formats are currently supported by the DPR:
New or unknown file formats may be submitted to the DPR, but must be established as part of the pre-submission negotiation. In addition, DPR administrators will not necessarily guarantee that all of the DPR services will be available for unknown file formats (i.e. migration or transformation processes) and will only guarantee preservation of the original bitstream.
All content files must be online or exposed over a network for the DPR software to be able to retrieve them during the ingest process. The exception is when content files are embedded within the METS wrapper using the <FContent> File Content element.
Each content file should have a file name that is unique to your institution (i.e., not necessarily globally unique); often the unique identifier is used to name the content file itself.
Examples: