See sections below:
There are two primary objectives of copyrightMD. One is to express factual information that allows users to make an informed copyright assessment of a given work. Additionally copyrightMD provides users with sources of further information about the copyright status, or to a person or institution that can be a resource for permissions. CopyrightMD is user-oriented and not intended to serve as a recordkeeping mechanism for other copyright-related information collected and gathered by institutions for internal management (such as donor agreement information, copyright permission request histories, etc.).
One of the chief characteristics of digital resources--whether digitally reformatted or "born digital"--is the ease with which they can be re-used either in whole or in part. Reuse rights are determined based on the copyright status of an item as well as the nature of the intended use. Libraries, archives and cultural memory institutions increasingly acquire, collect, create, provide access and preserve digital objects and collections. Recent Association of Research Libraries reports that over 30% of member libraries collections are digital.
It is incumbent on the users of materials to assess the copyright status of an item and to understand whether use is permitted by law or if it may be necessary to acquire permission. To make this copyright assessment, users need to have certain information about the item. In the case of digitally reformatted materials in particular, this information may be inherent in the context of the archive or collection in which the physical items reside. But the digitization of works often removes that context, and connections to the original works may weaken over time. CopyrightMD provides users the information necessary, to make an assessment of copyright status. Historically, however, the recording of copyright-related information is often not part of the tradition of cataloging or metadata creation in libraries, museums, and archives and thus this information has not been readily available to users. To complicate matters further, since 1978, copyright law no longer requires a notice of copyright to protect a work making it more difficult to identify creators and track down copyright owners.
Libraries have as part of their fundamental mission to provide users with a means to access and use information resources. Providing the user with all available information relating to the copyright status of the work increases the user's ability to access and appropriately use the work and as such is a part of any digital library’s service and mission. Current thinking is that this type of contact information will be provided to users through the user interface. Additionally, systems could also be configured so that users could scope searches and browses to collections with particular copyright statuses. For example, a search could be limited to "public domain" works only, or only to materials where the rights holder and contact information are known.
A second functional objective of the schema is to explicitly associate
item-level copyright information with discrete digital objects. By
preserving copyright metadata with individual digital objects as they
are created, we will be preventing the creation of "orphaned
works" in the future.
The goal is to provide information that supports a copyright assessment. Assertions give users some idea of the presumed copyright status of the work, but do not take the place of factual copyright metadata. In fact, assertions could potentially be automatically generated by display systems, given parsed copyright metadata.
Copyright status is not a substitute for the copyright metadata,
for several of reasons. First, copyright status isn't sufficient to
tell the user what to do for works that are under copyright. The copyright
data elements add information that could inform a fair use assessment,
as well as name and contact information should the user seek to obtain
permission. Next, copyright status is a moving target (albeit not
moving very quickly), so an item with a status of "copyrighted"
will eventually have a status of "public domain." Third,
copyright metadata should be seen as part of the archival package.
Among other things, it helps future users understand how the copyright
assessment was determined.
Any XML authoring tool can be used to create copyrightMD records. Other tools, such as standard databases could also be configured to produce copyrightMD outputs.
The CDL hopes to work with the digital library community to promote the creation of open-source tools and solutions for generating metadata that adheres to copyrightMD. The CDL also hopes to provide data transformation services whereby metadata records in one encoding format could be converted into schema-compliant encoding, using solutions such as eXtensible Stylesheet Language Transformations (XSLT) stylesheets.
It's true that copyrights can be transferred, so the copyright holder can change, and an unpublished work may be published at a later date. Metadata creators are understandably concerned about having to update this information for stored works. Because one cannot guarantee that this maintenance will take place, we recommend that copyright metadata be clearly date-stamped so that future users can know that the data in the record represents the copyright data at a particular moment in time. Even though some particulars may have changed, this gives the user a solid starting point for investigating current copyright status.
copyrightMD is designed to express copyright metadata within an XML environment, and therefore serves as an extension schema for the Metadata and Encoding Transmission Standard (METS).
The CDL Rights Management Group did an analysis of the functional requirements related to copyright metadata and as part of the analysis did a review of existing metadata schemas. The results of the analysis and review indicated that no existing metadata schema included all necessary data elements and, in fact, most merely contained broad buckets for the collecting and recording of copyright metadata with no functionality for machine actionability.
For more information, see Assessment of Common Rights Metadata Encoding Schemes and Gap Analysis.
The metadata schemas mentioned above do not adequately accommodate -- both in terms of extent and granularity -- the range of essential descriptive information necessary to assess the copyright status of a given object. Dublin Core, Metadata Object Description Schema (MODS), and MARCXML, for example, are designed to capture descriptive metadata and have relatively few data elements for capturing copyright-specific data. METSRights was designed to capture both copyright, contract, and license data, but it only has data elements for rights holder name and contact information. Even if multiple schema such as MODS and METSRights are combined into a composite metadata record for a given digital object, essential copyright information will still be lacking.
Digital license schemas such as XrML (eXtensible rights Markup Language) and ODRL (Open Rights Digital Language) may be able to contain copyright-specific data, but as they are used today their emphasis is on license conditions not on copyright status.
For more information, see Assessment of Common Rights Metadata Encoding Schemes and Gap Analysis.
Descriptive metadata serves somewhat different purposes than copyright metadata: the former is primarily geared towards helping users locate and identify items via searching and browsing, while the latter is geared towards assisting users with determining the copyright status of a given item.
Common cataloging practices and content standards for descriptive cataloging are not principally geared towards recording important copyright information. For example, knowing who created a work does not tell you who currently holds the copyright in the work. At the same time, the creator of a work is a key element in copyright law, even if the creator is no longer the copyright holder. Information about the creator is necessary for the expiration of the rights over time. Another difference between descriptive practice and copyright definition is that copyright relies heavily on the creator's death date, since that is the date that determines the length of the copyright. In common descriptive metadata cataloging practices, dates are only included with author’s names when those are needed to distinguish between two authors with the same name, and only the date of birth is needed for that purpose. Occasionally, the only known date is the death date, and therefore that is the date used, with “d.” before it to indicate that it is the date of death. In practice, records with author’s date of birth are rarely updated to include the date of death, so this key piece of information is missing from most cataloging.
It is possible, however, that some data values present in a descriptive metadata record will directly overlap with data values in a copyright metadata record. copyrightMD will seek to provide a mechanism to support referential linking between related data elements, in order to reduce data redundancy in a single record.
Yes. In fact, we assume that it will be used in conjunction with other metadata. For example, a copyrightMD record could be combined with other descriptive (e.g., Dublin Core, Metadata Object Description Schema (MODS), or MARCXML), structural, administrative, and technical metadata records in a METS wrapper. Our main interest is in the data elements of copyrightMD, and these could presumably be expressed in formats other than XML schema. If you do use the data elements in other formats, please let us know so we can add your implementation to our web site.
The U.S. copyright law maintains some differences between published works and unpublished works. For a useful summary of the differences, see Peter Hirtle's Copyright Term and the Public Domain in the United States. The law itself defines the act of publication as: "... the distribution of copies or phonorecords of a work to the public by sale or other transfer of ownership, or by rental, lease, or lending." (see US Code, Title 17, section 101). It is admittedly often not a simple matter to determine whether an item is or is not published, and users of the metadata have the option of coding this element "unknown."
The CDL Rights Framework was developed to provide a broad framework for understanding the digital rights environment of the CDL, and provides general policies that will guide the decisions that must be made as resources are created, acquired and shared in a digital environment. The CDL Rights Framework follows the Joint Information Systems Committee (JISC) work describing six stages of rights that correspond approximately to the digital resource workflow. The first three stages (recognition of rights, assertion of rights and expression of rights) make up the policy creation phase of digital rights management; the last three stages (dissemination of rights, exposure of rights and enforcement of rights) are the policy projection phases. The creation of the copyright metadata schema falls under the third phase--expression of rights.
All of the copyright data elements can be coded as "unknown," which is the appropriate coding for this situation. This alerts users to the fact that they will not be able to rely on the provided metadata to aid in copyright determination, and suggests that they should contact your institution (or the appropriate services contact) for further information.
The copyright data elements may be used at any level of description, and should be used at the level that is most appropriate to document the copyright status of a resource. This may be at an aggregate-level or at an item-level, depending on the degree to which the components of the resource share given copyright information. In some cases, the copyright data elements could be used at the component level for a particular item; for example, in the case of a photograph album comprising images created by different photographers, each page of the album may need to reflect a different copyright status.
The copyright data elements can be used at the component level for a particular item; for example, in the case of a multi-media work comprising images, music, and textual material, each component may need to reflect a different copyright status. A future version of copyrightMD will seek to address more fully issues with documenting copyright statuses for complex objects. It isn't clear yet if we can just add roles to creators of different components of these complex objects, or if we need to add additional data structures to the schema. If you have interesting materials to provide as test cases, we'd be happy to hear from you.
The copyrightMD schema reflects the full suite of elements that could be utilized in any given record. A valid copyrightMD record requires the top-level <copyright> element with associated @copyright.status and @publication.status attributes; however, no additional elements or attributes are required.
The CDL currently does not specify requirements for a minimal record (i.e., which elements must be present in any given copyright metadata record), but will be developing policies as the schema is developed. In principle, the CDL's goal is for copyright information to be recorded when it is known by contributing institutions.