Inside CDL

Doing EAD Locally

Section 3. Encoding Strategies and Tools

See sections below:

See also:

This chapter describes different strategies, tools, and services that can be utilized to get your EAD encoding done. While some tools may be easier to implement within the context of your institution, all of them will will require some degree of staff, resource, training, workflow analysis, and time commitment in order to be used effectively. Note also that some tools will allow you to simultaneously create a printed version of a finding aid for local users and an encoded version for online users; other tools may necessitate managing two different versions of a finding aid -- a printed version and an encoded version.

Regardless of the strategy or tool you use, you should have some basic familiarity with the EAD encoding standard, as codified in the EAD Application Guidelines and Tag Library. A good place to start is with the Application Guidelines' preliminary section and Chapters 1 and 2 (the Tag Library is primarily a reference tool for use of particular EAD elements). Some strategies require more or less engagement with EAD encoding than others, however. For example, if you are encoding manually using a simple text editor, you should be very familiar with EAD. If you are encoding using the OAC EAD Web Template, however, you do not necessarily need to be familiar with EAD.

Some knowledge of the EAD encoding standard will also allow to you to more effectively proof, edit, and troubleshoot encoding problems. You will also be better prepared to quality-control check your encoding, whether done in house or outsourced to a vendor.

You should also be familiar at this point with the OAC BPG EAD. Regardless of the tool you use, you should ensure that it produces OAC BPG EAD-compliant finding aids.

Although the best practice guidlines mandate for encoding, creating OAC BPG EAD-compliant finding aids will be much easier and efficient if your pre-EAD encoded descriptions (whether in word processed format, in a database, etc.) are consistently structured, ordered, and prepared in such a way that EAD tags can be easily "wrapped" around those descriptions. This includes ensuring that all OAC BPG EAD-required descriptive elements are present in your finding aid. An understanding of the best practice guidelines will help you determine whether or not it will be necessary to restructure or amplify description, or even reprocess parts of the collection.

3.1. Encoding Strategies

The OAC BPG EAD categorizes finding aids into two possible encoding schemes:

  • OAC Basic: The "OAC Basic" encoding scheme is the minimal scheme allowable for new finding aids added to the OAC database. It reflects single-level descriptive outputs at any level, but typically for large accumulations such as collections, record groups, fonds, or record series. It can, however, only describe materials at one explicitly articulated level and does not support multilevel encoding of subsequent lower levels (the "OAC Full" encoding scheme).

    The "OAC Basic" encoding scheme is appropriate for the following kinds of collections:

    • Small collections or single-items
    • Large homogeneous collections (e.g., the minutes of a committee, and nothing else)
    • Collections not yet fully processed or not expected to be processed for some time

    In such instances, the collection may not warrant component description or a detailed listing of files or items. The OAC recommends, however, using the "OAC Full" encoding scheme for collections demonstrating greater complexity.

  • OAC Full: The "OAC Full" encoding scheme reflects multilevel descriptive outputs. Multilevel descriptive outputs can describe archival material beginning at any level, and must include at least one other level than the one at which they begin. Typically multilevel descriptive outputs begin at the level of large accumulations such as collections, record groups, fonds, or record series. Multilevel finding aids represent the deepest encoding supported by the OAC BPG EAD.

Based on these two possible encoding schemes, there are a wide variety of strategies, tools, and services that can be flexibly utilized to encode either extant finding aids, or new finding aids.

Iterative Encoding

  • Consider the authoring process as iterative, and do not attempt to initially create full EAD finding aids (i.e., collection-level record and container list) all at once. Instead, start with collection-level description and define a process and timeline for encoding lower levels of description, such as container lists, at a later time. Alternatively, start with your container lists, and append a collection-level description at a later time.
  • Consider presenting your finding aid in a series of stages, publishing what you currently have processed and adding to this finding aid as collection processing progresses.

Prioritizing

  • Which of your finding aids are the easiest to encode? Consider the formatting and condition of your finding aids.
  • Which are the most important to encode? Consider which collections are the most significant in the context of your repository. Which ones are heavily used? Would use increase if their finding aids were made available online?
  • Consider grouping together finding aids by their "levels" of description in order to prioritize encoding. For example, target a particular group of collections that need to be fully described, i.e. a collection-level record and container list, and consider the range of strategies for encoding these in EAD (see "Combing Tools" approach, below). Target a separate group of finding aids that may only need to be described at the collection-level and use an appropriate strategy, such as the OAC EAD Web Template, to generate collection-level EAD finding aids.

Combining Tools

  • Consider using a combination of encoding tools and services (see Encoding Strategies Diagram). For example, use the OAC EAD Web Template to generate a collection-level EAD record, and use a simple text editor to encode the container list, and attach that container list to the collection-level record.
  • Also consider grouping together finding aids that can be encoded using a particular strategy. For example, you might send a group of finding aids with long, detailed container lists to a vendor; you might encode a separate group of finding aids with collection-level descriptions only in house, utilizing the OAC EAD Web templates.

3.2. Encoding Tools and Services

The following section provides information on EAD encoding tools and services that are commonly used by OAC contributing members. The Encoding Strategies Diagram keyed to this document illustrates the different ways these tools and services can be used in combination. Freeware and OAC-developed tools listed here can also be found in the EAD Toolkit. See also the University of Virginia EAD Help Pages ("EAD Sites Annotated" and "Helper Files Created for Specific Software" Webpages, in particular) for additional software and programs used to generate EAD encoding.

OAC EAD Web Templates

How it works:

The OAC EAD Web Templates consist of online forms into which encoders cut and paste segments of their non-EAD finding aids. The online form converts the text into either an EAD file with the push of a button. Templates can be configured to produce SGML or XML. For more information, see the EAD Toolkit.

Pros:

  • Very easy to create a collection-level EAD finding aid using free Web-accessible tool that requires very little training to use.
  • You do not need to be particularly familiar with EAD to utilize the template.
  • The template will always produce a valid and best practice guideline-compliant EAD file.

Cons:

  • Any editing of the EAD file beyond immediate use of the template requires use of a different encoding tool.
  • The template does note support the creation of extensive file-/item-level container lists. These must be created using a different encoding tool and cut and pasted into the EAD file.
  • The template cannot be used to save files that are in process, nor can it be used to edit existing EAD files. Encoders must either recreate the file using the template, or use a text editor or SGML/XML authoring tool to directly edit the EAD file.
  • Your source file will initially be your word processed finding aid until you create the EAD version utilizing the template. Then you will have two versions of the finding aid to manage. Which one will be your source file, the word processed file or the EAD file? Any edits to the finding aid will require updating both versions: the word processed version and the raw EAD file.
  • Since you will be cutting and pasting from a non-EAD file into the template, you will need to ensure that the text can be converted into simple ASCII text without degradation, or marked up using EAD text formatting tags -- this includes smart quotes, underlining, bolding, etc. Special symbols and non-Latin characters (such as diacritics, symbols, etc.) will need to be encoded directly in Unicode or using Unicode character references.
  • You will need to validate the EAD file using a validation tool since the template cannot do this. A simple text editor can be used in conjunction with the validation tool to correct encoding errors. For links to freeware and OAC-developed validation tools and simple text editors, see the EAD Toolkit.

Outsourcing

How it works:

A print or electronic finding aid is outsourced to a vendor to encode. The data submission package generally involves at least two items: the finding aid to be encoded and specifications for encoding the finding aid.

Pros:

  • You don't need to invest staff, time, or resources to encode in house
  • .
  • You can derive a print finding aid from the encoded finding aid by printing a copy of the encoded finding aid when displayed online.

Cons:

  • You will need to develop clear OAC BPG EAD-compliant specifications for the vendor.
    • The more consistent your description is, the easier it will be to prepare encoding specifications that can apply to a broad group -- or all -- of your finding aids. If your descriptions are not consistent, you may need to rework them. Otherwise, your encoding specifications will need to be able to describe how to consistently apply EAD encoding to inconsistent descriptions. If you receive a poorly encoded product from a vendor, it may well be due to the nature of the legacy finding aid and clarity of the encoding specifications.
  • Your source file will initially be your word processed finding aid until you create the EAD version. Then you will have two versions of the finding aid to manage. Which one will be your source file, the word processed file or the EAD file? Any edits to the finding aid will require updating both versions: the word processed version and the raw EAD file. Will you edit the EAD file in house or resend it to the vendor?
  • Can be expensive, ranging from $3.00 to $5.00 per page of finding aid.
  • Although the vendor will ensure that the file is a valid EAD file, you should always check the quality of encoding in house.

Additional resources:

  • For an additional discussion on outsourcing, see the EAD AG. See also Section 1.8 of this document.
  • See the University of Virginia EAD Help Pages' EAD Sites Annotated Webpage "Preparation" section to learn how other repositories outsourced encoding.

Word Processors and Text Editors

How it works:

Manually create word processed finding aid and manually mark up text into EAD encoded text, or use macros to add EAD tags or search-and-replace to convert symbols into EAD tags. For links to text editing freeware, see the EAD Toolkit.

Pros:

  • Easy to initially create text of your finding aid using a widely used, familiar, and fairly inexpensive tool.
  • You will have a word processed, printable version of your finding aid can be formatted in any way you like.

Cons:

  • In order to create an EAD file, you will literally be marking up your word processed text with EAD tags, or creating EAD encoded text directly. You must therefore be very familiar with EAD. This kind of EAD mark up can work for container lists, but it is not very feasible for collection-level description, which could be prepared using some other strategy, such as the OAC EAD Web Templates (see above).
  • If using macros to convert text into EAD encoded text, you will need to ensure your finding aids are consistently described.
  • Your source file will initially be your word processed finding aid until you create the EAD version. Then you will have two versions of the finding aid to manage. Which one will be your source file, the word processed file or the EAD file? Any edits to the finding aid will require updating both versions: the word processed version and the raw EAD file.
  • Since you will be cutting and pasting from a non-EAD file into the template, you will need to ensure that the text can be converted into simple ASCII text without degradation, or marked up using EAD text formatting tags -- this includes smart quotes, underlining, bolding, etc. Special symbols and non-Latin characters (such as diacritics, symbols, etc.) will need to be encoded directly in Unicode or using Unicode character references.
  • You will need to validate the EAD file using a validation tool. A simple text editor can be used in conjunction with the validation tool to correct encoding errors. For links to freeware and OAC-developed validation tools and simple text editors, see the EAD Toolkit.

Additional resources:

  • For an additional discussion on using word processors, text editors, and text conversion tools, see the EAD Application Guidelines for Version 1.0, Section 4.2.1 and 4.2.3.
  • See the University of Virginia EAD Help Pages' EAD Sites Annotated Webpage ("Preparation" and "Mark-Up" sections) to learn how other repositories implemented these methods. See also the EAD Help Pages' Helper Files Created for Specific Software Webpage for additional tools (such as macros) developed by repository to assist in transforming word processed files into EAD files.

Databases

How it works:

Use a relational database to manage elements of description in the finding aid. The database fields must be configured to correspond to EAD fields. It must also be configured to supply or attach EAD encoding to the description stored in the database. An EAD file is produced through exporting the descriptions out of the database.

DAMD is a preconfigured FileMaker Pro 5.0 database used by MOAC contributing members for producing EAD files. The database can also be utilized to generate digital objects associated with the EAD finding aid. For more information about DAMD, see the EAD Toolkit.

Pros:

  • A database can be used to input, store, and generate a complete finding aid, or only portions of the finding aid, such as the container list. Other encoding strategies, such as the OAC EAD Web Template, can be utilized to generate a collection-level description. A database may also be used to import finding aid text in a different format, such as a tab-delimited text document.
  • You can generate multiple outputs using the software, including the following:
    • EAD file for the OAC
    • EAD file for local display (you may need additional hardware and software, such as a Web server and stylesheets, in order to do this)
    • Print copies (you may need other software applications or stylesheets to generate a finely formatted print document)
  • You have only one source file to manage. All edits and changes can be made in the database and you can generate revised print or EAD outputs as necessary.
  • Changes in the EAD DTD and best practice guidelines can be incorporated directly into the database structure database structure or export routine in order to generate standards-compliant finding aids.
    • The interface can be customized to make encoding easier, using pull-down windows, prompts, etc.

Cons:

  • Non-preconfigured databases require a significant investment of resources, time, and staffing, and technical expertise to create and implement a database with these features. (Note that DAMD is a preconfigured File Maker Pro 5.0 database specifically designed for use by MOAC contributing members, however).
  • You may need to validate the EAD file using a validation tool if your database cannot be configured to automatically validate the file. This includes files produced by DAMD.

Additional resources:

  • For an additional discussion on using databases, see the EAD Application Guidelines for Version 1.0, Section 4.2.4.
  • See the University of Virginia EAD Help Pages' EAD Sites Annotated Webpage ("Preparation" and "Mark-Up" sections) to learn how other repositories implemented a database.

SGML/XML Authoring Tools

How it works:

Manually create EAD finding aids directly in SGML/XML. Some products may have macros or batch encoding functions to automate encoding. These products have graphical user interfaces that offer different views of the text, such as text with markup, text without markup, split views of text and markup together, etc.

Pros:

  • Similar to look-and-feel of a word processor, including many word processor-style functions (spell check, thesaurus, macros, etc.).
  • You can generate multiple outputs using the software, including the following:
    • EAD file for the OAC
    • EAD file for local display (you may need additional hardware and software, such as a Web server and stylesheets, in order to do this)
    • Print copies (you may need other software applications or stylesheets to generate a finely formatted print document)
  • You have only one source file to manage. All edits and changes can be made to the single EAD file and you can generate revised print or EAD outputs as necessary.
  • You can incorporate changes in the EAD DTD and best practice guidelines directly into software functions in order to generate standards-compliant finding aids.
  • Special non-Latin characters (such as diacritics, symbols, etc.) can be directly encoded.
  • You can continuously validate the EAD file through the authoring process, since these programs directly incorporate the EAD DTD.

Cons:

  • Similar to using a word processor, you will still be "marking up" text using EAD tags and will need to be familiar with EAD, although you will have access to various tools and views to make the encoding easier such as a graphical user interface to the tags (pull-down windows, prompts, etc.).

Additional resources:

  • For an additional discussion on using SGML/XML authoring tools, see the EAD AG
  • The EAD Cookbook, available via the EAD Help Pages, is a free set of software tools, templates, stylesheets, and instructions for markup developed by Michael J. Fox that can be used in conjunction with Corel XMetal, Corel WordPerfect 9.0, or SoftQuad Author/Editor. See also the EAD Cookbook for Notetab, developed by Chris Prom: this is a set of NoteTab clipbooks, batch files, and associated files allowing EAD markup and finding aid publication via the NoteTab Pro and NoteTab Light text editors in conjunction with the EAD Cookbook.
  • See the University of Virginia EAD Help Pages' EAD Sites Annotated Webpage ("Preparation" and "Mark-Up" sections) to learn how other repositories implemented these methods. See also the EAD Help Pages' Helper Files Created for Specific Software and Software for SGML/XML Webpages for additional tools developed by repositories to facilitate EAD finding aid creation and publishing using SGML/XML authoring tools.

3.3. Validating Your Encoding

Once you have encoded your finding aids, you will need to ensure that the markup is compliant with the EAD Document Type Definition (DTD). In order to do this, you will need a tool that can parse the data and check the data against the DTD; this process is sometimes referred to as "validating" the EAD file. A file that conforms to the EAD DTD is called a "valid" EAD file. If you are not using SGML/XML authoring software or other tools that automatically validate your EAD files, you will need to validate your EAD using other software. The EAD Toolkit provides links to freeware and OAC-developed validation tools.

In addition to making sure that your encoding conforms to the EAD DTD, you will also need to ensure that it conforms to the OAC BPG EAD. As described in Submitting and Editing Finding Aids, voroEAD will check your encoding, alert you to any errors or omissions it encounters, and provide instructions on correcting your encoding to conform with the OAC BPG EAD.



CDL Digital Special Collections Helpdesk
  • Need assistance? Contact us via e-mail: oacops @ cdlib . org