Inside CDL

OAC BPG Validator

EAD Version 2002

About the Validator

The OAC BPG Validator is a tool to assist encoders in proofing their EAD-encoded finding aids and verifying conformance with the EAD Version 2002 DTD and the OAC Best Practice Guidelines for Encoded Archival Description (OAC BPG EAD). It does this by providing the user with a list of warnings, errors, recommendations, and other observations, each linked to a specific line in the EAD instance which users can click on and view. The validator parses the instance with a validating XML parser and reports any errors via a report linked to specific lines within the instance which users may once again click to view.

The validator is intended to assist encoders in proofing their encoded finding aids and should not be viewed as a proofer in and of itself. The validator may take note of patterns and strings of text within the finding aid that it finds unusual in some way, or that in its experience have been indicative of errors. The validator will report on a wide variety of things but encoders should always rely on their judgement, based on an understanding of the OAC BPG EAD. Even when the validator directly quotes requirements in the OAC BPG EAD, encoders must keep in mind that even the OAC BPG EAD defers to the finding aid author's best judgement.

Installation

Download the ead_bpgilyzer2002 (5 MB) to any folder and double click to begin the install. You will be asked to specify an installation folder. It will default to C:\sgml\eadapps\BPG Validator 2002. You can change the default to install it wherever you like.

Next you will be asked to choose a Start Menu Program group in which to place the BPG Validator. By default this will be Start -> Programs -> BPG Validator 2002. The Start Menu Program group provides an uninstall option, in addition to access to a "readme" file.

The installation program will ask if you want to add a new icon to your desktop after installation is complete. This option is strongly recommended: the OAC BPG Validator is used by dragging XML files onto this desktop icon. Without this desktop icon, it will be difficult to use the tool.

Instructions on Use

Using the validator is very simple. Simply drag an XML file from Windows Explorer onto the OAC BPG Validator desktop icon, and wait for it to display the results. The validator checks for many things and may take some time to generate its report, based on the size of the EAD document (20 seconds or so for an 800 K document, to several minutes for a 6 MB file, to several hours for a 13 MB finding aid).

When the validator has finished it will open Internet Explorer and display a report.

View of OAC BPG Validator report

Report Warnings, Errors, and Recommendations

The report consists of a series of messages: warnings, errors, and recommendations. Each message begins with the name of the file to which the message applies, followed by a hyperlinked line number. You can click on the line number to go to the file and view the specific line (but see the section on line numbers below). For help on the message, consult the OAC BPG EAD. Please review and consider any warning or error message closely to determine which messages are accurate and which messages can be ignored. Look for words like “possibly” and “probably” in the text of the message. These indicate that the validator has noticed something puzzling and is merely bringing it to your attention.

The validator supports SGML-based validation and it may not accommodate some UTF-8 character encoding conventions in XML documents, although the XML may be well-formed and validates against the EAD Version 2002 DTD. Hence, files containing a UTF-8 Byte Order Mark in the prolog will be noted as containing the following error in line 0: "character "" not allowed in prolog". If you receive this error, you can ignore it.

About Line Numbers

The validator returns results accompanied by a line number in the EAD document where the error occurs. Users can click on the line number to view the specific line within the EAD file, then make the necessary correction in the EAD file using their preferred encoding tool or method (e.g., SGML/XML authoring software, EditPad, etc.).

Unfortunately, not all EAD XML files contain lines per se. That is, there are no end-of-line ("EOL") characters but rather the entire document consists of a single line, starting with the DOCTYPE declaration and ending with the </ead> closing tag. In such cases the utility and usefulness of the validator is seriously reduced, as encoders may find it next to impossible to locate the text or tags in question within a large finding aid, particularly inside the container list.

View of OAC BPG EAD Validator

To XML parsers, editors, and viewers, lines are largely irrelevant. An XML file spread out over multiple lines is not any better than one with no line breaks at all. However, encoders may find that including line breaks at suitable points within their encoded files may have administrative benefits related to creating, editing, and otherwise maintaining their EAD documents. The BPG Validator is a case in point, relying as it does on line numbers to point users to specific places in their documents. Most XML editors provide the user with the option to preserve line breaks in their output or to place line breaks between elements that the user specifies. Users who generate their EAD from databases might want to consider adding line breaks to the output at appropriate points. Perl users should utilize the '\n' character within EAD outputs.

Customizing the Validator: Choosing the Messages You See

By default the validator checks every rule and recommendation given in the OAC BPG EAD whenever possible. This can result in a lot of messages, including many which never apply for a given repository. It is therefore the user's responsibility to customize the validator by supressing those messages which never apply.

Almost all of the validator's behavior is specified in the bpg.cfg configuration file. Users can open this file with a text editor (such as TextPad, NotePad, NoteTab, etc.) and make extensive changes, especially by suppressing warning messages. There are several ways to do this.

At the very top of the oac.cfg file, under the "[General Parameters]" section, are three quick settings for suppressing certain classes of messages. For example, the user may opt to suppress all recommendations (The BPG classifies its rules as required or recommended). OAC users will probably not want to do this however. Some recommendations are more important than others.

View of OAC BPG Validator configuration file

Users may choose to disable individual messages. The last portion of the oac.cfg file, after the HTML templates, is a complete listing of every message generated by the validator.

View of OAC BPG Validator configuration file

Users can choose to disable selected messages by locating the relevant message and changing its "category" value. Users may change the value to "VOID" to disable a message.