Information for Testers
What's in the Database?
How Was Merging Done?
What's in the Database
Bibliographic data
The database is a selection of a little over a half a million records from the MELVYL and PE catalogs. Although the records were chosen to be representative, this is not a random sample. The data has been skewed a bit to include a higher proportion of records in English, and a slightly lower proportion of records from the Regional Facilities. There are records for all types of materials, for government documents, and records with CJK headings.
Authority data
Two LC authority files were loaded into this test database: LC names and subjects. Because there wasn't room to load the entire LC names file we selected only those with author names beginning with "B." For subjects we selected all subject headings that contained cross references. Both of these files date from about 1997 so they are somewhat out of date, but should suffice for this test.
How Was Merging Done?
Records were merged using a new problem written by Ex Libris to simulate the merging on MELVYL. The results will sometimes be different because the database has some differences, but in general the princples of merging should be the same:
- The purpose of merging is to bring together records for the same work, as defined in AACR2.
- Merging does not take place across formats.
- Where it isn't clear if records should merge it is best to keep them apart rather than bring them together.
Merging uses almost the same data in MELVYL-T as it does in MELVYL:
- LCCN, ISBN or ISSN
- Date of publication
- Title
- Main entry
- Country of publication code (from the 008)
- Pagination (largest number in the 300 $a)
- Place of publication
- Publisher
Merging in MELVYL-T differs from merging in MELVYL in that it does not look at the reproduction code (and therefore will not keep apart records for a print copy and a microform copy of the same item), and does not take into account the edition statement from the 250 field. The latter had proven itself to be too variable for accurate merging, and it appears that other data elements are sufficient to distinguish editions in most cases.