Information for Testers

What's in the Database? How Was Merging Done?

What's in the Database

Bibliographic data

The database is a selection of a little over a half a million records from the MELVYL and PE catalogs. Although the records were chosen to be representative, this is not a random sample. The data has been skewed a bit to include a higher proportion of records in English, and a slightly lower proportion of records from the Regional Facilities. There are records for all types of materials, for government documents, and records with CJK headings.

Authority data

Two LC authority files were loaded into this test database: LC names and subjects. Because there wasn't room to load the entire LC names file we selected only those with author names beginning with "B." For subjects we selected all subject headings that contained cross references. Both of these files date from about 1997 so they are somewhat out of date, but should suffice for this test.

How Was Merging Done?

Records were merged using a new problem written by Ex Libris to simulate the merging on MELVYL. The results will sometimes be different because the database has some differences, but in general the princples of merging should be the same: Merging uses almost the same data in MELVYL-T as it does in MELVYL:

Merging in MELVYL-T differs from merging in MELVYL in that it does not look at the reproduction code (and therefore will not keep apart records for a print copy and a microform copy of the same item), and does not take into account the edition statement from the 250 field. The latter had proven itself to be too variable for accurate merging, and it appears that other data elements are sufficient to distinguish editions in most cases.