CDL has submitted a Vertical Search Demo to the Digital Public Library of America (DPLA) Beta Sprint project. The Beta Sprint issued a call for ideas and prototypes for the DPLA, which is envisioned as a “large-scale digital public library that will make the cultural and scientific record available to all.” The demo shows how a vertical search—a search across a targeted or curated segment of online content—can provide one-stop access to digital cultural heritage materials distributed among many websites.
Try the Vertical Search Demo at: http://crawlspace.cdlib.org
The demo targets a range of websites with cultural heritage content. Currently, approximately 300,000 unique URLs from 100 sources are included in the index. These include collections comprising the Digital Collections and Content project (a registry of digital materials funded by the Institute of Museum and Library Services), as well as web resources from the University of California, other libraries and cultural heritage institutions, and aggregated content websites.
A major benefit of the vertical search approach is that it minimizes barriers for content contributors; instead of using metadata harvesting or federated search protocols, or ingesting and hosting digital objects, this tool simply requires resources to be available online and open to crawling.
CDL built the demo entirely with existing open-source applications, including Apache Nutch, Apache Solr, and related technologies. More technical information is available on the demo homepage.
The demo is the latest iteration of vertical search that CDL had developed previously, experimenting with similar technologies for UCOP’s UC Portal Project and the Water Resources Center Archive. This demo shows how a vertical search might provide opportunities for searching across collections in a variety of contexts—including within UC’s own collections.
CDL’s Brian Tingle developed the vertical search demo, with input and support from the CDL Digital Special Collections team.