By Lisa Schiff, eScholarship Publishing Program Technical Lead
The California Digital Library (CDL) is pleased to announce the availability of an extensive self-guided tutorial for its eXtensible Text Framework (XTF) application. XTF is an open source, highly customizable piece of software supporting the search, browse, and display of heterogeneous digital content and offering efficient and practical methods for creating customized end-user interfaces for distinct digital collections. The tutorial provides guidance for implementing and customizing XTF, from core functionality to overall look and feel.Downloads for the Mac and Windows operating systems are available from the XTF Project page on SourceForge, along with the complete distribution and documentation.
The tutorial comes with a complete XTF package that is ready to run when uncompressed; no other installation is required. It contains nine modules spanning the most powerful and popular features, including how to:
- Add new content
- Change metadata
- Change logo and colors
- Increase significance of titles in ranking hits
- Customize and enable default status of advanced search
- Change fields displayed in search results
- Enable structural searching
- Create a hierarchical facet
- Change footnote behavior
XTF Background and Overview
Since first developing and deploying this indexing and display technology in 2005, the CDL has worked to build and maintain XTF as a highly customizable application built upon tested components already in use by the digital library and search communities – in particular the Lucene text search engine, Java, XML, and XSLT. By coordinating these pieces in a single platform that can be used to create multiple unique applications, the CDL has succeeded in dramatically reducing the investment in infrastructure, staff training, and development for new digital content projects.
XTF offers the following core features out of the box:
- Easy to deploy: Drops directly in to a Java application server such as Tomcat or Resin; has been tested on Solaris, Mac, Linux, and Windows operating systems
- Easy to configure: Can create indexes on any XML element or attribute; entire presentation layer is customizable via XSLT
- Robust: Optimized to perform well on large documents (e.g., text that exceeds 10MB of encoded text); scales to perform well on collections of millions of documents; provides full Unicode support
- Works well with a variety of authentication systems (e.g., IP address lists, LDAP, Shibboleth)
- Provides an interface for external data lookups to support thesaurus-based term expansion, recommender systems, etc.
- Can power other digital library services (e.g., XTF contains an OAI-PMH data provider that allows others to harvest metadata, and an SRU interface that exposes searches to federated search engines)
- Can be deployed as separate, modular pieces of a third-party system
- Powerful for the end user:
- Spell checking of queries
- Faceted displays for browsing
- Dynamically updated browse lists
- Session-based bookbags
A sampling of XTF-based applications include:
- Mark Twain Project Online (http://www.marktwainproject.org), developed by the Mark Twain Papers Project, the CDL and the University of California Press.
- Calisphere (http://www.calisphere.org/), developed by the CDL.
- The Encyclopedia of Chicago (http://www.encyclopedia.chicagohistory.org/)
- The Chymistry of Isaac Newton (http://webapp1.dlib.indiana.edu/newton/) and The Swinburne Project (http://webapp1.dlib.indiana.edu/swinburne/www/swinburne/)
- Finding Aids at the New York Public Library (http://labs.nypl.org/2007/10/30/extensible-text-framework-xtf/)
- EECS Technical Reports (http://sunsite2.berkeley.edu:8088/xtf/servlet/org.cdlib.xtf.crossQuery.CrossQuery?rmode=btr)
For more information, visit http://xtf.cdlib.org/ .