Jump to Content
December 22, 2014
 
 
 
 
 

Display Technology

Keyword in Context Pictures (KWIC Pics)

KWIC Pics is a display technology that provides visual clues to aid in information searches. After a user of the eScholarship site performs a search, a results page is displayed. By hovering the cursor over matching keywords on the results page, the user can view an excerpted image of the actual PDF article page containing the keywords – providing an early glimpse of the document.

From a technical perspective, hovering the cursor over a highlighted keyword activates a series of events that are largely invisible to the user:

  1. JavaScript code running in the user’s browser sends an AJAX request to the eScholarship server, calling out the particular PDF file, page number, and matching keywords to show.
  2. The server then renders just that part of the PDF file and returns it as an image. First it must interpret the PDF file, find the right page, and turn it into image data (or “pixels”). PDF files are highly variable and eccentric, so eScholarship relies on a robust and well-tested PDF library called Poppler, which is also used within many PDF image display tools for Linux.
  3. The eScholarship server then modifies blocks of pixels to add yellow background and red foreground highlighting to the matching keywords, using text coordinates stored in the XTF full-text index.
  4. Next the pixel data is compressed for quick transfer. If a small number of colors or only shades of gray are present, PNG compression is chosen; for full color images, JPEG compression is used instead. This choice logic ensures accurate color reproduction while reducing bandwidth (and thus time) as much as possible.
  5. Finally the compressed image is sent back to the user’s browser, where the Javascript code displays it over the search results page.

Rendering PDFs as Images

A very similar process occurs in eScholarship when displaying the full view of an article. The page is initially filled with blank images. JavaScript code monitors the scrollbar to find out which portion of the page is visible in the browser window. It then issues AJAX requests to render just those pages and displays them when they arrive from the server. The rendering and compression process on the server is exactly the same as above (steps 2-5).

Links for more information:

Last updated: August 26, 2013
Document owner: Justin Gonder