Best Practices for Image Capture

Version 1.0

Maintained by the CDL Technical Architecture and Standards Workgroup

This document is based upon the work of the Making of America II Project team. MOA II documents are available at, http://sunsite.berkeley.edu/moa2).

This document outlines a set of "best practices" for libraries, archives, and museums who wish to create digital representations of parts of their collections. The recommendations here focus on the initial stages of capturing digital images and metadata, and do not cover other important issues such as systems architecture, search and retrieval, interoperability, and longevity. These recommendations are directed towards institutions that are considering the development of a large collection of digital surrogates which are expected to persist for a long period of time and easily migrate to new systems and technologies. Institutions with very small collections or those who anticipate relatively short life-spans for their digital surrogates may choose to follow a less rigorous set of guidelines. However, this should be done with the understanding that the resulting collection, which may become a valuable resource, may not be acceptable for long term archival storage and the digitization process may need to be recreated at a later date.

These recommendations really focus on reformatting existing works (such as handwritten manuscripts, typescript works on paper, bound volumes, slides, or photographs) into digital formats. Because collections differ widely in their types of material, audience, and institutional purpose, specific practices may vary from institution to institution as well as for different collections within a single institution. Therefore, the sets of recommendations we make here attempt to be broad enough to apply to most cases, and try to synthesize the differing recommendations previously made for specific target collections/audiences. For references to these previous documents, see the Bibliography at the end of this document.

Because image capture capabilities are changing so rapidly, this document covers the practices that we think are fairly universal, and will be usable for many years to come. This includes the notion of masters and derivatives, and some discussion about image quality and file formats. The Summary of General Recommendations, found near the end of this document, provides a list of these suggested best practices.

Digital Masters

Digital master files are created as the direct result of image capture. The digital master should represent as accurately as possible the visual information in the original object. The primary functions of digital master files are to serve as archival images and as a source for derivative files. Depending upon the collection, a digital master file may serve as a surrogate for the original, may completely replace originals or may be used as security against possible loss of originals due to disaster, theft and/or deterioration. Derivative files are created from digital master images for editing or enhancement, conversion of the master to different formats, and presentation and transmission over networks. Typically, one would capture the master file at a very high level of image quality, then would use image processing techniques (such as compression and resolution reduction) to create the derivative images (including thumbnails) which would be delivered to users.

Long term preservation of digital master files requires a strategy of identification, storage, and migration to new media, as well as policies about image use and access to them. The specifications for derivative files used for image presentation may change over time; digital masters can serve an archival purpose, and can be processed by different presentation methods to create necessary derivative files without the expense of digitizing the original object again. Because the process of image capture is so labor intensive, the goal should be to create a master that has a useful life of at least fifty years. Therefore, collection managers should anticipate a wide variety of future uses, and capture at a quality high enough to satisfy these uses. In general, decisions about image capture should err towards the highest quality.

Some collections will need to perform image-processing on files for purposes such as removing blemishes on an image, restoring faded colors from film emulsion, or annotating an image. For these purposes we strongly recommend that a master be saved before any image processing is done, and that the enhanced image be used as a high resolution derivative to generate further derivatives. In the future, as we learn more about the side effects of image processing, and as new functions for color restoration are developed, the original master would still be available.

Capturing the Image

Appropriate scanning procedures are dictated by the nature of the material and the product one wishes to create. There is no single set of image quality parameters that should be applied to all documents that will be scanned. Decisions as to image quality typically take into consideration the research needs of users (and potential users), the types of uses that might be made of that material, as well as the artifactual nature of the material itself. The best situation is one where the source materials and project goals dictate the image quality settings and the hardware and software one employs. Excellent sources of information are available, including the experience of past and current library and archival projects (see Bibliography section entitled "Scanning and Image Capture"). The pure mechanics of scanning are discussed in Besser (Procedures and Practices for Scanning), Besser and Trant (Introduction to Imaging) and Kenney’s Cornell manual (Digital Imaging for Libraries and Archives). It is recommended that imaging projects consult these sources to determine appropriate options for image capture. Decisions of quality appropriate for any particular project should be based on best anticipation of use of the digital resource.

 

Film-based Formats

Digital masters should record the visual appearance of the original artifact in a well-defined way, so that later users can knowledgeably interpret and manipulate the data, and so that capture technicians can follow consistent procedures for capture and quality control. For most kinds of documents and flat artwork, this can be accomplished by defining the desired digital values for standard targets (primarily a grayscale) included in the imaging. However, originals in other formats such as slides and negatives present different challenges.

For direct digital capture of negative collections, the visual appearance of the negative would ordinarily be of little interest; it is the visual appearance of the positive image created from the negative that is usually central. Methods and materials for making negatives have changed repeatedly over the years, and importantly, so have the materials and practices used in making prints from them. Planners may choose to express their capture plans in terms of the properties of the negative (e.g. transmission density), or try to describe a positive master format to be captured directly from the negative. Either course has pitfalls. While the first may seem like the safer approach, fully describing a negative will probably require more data than 8 bits per channel can carry, because negatives routinely record a much wider range of tones than media such as prints. It would be advisable for the planners to fully consider each step of the image transformation, from the densities in the negative, to raw scanned data, to digital master, to derivative products, making sure that enough of the right information is available in the master to satisfy the needs of derivative production. Also, as an aside, it's useful to remember that even black and white negatives often contain color information such as stained regions; scanning such a negative in color often reveals that one color channel emphasizes the effect of the stain, while another color channel may hide it.

Planners contemplating a project to scan color slides, some of which are faded, may find they need to make choices about whether to try to correct for the fading at the time of scanning, or to store a "faded", realistic digital master and then possibly produce a "restored" derivative to satisfy a need for an unfaded product. The second method would be favored because it provides for both a realistic and a "restored" version, and offers the possibility that different, better "restored" versions can be created in the future, regardless of any ongoing changes in the condition of the slide. However, in some cases the scanner and its software may be better able to correct the faded color channels at the time of scanning by tailoring the information-gathering to the levels of each dye remaining. The purposes and priorities of the project can help make the necessary choices: if a purpose of the digital project is to give lecturers a tool for selecting lecture slides for projection, or to record the condition of the slides, then the project will require a product that faithfully represents the faded condition of the slide. Experimental trials can reveal whether the "restoration" is better if performed at the time of scanning for a particular scanner and level of fading; in some cases multiple scannings and multiple masters may be the solution.

Another question comes up when a collection of slides that depict documents or works of art is to be scanned: should the digital master strive to represent the slide, or the original the slide depicts? On one hand, if the slide itself is considered an original work, the digital master should probably represent the slide as accurately as possible. On the other hand, many digital image capture projects involve a film intermediate: the document to be captured is first photographed on film, then the film is scanned to create the master file. In this case, the film may be seen as a vehicle for recording the tonality of the original document, and the scanning of the film may be adjusted so as to correct for the color and tonal errors introduced in the filming. The inclusion of a gray card or grayscale, photographed together with the work of art, can make objective corrections possible.

 

Faded Documents

Digital capture of faded documents presents many of the same challenges as scanning faded slides: is the objective to show the appearance of the document in its present condition, to make it maximally legible, or perhaps to depict it as we imagine it appeared as new? Ordinarily the digital master would be made to depict the document as it exists, and the master would then be processed to create the legible or "reconstructed" derivatives. However, in some situations faded information may be better captured using extreme, non-realistic means such as narrow-band light filters, or invisible wavelengths such as infrared. In these cases, multiple captures and multiple digital masters may be appropriate.

 

Microfilm

Microfilm is a photographic medium designed not for natural, realistic tonal capture but for optimal legibility. A project to capture digital images from microfilm taken of manuscript originals might naturally choose to emphasize legibility over tonal accuracy in its masters and derivatives since the microfilm intermediate is already inclined that way; this would be an important consideration in choosing whether microfilm is an appropriate source for scanning for a particular purpose.

 

Image Quality

Image quality for digital capture from originals is a measure of the completeness and the accuracy of the capture of the visual information in the original. There is some subjectivity involved in determining completeness and accuracy. Sometimes the subjectivity relates to what is actually being captured (with a manuscript, are you only trying to capture the writing, or is the watermark and paper grain important as well?). At other times the subjectivity relates to how the informational content of what is captured will be used. ( For example, should the digital representation of faded or stained handwriting show legibility or reflect the illegibility of the source material? Should pink slides be "restored" to their proper color? And if the digital image is made to look "better" than the original, what conflicts does that cause when a user comes in to see the original and it looks "worse" than the onscreen version? See Sidebar II for more complete discussion of these problems.). Image quality should be judged in terms of the goals of the project, and ultimately depends on an understanding of who are the users (and potential users), and what kind of uses will they make of this material. In past projects, some potential use has been inhibited because not enough quality (in terms of resolution and/or bit-depth) was captured during the initial scanning.

Image quality depends on the project's planning choices and implementation. Project designers need to consider what standard practices they will follow for input resolution and bit depth, layout and cropping, image capture metric (including color management), and the particular features of the capture device and its software. Benchmarking quality (see Kenney’s Cornell Manual) for any given type of source material can help one select appropriate image quality parameters that capture just the amount of information needed from the source material for eventual use and display. By maximizing the image quality of the digital master files, managers can ensure the on-going value of their efforts, and ease the process of derivative file production.

Quality is necessarily limited by the size of the digital image file, which places an upper limit on the amount of information that is saved. The size of a digital image file depends on the size of the original and the resolution of capture (number of pixels per inch in both height and width that are sampled from the original to create the digital image), the number of channels (typically 3: Red, Green, and Blue: "RGB"), and the bit depth (the number of data bits used to store the image data for one pixel).

Measuring the accuracy of visual information in digital form implies the existence of a capture metric (i.e., the rules that give meaning to the numerical data in the digital image file). For example, the visual meaning of the pixel data Red=246, Green=238, Blue=80 will be a shade of yellow, which can be defined in terms of visual measurements. Most capture devices capture in RGB using software based on the video standards defined in international agreements. A thorough technical introduction to these topics can be found in Poynton's Color FAQ: <http://www.inforamp.net/~poynton/ColorFAQ.html>. We strongly urge that imaging projects adopt standard target values for color metrics as Poynton discusses, so that the project image files are captured uniformly.

A reasonably well-calibrated grayscale target should be used for measuring and adjusting the capture metric of a scanner or digital camera. For capturing reflective copy, we recommend that a standard target consisting of grayscale, centimeter scale (useful for users to make sure that they are printing or displaying an image at the right size), and standard color patches be included along one edge of every image captured, to provide an internal reference within the image for linear scale and capture metric information. Kodak makes a set consisting of grayscale (with approximate densities), color patches, and linear scale which is available in two sizes: 8 inches long (Q-13, CAT 152 7654) and 14 inches long (Q-14, CAT 152 7662)

Bit depth is an indication of an image's tonal qualities. Bit depth is the number of bits of color data which are stored for each pixel; the greater the bit depth, the greater the number of gray scale or color tones that can be represented and the larger the file size. The most common bit depths are:

 

While it is desirable to be able to capture images at bit depths greater than 24 (which only allows 256 levels for each color channel), standard formats for storing and exchanging higher bit-depth files have not yet evolved, so that we expect that (at least for the next few years) the majority of digital master files will be 24-bit. Project planners considering bitonal capture should run some samples from their original materials to verify that the information captured is satisfactory; frequently greyscale capture is desirable even for bitonal originals. 8-bit color is seldom suitable for digital masters.

Useful image quality guidelines for different types of source materials are listed in Puglia & Rosinkski’s NARA Guidelines and in Kenney’s Cornell Manual (see bibliography).

Formats

Digital masters should capture information using color rather than grayscale approaches when color is integral to the information conveyed by the object. Digital masters should never use lossy compression schemes and should be stored in internationally recognized formats. TIFF is a widely used format, but there are many variations of the TIFF format, and consistency in use of the files by a variety of applications (viewers, printers etc.) is a necessary consideration. In the future, we hope that international standardization efforts (such as ISO attempts to define TIFF-IT and SPIFF) will lead vendors to support standards-compliant forms of image storage formats. Proprietary file formats (such as Kodak’s Photo CD or the LZW compression scheme) should be avoided for any long-term project. Most projects currently use uncompressed TIFF 6 images.

While it is our recommendation that no file compression be used at all for digital master files, we recognize that there may be legitimate reasons for considering it. Limited storage resources may force the issue by requiring the reduced file sizes that file compression affords. Those who choose to go this route should be careful to take into consideration digital longevity issues. As a general rule, lossless compression schemes should be preferred over lossy compression schemes. Lossless compression makes files smaller, and when they are decompressed they are exactly the same as before they were compressed. Lossy compression actually combines and throws away data (usually data that cannot be readily detected by the human eye), so decompressed lossy images are different than the original image, even though those differences may be difficult for our eyes to see. Typically, lossy compression yields far greater compression ratios than lossless. But unlike lossy compression, lossless compression will not eliminate data we may later find useful. Lossy compression is unwise, as we do not yet know how today’s lossy compression schemes (optimized for human eyes viewing a CRT screen) may affect future uses of digital images (such as computer-based analysis systems or display on future display devices). But even lossless compression adds a level of complexity to decoding the file many years hence. And many vendor products that claim to be lossless (primarily those that claim "lossless JPEG") are actually lossy.

 

Image Metadata

Metadata or data describing digital images must be associated with each image created, and most of this should be noted at the point of image capture. Image metadata is needed to record information about the scanning process itself, about the storage files that are created, and about the various pieces that might compose a single object.

The number of metadata fields may at first seem daunting. However, high proportions of these fields are likely to be the same for all the images scanned during a particular scanning session. For example, metadata about the scanning device, light source, date, etc. is likely to be the same for an entire session. And some metadata, about the different parts of a single object (such as the scan of each page of a book), will be the same for that entire object. This kind of repeating metadata will not require keyboarding each individual metadata field for each digital image; instead, these can be handled either through inheritance or by batch-loading of various metadata fields.

Administrative metadata includes a set of fields noting the creation of a digital master image, identifying the digital image and what is needed to view or use it, linking its parts or instantiations to one another, and ownership and reproduction information. Structural metadata includes fields that help one reassemble the parts of an object and navigate through it.

 

Derivative Images

Since the purpose of the digital master file is to capture as much information as is practical, derivative versions will almost always be needed for delivering to the end user via computer networks. In addition to speeding up the transfer process, another purpose may be to digitally "enhance" the image in one form or another (see discussion below of artifact v. content) to achieve a particular goal. Such enhancements should be performed on a submaster rather than on the digital master file, (which should reflect what the particular digitization process has captured). Derivative versions are typically not files that will be preserved, as the digital master file is for that purpose.

Derivative images for web-based delivery might be pre-computed in batch mode from masters early on in a project, or could be generated on demand from web-resident submasters on-the-fly as part of a progressive decompression function or through an application such as MrSID (Multi-resolution Seamless Image Database).

Sizes

Typical derivative versions include a small preview or "thumbnail" version (usually no more than 200 pixels for the longest dimension) and a larger version that mostly fills the screen of a computer monitor (640 pixels by 480 pixels fills a monitor set at a standard PC VGA resolution). Depending on the need for users to detect detail in an image, a higher resolution version may be required as well. The full set and sizes of derivative images required will depend upon a variety of factors, including the nature of the material, the likely uses of that material, and delivery system requirements (such as user interface). Derivative files should be created using software to reduce the resolution of the master image, not by adjusting the physical dimensions (width and height). After reducing the resolution, it may be necessary to sharpen the derivative image to produce an acceptable viewing image (e.g., by using "unsharp mask" in Adobe Photoshop). It is perfectly acceptable to use image processing on derivative images, but this should never be done to masters.

Derivative images will frequently be compressed using lossy compression. Compression algorithms are usually optimized for a particular type of image (e.g. JPEG achieves high compression ratios for pictorial images, but cannot compress images of text very much without introducing compression artifacts), and one should be careful not to use the wrong type of compression scheme for a particular image. Compression algorithms such as JPEG involve a spectrum of options (ranging from high compression ratios that involve visible loss to low compression ratios that involve little visible loss). Each software implementation of these options label them differently (on scales of 1-3, scales of 1-10, options high/medium/low, etc.), and there is currently no objective and interoperable way to declare which of a range of options one has chosen.

Artifactual v. Enhanced

Many historical images are faded, yellowed, or otherwise decayed or distorted. Image enhancement techniques can in some cases result in a much improved image for viewing the content of the image rather than the decayed condition of the artifactual print or transparency. In situations where such an enhanced viewing version is desired, it should in most cases be offered in addition to a version that more closely depicts the condition of the artifact. By having both images available, users will understand the condition of the original while having a more useable version for online viewing.

Production of artifactual derivatives can often be automated by using software that can perform a series of standard operations. Production of enhanced versions, however, will most likely not be able to be automated, due to the inability of any one standard transformation procedure to apply equally to all images in a particular project. If automated procedures for image enhancement are not effective, the costs of creating these images individually will need to be considered in the overall project cost.

Color Management

The objective of color management is to control the capture and reproduction of color in such a way that an original print can be scanned, displayed on a computer monitor, and printed, with the least possible change of appearance from the original to the monitor to the printed version. This objective is made difficult by the limits of color reproduction: input devices such as scanners cannot "see" all the colors of human vision, and output devices such as computer monitors and printers have even more limited ranges of colors they can reproduce. Most commercial color management systems are based on the ICC (International Color Consortium) data interchange standard, and are often integrated with image processing software used in the publishing industry. They work by systematically measuring the color properties of digital input devices and of digital output devices, and then applying compensating corrections to the digital file to optimize the output appearance. Although color management systems are widely used in the publishing industry, there is no consensus yet on standards for how (or whether) color management techniques should be applied to digital master files. Though projects may experiment with color management systems for derivative files, until a clear standard emerges it is not recommended that digital master files be routinely processed by color management software.

Strategies for Digital Preservation

How to keep digital files alive and accessible over time is a key issue facing our field, and as yet little consensus has emerged (Waters et. al.). The two general approaches are Emulation and Migration. Emulation assumes that we will keep files in their current encoding format, and that software will eventually be written for future computers and operating systems that will emulate current viewing and manipulation environments. Migration requires that files be periodically copied from one encoding format to a newer format (such as from WordStar to Microsoft Word 3 to Word 6 to Word 8, ...). Both approaches assume that files will be periodically transferred from one physical strata to another (to avoid deterioration of the tape, disk, or other storage medium). And appropriate metadata is critical to both approaches. Metadata is necessary to: know what file formats the file requires, and to make sure that all related files are emulated or migrated. Metadata will also be crucial for future environments which may include features such as automatic color control, and measurement tools to compare image sizes.

 

Summary of General Recommendations

 

 

Bibliography

Metadata

Making of America II White Paper, Part III, Structural and Administrative Metadata, http://sunsite.berkeley.edu/MOA2

METADATA WORKING GROUP REPORT to Senior [Library] Management, Cornell University Library, JULY 1996, http://www.library.cornell.edu/DLWG/MWGReporA.htm and the related work "Distillation of [Cornell UL] Working Group Recommendations," November, 1996 http://www.library.cornell.edu/DLWG/DLMtg.html

"Information Warehousing: A Strategic Approach to Data Warehouse Development" by Alan Perkins, Managing Principal of Visible Systems Corporation (White Paper Series) http://www.esti.com/iw.htm

SGML as Metadata: Theory and Practice in the Digital Library. Session organized by Richard Gartner (Bodleian Library, Oxford) http://users.ox.ac.uk/~drh97/Papers/Gartner.html

"A Framework for Extensible Metadata Registries" by Matthew Morgenstern of Xerox, a visiting fellow of the Design Research Institute at Cornell http://dri.cornell.edu/Public/morgenstern/registry.htm

Using the Library of Congress Repository model, developed and used in the National Digital Library Program: The Structural Metadata Dictionary for LC Repository Digital Objects http://lcweb.loc.gov:8081/ndlint/repository/structmeta.html which then leads to further documentation of their Data Attributes http://lcweb.loc.gov:8081/ndlint/repository/attribs.html with a list of the attributes http://lcweb.loc.gov:8081/ndlint/repository/attlist.html and their definitions http://lcweb.loc.gov:8081/ndlint/repository/attdefs.html The same site then gives examples of using this model for a photo collection http://lcweb.loc.gov:8081/ndlint/repository/photo-samp.html a collection of scanned page images http://lcweb.loc.gov:8081/ndlint/repository/timag-samp.html and a collection of scanned page images and SGML encoded, machine-readable text http://lcweb.loc.gov:8081/ndlint/repository/sgml-samp.html

 

Organization of Information for Digital Objects

The article "An Architecture for Information in Digital Libraries" by William Arms, Christophe Blanchi and Edward Overly of the Corporation for National Research Initiatives and published in D-Lib Magazine, February 1997 issue.http://www.dlib.org/dlib/february97/cnri/02arms1.html

Howard Besser. Digital Longevity (website) http://sunsite.Berkeley.edu/Longevity

Repository Access Protocol – Design Draft – Version 0.0 by Christophe Blanchi of CNRI is found at http://titan.cnri.reston.va.us:8080/pilot/locdesign.html and begins " This document describes the repository prototype for the Library of Congress. This design is based on version 1.2 of the Repository Access Protocol (RAP) and the Structural Metadata Version 1.1 from the Library of Congress."

"The Warwick Framework: A Container Architecture for Aggregating Sets of Metadata" by Carl Lagoze, Digital Library Research Group, Computer Science Department, Cornell University; Clifford A. Lynch, Office of the President, University of California, and Ron Daniel Jr., Advanced Computing Lab, Los Alamos National Laboratory (July, 1996) http://cs-tr.cs.cornell.edu/Dienst/Repository/2.0/Body/ncstrl.cornell/TR96-1593/html

Waters, Don, et. al. Preserving Digital Information: Report of the Task Force on the Archiving of Digital Information, Washington DC: Commission on Preservation and Access and Research Libraries Group, 1996 http://www.rlg.org/ArchTF/

 

Scanning And Image Capture

 

Adobe Systems Incorporated, TIFF Revision 6: ftp://ftp.adobe.com/pub/adobe/devrelations/devtechnotes/pdffiles/tiff6.pdf

This textbook-length standard defines the detailed structure of TIFF (Tagged Interchange File Format) graphics files for all standard TIFF formats (except TIFF/IT, see next entry). If you want to decipher the header information in a TIFF file, this provides the technical documentation. It is also interesting to see the variety of options (many rarely implemented) available to programmers within the standard, including features like JPEG compression, and metadata tags such as Copyright notice.

Howard Besser and Jennifer Trant. Introduction to Imaging. Getty Art History Information Project. http://www.gii.getty.edu/intro_imaging/

Still the best overview of electronic imaging available for the beginner and should be considered recommended reading for any level. Starts with a basic description of what a digital image is, and continues with a discussion of the basic elements that need to be considered before, during, and after the capture process. Includes a detailed discussion of the image capture process, compression schemes, uses, as well as access to and documentation of the final product. Covers selection of scanning equipment, image-database software, network delivery, and security issues. Includes a top-notch glossary and links to many useful resources on the WWW. Highly recommended. -JE

Howard Besser. Procedures & Practices for Scanning, Procedures and Processes for Scanning. Canadian Heritage Information Network (CHIN), http://sunsite.Berkeley.edu/Imaging/Databases/Scanning

Posed as a series of questions to ask about the task at hand, this document outlines, in a very straightforward way, all the basic decisions that must be made before, during, and after the scanning process. Includes both abbreviated, and longer explanations of each point, making it useful to a wide audience with varying levels of technical sophistication.

Electronic Text Center at Alderman Library, University of Virginia. "Image Scanning: A Basic Helpsheet, http://etext.lib.virginia.edu/helpsheets/scanimage.html

A very straightforward, basic, how-to document that outlines the image scanning process at the Electronic Text Center at Alderman Library, University of Virginia. Includes a good, concise discussion of image types, resolution, and image file formats, as well as a brief discussion about "Archival Imaging" and associated metadata. Also includes more specific recommendations for using Adobe Photoshop and DeskScan software with an HP Scanjet flatbed scanner.

Electronic Text Center at Alderman Library, University of Virginia. "Text Scanning: A Basic Helpsheet" http://etext.lib.virginia.edu/helpsheets/scantext.html

A very concise description of the optical character recognition process that converts scanned images into text at the Electronic Text Center at Alderman Library, University of Virginia. Outlines the process in a step-by-step fashion, assuming the use of the Etext Center equipment, which consists of a pentium PC, an HP Scanjet flatbed scanner, and OmniPage Pro version 8 OCR software.

Michael Ester. Digital Image Collections: Issues and Practice. Washington, D.C. , Commission on Preservation and Access (December, 1996).

This pithy report is riddled with useful insights about how and why digital image collections are created. Ester is especially effective at pointing out the hidden complexities of image capture and project planning, without ever getting too technical for a general audience.

Carl Fleischhauer. Digital Historical Collections: Types, Elements, and Construction. National Digital Library Program, Library of Congress, http://lcweb2.loc.gov/ammem/elements.html.

One of three articles that cover the Library of Congress digital conversion activity as of August 1996. Discussion covers the types of collections converted, the access aids established for online browsers, types of digital reproductions made available (images, searchable text, sound files, etc.), and supplementary programs known as "Special Presentations" on the Library website. Describes the developmental approach to assigning names to digital elements, and how those elements are identified as items and aggregates. Brief descruption of how digital reproductions are being used in preservation efforts.

Carl Fleischhauer. Digital Formats for Content Reproductions. National Digital Library Program, Library of Congress. http://lcweb2.loc.gov/ammem/formats.html

A clear explanation of capturing digital representations of different types of materials.

One of three articles that cover the Library of Congress digital conversion activity as of August 1996. Discussion documents Library's selection of digital formats for pictorial materials, textual materials, maps, sound recordings, moving-image materials, and computer file headers. Includes specific target values for tonal depth, file format, compression, and spatial resolution for the image and text categories. Good discussion of various mitigation measures that might be employed to reduce or eliminate moire patterns that result from the scanning of printed halftone illustrations and MrSID approach to map files. Sound recording and moving-image files are documented with the caveat that these will likely change in the near future as the technology evolves.

Image Quality Working Group of ArchivesCom, a joint Libraries/AcIS Committee. Technical Recommendation for Digital Imaging Projects, http://www.columbia.edu/acis/dl/imagespec.html

Contains a good Quick Guide, in table format, that recommends a specific conversion method, capture resolution, archive file format, screen format, and presentation format for 6 media types: 1) black and white text documents, 2) illustrations, maps & manuscripts, 3) 3-D objects, 4) 35mm slides & negatives, 5) medium to large format photos, negatives and transparencies or color microfiche, and 6) black and white microfilm. Clear, brief explanations for when to use film intermediaries, and more specific recommendations for bit-depth, capture resolution, and file formats for archival storage and presentation. Includes estimated file sizes for different file formats to facilitate the calculation of physical storage requirements.

International Color Consortium: http://color.org/

The ICC is a consortium of major companies involved with digital imaging, formed to create industry-wide standards for digital color management. Systems such as Apple's Colorsync, Agfa's Fototune, and Microsoft's Integrated Color Management (II) which follow the standards are said to be "ICC compliant" and may be able to exchange information. The ICC web site offers a downloadable version of the standards along with other technical papers.

International Organization for Standardization, Technical Committee 130 (n.d.). ISO/FDIS 12639: Graphic technology – Prepress digital data exchange – Tag image file format for image technology (TIFF/IT). Geneva: International Organization for Standardization (ISO).

This standard defines an extension of the TIFF 6.0 standard specifically intended to apply to graphics files used by prepress applications in the printing industry. In addition to standardizing the formats used for electronic dissemination of page layouts, etc., it calls for backward-compatibility with common TIFF 6-compatible applications such as Adobe Photoshop, to minimise proliferation of TIFF formats that can’t be opened by many applications.

Anne R. Kenney. Digital Imaging for Libraries and Archives. Cornell University Library, June 1996.

A very thorough and useful resource for digital imaging projects. The section on hardware is a bit dated, but most of the publication is still very useful. The explanations of digital imaging technology are useful, as is the advice for anyone tackling a digitization project for the first time. Text is VERY dense but is a good in-depth technical overview of the issues involved in the scanning process. The formulas are useful (if complex) and are better when it comes to text scanning than image scanning. Emphasis on benchmarking. Not for the faint of heart.

Picture Elements, Inc. Guidelines for Electronic Preservation of Visual Materials (revision 1.1, 2 March 1995). Report submitted to the Library of Congress, Preservation Directorate.

Poynton's Color FAQ: http://www.inforamp.net/~poynton/ColorFAQ.html

This is one of several papers at this site by Charles Poynton introducing the technical issues of color and tonal reproduction in the digital realm for general audiences. Some of the explanations are accessible to everyone, such as how to adjust the brightness and contrast on a PC monitor; others include some math, such as how to transform from one color space to another. Many useful references are cited. Included are discussion of human visual response, primary colors, and the television-based standards underlying digital imaging.

Steven Puglia and Barry Roginski. NARA Guidelines for Digitizing Archival Materials for Electronic Access, College Park: National Archives and Records Administration, January 1998. http://www.nara.gov/nara/vision/eap/digguide.pdf

An exhaustive and specific set of guidelines for digitizing textual, photographic, maps/plans, and graphic materials. Includes specific guidelines about resolution and images size for master files, access files, and thumbnail files. Also includes specific guidelines for scanner/monitor calibration, and file header information tags. Handy quick reference chart at http://www.nara.gov/nara/vision/eap/digmatrx.pdf.

Reilly, James M and Franziska S. Frey, "Recommendations for the Evaluation of Digital Images Produced from Photographic, Microphotographic, and Various Paper Formats" Report to the Library of Congress, National Digital Library Project by Image Permanence Institute. May, 1996 http://lcweb2.loc.gov/ammem/ipirpt.html

This report is intended to guide the Library of Congress in setting up systematic ways of specifying and judging digital capture quality for LC's digital projects. It includes some interesting discussion about digital resolution, but discussion of tonality and color is brief.