Jump to Content
UC3 Logo

Organizing your data

File Formats for Long-Term Access

The file format in which you keep your data is a primary factor in one's ability to use your data in the future.

As technology continually changes, researchers should plan for both hardware and software obsolescence. How will your data be read if the software used to produce them becomes unavailable?

Formats more likely to be accessible in the future are:

  • Non-proprietary
  • Open, documented standard
  • In common usage by research community
  • Use standard character encoding (ASCII, UTF-8)
  • Unencrypted
  • Uncompressed

Consider migrating your data into a format with the above characteristics, in addition to keeping a copy in the original software format.

Examples of preferred format choices:

  • Images: JPEG, JPG-2000, PNG, TIFF
  • Texts: HTML, XML, PDF/A, UTF-8, ASCII
  • Audio: AIFF, WAVE
  • Containers: GZIP, ZIP

For more information on supported formats, see the CDL Guidelines for Digital Images, and recommendations for encoded text from the CDL Structured Text Working Group.

Directories, Files and Version Naming Conventions

Directory Structure Naming Conventions

When organizing files, top-level directory/folder should include the project title, unique identifier, and date (yyyy or yyyy.mm.dd).

The sub-directory structure should have clear, documented naming conventions. Separate files or directories could apply, for example, to each run of an experiment, each version of a dataset, and/or each person in the group.

File Naming Conventions

  • Reserve the 3-letter file extension for application-specific codes, for example, formats like .wrl, .mov, and .tif.
  • Identify the activity or project in the file name.
  • Identify separate versions of files and datasets using file or directory naming conventions. Record all changes to a file no matter how small. Discard obsolete versions after making backups.

File Renaming

Tools to help you:

File Naming Conventions for Specific Disciplines

Many disciplines have recommendations, for example:

Credit to MIT Libraries for permission to use and adapt their pages and to members of the UC3 community.
Please send us any comments about these guidelines.

Creative Commons License

Last updated: March 14, 2014
Document owner: Perry Willett