Skip to Main Content

Research Data Management

Learn about best practices in research data management

Data Inventories, Data Dictionaries, and README files

A data dictionary and/or data inventory details the specific information about your data collection and what the variables, terms, experimental conditions and other recorded observations mean. For more information and resources, see the NNLM Data Thesaurus Data Dictionary term. 

A README file is a text file (.txt or other open format) that contains information about the contents of a folder, or more specific information about the files and data as part of your experiment. A README file template and more information can be found at the Harvard Medical School data management website. Notice the types of information include, file naming structures, file formats, and the column headings for any tabular data. This is similar to your data inventory, but a README should contain more general information as well, such as contact information for someone who can answer questions about the data, where the data is stored, etc. 

What kinds of information do you want to include in a data inventory? 

  • What are the types of data you are collecting?
  • What are the file types generated from this data?
  • How stable is the data in its original form?
  • How much data will you be generating?
  • What other outputs will be created that are necessary to understand the data?
    • Code
    • Protocols, Templates, Lab notebooks

A data dictionary should include detailed information about tabular (spreadsheet) data so that it's clear what you're entering or looking at.  

  • Variable names and descriptions
  • Variable types (string, scale, integer, categorical)
  • Units of measure
  • Computed/derived variables
  • Missing value definitions

Protocols and Software Settings

In addition to documenting spreadsheet variables and other types of data you are collecting, remember to document any experimental protocols and software settings. 

If you are using a lab instrument that has multiple settings, take a screenshot or photo of the settings screen. If anything has been changed, it will be easy to reset it if you have that information documented. This will also help when writing up your experimental protocol.