Documenting and organising data
Researchers must ensure that sufficient documentation and metadata (i.e. information about the data) is created and maintained to enable research data to be found, used and managed throughout the project and into the future.
Project-level documentation covers the management, ownership, and organisation of the research data. This might include:
- Name and contact details for the primary owner or custodian of the project’s research data (e.g. Chief Investigator). This person will be responsible for future decision about the data (e.g. access or disposal).
- Where the data will be stored.
Project level documentation occurs during the planning stage.
The Australian Code for Responsible Conduct of Research (2018), and increasingly funding agencies, requires all researchers to create and maintain a list of their research data.
Documentation at the data-level is important for:
- identifying the data
- helping to find the data
- associating the data with its owners and creators
- creating links between the data and other related data or publications
- providing context for the data, e.g. by locating its collection or creation at a certain time and place
- enabling the quality of the data to be assessed, and research results to be validated.
Data that has been poorly documented will be difficult (or impossible) to find. Even if the data can be found, its value will be diminished if it is hard to interpret the contents, and to judge the quality or validity of the data. If it is not possible to determine when, where, how, and by whom the data was originally produced, there is also the risk that the data could be exploited inappropriately, or even accidentally destroyed.
What information should you document about your data?
Data documentation provides provenance or context for the data and ensures that the data can be understood in the long term. It may include information such as:
- List or index of research data files collected/created
- Name and contact details of the primary owner/custodian of the research data
- why the data was collected
e.g. information about the research project aims and objectives
- how data was collected
e.g. instruments and processes used
- confidentiality and consent agreements
- how data is structured
e.g. names, labels and descriptions for data elements, and any rules relating to the values that are in them (coding schemes, classification schemes)
- What software, programs, processes are required to access and read data
- Quality control measures to ensure integrity of research outcomes
- Modifications or changes to the data over time
- Registry of access to data (includes collaborators and external parties)
- Retention and disposal obligations
- any other information aimed at helping data users to analyse and interpret the data
e.g. user guides or manuals.
How to create documentation
When creating your data level documentation, you can choose your preferred method, however we recommend either a spreadsheet format or a text file (e.g. a README file).
When recording the information you should try to use both standard terminology used by your discipline, and plain language for those outside of your discipline who may be accessing your research data.
Where should you keep the documentation?
Organising your research data and files
It is important your folders, documents and records are named in a consistent and logical manner so they can be located, identified and retrieved as quickly and easily as possible. This is crucial regardless of whether you are the sole researcher in a project or you are working with collaborators.
When you are deciding on digital file naming conventions, consider:
- Always use capital letters to delimit words, not spaces
- Avoiding punctuation altogether, or using hyphens and underscores rather than spaces, especially where files may be accessed using a web browser
- Try to make file names short, but meaningful
- If you need to incorporate a date in a file or folder name always state it as YYYY or YYYYMM or YYYYMMDD
- If you need to incorporate a number in a file name give it a two or three-digit number. Never use a single digit.
- When using version numbering put this at the end of the name
- Although you can don't create names containing these characters : / < > | " ? ; = + & * $
- Avoid initials, abbreviations and codes that are not commonly understood.
- It's never a good idea using common words such as ‘draft’ or ‘letter’ at the start of file names, unless this will make it easier to retrieve the record.
- Avoid unnecessary repetition and redundancy in file names and file paths.