Organise Your Data
Ensure your data will still make sense to you and others in six months or even ten years by keeping it organised from the start.
Electronic Research Notebook (LabArchives)
LabArchives is the University’s Electronic Research Notebook solution, provided in order to facilitate efficiencies and improvements in many aspects of managing research data including security, collaboration, and accessibility.
All current staff and students can access the University's instance of LabArchives. Because LabArchives is cloud-based, you can access it anywhere and at any time via the internet or mobile app. Data in LabArchives is stored in Australia and there is no limit to the size or number of notebooks that you can create and share.
Storing your research data
Refer to the Storage for Staff guide on the Information Technology and Digital Services website.
Common University tools for storing research data include:
- University network drives (S: and R:)
- Cloud based storage such as Box
Tools developed for specific uses such as LabArchives electronic research notebook tool, Figshare online digital repository, and the Genomics Repository.
Be aware of where your data is stored (local or cloud-based, in or out of Australia, on a medium that can be lost or stolen), if and how often it is backed up, and whether your choices are endorsed and supported by the University.
If you are unsure about the best way to store your data, contact the Service Desk on 831 33000 or email firstname.lastname@example.org.
Whether you are a student or staff member, any research data collected must be stored in line with Australian Code for the Responsible Conduct of Research .
You can save time on your research project if your data is organised and easy to find.
A naming convention can be applied to folders, and the information below in "File names" is also relevant to naming folders. By prefixing folder names with numbers, you can force them to be ordered according to the steps in your workflow.
For example, the structure below has a four step workflow:
- 20190502 Morgan Survey (top level folder)
Then include a "readme" file in the same top level folder to describe the folder layout right where the information will be needed.
Name your files in a way that describes what they contain and how they relate to other files. Do this from the start of your project to save time later. Consistent, meaningful naming of files and folders can really help to keep your data organised and make everyone’s lives easier.
If records are named consistently, logically and in a predictable way, they will be easy to find and compared to other versions. Follow your discipline’s conventions so people not involved in the project will be able to make sense of them.
For example, a file name format may look like this: Date collected_Location_Sensor. File names would start to look more like this as the format is applied: YYYYMMDD_SiteA_SensorB.CSV.
Have a look at the University of Edinburgh’s 13 Rules for file naming conventions.
- 20190502 Morgan Survey (top level folder)
What is metadata?
Metadata means "data about data." It is information about an object that describes characteristics such as where it can be found, what it contains, its format, its quality, and contact information for people who are knowledgeable about it.
Metadata can be used to describe physical items (primary resources), as well as digital items (datasets, documents, audio-visual files, images, etc.).
Metadata can take many different forms, from free text (such as read-me files) to standardized, structured, machine-readable content.
Metadata can serve several different purposes, and the included information will vary accordingly. For example:
- descriptive metadata describes data for the purposes of discovery and identification
- technical metadata describes things such as file types, and how the data was collected
- access and rights metadata describes who can access the data, and what they can do with it
- preservation metadata describes actions taken to preserve or sustain the data for later access and use.
Metadata can describe a whole collection of data; for example, all data associated with a particular research project. It can also describe data points in single fields; for example, this is a measurement of temperature in degrees Celsius.
Ideally, a controlled vocabulary will be used to create consistent metadata. A controlled vocabulary is an organised set of words used for consistent indexing, and subsequent optimised retrieval through browsing or searching.
Think about your own data. What information would you need to provide in order to comprehensively describe it? Is there a controlled vocabulary for your research area that provides information about the preferred terms and definitions for your data?
Version control is a system of naming files that makes it possible to recover and view earlier versions of files.
Versioning is particularly important if you are collaborating and multiple people are working on the same file at once.
Some file-sharing apps (e.g. Box) have automatic versioning - if you save a newer copy of a file in Box it will create a new version.
Turn on versioning or tracking in collaborative works or storage spaces such as Wikis, or GoogleDocs.
Files can also be versioned manually. Include a version number at the end of the file name such as v01. Change this version number each time the file is saved. For the final version, substitute the word FINAL for the version number (this is especially important if files are being shared).
Read more information about data versioning on the ANDS website.
Consider the file format and the overall storage requirements before you start your project. It can be very time-consuming, expensive or even impossible to make changes after the completion of the project.
Ideally, the file formats are non-proprietary, open, have a documented standard, are unencrypted and uncompressed and are widely used in your discipline.
Is it likely the file format will be obsolete in 5, 10 or 50 years? How long will the data be stored for?
High-resolution data may require conversion to another format. Consideration must therefore be made for the long-term preservation of data taking into account the storage, display, visualisation, conversion or re-use of data.
Common durable formats include:
- Text documents - .TXT; .DOCX
- Spreadsheets/tabular data - .CSV; .XLSX
- Web pages - .HTML; .XML/.XSLT
- Images - .PNG; .JPG; .TIFF
- Audio files - .FLAC; .MP3; .WAV
- Video files - .MP4; .AVI; .MPG