Storage and backup
Understanding your storage options; the Monash Large Research Data Store, and how to get access to it; risks associated with personal storage media (CDs, DVDs, USBs, portable hard drives); assessing cloud storage options.
- Digital data storage options
- Network storage
- Personal storage - hard drives
- Removable media
- Cloud storage
- Backups of digital data
- Storage and copies: other data formats
- Volumes of data
When you are selecting where to store digital data, you should consider the following:
- How likely is it that the hardware, software or media will fail or become obsolete?
- What would be the impact of any failure?
- What security systems are in place?
- What disaster recovery procedures are in place?
- What is the availability of support by professional IT staff?
To assist staff and students in the selection of the most appropriate storage for their specific needs, the Digital Data Storage Options at Monash chart [pdf 5.97Kb] and accompanying Benefits and Limitations table provide information on storage services at Monash. With the aim of helping to simplify selection.
It is highly recommended that you store your research data on networked drives that are managed by professional IT staff (eSolutions).
The benefits of having your research data stored on networked drives include:
- data is readily available to you and other authorised users
- data can be made available via remote access on request
- standard security and access controls are in place to prevent loss, theft or unauthorised use
- all your research data can be stored in a single place
- automated systems are in place for backups and integrity checks.
Note: To gain access to storage where there are large or complex data issues, contact the eSolutions Service Desk or the Monash eResearch Centre (MeRC) - firstname.lastname@example.org. Complete the questions listed in the Storage request - pre-application information document prior to contacting eSolutions or MeRC to help streamline the process
It is not recommended that you store files on individual desktop or laptop PCs.
Local hard drives under the Windows 7 SOE (Standard Operating Environment) will not support write access to the C: drive. However, users do have write access to C:\users\username for storing temporary working copies of data, but should not use this to permanently store master copies of research data. From time to time, local hard drives do fail and are often not backed-up. Local machines are also regularly replaced, upgraded, and/or re-allocated to other staff members, at which time data on those machines may be lost or at risk of being inappropriately accessed.
If research data needs to be stored temporarily on desktop or laptop PCs (e.g. if you need to work off-site in environments where you cannot access Monash services remotely via the internet), your data (where saved in the 'Documents', or 'Pictures' libraries, or on your Windows Desktop) are synchronised with the server, and available offline whenever the Monash Network is not available. These locations should be considered convenient working areas and not for permanent storage. Instead, you should consider connecting to the Monash Network using the VPN. [Note: Windows SOE users will not require VPN connection to access Monash services on completion of the "mobility project", the aim of which is to provide users with an always-on connection to the Monash Network.] This will allow synchronising your personal storage with networked storage and provide access to research shares (S: drive). For large datasets that need to be stored on a network location, the S: drive is suitable for this purpose.
For clarification and advice on the most appropriate personal storage and backup solutions for your research needs, contact the eSolutions Service Desk.
The low cost and portability of removable media like CDs, DVDs and flash memory devices (i.e. USB memory sticks) makes them an attractive option for storage. But taking a long-term perspective, they are rarely a suitable option for long-term retention of your research data, especially master copies:
- Removable media are often not big enough for all the research data, so multiple disks or drives are needed. This can make accessing your data later on difficult, especially if you do not have good systems in place for identifying and describing the data.
- Although use of CDs, DVDs and USB sticks is common, their longevity is not guaranteed, especially if they are not stored correctly (ideally in a steady range of about 18-22 degrees Celsius and 35 to 45% relative humidity). The estimated life of a CD/DVD stored at above 28 degrees and 50% humidity is as low as two years, far short of the minimum retention periods that apply to most research data.
- In addition to being environmentally sensitive, removable media can be easy to damage physically (e.g. through magnetism or shocks). Errors with writing to the media ('burning') are also quite common.
- Because they are so portable and data can be easily copied from them, removable media pose a risk in terms of data security. Devices are easy to misplace or lose, and often the data on them does not have access controls.
The Large Research Data Store (LaRDS) offers researchers large-scale networked storage for digital research data. Given the availability of this storage in addition to faculty-level options for networked storage, it should not be necessary for most researchers to use removable media for storing master copies of digital research data.
If you choose to use CDs, DVDs and USB sticks (e.g. for working data or extra backup copies), you should:
- choose high quality products from reputable manufacturers
- follow the instructions provided by the manufacturer for care and handling, including environmental conditions and labelling
- regularly check the media to make sure that they are not failing, and periodically 'refresh' the data (i.e. copy to a new disk or new USB stick)
- ensure that any private or confidential data is password-protected and/or encrypted.
"Cloud storage" is essentially data storage that is made available as a service via the internet, generally hosted by third parties. Commercial cloud storage provides an alternative to local network and physical storage devices for storing and backing up your data, and provides remote access to that data from any computer with an internet connection. Commercial cloud storage solutions may provide tools for managing files and organising your virtual storage space.
There are a multitude of cloud storage services available. Monash provides Google Drive as its enterprise solution for small scale data storage. The eSolutions Security & Risk team recommends the use of Google Drive in place of other cloud storage services such as Dropbox, because Monash University has a formal agreement in place with Google that ensures compliance with our legal and privacy obligations. This agreement also provides visibility of Google security arrangements and recourse in the event that breaches occur. Whereas, nothing similar is in place with Dropbox or other similar cloud storage services.
Monash supported cloud-based services can be used for research data that are critical, confidential, or otherwise sensitive in nature, such as clinical data this includes LabArchives.
Monash policy does not allow "SENSITIVE DATA", “HIGHLY SENSITIVE DATA” or critical information to be hosted outside of non-Monash supported data storage. This includes administration or staff related data such as credit card numbers, tax file numbers and health information, and research data that are classified by Human and Animal Ethics Committees. A privately organised commercial cloud-based service is not suitable for research data that are critical, confidential, or otherwise sensitive in nature, such as clinical data.
An appropriate exemption must be obtained before this data can be hosted externally. Refer to Electronic Information Security: Responsibilities, Classifications and Standards Procedures for further information.
Importantly, if using a cloud-based service other than Google Drive, service level agreements should be read carefully before committing your research data to the Cloud, as there are risks:
- Who owns the IP in your data when held in the Cloud? Can the cloud service provider claim copyright ownership or a broad licence to use your stored material? Does data held outside Australia need to comply with transborder copyright laws?
- What level of access does the service provider have to your cloud?
- What data protection practices are in place? Assessing a cloud service provider's data management practices can be difficult. To mitigate this risk seek out reputable service providers.
- What protocols are used in the transfer of large amounts of data between a remote system and cloud infrastructure? Is there a virtual private network (VPN) or similarly secure connection to ensure data security, whilst simultaneously safeguarding the network from cyber attacks?
- Can data be comprehensively deleted, when required? The design of some cloud storage means that additional copies may be stored across virtual servers, where data cannot be deleted as the disk to be formatted also stores other data.
- Viability of the service provider - if a service provider shuts down unexpectedly, can your research data be retrieved?
For clarification and advice on the most appropriate cloud storage solutions for your research needs, contact the eSolutions Service Desk.
As a researcher, you are responsible for ensuring that digital research data is backed-up regularly.
Digital data stored in LaRDS is automatically backed-up to a tape library nightly in two physical locations.
Backup regimes of digital data stored on faculty drives may vary. If you plan to use faculty storage facilities, you should confirm with local IT staff that the backup regime in place is appropriate, in terms of the frequency of the backup, number of copies, and multiple locations.
If you store research data on personal hard drives, you should investigate the many free and commercial tools and services available for automatically backing up your system to an external hard drive. Online remote backup services are available, but you should be aware that the privacy and security policies of these services may not meet the legal and ethical standards expected of responsible researchers.
Secure storage of non-digital data during the statutory retention period (usually a minimum of 5 years after publication) is the responsibility of the researcher and their academic unit.
Consider the suitability of the storage area for non-digital data, and their organisation. Conditions under which data are stored significantly affect their longevity. Data should be well-organised, clearly and appropriately labelled, easily located and physically accessible. Storage areas should be structurally sound, free from risks such as pests, flood, and as far as is practicable fire, and be suitably climate stable.
Finding space for storing non-digital data can be difficult. Converting research data to a digital format can reduce physical storage requirements, but this can be a complex and expensive process. The Library can provide advice to researchers about scanning standards and likely costs for digitisation.
You should estimate the amount of data required for your project as early as possible, and consider including costs for data storage (including storage of back-up copies) in funding proposals.
Notifying central or faculty storage providers of upcoming storage requirements helps with planning and avoiding delays.