Working with your Data

A young woman writing a mathematical formula on a glass pane.

This page covers the following topics to help manage your data as you work with it in the course of your research project:

  • Organising data through file and folder naming and version control
  • Creating documentation for your data
  • Storage and security as you work on your files
  • Dealing with sensitive, confidential and private data

Organising your data

Data organisation is a key part of research data management: it helps you to find and access your data more easily and helps to prevent data loss. Being consistent in your approach to data organisation is crucial if you’re working collaboratively and need to share your files with others in a shared file space.

The following practices can help you to keep your data well-organised:

Using consistent and meaningful file and folder names is key to making your data files easy to retrieve. Good file names provide useful cues to the content, status and version of a file. If you’re working with others, it’s a good idea to decide on the naming conventions you’re going to adopt for your files when you start working together so that you can all easily access the files you need.

Your naming conventions could include:

  • Date
  • Topic
  • Project name/number
  • Name of researcher
  • Version number

Abbreviating should help keep the length of filenames down.

Other aspects to consider when creating names for your files:

  • Keep filenames short and relevant (ideally 25 characters or fewer).
  • Avoid using spaces, dots and special characters.
  • If using dates, make sure you format them consistently. The YYYY-MM-DD format at the start of a filename ensures files are listed chronologically.

Over time your data files may go through a series of revisions, and you may wish to keep track of the various versions of your files. Version control helps you to identify which version of a file is the most up-to-date and in collaborative projects can also help you to keep track of who has made changes to the files and when.

The version number can be indicated in the file name by the inclusion of ‘V’ followed by the version number. If there are quite a few changes to a file, consider using either one or multiple integers for the version number to indicate major and minor changes, e.g. V1 or V.1.1, V.1.2, etc.

If you’re keeping multiple copies of your files in different locations, remember to synchronise your files so they are all updated.

If you need to record more detail about the changes made to different versions of your data files, then it is good practice to create a version control table. Information recorded in the table may include the version numbers, date of change, person making the change, and the reason for the change.

Examples of version tables can be found in the UK Data Service guidance on versioning which can be viewed here.

Good folder organisation makes it easier to manage your data. When working with a research team having an orderly structure is even more important so try to agree on how you’re going to structure your folders early on in your research project.

You may want to consider the following when choosing how to organise your data:

  • Use folders to group files with common properties and apply meaningful folder names
  • Structure your folders hierarchically
  • Separate current and completed work
  • Keep your raw data separate from the data you are working on
  • Keep your documentation separate from your data files

Documenting your data

If you would like to share your data with others, or if you want to re-use the data yourself at a later date or include information about methods in your publications, then creating documentation for your data is essential. Data documentation provides context and clarifies the nature of the data so that anyone can make sense of the data and re-use it in their own research. Documenting data helps to make the experience of re-using and interpreting data easier for everyone which is in keeping with the ideals of making data FAIR (Findable, Accessible, Interoperable, and Re-usable).

Ideally you should start creating documentation for your data as early as possible in order to reduce the burden of trying to assemble it at the end of your project. Procedures for documenting data should be part of your data management plan.

When deciding on what to document, think about providing all the information that a researcher might need in order to use your data without having to contact you. The sorts of thing you might document include:

  • Background to the project
  • Research methodology (including research design, data collection methods, use of secondary sources)
  • Information about the research setting, who collected the data and when
  • The content of the dataset (type of data, number and structure of data files)
  • Details of how the data was processed and analysed, including anonymisation and data cleaning
  • Variables and codes used in data analysis
  • Quality assurance procedures
  • Details of any equipment used and how it was calibrated
  • The text of questionnaires or interview schedules
  • How the data can be accessed (including licensing, conditions for re-use and confidentiality considerations)

There are a number of options for creating documentation for your data. What you choose will depend on whether the data is qualitative or quantitative as well as the nature of your discipline. Some ways your documentation could be presented include:

  • Embedded within the data file itself (e.g. as a header or summary, in code or field descriptions; or in the ‘document properties’ section of a file)
  • In a ‘readme’ file
  • In a lab notebook (electronic or physical)
  • As a working paper
  • As an article in a data journal
  • As supporting documentation files (e.g. focus group guides, questionnaires, interview transcripts)

Useful guidance on writing readme files is provided by Cornell University's Research Data Management Service Group. They also provide a template for writing a readme file which can be downloaded and adapted for your own data.  

Storage and security

Three padlocks, increasing in size, in a row.

The data that you create are valuable resources and you’ll need to think about where you are going to keep it so that you and others can access it when you need to, and how you will keep it secure.  When planning your research project, think about how much data you expect to create or use and how much storage space you will need both during active phase of your project and once it has ended (for information on where to share your data once your project has ended please see the page on Preserving and Sharing your Data.) You should also consider how sensitive your data are and how this will impact on your storage needs as you work with your data.

The University provides access to OneDrive which is a cloud-based file storage space. Researchers have access to 5TB of storage which can be accessed from anywhere. You can share access and collaborate on files with others both within and outside the university. Contact Digital Services for more information or if you require more than the allocated storage.

There are other places researchers may choose to store data, e.g. laptop, external hard drive, memory sticks, DVDs and filing cabinets for hard copies. All of these present risks to your data as they are vulnerable to loss, theft, corruption and failure. Due to these risks, if you do need to use any of these portable devices or physical storage solutions, make sure that they are not your sole copy of your data.

Backing up your data regularly is really important for keeping yourself protected against data loss. To keep data secure, keep multiple copies of any important data in a number of locations and make sure you update backups regularly so that all copies are kept up to date.

There are a number of policies that support cyber security at the university that can help with keeping your data safe. Please see the IT webpages on Information Security for guidance: www.wlv.ac.uk/its/information-security/

A pile of newspapers, with the word 'classified' visible at the top of one of the pages

Sensitive data

If you are working with any personal or sensitive data then you will need to take extra precautions when it comes to storing it and keeping it secure. The university outlines three categories of data which staff need to be aware of when they create, store and share any information: public/open, restricted, and confidential. Further information can be found on the data classification webpage.

Research data may be deemed sensitive if it includes:

  • Data that identifies an individual living human subject
  • Confidential data, generated or used under a commercial agreement
  • Data that may adversely affect rare or endangered species or ecologies
  • Data that may harm an individual or community, or have negative public impact

Sensitive data about participants (special category personal data covered by the Data Protection Act 2018) includes: race, ethnic origin, politics, religion, trade union membership, genetics, biometrics, health, sex life, and sexual orientation.

If you are handling personal or sensitive data then you will need to consider how to manage it throughout the course of the research lifecycle.

Anonymising data allows the data to be used and shared whilst maintaining the privacy of your participants. Once data has been fully anonymised it is no longer deemed sensitive and can be shared without risk. Ideally you should anonymise data as early as possible in the data collection and analysation process, and make sure to keep any anonymised data separate from identifiable data. The UK Data Service provides comprehensive guidance on how to anonymise quantitative and qualitative data.  

Encryption involves the encoding of data so that it can only be accessed by those who have been authorised to do so. You can encrypt individual data files or folders or even whole devices such as USB sticks and laptops. When storing, transferring and disposing of sensitive data you should make sure that it is encrypted. For advice on encryption, please contact Digital Services.

The Data Protection Act 2018 and the UK General Data Protection Regulation state that personal data should only be accessible to those who are authorised to access it so you should make sure that access to sensitive data is restricted. When working with collaborators, ensure you encrypt sensitive data when sharing it- ideally e-mail should not be used to share sensitive data but if you are, then send encryption passwords separately from the encrypted files.

Further help and information

If you would like advice or guidance on any of the topics above relating to working with your data, please contact the Scholarly Communications Team at wire@wlv.ac.uk.

Images used on this page

Photo by Jason Goodman on Unsplash

Photo by FLY:D on Unsplash