Why should you manage your working data properly?
Proper working data management…
-
allows you to retrieve datasets more quickly;
-
ensures data safety and prevents data loss;
-
reduces the risk of accidental errors in the data;
-
saves time and effort when preparing the data for publication or sharing.
Here are six tips that will help you to manage your working data more efficiently:
1. Organise your data.
A good way to organise your digital format research data is by using hierarchical folder structure.
2. Create a naming scheme for your files and stick to it.
- Decide on the information you want to be in the file name. The file name should reflect its content.
- Create several naming templates for the most frequently used file types. This will make the process of coming up with a name for a new file much faster.
- If several people are working on the same set of files, they all should agree on naming rules to be applied.
- If you wish to include the date in the file name, the recommended format is YYYYMMDD.
- Avoid using non-alphanumeric characters and letters that are not part of basic Latin alphabet.
- Instead of using spaces, use underscores or hyphens.
- When numbering files that are 10 and more in number, start with 01 instead of 1. Similarly, start with 001 when numbering 100 and more items.
- Keep file names as short as possible. One way of shortening file names is using abbreviations or codes. However, when using abbreviations or codes, you should keep record of their meanings in a separate document.
Examples of informative file names:
Tardigrada-domestication_Lit-review_v1.0
CA12345_Data-summary_20220102_NN
Protein-synthesis_20220102_Experiment01_specimen001
3. Do not forget to back up your files.
Data loss can be prevented by saving backup copies. When doing so, it is advisable to follow the 3-2-1 rule:
You should have at least 3 copies of data stored on 2 different storage media with at least 1 of the copies being in another location or building than the other copies.
Tip: You can store up to 100 GB (with an option to request more) of research data in the National Open Access Research Data Archive (MIDAS). All data deposited in MIDAS are automatically stored in triplicate: a master copy and two backup copies, one of which is kept in a geographically remote standby data centre.
4. Practice version control.
Every time you make any significant changes to a data file, save it as a new version of the file. We recommend one of the following two versioning systems:
- Version numbering: one or several numbers separated by dots are added at the end of the file name. The numbers change every time the file has been modified. The first number documents major changes, while numbers after the dots reflect minor changes (the further the position of the number, the smaller the significance of the change).
e.g. Data-table_v2.0.5
- Documenting the modification date: the date when the file has been modified is included into the file name using YYYYMMDD format. If more than one significant change has been made on that day, append an extra alphanumerical symbol to the date to distinguish between versions saved on the same day.
e.g. Data-table_20220202b
5. Describe your data.
Try to preserve as much information about your research data as you can. It is best to describe the data directly after it has been generated while your knowledge about the data is still fresh. The information about data is called metadata.
Need some advice?
Questions on topics related to research data management can be directed to Dr Gintė Medzvieckaitė from the Scientific Information and Data Division.
Scholarly Communication and Information Centre
Saulėtekio al. 5 (Block B, 4th floor, Room 403)
Phone: +370 5 219 5062
Email:
Contact via MS Teams
The Scientific Information and Data Division also offers training events on topics related to Open Science and research data management. Please contact the Head of Scientific Information and Data Division Gitana Naudužienė if you would like to request a training activity in English (a group of 5 or more attendees is required).