Best Practices When Creating Tabular Data (Continued)

Here are additional best practices to consider when creating tabular data:

  • Fill in all cells
  • Create a data dictionary
  • Make data more interpretable
  • Save the data in plain text files
  • No calculations in the raw data files
  • Make it a single big rectangle

8. Fill in All Cells
Do not leave empty cells because if you have empty cells in the columns, you will likely end up distorting your dataset and that may result to inaccurate data. To prevent empty cells in Excel Column see this link. See good vs. bad practice below. We use a table from Zwar’s (2021) study as an example.

ok, check, to do-1976099.jpgGood Practice:

Outcome VariablesDevaluing FeelingsAppreciative FeelingsAccusing Statements
workingnon-workingworkingnon-workingworkingnon-working
Caregiver’s gender (Ref. female)0.020.01−0.04 −0.01 0.040.05
Constant1.63 1.81 3.19 3.30 1.83 2.07 
Observations515 513 512 515 511 516
R2 0.071 0.076 0.048 0.076 0.027 0.016 

errorBad Practice

Outcome VariablesDevaluing FeelingsAppreciative FeelingsAccusing Statements
workingnon-workingworkingnon-workingworkingnon-working
Caregiver’s gender (Ref. female)0.02−0.04 −0.01 0.040.05
Constant1.63 3.19 3.30 1.83 
Observations515  512 515 511 516
R2 0.071 0.076 0.048 0.027 0.016 

9. Create a Data Dictionary
Data dictionaries describe each variable in data tables. In your data tables, include a header row with a short name (without spaces) for each variable. Use the data dictionary to link this short name to a longer text label for each variable, a description of the data, data type and possible values (such as “integer” or “string”), and units of measurement. See good vs bad practice below. Example from Glenn (2020).

ok, check, to do-1976099.jpgGood Practice:

PatientId PatientAge PatientSex RiskFactors 
34 Obesity 
-999 Cancer 
45 Cancer 
38 Smoking 
-999 NULL 
39 Obesity 
48 Smoking 

Data Dictionary

VariablesDefinitionType of DataPossible Values
Patient AgeAge of UsersInteger30-50
Patient SexSex of patientStringM, F
Risk FactorRisk factor classification of patientStringObesity; Cancer; Smoking; NULL

errorBad Practice:

PatientId PatientAge PatientSex RiskFactors 
34 Obesity 
-999 Cancer 
45 Cancer 
38 Smoking 
-999 -999 
39 Obesity 
48 Smoking 

10. Save the Data in Plain Text Files
Keep a copy of the data files in a plain text format, with comma or tab delimiters. You can use (CSV) files (Broman & Woo, 2018) – researchers ought to do this because it helps to interchange data between programs with two different architectures. See an example spreadsheet and screen capture of a comma-separated values (CSV) formatted file. We use a table from Zwar’s (2021) study as an example.

Outcome VariablesDevaluing FeelingsAppreciative FeelingsAccusing Statements
workingnon-workingworkingnon-workingworkingnon-working
Caregiver’s gender (Ref. female)0.020.01−0.04 −0.01 0.040.05
Constant1.63 1.81 3.19 3.30 1.83 2.07 
Observations515 513 512 515 511 516
R2 0.071 0.076 0.048 0.076 0.027 0.016 
plain text file

11. No Calculations in the Raw Data Files (Broman & Woo, 2018)
The best strategy here is to make a copy of your files and do your calculations in the copy. Raw data should present the original collected data free from interpretation and analysis. In the ‘bad’ example below, we see the table has been shared with a scale fully computed and the raw data from each scale item is missing.

ok, check, to do-1976099.jpgGood Practice:

Id Gender Age Personality1 Personality2 Personality3 Personality4 Personality5 
18 
18 
19 
18 

errorBad Practice:

Id Gender Age Personality-scale-score-for-5-extraversion-items 
18 20 
18 17 
19 15 
18 16 

12. Make it a Single Big Rectangle
The best layout for your data within a spreadsheet is as a single rectangle with participants/samples/patients in rows and variables in columns (Broman & Woo, 2018). Below, see an example of a spreadsheet with non-rectangular layouts from Broman & Woo (2018) studies.

ok, check, to do-1976099.jpgGood Practice:

ABCDEF
1weight10093998778
2sexmalefemalemalemalemale
3glucose13412012483105
4insulin0.601.181.231.160.73

errorBad Practice:

ABCDEF
1
2101102103104105
3sexmalefemalemalemalemale
4
5101102103104105
6glucose13412012483105
7
8101102103104105
9insulin0.601.181.231.160.73