annakrystalli
10/2/2016 - 6:59 AM

Crowdsourcing ideas for rmacroRDM basic data quality check strategy

Crowdsourcing ideas for rmacroRDM basic data quality check strategy

rmacroRDM issue #10


There are a number of areas in the package that various checks are performed but could do with being more strategic about it. Should link in with developing tests #14 .

Q: What do you consider the most important elements of ensuring quality of your data? eg:

check all that apply. add your own

  • handling of missing values?
  • checking data.types against expectations?
  • handling of white.space and blank lines
  • consistency of variable names throughout file.system
  • consistency of species names throughout file.system
  • identifying typos
  • identifying outliers
  • identifying duplicates



Q: What tools in r and at what stage of data processing have you found these to be most effective?


Please feel free to fork and share thoughts and ideas!