I've been spending this week in Melbourne at the 2012 eResearch Conference and as we reach the close of day two I thought I would share some of my notes from sessions I've found interesting at the conference so far. I hope you find them interesting too...
Repository as App: Functionality to attract dark data - Prof. Bryan Heidorn
This presentation was looking at the long tail of data, the small projects that are actually collectively making up the bulk of the data that should be available for people, the non-blockbuster projects. Small data is big science because it is high volume, is information rich, has high entropy, but also because the needs of this wide field are not really understood by either researchers, support staff or vendors.
Where will you find dark data? Literature, Museum specimens, field notes, (un)experimental data sets, citizen observations. A great deal of the dark data we are using now are things that their value is only jut being discovered many years after being created. For example, journal and diary notes from citizens that mentioned when flowers bloomed are now being used for comparisons to identify the effects of climate change.
To better manage this issue the tools and business of science needs to be altered to seamlessly allow support management and communication of data. We need professional development and training to open up the field.
Creating opportunities to have data automatically moving across to support in the cloud or at least back up systems with built in sharing functionality will be vital to avoiding growth in the quantity of dark data.This is all about moving away from a world of the computer overlords who would control access to functional support, towards understanding how can we recycle data for reuses and embracing a democratization of data.