The Value of Data Management Plans

A big news item coming out of the Digital Infrastructure Summit held in Ottawa on January 28-29, 2014 was the announcement that Canada’s federal research councils will introduce policy changes over the next 24 months that will require applicants to include data management plans in their funding proposals. This announcement came quickly on the heels of a Fall 2013 consultation conducted by these same councils on Capitalizing on Big Data. Within the background material prepared for this study, these councils were challenged to adopt “agency-based and focused data stewardship plans (p. 8)” of which data management plans (DMPs) were seen as integral.  The push toward this policy change will now likely face some opposition, although momentum currently seems to be with those promoting policies in support of a Canadian data stewardship culture.

Some research councils in other countries have already implemented DMPs. For example, a guideline among the data principles of the Research Councils of the United Kingdom (RCUK) specifically encourages its members to develop data management plans:

Institutional and project specific data management policies and plans should be in accordance with relevant standards and community best practice. Data with acknowledged long-term value should be preserved and remain accessible and usable for future research.

Provided as an umbrella framework, each of the seven research councils of RCUK is independently responsible for its data policies.  For example, the Economic and Social Research Council (ESRC) describes its reasons for requiring data management plans as:

We believe that a structured approach to data management results in better quality data that is ready to deposit for further sharing.

This single sentence is very revealing about the expected returns on DMPs.  To begin, a DMP is seen to contribute structure to the handling of data within a project.  An outcome of this approach is believed to be higher quality data.  Furthermore, the data will be better prepared for deposit with an organization that will make the data available for others.

On the surface, data management plans appear to be a very straightforward policy tool. They simply lengthen current funding applications by another page or two. However, the purposes they fulfill and the processes they embody will enrich the production and custodial care of research data.  The ESRC anticipation of higher quality data for sharing also implies collaboration with data curation services and with data repositories.  Ultimately, a DMP should engage researchers in conversations with those providing such services.  In this context, a DMP becomes a document of relationships that should be shared, edited, and monitored among those contributing to a project.  From this viewpoint, a DMP functions as a dynamic document of agreements.

To serve the multiple purposes just described, DMPs should be designed for easy digital exchange across a variety of applications.  The best way to approach this in today’s complex world of  information technology is through a metadata standard describing a data model of elements constituting a DMP.   CASRAI, a community-based standards body for research administrative information, is well positioned to do this.  In fact, the U.K. chapter of CASRAI has already begun work on a set of elements for a DMP data model.  In conjunction with this, it would be helpful if the Standards and Interoperability Committee of Research Data Canada would develop a fundamental flowchart representing the interplay of purposes, uses, and relationships expressed in a DMP.  This would be both informative for the CASRAI working group developing specifications for DMPs as well as helpful in validating the completeness of a DMP data model.