Research Data Management Infrastructure III

In earlier entries to this Blog, Research Data Management Infrastructure (RDMI) was defined as the mix of technology, services, and expertise organized locally or globally to support research data activities across the research lifecycle.  The context for RDMI has already been discussed in terms of the research lifecycle and of the two additional components making up research infrastructure: Canada’s high speed research network and high performance computing services.  This essay will address the elements of data infrastructure and how they are organized.

In developing its Cyberinfrastructure program, the U.S. National Science Foundation funded a project to investigate how best to build successful infrastructure.  CyberinfrastructureComing out of this study was the report, Understanding Infrastructure. The authors establish early in their work the significant connection between social organization and the use of communication technology.  Regarding cyberinfrastructure, they stress that it “is about more than just pipes and machines” (p. 5) and emphasize the importance of social organizational factors in shaping solutions.  They note that in developing cyberinfrastructure, solutions can be social, technical, or a combination.  They feel that the distribution of solutions is central to building infrastructure.  Using the diagram by Millerand, solutions are portrayed as being distributed across two dimensions: technical-social and local-global.

[C]yberinfrastructure is the set of organizational practices, technical infrastructure and social norms that collectively provide for the smooth operation of scientific work at a distance. All three are objects of design and engineering; a cyberinfrastructure will fail if any one is ignored. Understanding Infrastructure (p. 6)

A Textbook Example

Earlier this year I experienced a textbook example of this conceptual model of infrastructure while visiting Bryn Mawr University just as they were changing the way they provide campus wireless services to guests.  When I arrived on campus, I was given a sheet of paper containing the name of the campus wireless service, an account ID and password to log into this service, and a set of instructions for different devices and operating systems.  I was required to obtain a separate account for each device on which I wished to use campus wireless services.

This approach to providing guests with wireless access to the campus network and the Internet falls under the social-local set of solutions in the above infrastructure model.  The procedures were organized around human intervention, i.e., having to find and speak with a person who could provide me with the information sheet, and around social norms requiring me to sign an agreement statement, confirming my acceptance of the rules for using their wireless.  The wireless technology, however, was typical industry-standard WIFI.

On the second day of my visit, a new wireless service was launched for guests on their campus: Eduroam.  This is the international service that allows academic guests from university members of Eduroam to gain access to secure wireless networking while visiting another Eduroam site.  Because my home institution is an Eduroam member and can authenticate my credentials through this service, I simply open my wireless device, go to the list of available wireless services where I am, and if Eduroam is among them, I select it.  The system behind the scene allows the local Eduroam host to verify my credentials with my home institution and to provide me with selective network services on their campus.  For example, if the Library has a license for a database that does not allow guests access, the local implementation of Eduroam can hide this database from my guest access.

This service approach falls under the technology-global set of solutions.  My credentials are validated through my home institution using technology, allowing me to connect to wireless services at a member Eduroam campus, without having to go through another person or having to obtain temporary authentication credentials.  Eduroam has provided me with guest access easily to wireless services in the United States, Germany, and Canada.  There are higher education institutions in over fifty-five nations now supporting Eduroam.  It truly is a global solution to providing guest access to secure wireless networking.

Cyberinfrastructure and RDMI

How does this particular Cyberinfrastructure (CI) model relate to Research Data Management Infrastructure?  First, the CI model provides a conceptual framework for the definition of RDMI.  The RDMI elements of technology, services, and expertise are part of CI, although not expressed in exactly the same terms.   Applied to RDMI, organizational practices and social norms are aspects of the services supporting data management across the research lifecycle.  Services embody organizational responses to data management.  For example, offering researchers assistance with data management plans requires organizing resources to deliver such a service.  Social norms and expectations are also expressed in services.  A funding agency may require data management plans to get researchers to describe how they will share the data from their project, setting an expectation to share data.  Thinking of services in the context of RDMI is a combination of CI characteristics around social norms and organization.

Expertise is another component of CI and RDMI.  Data management activities span the research lifecycle and involve many different skills, drawing upon a variety of expertise.  The demands for data management expertise depend on the scale of the research project.  A small project may involve only a couple of people, who can manage with a general set of skills.  A much larger project may require a team of experts with each team member responsible for a specific specialization.  Expertise also is aligned with responsibilities for data management activities, which was identified as aspects of data stewardship in a previous Blog discussion.

Place is significant in CI and RDMI.  Research is increasingly conducted in collaborative, inter-institutional teams that span nations.  High speed optical research networks are vital for researchers who work at a distance from one another.  Whether working together in real time or asynchronously in different places, the network allows them to organize their workflow so each can contribute.  Similarly, researchers may require access to high performance computing (HPC) but are not located at an HPC site.  Over a research network they may gain access to the computing resources they require.  Distance also comes into play with RDMI.  Data may be gathered in one location, processed at another site, analyzed at yet another place, and preserved in an institution separate from these other locations.  Through a collaborative initiative, such as the Canadian Polar Data Network, an institution may offer preservation services for research data that behind the scenes consists of a distributed dark archive shared among several institutions.  The scope of some research data infrastructure requires global solutions.  One example is the need for infrastructure that will overcome barriers in the free exchange of scientific data across national borders.

The implementations of RDMI will vary from institution to institution but the set of solutions will be distributed locally or globally across technology, services, and expertise.

The next Blog entry will focus on the question:  Who are Canada’s international peers in Research Data Management Infrastructure?

[The views expressed in this Blog are my own and do not necessarily represent those of my institution.]