The RDC Federated Pilot for Data Ingest and Preservation

blog arrowPosted on: Jan 19, 2015

RDC has been co-ordinating a major effort among RDC active organizations to develop a pilot that will allow researchers to deposit their research data in repositories housed on Compute Canada storage facilities. The purpose of the pilot is to truly determine what is required to meet the needs of researchers from disciplines without developed data infrastructure. Among the organizations active in this RDC Pilot are Compute Canada, CANARIE, Canadian Association of Research Libraries, CANFAR, C-Brain, Scholar’s Portal, SFU Libraries, and University of PEI.

The efforts to mount the pilot have been focused on connecting existing infrastructure and services via the CARL Portage library network and Compute Canada, and augmenting those resources with insights and experiences from disciplinary repositories. On a preliminary basis, SFU has begun to ingest data that will be stored on Compute Canada resources. The SFU approach is based on the integration of the Islandora repository platform and Archivematica preservation software, in order to cover both data deposit and archiving for preservation. A complementary solution is under development at Scholar’s Portal to integrate Dataverse and Archivematica. These activities will ramp up significantly over the next several months. At the same time Compute Canada will introduce data replication to multiple Compute Canada sites across the country with a Globus solution, and the group will begin discussions about the role of the disciplinary repositories in this network.

Project participants will meet in person on February 11th and 12th to review progress, document lessons learned, and chart next steps. A strong emphasis in the meeting will be beginning the process of modelling how the pilot solutions could scale to meet widespread needs and to define the services and infrastructure that must be in place.

