Feedback on the Tri-Council Draft DM Policy
Submitted by the Research Data Canada Policy Committee
On behalf of the RDC Steering Committee and Stakeholder Community
Research Data Canada is pleased to provide feedback to the Tri-Agencies’ Draft Data Management Policy on behalf of the RDC stakeholder community. The feedback was facilitated through discussions of the RDC Policy Committee, and includes comments extracted from an April 13 meeting of university Vice-Presidents Research in Toronto (facilitated by RDC), from RDC’s Policy and Steering Committee meetings, and directly from the RDC stakeholder community.
In all of RDC’s conversations regarding the draft policy, there was general agreement that the direction of the policy is good, and that helping drive towards open science and FAIR (Findable, Accessible, Interoperable, Reusable) data is very positive. There was also recognition that funder policies like this are critical to changing research culture, as demonstrated by similar policy contexts in other jurisdictions.
The most common feedback point was the need for clarity from the federal government, and the Tri-Agencies in particular, around the funding and resources available to respond to the Policy. While some suggest that institutions would not require additional resources to respond to the policy, the RDC community was clear in their expectation that it will be resource intensive for many institutions, especially those without an existing data management program. Recognizing that there is a cost to the researcher (e.g. in terms of time) and the institution (e.g. in terms of support resources, especially HQP and systems) would be more beneficial to the launch of the Policy than to leave it out of the document completely. For example, it would be useful to highlight any existing or planned guidelines and funding that will be available to researchers to facilitate a response. Ideally, this would include details about specific grants and amounts eligible for activities like data management or preparation of data for deposit. It is critical for the Tri-Agencies to understand that only a fraction of the research undertaken by Canada’s research institutions comes from the agencies, and the majority of funders in the Canadian ecosystem have implemented, or are considering implementation, of policies similar to this one. A fulsome and impactful response by researchers and their institutions will require an equally fulsome and impactful response from the funder community: this Policy is the ideal opportunity for the Tri-Agencies to play a leadership role.
Another common comment was around the use of language that would most likely lead to a lack of action, rather than a more substantial and fulsome response. Use of the word “encourage” in a number of sections (e.g. data management plans) should be made stronger, for example by using “required”. Another option is to highlight that the eventual goal is to bring the funder policy in line with other private and public funders, and that this initial version is intended to facilitate the conversation, and move the bar forward without jumping directly to an explicit mandate. There is a disconnect between the use of the term critical in a preamble that suggests:
The ability to store, access, reuse and build upon digital research data has become critical to the advancement of science and scholarship, supports innovative solutions to economic and social challenges, and holds tremendous potential for Canada’s productivity, competitiveness and quality of life.
and then launch a policy which encourages Canada to join international leaders in this context.
To underscore this even more, RDC proposes that the preamble to the policy (and the accompanying material) should be more direct in highlighting the advantages of data sharing, including specific exemplars from the international and scholarly publishing communities. Canada’s recently released draft on Open Government, and in particular the section on Open Science, should also be referenced as an example of Canada’s response. As part of the final Policy, RDC recommends implementing a directory of examples for each of the three main sections, ideally as part of an associated FAQ, and linked to from the Policy document. The policy would also be well served by adding mention of the FAIR rubric, both in the policy itself and as part of the supporting materials. Making a strong scholarly case for a policy of this nature, and including a few examples in the preamble or supporting material, would resonate with the research community.
While we recognize that the policy implementation date cannot be more specific at this stage in the process, we do recommend that the Agencies consider proposing an explicit multi-year timeline for each of the three components as part of the final Policy release. This will help institutions a great deal in formulating their own internal timelines, especially given the substantial effort needed to synchronize this Policy with the various internal research policies. There may also be an opportunity with the launch of the final policy to suggest that the agencies will work with the stakeholder community to develop a framework for obtaining feedback and measuring impacts.
The reference in the Policy preamble that it “does not apply to scholarship, fellowship or Chair holders.” is incongruous with the overarching intent of the policy. While RDC recognizes that there may be internal challenges to applying such a policy to all publicly-funded researchers at the same time, we recommend at the very least that this statement be modified to highlight that the Policy is intended to eventually apply to all researchers receiving Tri-Agency funds, although the application may be staged as agencies and departments formulate their response and documentation. This could be done either by inserting the word “current” before scholarship, or being more explicit about the longer-term intention.
RDC proposes that the Policy highlight, in the preamble, that the term Open Science is widely accepted to describe research in all domains, including the social sciences and humanities.
This part of the policy was viewed in a positive light, and a useful precursor to the delivery of services to support the 2nd and 3rd pillars of the Policy. Feedback from the RDC community suggested that the linear nature of the policy and the three pillars, leads nicely to institutional preparation and the ability to respond effectively to the policy. This is highlighted in additional comments regarding the use of a more explicit staged timeline for the Policy components.
This section refers to “data as an important research output”, which is also highlighted in Section 3.3 with a focus on research outputs. It would be useful to highlight that this data value materializes as an input, an interim work product, as well as an output of the research enterprise, and thence the benefits that accrue from the effective stewardship and sharing of data. Referring to data as an asset in the Policy document would support this goal. A useful phraseology for this concept is that good data stewardship is equally concerned about data assets for, during, and from research.
Given the value of these outputs to the broader Canadian landscape, we recommend that the Tri-Agencies work with stakeholders to create a review process that would bring a sense of national coherence to the approaches reflected in the various Institutional Strategy documents.
Data Management Plans (DMP)
A key comment from the RDC community concerns the gap of compliance with respect to the recommendations in this section. For example:
- How will DMPs be received and tracked in those cases where it is requested as part of the adjudication?
- Who retains copies of the DMPs after submission?
- How will it be tracked in the adjudication process?
As with other key components, it would be useful to have additional language around this issue, or at the very least, an indication that these important details would be developed prior to the final policy’s release. Providing a sense of minimal expectations (see the comment later regarding a Data Availability Statement) would help researchers respond to the Policy. This is also an important example for showing how a staged implementation of the key Policy elements could work.
This section is likely to receive the greatest amount of feedback from the research community, given the perception of DMPs as complex and lengthy. DMPs can be seen by researchers as unnecessary and laborious, despite the clear advantage to having a plan in place for any research program. One possible approach would be to highlight that the preparation of a DMP is a flexible and dynamic conversation that typically happens at the level of the institution, and helps ensure that the appropriate resources are available to researchers. Recent efforts to craft dynamic DMPs that reflect a full life cycle approach, including the most critical information initially, and evolving to a more detailed DMP, can help ease the perceived burden of responding to a detailed DMP.
The following text from this section should be reworded to indicate that work done in response to an Agency request for a DMP will be considered as part of the adjudication process. Given the effort required to prepare a DMP on request, knowing that it will be used in the adjudication process is critical to acceptance by the researcher. It also recognizes the institutional effort required to support the researcher in preparing a DMP for adjudication.
For specific funding opportunities, the agencies may require DMPs to be submitted to the appropriate agency at time of application; in these cases, the DMPs may be considered in the adjudication process.
This section also highlights previous comments that the overuse of language such as may is generally not useful in a document of this nature.
RDC recommends that this section be modified to require researchers to create a Data Availability Statement as a minimal approach to a full DMP. This mirrors the approach in the Data Deposit section regarding deposit and open data. A Data Availability Statement is a critical part of, or precursor to, the more detailed DMP exercise. It can also lead naturally into the Data Deposit pillar, encouraging a conversation about data sharing, and allowing the researcher to make an informed decision about whether or not to openly share their data. Creating a Data Availability Statement is a simple requirement to respond to, and is increasingly required by publishers before a manuscript will be considered.
A very positive aspect of this section is the requirement
to deposit into a recognized digital repository all digital research data, metadata and code that directly support the research conclusions in journal publications, pre-prints, and other research outputs that arise from agency-supported research.
This section should add additional language that the digital research data, metadata and code that is deposited in a repository should reflect a complete and appropriate collection of material (e.g. algorithms, specifications, methodological guides, data dictionaries, etc.), not just the tables or data summary from the research conclusions. A clear definition of what constitutes data would be beneficial in this section, or via the accompanying material. In most cases, the more detailed datasets and material are what is of interest to the broader research community.
Given that the repository community is very much in a dynamic stage of development, it would be beneficial to provide more detail on what a recognized repository is, and expand on how a researcher would respond to this part of the policy, most likely in the accompanying material. Resources like the Recommended Data Repositories, FAIRsharing.org are good example of a guide for where to deposit.
Since this part of the Policy is a requirement, the Policy should also include an indication that compliance will be reviewed in some way. The Policy need not outline exactly how that will be done, but should indicate that an appropriate and meaningful process will be established in consultation with the research community.
Earlier comments regarding the cost of responding to the draft Policy are especially relevant to this section. Any activity that redirects research funds will be seen as a negative, so highlighting that there is new support for this will help promote the goal of data deposit.
This section would also benefit from additional language referring to the growing support from publishers and domains of practice to facilitate (and in some cases require) data deposit as part of the publication process, as well as from promotion and tenure committees for the institutional context. This would be especially important in highlighting that the Policy is a reflection of emerging standard practice in scholarly communication.
Given the emerging framework of support agencies and services in Canada, it is highly recommended that the Tri-Agencies find a way to highlight those that are recommended to institutions and researchers as the Policy is implemented. There is a wealth of excellent material from the international community in particular, which has been developing a response to similar policy frameworks and mandates for close to a decade. RDC will soon launch the RDC-DRC Best Practices Designation, which will serve to highlight these resources using a flexible framework that will facilitate integration into other efforts. We also understand that the dynamic nature of the Canadian landscape presents challenges in how this is kept up to date and presented to the community. We recommend that the Tri-Agencies continue to work closely with organizations like RDC, CARL Portage, and others, to maintain a guide to these resources, and in such a way that they are linked to, and a key part of, the official Policy documents.