Research Data Management Support in the Humanities: Challenges and Recommendations

Date

2023-01-10

Authors

Higgins, Stefan
Goddard, Lisa
Khair, Shahira

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This report summarizes proceedings from Research Data Management for Digitally-Curious Humanists, a virtual event sponsored by Social Sciences and Humanities Research Council (SSHRC) on Research Data Management Capacity Building. This event was held as a Digital Humanities Summer Institute 2021 aligned conference, and was led by the University of Victoria Libraries and the Electronic Textual Cultures Lab on June 14th, 2021. The program, presentations, and related resources are openly available on the project site: https://osf.io/6vepj/wiki/home/ The following recommendations reflect conversations with Humanist researchers and students before, during, and after the Connections event, based on pre- and post-surveys of attendees, and on presentations and discussion during the event. We have included the recommendations at the beginning of this report for easy reference, but please read further sections for much more detail. 1) Many humanists are uncertain about what constitutes “data” in the context of their research projects. Better guidance on defining research data must be developed in consultation with both digital and non-digital humanists from a variety of different disciplines. 2) As RDM policies become more mature, it is imperative to spend time examining edge cases, including analogue scholarship, and fine arts research processes. Directed effort must be made to engage researchers who identify their research as “not fitting into current data management policy” rather than focussing on successful, masthead DH projects, which generally already have institutional support, funding, and technical capabilities. This recommendation can be summarized as looking at the boundaries and edge cases of policies, as well as the centre. 3) Humanist researchers are not necessarily convinced about the relevance or value of “data” in their disciplines. This fosters a reluctance to engage with data management planning, and a tendency to see RDM as a bureaucratic burden. Clear examples of the value of high quality, sustainable, reusable humanities data sets are necessary to convince humanist researchers of the importance of RDM work. 4) Humanist researchers continue to feel that they need support for data management at all stages of the research process: on conceptual and theoretical approaches to data; on guidance for meeting new Tri-Agencies funding requirements; on making choices about data infrastructure; on defining appropriate metadata frameworks; on capturing and recording metadata according to standards; and on how to ensure their research does not change in kind in order to meet data policy. 5) Humanist researchers would like to receive funding increases that reflect the additional cost of research data management to projects, including the need to hire and train team members who can oversee the design and creation of data and metadata to ensure that practice aligns with the data management plan. 6) Humanists require RDM support and training sessions over the course of their whole careers, and not simply when they are ready to apply for funding. Ideally, data management concepts and basic skills will be developed at the undergraduate and graduate level. Asking researchers to try to absorb and apply all of this information at the point of grant application is likely to generate frustration and shallow engagement, as material becomes outdated or forgotten over the award timespans. 7) Many senior humanities researchers and instructors do not feel that they have enough RDM knowledge to confidently teach the necessary concepts and tools. A great deal of RDM instruction is aimed at experienced researchers, but it is also necessary to develop instructional resources that are aimed at undergraduate and graduate audiences. Ideally these instructional materials will include asynchronous options, and hands-on learning exercises that can be evaluated in a for-credit context. 8) Humanist data are extremely diverse. Most data are not highly structured or machine-generated, and a significant amount of what might be considered data are not digital. For funding bodies, institutions, and humanities researchers, one central task of research data management will be developing infrastructures that achieve a measure of standardization that supports widespread access, while ensuring researchers do not lose the ability to critically engage with different theories, methods, and practices of categorization in their own work. Research software, publishing platforms, and data repositories need to be flexible enough to support humanist research objects and processes without unduly constraining them. 9) Avoid applying over-standardized solutions for diverse research across different disciplines and fields. Although some measure of standardization is necessary for any RDM work at scale, an overemphasis on standardization risks conflating and confusing different types of research and their needs. 10) Research data management and digital research infrastructure (DRI) are closely connected. Ideally the Tri-Agencies will work closely with Digital Research Alliance of Canada (the Alliance) to ensure that digital infrastructure and research software are developed in ways that incorporate RDM principles, and facilitate the production of good data and metadata that can easily be exported for ingest into repositories that will provide long-term access and preservation. 11) Platform, software, and tool choices will significantly affect the way in which project data is organized, described, and accessed. In order to produce good data, humanists will need expert guidance on the way in which their technology and tool choices will impact their ability to export data and related metadata for deposit into repositories. This is closely related to the kinds of general data research infrastructure needs that humanists have articulated in several of the 2020 NDRIO white papers. There is a critical need for improved access to research tools and infrastructure, but technology alone cannot fully address researcher needs. Human experts who can provide support and guidance are equally important. 12) Humanist researchers continue to struggle with project sustainability, but are often loathe to divorce back-end data from its front-end context for the purposes of preservation. One way to address this is to include more contextual and interpretive information in metadata, and to design projects from inception so that data can stand alone, outside the context of the user interface. Not only will this produce more reusable data, but it will help a great deal with the problem of project preservation. Humanist researchers require much more theoretical and practical training on metadata creation, ideally beginning at the undergraduate and graduate level. 13) Data work will not always be perfect from the beginning, and so data as practice involves a willingness to experiment, or to be prepared for the changes that projects undergo, and the contingencies they may encounter. It is extremely unlikely that humanist researchers will be able to create accurate and detailed plans at the application stage. Data management planning tools must support the evolving nature of data management plans with document versioning, alerts to remind researchers to revisit data plans periodically, and authorization tools that can accommodate changing team membership. 14) The Tri-Agencies should clearly articulate how DMPs will be evaluated during the application review process. Given the lack of DMP expertise among many humanists, it is imperative that clear direction to reviewers is provided about how to evaluate the DMP component of an application. There is some risk that SSHRC reviewers who do not accept the importance of “data” in their disciplines will not place weight on data management as a criteria for evaluation. 15) The Tri-Agencies should clearly articulate the oversight process and reporting requirements related to Data Management Plans. Without some kind of formal follow-up, there is a strong chance that DMPs created at the point of funding application will never again be consulted, updated, or put into practice. 16) Humanist researchers strongly agree that research projects involving human subjects must prioritize consultation with communities of practice. Ethical concerns must trump data-sharing benefits in all cases. Indigenous data is out of the scope of the current policy, which is appropriate, but funding for community-designed and -owned solutions are also necessary so that Indigenous people are able to control, access, and use their data over time.

Description

The program, presentations, and related resources from the 2021 event Research Data Management for Digitally-Curious Humanists are openly available on the project site: https://osf.io/6vepj/wiki/home/

Keywords

humanities, research data, Research Data Management (RDM)

Citation