Every two years, the ACRL Research Planning and Review Committee publishes in College & Research Libraries News an article on the top trends and issues affecting academic libraries and the change our institutions are experiencing. We will be highlighting some of these trends through a number of blog posts over the next few weeks.
Research data management (or RDM) refers to the organization, storage, preservation, and sharing of data collected and used in a research project. Many academic libraries offer research data services (RDS) that cover all aspects of the data management lifecycle (figure 1). Examples include assisting researchers with data management plans, using file naming conventions, and exploring data repository options. Specific issues may cover the type of data collected and its format, backup policies for the storage of data, accessing and sharing data, and the issues of privacy, consent, intellectual property, and security that pervade RDM.
Research data management is important for several reasons. First, data is a scholarly product and yet is a fragile commodity easily lost. An investment in RDM saves time and resources over the course of the data lifecycle. Good managers increase the quality and accessibility of the data to ensure valid reproduction and replication of results. Last, easing the sharing of data allows other researchers to make valuable discoveries.
Recent trends have focused on these themes: awareness and education initiatives, implementation of data standards, and training the next generation of librarians and data specialists.
Awareness and Education Initiatives
Governments and organizations are coordinating efforts toward open access, open data, and open science. Examples include the Canadian Government (Canada’s 2018-2020 National Action Plan on Open Government), the European platform OpenAIRE, and the National Academies of Sciences, Engineering, and Medicine (Open Science by Design). All are striving to shift scholarly communication toward transparency and openness in communicating and monitoring research; specifically, “coordinate open science and research data efforts, to align science with societal values and strategically plan for public access of data” (ACRL Research Planning and Review Committee).
The State of Open Data Report 2019 by Digital Science discusses the trend for adopting and accepting open data, with the point that “the research community is now demanding more enforcement of the mandates that have been adopted by many governments, funders, publishers and institutions around the world.” The report even states that “the majority of researchers want funding withheld and penalties for a lack of data sharing.”
Open Science by Design: Realizing a Vision for 21st Century Research is a report by the National Academies of Science, Engineering, and Medicine published in 2019. As the report says: “Open science aims to ensure the free availability and usability of scholarly publications, the data that result from scholarly research, and the methodologies, including code or algorithms, that were used to generate those data.” The benefits of open science can include new standards that support reliability, reproduction, and replication; addressing new questions from multiple perspectives and expanding interdisciplinary collaboration; disseminating knowledge quickly and inclusively; and public-funded research available to everyone.
Implementation of Data Standards
The maturation of guidelines illustrates how data standards are permeating the scholarly communication process. The FAIR data principles (findability, accessibility, interoperability, and reuse) were created in 2016 by GO-FAIR and have been widely adopted in research data management. Most researchers are unaware of the FAIR principles although this is changing slowly.
Solutions include inter-institutional collaboration with a focus on networks for data curation. Organizations such as the Data Curation Network have developed workflows and checklist resources to educate librarians and establish best practices. These resources were founded by data curators, data management experts, data repository administrators, disciplinary subject experts, and scholars, many with professional librarian training. National initiatives such as the Canadian Data Curation Forum held in 2019 is designing a national data curation network. Ethical data management and curating data for reproducibility were just some workshops given. CODATA is the Committee on Data of the International Science Council (ISC) and coordinates a wide variety of initiatives, task groups, and working groups.
Responsible RDS is maturing slowly and adoption remains slow. Barriers to developing RDS at academic institutions include long term financing, a shortage of qualified staff and specialists, an extensive array of data science skills across the institution, and researcher indifference in general. An examination of 114 Association of Research Libraries (ARL) institutions by the Data Curation Network found that while 44% had an established data repository, very few websites had information about data curation support.
Training Next Gen Librarians and Data Specialists
Besides the broad efforts made by institutions listed above, professional library school programs have taken it upon themselves to implement their own programs. Data science courses have expanded the LIS curriculum in the last two years. Full-fledged online courses have been introduced. The Research Data Management Librarian Academy (RDMLA) started in 2018 and offers a complete overview of RDM best practices.
The course was developed by a team of librarians and LIS faculty at several U.S. universities and the publisher Elsevier in order to promote RDM best practices. Modules include navigating research culture, advocating for RDM in libraries, launching data services, project management and assessment, data analysis, data visualization, coding tools, and platform tools. The curriculum will expand in fall 2020 with two new modules: “Data Copyright, Licensing, and Privacy,” and “Delivering Data Management Training: A Guide to DataONE.” In late 2021, a Chinese version of the Academy will be released.
Even for trained data librarians, there are still gaps as the National Library of Medicine determined in a 2019 workshop. Developing the Librarian Workforce for Data Science and Open Science identified seven skill categories data librarians needed to improve including “data skills, computational skills, research and subject matter knowledge, traditional library skills, skills for developing programs and services, interpersonal skills, and skills for lifelong learning.” It is a given that one data librarian cannot master all of these skill sets.
While the obstacles of uneven open access, skill shortages, knowledge deficits among practitioners, and the maturing of best practices and standards are acknowledged as impediments to progress, there is an overall optimistic trend of growing understanding of the vital importance of Research Data Services for the scientific enterprise. Effective management of our data resources is occurring among researchers, data managers, and librarians, with cooperation and collaboration among institutions, organizations, and networks, with this perspective seen as an integral part of research data management.
The Catholic University of America offers a Master’s degree and a certificate in data analytics in the School of Engineering and courses in data science in the Department of Library and Information Science. These courses have appeared in the last couple of years to address the data skills gap in the workforce. Students, faculty, and researchers interested in discussing Research Data Services for their projects can check the library’s Digital Scholarship website for additional information.
Cox, A. M., Kennan, M. A., Lyon, L., Pinfield, S., & Sbaffi, L. 2019. Maturing research data services and the transformation of academic libraries. Journal of Documentation. Retrieved from: https://www.emerald.com/insight/content/doi/10.1108/JD-12-2018-0211/full/pdf?title=maturing-research-data-services-and-the-transformation-of-academic-libraries
FAIR Principles: https://www.go-fair.org/fair-principles/
Federer, Lisa, Sarah C. Clarke, and Maryam Zaringhalam. 2020. “Developing the Librarian Workforce for Data Science and Open Science,” January 16, 2020, https://doi.org/10.31219/osf.io/uycax.
Johnston, Lisa R., and Liza Coburn. 2020. “Data Sharing Readiness in Academic Institutions.” Data Curation Network. https://datacurationnetwork.org/data-sharing-readiness-in-academic-institutions
National Academies of Sciences, Engineering, and Medicine. 2018. Open Science by Design: Realizing a Vision for 21st Century Research. Washington, DC: The National Academies Press. https://doi.org/10.17226/25116.
Wilkinson, Mark D., et al. 2016. ‘The FAIR Guiding Principles for Scientific Data Management and Stewardship’, Scientific Data, 3. https://doi.org/10.1038/sdata.2016.18