Skip to main content

Report on Enhancing Services to Preserve New Forms of Scholarship: Changes in Scholarly Publishing

Report on Enhancing Services to Preserve New Forms of Scholarship
Changes in Scholarly Publishing
    • Notifications
    • Privacy
  • Project HomeWikimedians at NYU
  • Projects
  • Learn more about Manifold

Notes

Show the following:

  • Annotations
  • Resources
Search within:

Adjust appearance:

  • font
    Font style
  • color scheme
  • Margins
table of contents
  1. Report on Enhancing Services to Preserve New Forms of Scholarship
    1. Executive Summary
    2. Preservation Guidelines
    3. Contents
    4. Changes in Scholarly Publishing
      1. New questions
    5. Project Description
      1. Scope and motivation
      2. Partners
      3. How the work was organized
      4. What we did
        1. Pre-Transfer Activities
        2. Preservation Actions
        3. Evaluation
    6. Preservation Objectives
      1. Managing expectations
      2. Defining a work and the elements to be preserved
      3. Existing preservation-oriented features of publication platforms
    7. Preservation Activities
      1. Methods
      2. File Transfer of Information Packages
        1. Adapting to challenges related to embedded third party resources
        2. Adapting to challenges at the platform level
      3. Web Archiving
        1. Adapting to challenges at the platform level
        2. Adapting to challenges related to social media and user contributed content
        3. Adapting to challenges related to dynamic content
      4. Emulation
    8. Assessment
    9. Works Cited
    10. Appendix A: Publications Analyzed
      1. Fulcrum
      2. Manifold
      3. RavenSpace
      4. DLXS
      5. Scalar
      6. Open Square
      7. Standalone websites
    11. Appendix B: Acceptance Criteria Template
      1. Section A: Pre-transfer activities
        1. Preservation objectives
        2. Transfer of content to preservation partner
        3. Describe contents submitted for preservation
      2. Section B: Preservation activities
        1. Assessment of submitted materials
        2. Access to Archived Copy
      3. Section C: Assessment activities
        1. Evaluation
    12. Appendix C: Enhancing Services to Preserve New Forms of Scholarship Project Participants

Changes in Scholarly Publishing

The proliferation of digital tools in scholarly publishing over the past 25 years has created a challenge for the preservation of scholarship. The preservation of print materials has traditionally been the responsibility of libraries, who, often under consortial, national, and regional agreements, steward materials that they have acquired. Digital materials, which are sometimes acquired outright and sometimes licensed to libraries, have occasioned the development of new organizations and structures to support the ongoing preservation of the scholarly record. In addition to national libraries, CLOCKSS, Portico, and HathiTrust have been among the largest providers of digital preservation services for scholarship. In order to operate at scale, these organizations have developed processes for preserving material in common digital formats: PDF, XML, EPUB, HTML, and others. This network appears successful at preserving publications in those forms that hew closely to the conventions of scholarly publishing: linear, text- and image-centric works that could be expressed well in printed formats. These are the vast majority of journal articles and books, many of which are published simultaneously in print and digitally.

The amount of scholarship published using digital technologies beyond these conventions has increased steadily over the past two decades. From hypertext and interactive web-based scholarship of the 1990s, scholars and publishers have expanded the range of forms that scholarship can take. But whereas major preservation service providers now have clear and scalable processes for text-based journal articles and books expressed in XML, PDF or EPUB, they do not currently have reliable, scalable processes to preserve publications that have been further enhanced with multimedia or interactive features, or that encourage web-based, non-linear navigation. Even web-archiving–based workflows, such as that used by LOCKSS, often fail to preserve many features when they diverge from expected patterns.

Because print books published by university presses and other scholarly publishers were so widely preserved by libraries in the twentieth century, scholars and other library patrons have an expectation that books will be preserved, i.e., that future researchers will have access to them. Digital preservation services have emerged to fill this gap. Preserving the scholarly record is now a collaborative activity between publishers, libraries, and these preservation services. Publishers and content aggregators use standards in creating digital objects, and then engage preservation services to safeguard and make them available in the event that the publication becomes unavailable in the future. Libraries preserve digital outputs that they have stewardship over, such as scholarship in their institutional repository and digital projects and web sites sponsored by the library or its parent institution. This collaboration is currently constrained: first, by shrinking library and publisher budgets, and second because researchers and publishers continue to expand the forms and features of their publications. The expectation of preservation remains, or at least it remains unexamined. As more research outputs make use of complex web technologies, third-party software, and remote resources, the gap between expectations and reality widens.

Enhancing Services to Preserve New Forms of Scholarship, led by New York University Libraries in collaboration with Portico, CLOCKSS, Michigan Publishing, the University of Minnesota Press, Stanford University Press, UBC Press, and NYU Press aimed to narrow the gap between reality and expectations for the preservation of digital scholarship. This goal was met in two ways: first, Portico and CLOCKSS conducted exploratory preservation processes for a wide range of complex digital projects from university presses in order to determine what could be preserved and how scalable the process might be; and second, we developed guidelines that publishers, authors, preservation services, and platform developers can use to improve the preservability of complex digital publications. This report situates the project and describes our methods and findings. The guidelines have been published as a companion document to this report and are available in a sortable website at https://preservingnewforms.dlib.nyu.edu.

New questions

The ongoing development of digital technologies, and the concurrent development of scholarly research methods, will continue to pose challenges for preservation. To an extent, this is an inherent feature of digital scholarship. As new tools for conveying ideas become available, and as scholars find new ways to build on and incorporate existing digital resources, the tools that libraries and preservation services have built will never fully accommodate all elements of digital scholarship. In order to develop services and workflows that will meaningfully preserve scholarship for the future, several theoretical questions require consideration.

In the print context, research outputs are generally ontologically well-defined and stable. This allows for stable services for preservation; it also makes the process of defining the work itself easier. In contrast, digital research outputs sometimes make definition of the work difficult. If an author has embedded a video from YouTube in the body of the text, is that video necessarily part of the work, or could it sometimes be considered supplementary? What if the YouTube-hosted video was created for the project? Conversely, when might digital resources marked as supplementary in fact be necessary for some future users’1 understanding of the author’s argument? In short, how do we define the extent of the work that is to be preserved?

This might also be viewed in terms of the scholarly record: When research outputs are well-defined and uniform, scholarly communities can more easily define the scope of outputs that form the core of scholarly communication. As output formats proliferate, and boundaries between disciplines and between audiences blur, libraries and preservation services can no longer take long-standing assumptions about the scholarly record for granted. When do data sets, visualizations, audiovisual clips, and other interactive features become part of the corpus necessary for maintaining academic communities over time? And what about user-generated annotations or contributions?

Even when questions about objects of preservation can be answered, organizations that publish, collect, or preserve scholarly work often cannot provide reliable stewardship over new forms of scholarship. One premise of Enhancing Services to Preserve New Forms of Scholarship was that libraries, publishers, and preservation services must work together more than they have in the past in order to preserve complex digital publications. This premise tacitly acknowledges that in the past, each of these players was able to fulfill its goals with only a modest level of ongoing collaboration. Libraries asked publishers to use materials and production methods that would aid in preservation and conservation, but publishers themselves needed little knowledge about preservation in their usual workflow. As publishers now create publications with digital features that vary in complexity and technology, they need to work more closely with preservation services and libraries to make their products preservable. How can publishers, libraries, and preservation services adapt to work together?

Questions of scale were crucial to this project. While we were interested in improving digital preservation for a range of preservation methods and services, our primary concern was with scalable processes such as those with which Portico and CLOCKSS operate. If individual publications require days of software development in order to be harvested by a web harvester or converted to a sustainable format, preservation will only be feasible for well-funded projects. Services such as Portico and CLOCKSS, whose business models require scale, may not be able to process them. Still, the more complex a digital publication, the more likely it will require more labor to preserve; and the project sought to improve preservation services for those publications as well. So, we asked, what compromises must publishers or preservation services make to preserve complex publications at scale? Are there solutions that both satisfy the publisher and the needs of preservation services?

Participants knew from the start that preserving some of the more complex digital publications from university presses was a significant challenge. There are limits to what can be preserved, especially in a world where new applications, protocols, and standards are constantly being introduced. But we didn’t know what those limits are. So we set out to understand what is possible under the current business and technological constraints.


  1. Because those who interact with enhanced digital scholarship do much more than read, we use the term “user” rather than “reader” throughout this report.↩︎

Annotate

Next Chapter
Project Description
PreviousNext
CC BY 4.0
Powered by Manifold Scholarship. Learn more at
Opens in new tab or windowmanifoldapp.org