Memory Without Origin: Why Research Libraries and Archives Need Governance Infrastructure for the AI Training Era
Article
orcid.org/0000-0001-5043-7575This essay presents the rationale for the development of the UVA Archival AI Protocol, a governance framework for evaluating AI training requests directed at archival and special collections. AI foundation model training permanently eliminates an institution's ability to govern how its collections are used unless contractual protections are established in advance. The essay argues that such requests constitute stewardship decisions, not access decisions, and require a correspondingly higher standard of institutional justification. It distinguishes between retrieval-based systems, which preserve provenance and permit removal, and current foundation model training, which forecloses both. Drawing on the SAA Core Values, the SAA Code of Ethics, and the CARE Principles for Indigenous Data Governance, the essay examines the obligations institutions hold to donors, communities, and records creators whose materials were entrusted under conditions that predate and cannot anticipate commercial AI training. It describes the specific risks posed by digitization-for-training exchanges and unconstrained downstream use clauses, and argues that shared professional standards reduce the information asymmetry institutions currently face at the negotiating table. Without provenance conditions established before training occurs, the predictable result is knowledge that circulates without traceable origins, a condition the archival profession exists to prevent.
AI governance, archival ethics, provenance, foundation model training, cultural collections, stewardship, donor obligations, Indigenous data governance, digitization, UVA Archival AI Protocol
University of Virginia
April 02, 2026
This essay accompanies the UVA Archival AI Protocol (https://doi.org/10.18130/5dqf-9w86) and Adoption Kit (https://doi.org/10.18130/jbeg-a995), deposited separately in this repository. All materials are available under CC BY 4.0.