Bag NamesRoot directories for bags will be named using a combination of institutional id as determined by the institutional profile inside of APTrust and the unique identifier of the item to be preserved. Dots in the bag root name should be used as delimiters between name parts as designated above and any dots or other special characters normally found in either institution ID or item unique ID should be truncated or converted to dashes or underscores. Multipart bag names must end with ‘b###.of###’ where ### is the number of that bag in the bag count. Bag count sequences begin at 001. For example, if the University of Virginia has institutional code ‘virginia.edu’ and is creating a bag for an item with the unique ID ‘uva-lib:1229365’ then the bag root directory should be named ‘virginia.edu.uva-lib_1229365’ If this was a 200 multipart bag then the first bag root directory could be named ‘virginia.edu.uva-lib_1229365.b001.of200’, the second ‘virginia.edu.uva-lib_1229365.b002.of200’, and the last bag being ‘virginia.edu.uva-lib_1229365.b200.of200’. When tarred these will of course carry the .tar extension for for example ‘virginia.edu.uva-lib_1229365.b016.of200.tar’ We enforce bag naming conventions because when we untar bags in a staging area to validate their contents, we don't want bags untarring to the same directory and overwriting each other.
File and Directory NamesFile and Folder names must follow POSIX conventions:
Bag StructureBags must have the following structure. Items in bold are required. Others are optional. Additional notes appear below. Note the new rules on manifests!<institution_id.item_uid[.b###.of###]>/ | aptrust-info.txt | bag-info.txt | bagit.txt | manifest-md5.txt and/or manifest-sha256.txt | tagmanifest-md5.txt | tagmanifest-sha256.txt | [custom tag files] \----data/ | [payload files] \----[custom_tag_dir]/ | [custom tag files] \----[custom_tag_dir]/ | [custom tag files] ManifestsBefore March 29, 2016, APTrust accepted and verified only the manifest-md5.txt file. We will now accept either manifest-md5.txt or manifest-sha256.txt. If you supply both, we will validate both.Tag ManifestsWe will validate all tag manifests, though they are optional. We will accept tag files not listed in the tag manifests, though obviously we cannot validate their checksums.Custom Tag FilesAs of March 29, 2016, we preserve all tag files, except bagit.txt, which will be recreated when you restore a bag. Custom tag files may be in any format, including binary. We will not try to parse them, but we will validate their checksums if they are listed in the tag manifests.Required Tag Filesbagit.txtThis is requited by the BagIt specification, and should contain the following: BagIt-Version: 0.97 Tag-File-Character-Encoding: UTF-8 bag-info.txtValid APTrust bags MUST contain a bag-info.txt file with the following fields, which may be blank:
aptrust-info.txtThis file MUST be present and MUST contain the following tag fields.
Bag SerializationBags serialize for use by APTrust must use TAR as their serialization format, MUST not use compression and MUST follow the file and folder naming restrictions as well as end with the .tar extension.Bag SizeInitially bags sent to APTrust should be limited to 250 GB for the final tarred bag. Space available for temporary file processing puts a practical limit on total bag sizes in APTurst. We expect this limit to grow over time but the initial performance data will help determine the final limits for the service.Quick ChecklistValid bags meet all of the following criteria:
|