File Format Flair for the Holidays! Updates to the Sustainability of Digital Formats
Today’s guest post is from Kate Murray, Liz Caringola, Genevieve Havemeyer-King and Liz Holdzkom of the Digital Collections Management & Services Division at the Library of Congress.
Happy new and updated FDDs (format description documents) to all who celebrate! It’s time for one of our favorite holiday traditions – an update about recent file format research and documentation from your friends (elves?) behind the Sustainability of Digital Formats! This is the ninth entry in the File Format Friends series, which began in December 2021 and has included semiannual updates since.
New and Updated Format Analysis
Since our last update in June 2025, we’ve been in a bit of a gap-filling mode. We’ve revisited some existing areas of focus to complete, augment or update essential entries of interest to the Library of Congress. These efforts support strategic planning regarding digital content formats, and ensure the long-term preservation of digital content.
One format group is related to the Material Exchange Format (MXF). MXF has a long history at the Library as one of the preferred formats for file-based moving images. It’s also been a focus for the Federal Agencies Digital Guidelines Initiative (FADGI) Audiovisual Working Group with sponsorship of SMPTE MXF Archive and Preservation Format Registered Disclosure Document (RDD 48), SMPTE RDD 48:2018 Amendment 1:2022 (which maps FFV1 to the MXF container) as well as the free open source application embARC (metadata embedded for archival content), which enables users to manage (i.e., audit, validate and correct) embedded metadata in Digital Picture Exchange (DPX) files as individual files or an entire DPX sequence, as well as RDD 48 compliant MXF files including FFV1 in MXF, while not impacting the image data.
To help augment our MXF OP1a (Operational Pattern 1a) set, we’ve added MXF_OP1a_UNC, MXF File, OP1a, Uncompressed Images in Generic Container; MXF_OP1a_JP2_LSY, MXF File, OP1a, Lossy JPEG 2000 in Generic Container; and, MXF_OP1a_JP2_LL, MXF File, OP1a, Lossless JPEG 2000 in Generic Container.
This also might be a good time to remind our dear readers that we don’t aim to be comprehensive to include all possible variations of formats. As we describe in Fun with File Formats, “It’s a bit of a complicated decision [what we research] but the gist is that we focus on formats that are of interest to The Library of Congress because we have them in our collections or are getting them in or sometimes we see a newish format on the horizon and want to get ahead of the game. We also prioritize formats listed in the Recommended Formats Statement (RFS) because if we have a preferred or acceptable format, we want to be informed about it.” So we know that there are many other variants of MXF out there in the format-verse but we are scoping our work to a subset of them.
It’s a similar situation with a recent significant update to PNG (Portale Network Graphics). While the original specification dates from 1996, the W3C recommendation for Version 3 was released on 24 June, 2025, so we decided to take another look at our entry for FDD153 for PNG and make some changes to reflect the new version. Because we know the suspense is killing you, for a list of publication dates and changes between Version 2 and Version 3, see the Third Edition Change List. For changes between Version 1 and Version 2, see the Second Edition Change List. W3C also makes available the pre-release drafts for the Second Edition and the First Edition.

This also brought us to a new entry for APNG (Animated Portable Network Graphics), which is an extension of the PNG raster image format but extends beyond static images to support frame-based animated images. Since the Library has 16 TB (over 52 million) PNG files as of June 2025 and it’s a preferred format for still image content, these updates were bumped up the priority list.
Next in our ‘room to improve’ series are new entries for PDF (Portable Document Family)! The Library has A LOT of PDFs – many hundreds of thousands of terabytes of different flavors of PDF – and has been involved with the development and maintenance of the PDF family of specifications for a long time through membership in ISO TC 171 SC 2 and its Committee Manager, the vendor-neutral PDF Association. (Fun fact that the PDF Association provides free access to ISO specifications for both the PDF 2.0 and PDF/UA family of standards so go get them as a holiday treat for your favorite format nerd!)
To help fill out our PDF converge, we’ve added entries for PDF/A-4e for Engineering, which is defined in ISO 32000-2 (PDF/A-4), Annex B and is the successor to PDF/E as well as PDF/A-4f for Embedded Files, defined in ISO 32000-2 (PDF/A-4), Annex A and is a synonym for ISO 19005-4 Level F conformance. We have more PDF descriptions coming, including EA-PDF which is a PDF/A compatible vendor- and platform- neutral format for the long-term preservation of email, so stay tuned about that.
In addition, as part of our ongoing work to support digital accessibility (see More Formats and More About Formats: New Entries, Format Accessibility Features and Other Updates), we also went back in time a bit to document Synchronized Multimedia Integration Language (SMIL), Version 2.1, which is a legacy W3C format from 2005 which supports accessibility features, such as MediaAccessibility and MediaDescription modules to support alt text as well as a new accessibility attribute: readIndex, which “Allows explicit ordering for controlling assistive technology.”
Our last new FDD is certainly the most fun to say: YAML Ain’t Markup Language (YAML), a human-readable data serialization language and, as of version 1.2, a strict superset of JSON, meaning that valid JSON is also valid YAML. And because we know you love a good origin story as much as we do, please allow us to explain: “YAML (rhymes with “camel”) originally stood for “Yet Another Markup Language” (see YAML 1.0 working drafts from 2001), but by the time the YAML 1.0 Final Draft was published, the developers changed it to the recursive acronym “YAML Ain’t Markup Language.” Developer Ingy döt Net replied to a Stack Overflow post in 2013 explaining the reason for the name change soon after he joined the development team: “After a few months of us working together, I pointed out that YAML (which most definitely stood for Yet Another Markup Language at that time) was not really a markup language (marking up various elements of a text document) but a serialization language (textual representation of typed/cyclical data graphs). We all liked the name YAML, so we backronymed it to mean YAML Ain’t Markup Language.” So YAML it is!
What’s on Tap for 2026 and Beyond
We have a variety of projects planned for 2026 and forward including a focus on formats related to AI and ML. For direct links to all our new FDDs and to see what we have planned for the coming year, take a look at our draft workplan which includes a publication log.
As we say often on the Sustainability of Digital Formats site, comments welcome!
Source of Article