This article is more than 1 year old
Firefox points the way to eradicating one of the rudest words online: PDF
The ghosts of dead trees haunt us still
Comment It's not sexy but it is good. Mozilla deserves our love for implementing a better PDF reader in the new Firefox browser, 106. It takes away the pain, just a bit, by doing in-browser renderings that can be annotated, decreasing the chance you'll have to find a third-party reader that does what you need.
Because, yes, it's 2022 and we're still expected to choose between different readers and download them like peasants. Bad enough on the desktop, infuriating on mobile. Which one will work with which documents? Which one is most secure? Which is going to stop halfway down a document until you watch an advert or pay for some Pro version? All you wanted to do was read a document that someone sent you in PDF, the world's least favorite format.
The list of PDF's sins puts Las Vegas to shame, and they all stem from one of the 1990s' most grievous misconceptions, that the digital would be a more exploitable clone of the physical. That assumption is far from dead – it's why corporate VR is so awful – even though PDF has been a glaring demonstration of that folly for decades.
PDF is skeuomorphic, intended to carry the character of an old entity into a new one. It is designed to produce an exact replica of a printed document. Great if you want printed documents, terrible if you don't. But PDF works from the assumption that because human readable data has been distributed in formats that can't change from cuneiform to Caxton to Agatha Christie, that's how it's going to be in computer land.
You don't need to be told how that's has worked out. The digital realm has evolved at unseemly speed to a rich environment of screens from one inch to 70. Content can flow and shift, it's interactive, responsive, shareable, searchable, translatable – as fluid as the bits it rides on.
PDF is good at precisely none of this, and fiercely resistant to most of it. You want to cut and paste? We turned that switch off, you mad fool. Extract the text from that two-column formatted research paper? Have fun with that, and don't think too long about the idea of research papers being to share data. You'll only smudge your makeup with the tears.
It didn't have to be like this. PDF, while unwieldy and anachronistic, has all sorts of internal features that should help, such as internal tags to help hoick out document components so they can be rebuilt in better ways. As much container as content format, it can incorporate what used to be called multimedia, even 3D. But you won't get any telemetry out of it, no user behavioral data, which is arguably a blessing but blights good and bad hats equally.
- Starlink, shot by both sides in Ukrainian fracas, lives to fight on
- How do you protect your online systems? Cultivate an insider threat
- The open internet repels its most insidious attackers. They'll return
- Rust is eating into our systems, and it's a good thing
Nobody uses the good stuff, because by promising that computerized content is dead trees with digital halos, PDF has seduced organizations into taking the easy track of not bothering to update their thinking. The result is one of the classic paradoxes of IT: people continue to implement bad technology even though they know how bad it is through their own everyday lives. Thus, there continues to be a market for PDF-based document workflow systems that demand PDF inputs and produce PDF outputs, no matter how much the humans at either end grind their teeth in frustration. You work around it, because what else do you do, and we all party like it's 1993.
The results are lost productivity, lost opportunity, even lost users. Websites with important documentation only available in PDF and accessibility "help" that says: "You may need to download a special screen reader if you can't see this. Ask your IT department for advice." Not forgetting: "Please print out this form, fill it in, and send back a scan." But mostly, it's just endless clunky docs you have to fight to read, just because. Is that a desirable experience for anyone?
Mozilla's decision to help sooth the madness is a beacon of hope. The days when users have to know there's even such a thing as a file format, let alone how to handle one, should be long gone. That's the computer's job; making it so is our job. Mozilla knows this, even Google knows this – with no anti-user commercial pressure to lock down an ecosystem, GDocs is fairly good at ingesting all sorts of formats and converting them to an internal format the user never sees. It's not perfect, it's a hard job to ride that herd, and hard to make a business case for it, but it's the future. Even if, by now, it should be the present.
Here's the challenge: Build a decent online document creation, workflow and life cycle management system that only cares about formats when you tell it to. Otherwise, produce a container with all the data and all the layout, all the interactivity, all the workflow metadata, in a way that lets the renderer make all the right decisions for what the human wants. We could have had this years ago, the technology is long invented. The innovation, the magic twist that will liberate us from this particular jail, is making the people who use it realize they're not lost in the pulp fiction of ink on paper. That has no place on our desktops, in the cloud, or in our minds. ®