Koralatov
February 20, 2011 at 8:02pm
26 notes (∞)

Plaintextism

John Sparks, in his post “The Joy of Text”:

There’s something to be said for the use of plain text files. Text is simple. Text files are easy to read on any computer running any operating system and don’t require any proprietary word processor to interpret. Even more important, text files can be read by humans. Keeping your writings in text makes them digitally immortal.

Moreover, text is internet friendly. The files are small and can jump among connected devices with poor connections like hopped up Disney faeries. It is really easy to work with your text files on any device from anywhere.

There’s more than just “something to be said” for plaintext. There’s everything to be said for it. It’s been the lingua franca of computing since at least the tail end ’60s; as our intrepid author notes, it’s 100% portable and totally past- and future-proof. It’s nearly everything one could want, and more than that, it’s nearly everything one could need.

Sparks, to his credit, examines the matter in some depth, and details his workflow and applications; Patrick Rhone merely resorts to his usual glib bleating1:

Those that have been following along for any length of time know this is something I very much believe in.

What’s perhaps saddest about the emerging cult of plaintextism is how little actual thought was put into it by its adherents prior to the emergence of Simple Note/Notational Velocity/Elements/&c.. (Sparks admits this himself — “[t]he watershed event, however, was the iPad. Very quickly after using the iPad, I realized I didn’t need a full blown word processor on my iPad as much as I needed a way to enter, edit, and manipulate text” — but no such admission is forthcoming from others.) It’s almost as if they needed someone else to do their thinking for them.

(It would also be interesting to see whether their email is in plaintext, or if they succumb to Gmail’s bad-habit default and send HTML mail. Given the apparent lack of thought on the plaintext thing, I’m guessing it’s probably HTML.2)

Another common theme in this orgy of love for plaintext is the celebration of Markdown. In some ways, it’s totally justified, and I finally started to use it myself recently3. Purely from the perspective of someone who is writing, it’s almost perfect, but — there’s always a but — the structure it supplies lacks the future-proofing for which plaintext is being celebrated. It’s quite widely supported now, but there’s no guarantee that it won’t fall out of use at some point in the future. Obviously, this isn’t a major issue since the plaintext files themselves will always be openable, and the syntax itself isn’t overbearing or messy, so even a someone who doesn’t use Markdown can (largely) understand what’s written, but it still leaves the potentially huge task of converting it into another structured language.

What all the newfound adherents of plaintext are missing is one very simple truth: the real archival language of any writing is, of course, HTML. It’s only a hair behind plaintext in its portability — most phones now display it, and any computer that doesn’t is an antique — and it has one advantage that plaintext doesn’t: it retains structure. In most cases, of course, structure is a nicety, but not a requirement; your shopping list, for example, will probably get by just fine in plaintext. In some cases, though, it is absolutely essential.

Joe Clark4 covered this exact topic beautifully several years ago:

Cory [Doctorow] overstates the advantages of his kind of electronic text. To do what Cory suggests is effortless actually requires a lot of effort.

If I give you unstructured ASCII text, you may be able to use grep or human bloody-mindedness to mark up the text into something that can be turned into a “formatted” two-column PDF. But it’s not gonna happen automatically. You must go through a stage in which the unstructured text is given structure.

I know this from first-hand experience. I recently finished the laborious process of typesetting a book where most of the text was from older books that had been scanned into plaintext using OCR. This left two problems: mangled characters, and a total lack of structure. I supplied the bloody-mindedness that converted the unstructured text into structured text.

Once I’d finished setting the book in InDesign, and the project was finished, I exported it into two archival formats: PDF, to make an exact digital copy, and HTML, to ensure an editable, structured version would always exist. (One further example is the essays I wrote in Uni; I finally went through and converted them into an archivable format, only it wasn’t .doc — I used HTML.)

The takeaway is simple: for most of what you’re doing, and for the countless temporary, throwaway documents that we all create, plaintext is the ideal format to work in. For your important projects, ones you genuinely want to keep and to survive, the format you need to use is HTML.


  1. I touched on this in a previous post: “…there’s only so much self-righteous, preachy “Not What We Believe In” chanting I can stomach, and Minimal Mac has become increasingly focussed on that kind of empty, mantra-esque crap in the recent months”

  2. I’m not even going to mention top-posting. 

  3. I know, I know; what took me so long? 

  4. It was Clark that made me realise the sense in using HTML as my archive format of choice; prior to reading his blog a couple of years back, I, too, thought plaintext was the best archival format. 

Notes

  1. koralatov posted this