Future of Text

Let me try to briefly describe a certain future of text that has largely been abandoned since the advent of Word, the Web and DTP. In his book “Track Changes”, Matthew G. Kirschenbaum reconstructs the otherwise already forgotten history of word processing: in contrast to today’s use of the term, it initially referred to a model for organizing the office, then to electronic typewriter machines and later to a wide array of software packages reflecting every imaginable combination of generally useful features and affordances for manipulating text on a computer. From print-perfect corporate letters to authors changing their manuscripts over and over again instead of relying on the services of an editor, electronic writing had to develop around purely textual aspects because of the pressing hardware limitations at the time. Naturally, the early hypertext pioneers expected a new era of powerful tools/instruments to come about that would augment reading, writing and curation way beyond what humankind had built for itself for that purpose so far. Today we know that this future isn’t what happened.

Text by its very nature is a universal cultural technique – and so must be the tools and conventions involved in its production and consumption. Consider a whole infrastructure for text comprised of standards, capabilities, formats and implementations that follow a particular architecture analogous to POSIX, the OSI reference model and the DIKW pyramid. Such a system would need to be organized in separate layers specifically designed to bootstrap semantics from the byte order endian to character encoding, advancing towards syntactical format primitives up to the functional meaning of a portion of text. Similar to ReST and its HATEOAS or what XHTML introduced to Web browsers, overlays of semantic instructions would drive capabilities, converters, interface controls and dynamic rendering in an engine that orchestrates the synthesis of such interoperable components. Users could customize their installation quite flexibly or just import different preexisting settings from a repository maintained by the community – text processing a little bit like Blender with its flow-/node-based approach.

Is this the insanity of a mad man? Maybe, but we have seen variations of this working before with some bits and pieces still in operation here and there. This is not rocket science, this is not too hard, it’s just a lot to do and few are actively contributing because there’s no big money in foundational text technology anymore. By now, much better hypertext and hypermedia tools are urgently needed as an institutional and humanitarian cause. The future of text and its supporting infrastructure can’t be a single product, trapped in a specific execution environment, corrupted by economic interests or restricted by legal demands. Instead, imagine a future in which authors publish directly into the Great Library of everything that has ever been written, that’s constantly curated by crowds of knowledge workers for everybody to have a complete local copy, presented and augmented in any way any reader could ever wish for. If this is too big for now, a decent system to help with managing your own personal archive and the library of collected canonical works would be a good start as well.

After cheap paper became available and the printing press was invented, it still took many generations of intelligent minds to eventually figure out what the medium can and wants to be. Likewise, the industrial revolution called for a long and ongoing struggle to establish worker rights in order to counter boundless exploitation. With our antiquated mindsets and mentality, there’s a real risk that we simply won’t allow digital technology to realize its full potential in our service for another 100-300 years, and the future of text might be of no exception.

Copyright (C) 2019 Stephan Kreutzer. This text is licensed under the GNU Affero General Public License 3 + any later version and/or the Creative Commons Attribution-ShareAlike 4.0 International.

Glossary

Definitions

asdf
Microsoft Word
Word
A restrictively licensed software application in the tradition of earlier word processor packages. Its main purpose is to allow digital editing of short corporate or personal letters for print. The result looks “perfect” in the sense that no indications of error correction end up on the paper as if the typist wrote the entire text in one perfect go or the sender could afford the luxury to have pages re-typed until no errors remained, suggesting that the recipient was considered important enough to deserve keeping a typist busy (which isn’t the case of course if a computer is used). Microsoft Word isn’t a writing tool nor designed for writing books, it’s not for typesetting or Desktop Publishing.
World Wide Web
Web
WWW
A stack of protocols, standards and software which was initially designed as a system to access and navigate between separate, incompatible document repositories via a shared, common interface. Today, it’s mostly a programming framework for online applications like shops or games. The support for text and semantics is very poor. As restrictively licensed operating systems lack the ecosystem of trusted software repositories to install programs from, implementations + the standard continue to lack support for text capabilities as well as operators of Web sites refuse to standardize functionality + libre-freely license their client/server code in order to artificially create and maintain a Software-as-a-Service/“Cloud” lock-in dependency on their server instance; untrusted/unchecked software scripts are sent by the server to automatically run on the client, therefore Web browsers need to be sandboxed as such remote code execution could otherwise easily compromise the security of the client system. For that reason, the Web can’t make use of native capabilities available on the client and needs to be denied interoperability with system infrastructure outside of itself.
Desktop Publishing
DTP
Layouting for print using a computer, but otherwise roughly the same approach Gutenberg used – setting type by hand. The main purpose for software applications in this category is to create flyers and magazines. It’s not for writing nor for typesetting regular long-form text.
Portable Operating System Interface
POSIX
An interface standard for basic capabilities of a computer operating system. User-space third-party applications can be built on top of it and gain code portability in regard to all the other POSIX-compliant implementations.
Open Systems Interconnection reference model
Open Systems Interconnection model
OSI reference model
OSI model
A conceptual framework that describes how the transmission of data between two endpoints in a network can be interpreted in terms of horizontal hierarchical layers which are isolated against each other, so a change of implementation/protocol on one of the layers doesn’t affect all the other parts, avoiding inter-dependencies and a vertical, monolithic architecture. The model reflects the observation that payload data gets encapsulated in a series of nested, cascading metadata containers before it is sent over the wire. At nodes along the way and on the receiving end, data of a lower level can be unpacked from its higher-level wrapper envelope – a little bit like Matryoshka dolls. Technically, this might simply mean that a component of a certain layer will handle the metadata which was addressed to its own particular function and then dispatch/pass-on what appears to be meaningless payload to the next component belonging to one level up or down, where the same procedure repeats all over again, but this time on the remaining, extracted data with the earlier packaging around it being already removed by then. Bootstrapping, standardization, separation of concerns, hiding implementation details behind abstract interfaces: these are typical strategies for designing complex adaptive systems, and the OSI reference model serves as a good example for a theory that expresses some of the underlying universal principles.
Data, Information, Knowledge, Wisdom
DIKW pyramid
DIKW hierarchy
DIKW model
DIKW
A conceptual framework that describes the hierarchy of the encoding of meaning in information systems. From the atomic data handling primitives of the physical carrier medium up to the complexity of good and wise use (or, say, meaningful interpretation), the theoretical model suggests that the implementation on each of the layers can be changed without affecting all the other parts because the internal matters of information handling of a stage are contained and isolated from the other stages. Michael Polanyi presents a similar notion in his book “Personal Knowledge”.
bootstrapping
bootstrap
The method of launching higher stages of complexity from lower, more primitive stages. This is how a computer operating system boots itself up from a single electrical impulse caused by the push of a button to the master boot record, continuing with the BIOS firmware, starting the operating system and finally setting up user space applications. Douglas Engelbart proposed a similar concept for exponential improvement in which the lower stages are used to create a much better, new stage, on which the process is repeated all over again.
Representational State Transfer
ReST
The rediscovery of the semantics in the Hypertext Transfer Protocol (HTTP). With the early Web mostly serving static pages and the later one targeting the development of custom server-centric online applications for the browser, the notion of small agents/clients for carrying out tasks, collecting information and interoperating with local + remote software wasn’t really supported, but with the Software-as-a-Service/“Cloud” lock-in dependency model (also fueled by the idea of custom, non-standardized, server-defined APIs – application programming interfaces) in contrast to simply publishing and retrieving semantic data, ReST principles recently gained more popularity again. With simple HTTP verbs (commands, actions), a client can instruct the server about what to do in a standardized way (a little bit like XML-RPC and SOAP). The hypermedia reply/response might inform the client about new request options that could be followed automatically or manually by the user. Furthermore, the semantics define URLs as arbitrary unique static IDs (not carrying any meaning nor reflecting/exposing the directory structure of the host operating system), so no custom client-side programming logic needs to know about how to construct URLs that correspond to the schema expected by a particular server. Instead, a client is enabled to easily request hypermedia representations of a remote resource, make sense of it and adjust its internal application state accordingly. Long ago, the Turing test demonstrated that a recipient can never know if the representation of a remote resource was generated by server-side code or a database or if a static page was sent, so why should a client base its operation on such assumptions and become dependent on a particular server implementation that may change at any time, where it would be much more reliable to exchange representations that contain standardized + semantic hypermedia instructions commonly understood by the receiving end regardless if the structure, surrounding data payload or implementation changes?
Hypermedia as the Engine of Application State
HATEOAS
The concept recognized by ReST describing that capabilities of a system could be driven by data, not by the custom, non-standardized, incompatible programming of a particular application. “Hypermedia” refers to the notion that semantic instructions embedded in other payload data could operate corresponding controls and functions, maybe even as part of a larger infrastructure in which different components are plugged in to handle their specific tasks according to the standard they’re implementing. Software developers who write parsers for domain-specific languages – especially if these are interpreted ones designed to trigger actions in an application – are quite familiar with this type of mechanism. The Web too with the rendering engines in its browser applications makes use of this approach. Nonetheless, it could well be that still no hypermedia format exists to this day that would provide the semantics needed for ReST, and the Web with HTML doesn’t support ReST either. The general XML format convention doesn’t come with semantics of its own beyond primitives for introducing semantics and structure built on top of it, but with XHTML, at least different namespaces could be mixed for hypermedia-aware clients, of which probably none exist yet based on the architecture of a standardized capability infrastructure. Anyway, HATEOAS might have been formulated together with ReST and its semantics of HTTP (not necessarily related to the insufficient “hypermedia” semantics of the Web), but the universal principle behind it can be applied in other contexts just as well. There’s no reason why the concept should be limited to the model of a local client interacting with remote servers, why it couldn’t be applied to service-oriented architectures in general, even if all servers/services are local, or just some of them while others are not.
eXtensible HyperText Markup Language
XHTML
XML is a very simple format convention of primitives for structuring text semantically. HTML is a more specific format for defining the structure of a Web page. Since XML and HTML share the same origin of SGML, it makes a lot of sense to formulate the specific HTML Web page format in the general XML format convention, resulting in XHTML. There are many tools and programming interfaces for XML available which therefore can also read XHTML, so the Web could have become semantic and programmable. With regular HTML, a huge parsing engine is needed, especially because most published HTML is ill-formed and broken, leaving only bloated, sandboxed Web browser applications as the target for HTML documents. HTML and subsequently its XML variant XHTML lack support for decent text capabilities, but the almost abandoned XHTML would at least offer programmable semantic access to Web pages, while HTML as the overwhelmingly popular publication format of the Web tends to be cluttered with visual elements for navigation and custom online application programming – these without being explicitly made recognizable as such for the computer. Under these circumstances, it’s very difficult and expensive to make any hypertext/hypermedia capabilities or small agents/clients work with HTML content published on and for the Web and its browsers.
Web browsers
Web browser
browsers
browser
The generic term describes a category of software applications for navigating text/hypertext. Its main function is to request/retrieve resources (local or remote) and to parse/interpret them for subsequent augmentation. Semantics of standardized meaning allow the browser to recognize special instructions within the payload data of a resource, so it gets a chance to apply custom settings when preparing the presentation for the user. Another important component for variants with a graphical user interface is the rendering engine for drawing visual elements onto the screen, so the user is enabled to interact with them in various ways. A browser is not for writing. For navigating hypermedia, probably no real browser exists yet. In the future, standardized controls for the augmented reality of the Internet of Things might resemble some kind of “browser” as well. Typical browsers for the Web go much beyond core requirements and come with their own affordances for multimedia playback, scripting/programming and a whole bunch of other features for interacting with online applications. These aren’t available as local capabilities in the context of the client because Web browsers need to be sandboxed/isolated for security reasons.
Blender
A freely licensed software application for creating 3D models and animations.