Watching the Recordings of Engelbart’s Great 1968 Demo 50 Years Later

Today, exactly 50 years after Douglas Engelbart’s Great Demo, I’m watching the recordings in their entirety for the first time. Sure, I’ve seen parts of it many times and easily gained an intuitive understanding how NLS did some of the capabilities and what they are, but I almost have to apologize to people in the hypertext community that I didn’t watch the recordings in their entirety. There are some reasons for that: I think others can’t imagine how difficult and time-consuming it is for me (keep in mind that demands active, focused, prime/quality time in front of the computer + Internet connection, which is usually reserved for productive work). For me, the 1:30 hours translate into many hours, and I find it very difficult to watch things happening on a screen produced by a system I don’t have and can’t have, so it’s impossible to arrive at some understanding/feel for how and why the things happen and work the way they do, and even figuring that out is outright irrelevant in the abscence of the system of course. It’s more like passively watching/experiencing/consuming an art installation that has little to do with any practical problem solving or tools. Even more so, the obscure hardware, programming languages and meta compilers don’t exist any more, so it’s like reading some old handbook on writing assembler code for this one particular industrial machine that went extinct a long time ago, an exercise that can only have historical/archeological value. On the other hand, it feels great to see some of my technical intuition confirmed; to watch the recording in a dark room with the typical black and white shining bright from the screen as if it were from the projector at the San Francisco Civic Auditorium during the Fall Joint Computer Conference (but also alone and quite some sadness, in a dark and cold winter month – the legacy feel of the demo is very sterile anyways and not much lively as it were in color and with audience); to be able to do it on the exact same day 50 years later; to not care much about the confused claims, misunderstandings and future goals voiced during the anniversary festivities; and most importantly, that – as assumed – the text capabilities shouldn’t be too difficult to build. I find it truly surprising that most people from today’s hypertext “community” (?) I’ve encountered so far seem to be pretty unaware or confused about the mechanics of NLS, despite it’s a primary subject of study and the 1968 demo recording expected watching. Anyway, what follows are a few notes from watching the recording.

1/3, 0:02 Note that the lines above a character mean that it is uppercase. Output seems to be upper case only (but then, how did they get the line above the character?), which of course leads to a lot of things ending up being written case-insensitive (lower case) because if you don’t see it anyway, why type it? The printed ARPANET Network Information Center Journal, on the other hand, is case-sensitive.

1/3, 5:18 FILE OWNER ENGELBART, CREATED 12/7/68. That’s probably his user account created on/for the demo environment (longer display name and short initials DCE seem to be part of the account information).

1/3, 5:34 How does the system know that the copy operation should create a second statement, and not just append the text of the first statement to the end of the first statement, leaving the result to be one statement? Either he did it explicitly by using a statement copy command instead of a text copy command, or a statement is identified by a blank line (with an implicit blank line at the end of the text), so the system has no trouble identifying statements for outline collapse and automatic numbering etc.

1/3, 6:45 Done manually, all in text. Up to this point, it seems like NLS has no knowledge about the “statement” (I’m pretty sure that it is just a synonym of theirs for “sentence”/“paragraph”, indicated by the outline collapse that didn’t only include the start of the “statement”/sentence, but also the first few words of the paragraph, as much of them as the terminal size can hold for a single line), so formatting isn’t done as a semantic ViewSpec here, with the system recognizing a (outline) title up to the colon.

1/3, 8:05 Don’t understand how he accidentally deleted everything, if that’s by selecting all of the text or what?

1/3, 8:16 So there’s no versioning except for saving manually. Maybe saved states are versioned, that’s how the NIC Journal did it, but no change tracking or undo on the character level, as it seems. Maybe they added it later, this is the early 1986 implementation.

1/3, 9:45 The command is INSERT BRANCH, maybe it went to the end of the list. At 1/3, 9:58, the command changes to INSERT CHARACTER and “PRODUCE” goes to where his cursor was before where it was from before moving “ASPIRIN” down or by a new, extra click (maybe done to insert the new “branch” (=“statement”/“paragraph”/collapsed outline title as the “list item”) right there). Just guessing. Update: At 1/3, 10:43, it becomes clear that the first INSERT BRANCH of “PRODUCE” went under the statement “SOUP” as a sub-statement. At 1/3, 11:18 it becomes clear that this is the default way the INSERT BRANCH operation/command works.

1/3, 11:09 What happens if we run out of letters in the alphabet to number the second level?

1/3, 13:10 That means that the numbers/names can never be changed once they have been initially assigned, can they? Not so much because they’re not in the data of the target and therefore calculated by the system, but a merely textual reference to a particular number doesn’t know that it is a number other than the user invoking a jump command on it, therefore it can’t be found/updated if the number in the target changes. It would also bad for remembering them. Update: at 1/3, 16:58 after re-arranging the list, 2A4 doesn’t exist any more.

1/3, 14:38 Note that the syntax for encoding names was indicated at 1/3, 5:17 with “NAME DELIMITERS ARE ‚[‚ AND ‚)‘”, probably in this awkward combination to allow regular [] and (). Seems to be a per-file setting and probably could be overwritten if [) was already used by something else in the text, so for jumping to names/labels, the delimiters don’t matter, but the author(s) needs to be aware when writing them.

1/3, 20:49 So it must be a pixel/raster screen/display (in contrast to text terminals/screens of dedicated word processor machines), because there was the shopping graphic, the mouse cursor and different font sizes. Text is reflowable.

1/3, 22:15 So a “link” is basically a “name”/“label” + ViewSpec settings/instructions to be applied onto the target.

1/3, 28:32 One has to wonder if the text is still editable, ViewSpecable, a file, and where the lines belong to, if they’re absolutely/relatively positioned, how one would draw a line, what happens if the text changes and would cause problems for the positioning of the lines.

1/3, 28:48 Note how the lines at “COMPOSE, STUDY, MODIFY” change while moving to the next view (same at 1/3, 29:18), it’s probably similar to a “presentation slide” in todays terms (LibreOffice Impress) or static image, with clickable links in it (HyperCard, XHTML Image Maps, etc.).

1/3, 32:03 It’s funny how their mice look a little bit like the brick attached to the pen ;-).

2/3, 13:18 The picture didn’t scale the text, did it? On the other hand, Jeff Rulifson said that the “labels” are also “statement names” (probably manually inserted into the picture, or are they jump link “names”/“labels”, or extracted live from the “statement”/=paragraph/outline title?).

2/3, 22:14 So the code is organized and written exactly in the same way the user interacts with it, instead of both separated from each other and making the code and inner workings inaccessible to the user and the developers/programmers/implementers as well. Imagine you’re clicking through a GUI and a separate code editor window would narrow down or bring up what parts of code get executed, instead of stepping through things in a disconnected debugger with breakpoints etc. as we do today.

2/3, 26:23 Too bad that I split text and definition areas vertically and not horizontally in my glossary capability prototype :-(. Horizontal splitter solves the issue with left-to-right, top-to-bottom reading that when reading from left-to-right, the eye expects the text to flow horizontally, and the vertical splitter in the center is just confusing because it breaks this direction from the expected movement, raising the question if from the top, the left or the right side should be read first. With a horizontal splitter, left-to-right is never interrupted and top-to-bottom isn’t either. On the other hand, vertical splitting is better for alignment, for parallel matching texts (because both lines of use and definition could be aligned, and with a horizontal splitter, the eye has to match and jump separated lines all the time). Could be a ViewSpec option to define what’s preferred. Another option is inline expansion (or popups/overlays, if you’re not annoyed by them).

3/3, 8:12 They’re doing it analogically, aren’t they? The resizing might be an indicator. Or does the computer actually stream video footage data to a raster/pixel canvas? Sure, Engelbart might see Bill on his screen, but that’s probably because of the 3/3, 7:47 “hardware-wise available feature” in terms of camera overlay on the display. Mouse pointers and NLS screen might be shared via the computer, but as both look at the same shared screen, the question is if Bill sees himself from the relay of Doug’s display, or if he sees Doug in exchange, or the empty space (as I would suspect). The fact that NLS needs to make space in terms of windowing for the camera overlay is another indicator that NLS isn’t aware of the camera feed and doesn’t reflow the text automatically based on the canvas window size (and there’s no indication that the resizing of the camera overlay is done by a command or something, that’s probably analogue controls of the camera/display, very similar how they overlayed Doug’s face, hands and mouse feeds earlier).

3/3, 14:16 OK, now they’re drawing free-form or is the text snapping to a grid? Is it still a text file, and what would be the order of the text nodes? Based on pixel position or order of creation? What ViewSpecs can be executed on such an image or the text within it? The same ViewSpec commands, addition of “names”/“labels”, applying the Content Analyzer, or different operations? It might be just a free-form drawing canvas with no or little relation to the usual NLS text capabilities (but making text that’s probably overlayed/rendered above the drawing a link is clearly possible as demonstrated with the shopping list map 1/3, 16:08, so the text is there and positioned as entity/node and not rastered into static pixels, and the lines/drawings are probably stored as a rasterized/pixel static image, or too retain their attributes/existence as nodes/entities, similar to SVG/path/scene/object rendering in contrast to framebuffer-based techniques). This distinction or the potential lack of it, of sole text controls/interaction and the concerns of visual drawing, is quite important to understand the nature/design of a system and the capabilities that are supported or prevented by it.

3/3, 16:41 So a “catalog” is just the collapsed outline list of everything, or did they create/curate/maintain that manually? “Keywords” seem to be the xFiles indirect pointers (we’re learning about indirect retrieval still) that can make collections/lists regardless of the actual full catalog list in the file, structure or order (“keywords” might be a virtual list in comparison to the “catalog” as the actual list in a file).

3/3, 18:00 Ah, that’s how they do it. So the “names”/labels at the start are the already known link targets (anchors in XHTML), and then they add just their own, individual keyword names to the “statement”/paragraph, so they can filter/search for all “statements”/paragraphs that were marked with the same keyword. Question is if that’s per-user basis (but what if they want to share their keyword assignations?) or if they have to be globally unique to avoid conflicts. Seems like they’re ordinary “names”/labels anyway. “Keywords” seem to be an operation/command to add/assign “names”/labels quickly by simply clicking without the need to write them out every time (and maybe without changing the file for all the other users, but just add the personal, user-based keyword to the file metadata (xFile structure), or every user can see all the keywords by the others and can then decide to add his own or accept the existing ones as demonstrated by Bill, accepting the official “name”/label/keyword as present at the beginning of the “statement”/paragraph).

3/3, 19:16 Looks like by clicking, they draw a circle over the currently selected node, or replace the clicked character with an ‚O‚.

3/3, 19:18 Being able to weight keywords probably makes them different from ordinary, unweighted, unweightable “names”/labels.

So after having watched the entire recording, it’s very apparent once again that the demonstrated capabilities are very, very simple. Most of them are artificially constructed by the users themselves, by using just a few and simple basic, fundamental commands and ViewSpec settings, which in clever combination allow the useful structuring and navigation of texts. In fact, almost all of it seems to be made up of text. It’s very telling that the first example and introduction to the whole system is a copy operation with the help of markers, a long forgotten and abandoned technique from the earlier time of dedicated word processor machines. In general, without fancy graphics because of the lack of computer power, most innovation had to go into text, so the demo shows interactions that are based around concepts like interpreting a click on a character to select the entire word or adjacent whitespace, until interrupted by a different type of character. None of that is coincidence, take Ted Nelson’s JOT for example. The separate command line that lists the input so far, narrows down on options and informs the user about what’s currently going on as some sort of interactive/live documentation helps to not confuse/conflict writing activities with system interaction/navigation/operation.

WORDPRESS AND THE WEB/BROWSERS/SERVERS IN GENERAL ARE SUCH A SHIT, AT 3:50AM I AGAIN LOST TEXT I’VE TYPED INTO IT, maybe just because of what’s new/different in the 5.0 update or the new stupid behavior of visual vs. “code” editor (very likely that it was the autop “feature” again because a closing </p> was missing as I would type that later and WordPress tried to be clever by closing or removing the paragraph automatically, removing it together with my text, or it was the so-called “autosave” feature that only knows about intervals of a minute and not about window.sessionStorage – just because it’s a totally confused joke of an “editor” to begin with), and it was of course a total mistake of mine to ever entrust it any writing to begin with, so I shouldn’t post anything to this instance again and replace it completely. This post had a nice end and conclusion about why there can’t be any excuse for not having decent text capabilities 50 years later despite nobody wants them any more for various reasons + a lot of other more exciting/lucrative opportunities present themselves. Its loss demands tools to be built which make sure that this can never happen again.

#thedemoat50

My only two contributions to Doug@50/#thedemoat50: glossary prototype (wait until it’s loaded: a completion message will appear after ~1-2 minutes, then click the light-blue italic words) and tracking changes on the character level while writing + visualizing them. Not too difficult to imagine both capabilities to be combined/integrated, what category of problems may arise and which sort of solutions would be required. More on the background of the glossary capability (there are also some blog posts: 1, 2 and 3) and the versioning capability. Try them yourself: glossary capability (might not work locally because of the stupid Same-Origin-Policy restriction in browsers) and change tracking text editor + history visualization (requires Java 1.6 or higher).

It’s pretty obvious that these are my personal results/solutions and not the result of a collaborative group effort. I just wasted a full year with trying to find a single individual in the contemporary hypertext community who would be interested in discussing, designing, building or using a capability infrastructure that powers a hypertext system architecture, likely because of incompatible differences in approach, paradigms, capacity, perspectives and goals.

Personal Hypertext Report #11

Looks like C++ doesn’t come with built-in streams that can be navigated in reverse direction, as the default principle is to optimize for speed, and if data is read in forward direction and optimized by reading larger blocks + buffering them, std::istream::unget() may fail in proceeding before the current buffer block. Changing direction back and forth might result in throwing away an entire buffer block and having to read in another chunck, which defies attempts to optimize for speed, especially if those operations occur between buffer limits and there’s no smartness built in to deal with dynamic buffer sizes. I have to either verify that std::istream::unget() can always go back to the beginning of the data source (which is unlikely, because it may be possible with some stream implementations and fail with others, for example data that arrived over the network) or have to come up with my own stream interface and implementation for a file source which likely may not be too optimized in terms of reading block chunks. I could also limit the stream type to file streams, but I would want to avoid that, if possible, so data can come from other places as well as long as they’re not exclusively forward-directional. Introducing a new abstract stream class might be worth the effort for the „Reverse Streaming API for XML“: when porting to Java, Java’s streams might not have this limitation, and it can’t be worse than JavaScript with no notion of streams whatsoever (as encountered with the JsStAX port).

Another cheap solution would be to use the recently updated change_instructions_executor_1 that generates a separate file for every change instruction, and if I would also add some separate file that exports the instruction(s), I could navigate back and forth between those files as specified by the change_instructions_executor_1 result information file. But this would require such files to be copied locally (not to rely on any change_instructions_executor_1 output that might be subject to change or removal), and the need to have that many files around for a longer period of time isn’t particulary better than keeping an exclusive lock on a change instruction file because the stream is currently reading it. In general, this option would make use of other existing tools of the system, which is a favorable property, but then we’re in a GUI tool and not necessarily in a workflow anyway and the change_instructions_executor_1 could also still change in major ways (not that this would be a problem, but something to consider).

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International. See the history of how this text developed.

What’s the “Symbol” Tool?

Symbols are abstractions that are intended to act as (mental) substitutions for the real thing. As atomic entities in information encoding and communication, they can’t be manipulated without diverting the reference from the thing they were substituting to substituting for something else. If it’s about fiddling with the reference, it’s usually cheaper and easier to just create another symbol instead of trying to manipulate existing ones. The value/usefulness of a symbol is defined by its function as a disambiguation, in contrast to all the other symbols or useless noise.

There are many symbol conventions and media. Signs, sounds, words are just a few of them. Individual, abstract characters/letters are symbols themselves, but in dealing with symbols, we rarely read on the character level, but on the word level to identify its symbolic meaning. Text manipulation on the character level as opposed to text manipulation at the word level is rarely about manipulating symbols, because changing individual characters most of the time switches to a different word symbol and doesn’t do anything to the character as a symbol itself. Characters are the atomic encoding format/convention to construct textual symbols. Characters are atomic in the information encoding scope, but the symbol scope is one level above mere encoding concerns. The atomic entity for textually encoding a symbol is a word. As we can’t really manipulate a word symbol, as text is a series of word symbols, what we do most of the time is to manipulate the composition of word symbols within a text series.

We don’t care about the letter ‘s’ in “insight” as a symbol, because the symbols ‘s’ and “insight” are different from each other. We rarely use individual characters for their own symbolic meaning, but as building blocks to construct words via character combination. Such word symbols then can be proper, better symbols than what the limited alphabet provides. Now, if word symbols are atomic, how to manipulate them? If we start to change characters, we likely create new words/symbols, or switch to totally different symbols like “insignia” or “insider”. Changing characters in a word symbol switches to a different symbol and manipulates the text, but didn’t change the original word symbol “insight”, and how could we?

As we have established that “insight” is a symbol, what can we do with it or how to manipulate it? There’s the option to re-define it or fiddle with its meaning, which can be considered a bad, confusing action or a very creative activity as well, depending on context. The “insight” symbol doesn’t reference a tangible physical object of course, but an abstract concept, which isn’t a big difference, it’s just giving names/identifiers to whatever we might want to talk about, as a shorthand or “link”/”address” to the real thing. The actual meaning of the symbol has a defined scope (can be more vague or more strict), which includes a sudden realization or gained understanding about something non-obvious; the long and deep study of something that leads to better, more correct understanding than what others learn from a quick look on the surface; to look into oneself as the practice of self-reflection or -analysis; that’s what our language standard says, what the dictionaries as lookup tables for resolving and dereferencing word symbols say. But then I could start to call my company “insight”, we could agree to use the term/symbol to mean the exact opposite as some kind of secret code or in ironic context, I could “abuse” the term/symbol by using it to describe/name the event that a physical object comes into view/sight of an observer (as in “the ship came into insight distance”), or similar. Notice that the symbol itself hasn’t changed and hasn’t been manipulated, I instead manipulated what the symbol is pointing to or the scope of meaning, what it can and can’t point to. Symbol manipulation in terms of changing and overloading it’s meaning is somewhat dangerous because it becomes less useful if we do it too much.

What is symbol manipulation then? If I come up with the word “outsight” to refer to a situation in which insight never can be obtained; sudden, surprising findings about something while I was looking for something else; looking from inside outwards; general dumbness or whatever else (similarities in meaning scope is just because I followed a similar character construction rule that allows the deduction of a negated meaning, but the actual referenced concepts/meanings are different and distinct, they may or may not be opposed even, and I could have picked a different selection of meanings or a different combination of characters to refer to some or all of the mentioned concepts), it barely affects the original “insight” symbol and its meanings, only by mere accident/coincidence. One could claim that this is a symbol manipulation example because I relied on the original symbol to construct/derive the new one, so there is a relation, but I could make the point that the symbol itself is rather arbitrary. It’s perfectly fine to come up with new words that don’t have any resemblance to existing words/symbols (although it’s considered bad design) and define their meaning or meaning scope. I could just define that “anpecatrm” refers to the activity of looking out of the window (to specify the scope, specifically and only used when there is a window of an implied house, not to be used looking out of the window/windshield of a car).

How else could symbols be manipulated? We could consider the usual manipulations of typography, typesetting, rendering, visualization, but if “insight” in red has a distinct different meaning than in green, changing the color changes what meaning is referenced, the two symbols stay separate from each other and their color can’t be manipulated interchangeably. Such operations can be a way to trigger/hint different connotations however, to indicate a slight difference in meaning scope, but please note that we are only able to do so after leaving the encoding convention of plain text and entering the entirely different encoding conventions (another dimension) of pictorial visualization.

If you’re an electrical engineer and encounter computers with their binary information encoding, the realization can be (see Turing) that the bit patterns are arbitrary symbols that can represent other symbols like numbers (most prominent back in the day), text, images, abstract concepts and whatever else, and just as we manipulate binary and numeric symbols, we can as well manipulate text, image, audio symbols (if we can find reasonable methods to do so, that is). For binary and numbers, arithmetic is a useful manipulation method (in contrast to useless manipulations like picking a random bit or digit of a large number and make all other bits/digits that very bit/digit). What is it for text? Converting upper-case characters to lower-case? Make a word/symbol italic (but what would that change, do we enter pictorial/visual symbolism and would it still remain to be the same symbol)? I have some trouble of listing useful methods that manipulate pure word symbols. It may be much easier to list useful symbol manipulation methods for numbers, audio, images, but that too changes the symbol so it refers to something else (most dramatically with numbers). Whatever we do to symbols themselves, we usually have to follow pretty narrow constraints in order to preserve them as useful and correct.

So what is it that we really care about? It could be moving symbols around, combining, separating and rearranging them, “enacting” them (to attach effects to symbols and trigger them), and indeed augment their use (“writing” them or picking them from a list of symbols, insert them into other contexts as, for example, formal constructs, or whatever else). Those activities rarely change the symbols themselves as they’re supposed to retain the reference/meaning.

How would we manipulate language, if that’s similar enough to symbol manipulation, if not equivalent/synonymous? Or are (word, visual or other) symbols atomic entities and “language” the rules where to put them? Is it about us changing vocabulary and/or grammar? Potentially to some extend, but it’s more about manipulating particular symbol sequences in compliance with the established rules. A text, for example, is encountered as a large collection of symbols, being composed in a specific language (in which our knowledge is encoded). Language/vocabulary are in place for a long time now and can’t be changed easily because their modification requires everybody to agree on the new standard, so the meaning and the rules for dereferencing become established.

Another consideration: There is no practical obstacle whatever now to a world that exclusively operates on/with audio symbols. Noises and language received a great deal of standardization for their use in writing, reading and print serialization, but with audio interfaces and serialization, would we still hold on to the complex rules of written language composition that target the eye for visual consumption? I can easily imagine that much more efficient symbols and languages could be developed and adopted for acoustic information encoding and communication.

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International. See the history of how this text developed.

Personal Hypertext Report #10

With the change tracking text editor completed in its first stage, I can imagine that a lot of people can’t make a lot of use with the XML output it produces. In order to extend it to a full writing system, I currently look into programming a „change instruction navigator“, which is planned to have a rich editor control for the additions to be highlighted in green and the deletions to be highlighted in red. Two buttons at the bottom should allow the navigation backwards and forwards in history. There could be an option to jump to a specific instruction, and another button (optional) to select a specific version. On calling the program, one could immediately jump to a specific change instruction.

I think I’ll keep a stream object on the file which will lock it, and realized that the Java StAX API doesn’t allow to move backwards, so I’m looking into developing „Reverse StAX“, and to make things easier, I try to start a C++ reference implementation to later port it to Java, based on my existing CppStAX code. This will delay work on the navigator, but I’m not willing to keep all the instructions in memory, so I hope that it is worthwile to invest into more powerful XML tooling.

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International. See the history of how this text developed.

Personal Hypertext Report #9

Finally, I’ve managed to get the „change_tracking_text_editor_1“ capability working reliable enough for beta testing and prepared a downloadable package: hypertext-systems.org/downloads.php. Java 1.6 or higher is required. A description of the tool/capability can be found in this video.

From here, plenty of very interesting options to extend it present themselves, but I find important to point out that in my opinion, tracking the development of a text is fundamental for a hypertext system and serious writing on a computer. Without it, versioning and revision can only be done retrospectively with heuristical diffs as after-the-fact analysis, which can be wrong and lacks information like the order of changes or changes that later got overwritten again. With the text’s history recorded, every individual character as the most atomic element can be addressed with a higher resolution than with indifferent, agnostic diff/patch blocks.

For a hypertext system, one has to ask where the texts it’s supposed to manage come from. If they’re from old sources, only compiled, merged, consolidated versions without record of the history of their writing may be available, but for new texts I write myself today, I don’t want to immitate the old constraints imposed by physical production and instead make full use of the potential of digital. With writing covered for now (although very primitive initially), I can continue with tools/capabilities for resource management, publishing and reading, to eventually arrive at an integrated system for more convenience than using the capabilities individually.

Besides the prototype of a hyperglossary capability (video) and WordPress post retriever (with subsequent conversion to different target formats), the „Change Tracking Text Editor“ is the only other contribution I was able to prepare for the 50th anniversary of Douglas Engelbart’s Great Demo while loosely participating in the Doug@50 effort during the year 2018.

Related books I’ve read while working on the editor: „Track Changes“ by Matthew Kirschenbaum and „The Work of Revision“ by Hannah Sullivan.

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International. See the history of how this text developed.

Personal Hypertext Report #8

Stephan Kreutzer is becoming increasingly aware that he’s prevented from working on/with other texts than his own. xml_dtd_entity_resolver_1 was developed to expand named DTD entities coming from XHTML.

I want to study and curate everything that gets submitted in the FTI context, but that’s a slow and tedious process – in part because I don’t have the tools that help with such activity. So if I keep discussing every relevant detail, it would eat away from the already very limited time to actually build those tools. Furthermore, it seems that we’re not creating a single common system/infrastructure together (as NLS was), but continue to work on our different, isolated tools components. That’s perfectly fine, while depending on focus, the approach has to adjust accordingly. At the same time, participants submit a load of contributions just like I do too, but using the already existing human/technical systems as a bad starting point for bootstrapping their improvement leads to huge problems that prevent me from doing useful work with the material. Hearing that Christopher Gutteridge suggested the strategy (at 21:26) of “just doing it” to demonstrate how the future of text could look like and to solve our own problems with text work along the way, I think I should just work on my own publications and system with the extended mandate of pretending that sources similar to the jrnl (context) would be something of practical interest, not just for theoretical hypertext studies and mere passive consumption. This strategy being what I would do on my own as well sends me back to square one where I left off a year ago in the hope to find people in the hypertext community that want to bootstrap a capability infrastructure (yes, I know, a vague term inviting many different interpretations). Considering this approach with the increased difficulty resulting from the use of WordPress and web/browser components immediately paralyzed me for a whole week, but it’s not the first time I’ve encountered the overwhelming “curse of complexity”, therefore have my methods to deal with the phenomenon so progress can be made. Solving the many problems one after another isn’t too difficult after all, it’s just a lot that takes its time.

I also grew increasingly suspicious that the Global Challenges Collaboration wants to do something Engelbartian despite the theme is constantly reoccurring. I was listening through their archive of recorded conversations and those reveal that the group is far from being interested in TimeBrowser capabilities as a supporting function or solving complex, urgent world problems or collaborating in ways that didn’t emerge from their conversations. I’m unable to engage as their open web channels (not Zoom, not Facebook) are abandoned, my submissions go to waste and their material remains largely unavailable. Even their more practical action group is no exception, or just think about what one can ever hope to actually do when considering what follows from 53:12 of the recent “GCC Sunday Unblocking” or 2:15:10 of “GCC open space Thursday”.

So by now, I wrote a few texts of my own and don’t need material created by others any more, and despite they’re not dialogue, I can pretend they are, especially by faking a second voice on the jrnl that’s compatible and collaborating in contrast to other non-cooperating entities/sources on the network.

As a first step, I added the xml_dtd_entity_resolver_1 capability/component/tool that’s able to resolve DTD named entities based on either local DTD resolving (local catalogue) or by a simple replacement dictionary. XML files that make use of DTDs (part of the XML spec) can be converted into their expanded equivalent for further pure XML processing without the need to rely on DTDs any more in cases where such a state is favorable. The particular application is the wordpress_retriever_1 workflow, so it can expand the named DTD entities coming from XHTML. A potential hypertext system won’t internally rely on XHTML and DTDs (nowadays only encountered as built-in in browsers anyway) and needs to convert into formats that might not be aware of DTDs, XHTML or XML at all. As XML and DTDs are different formats, it doesn’t make a lot of sense to mix them, be it in their own specification or in XHTML because it gets into the way of bootstrapping semantics, which is why I need to get around their conflation.

Still, as far as the jrnl is concerned, I might deliberately not expand named DTD entities in order to have the XML parser fail, so I can learn about the entities that are actually used, because those that are right now, are only &nbsp;s in empty <p>s, and we really should look into where they come from as well as get rid of them. One can legitimately defend the use of DTD entities, but not in this case. I’m pretty sure that these empty paragraphs are used to add vertical space, which is a quite confused practice in itself, and I guess it’s created by pressing the Enter key in the WordPress editor several times, and the &nbsp; is there to prevent WordPress from removing the otherwise empty paragraph automatically (or alternatively, that’s how WordPress implements several consecutive Enter linebreaks in lack of any decent idea what to do with them). In other words, this renders another piece of WordPress rather useless, which is the editor, so I wonder what’s left. A decent editor should not allow manual linebreaks.

To make my progress a little bit more usable, I added a starter script to the packager and compiled the prepared download package. I also updated the site to the extend that it at least reflects some of the results via the videos about them in lack of better ways to present them. In part, it’s supposed to prepare for the Frankfurt Book Fair in case I want to refer somebody to the effort.

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International.

Personal Hypertext Report #7

I’m an implementer, tool builder, technologist; but that shall not preclude me from working on the human system side as well. Sometimes some of our tools will process data that happen to be human expressions. Today, whatever capability the tool might provide and if it is the most amazing ever imaginable, there can’t be any hope of meaningful improvement on the tool or human system side in such cases as copyright law can easily render it completely void. There’s much to write about the damage the print-era copyright law does to the digital world, but even the hypertext community seems to not care at all about the human system side in that respect. It’s no coincidence that we have no other choice than starting from scratch with our poor attempts to build another new hypertext system as most of the earlier work is unavailable, unusable or lost for legal reasons and I’m not aware of a single theoretical work that wasn’t published under the traditional publishing paradigm. Sure, the general availability of digital computers and networks are a fairly new development (we remember a time in our lives where there weren’t any), so the early computer/network pioneers didn’t have to deal with this issue as the technology itself built by them has no inherent recognition or support for copyright law that was designed to regulate the publishing industry. Even with our recent attempts, I have no legal permission to use anything published on jrnl.global except my own stuff. There’s absolutely no point in developing collaborative hypertext capabilities like a shared journal if we lack the most basic human system side policies that make it shared. There’s absolutely no point in curating work if it can disintegrate or disappear at any time. I don’t need the submissions to an institute to hold a monolog. What follows is a practical description of how copyright prevents collaboration.

For my Personal Hypertext Report #6, I made a screenshot image that contains portions of Frode Hegland’s “jrnl launch meeting announcement (for Friday 24th of August 2018)” text. As the report primiarly covers actual issues of the existing posts on jrnl.global, it made sense to not only publish/distribute it via my own dedicated machine, but also to submit it to the FTI jrnl under Frode’s control. Sure, such practice is totally stupid because we don’t have a capability yet for both copies to share the same canonical identity, but what else do you do as long as nobody is working on such capabilities yet? So for the screenshot on the jrnl under Frode’s control, there is no legal problem because in the process of submitting, I gave permission to Frode to distribute my contribution to the screenshot, while Frode already has the permission to distribute his own contribution to the screenshot of course. Now, I also want to distribute the screenshot including Frodes contributions to it myself, be it via the dedicated machine(s) under my own control or in print, for which I need Frode’s permission, and that’s not only for me personally. Furthermore, I want to invite my visitors/readers/users to distribute my part of the work as well (for backup/posterity reasons), to create derived works, to distribute the derived works and to use all of that for whatever purpose (the four essential freedoms of libre-licensing). As the screenshot can’t be separated into Frode’s contribution and mine without disintegrating or destroying it, it’s up to Frode to consent or object to my plan of giving everybody blanket legal permission to do their own computing, reception, curation and publishing in freedom, sovereignty and independence as the technology easily enables them to – so I intend to remove the legal prevention and risk as from realizing the full potential of what digital can offer. Therefore, a non-transferable permit limited to me personally wouldn’t do.

I could argue that the screenshot is a new work/interpretation with its own new copyright to me as the original author. If so, I would license it under the GNU Affero General Public License 3 + any later version (without having really investigated what it means for works that aren’t software) and also the Creative Commons Attribution-ShareAlike 4.0 International (despite it doesn’t have a trusted path of upgradability). Why do I pick these two licenses? Simply by looking around which ones are most protective of users rights/freedoms, what features they offer in service to that. These licenses are not perfect, but the best we can do for now without violating the law, as we can’t change it nor afford to ignore it.

But did I obtain rights myself, is there enough that statisfies the threshold required to become copyrightable? It could be the composition, I also draw the rectangles, but then, the output of my converter isn’t copyrightable as it was generated by a machine and not a human. Then, making an image of it in a particular way/composition might render it copyrightable again, but I didn’t change the program output, so it’s probably not enough creative expression of my own. Can I obtain rights similar to reproduction photography that makes a photo from an old Public Domain painting in the museum and get copyright on the basis of “creative arrangement” ending 70 years after the death of the photographer? In contrast to photography, I can’t creatively influence the capturing process of the screenshot function as standardized by the operating system. On the other hand, I positioned the two windows carefully and also cut it to a certain detail subframe, something that reproduction photography might not be allowed to do as it’s supposed to reproduce the original truthfully.

What if the screenshot would turn out to be a combined, collaborative work? That’s usually the case if several authors work together in creating a shared result for which the individual contributions are hard or impossible to determine afterwards. Now, Frode and I didn’t work together in arranging the screenshot, but Frode’s contribution in form of the text is clearly identifiable, so did we both earn copyright for the new, combined work? What about the UI elements that are visible from the text editor and WordPress, are their UI layout design portions (as used) covered by copyright and would I need to ask for permission to publish the screenshot? If so, we all would have to agree on licensing in order to not block each other from using the result ourselves. If we fail to agree on licenses or one of us changes his mind at any point in time and fails to convince the others, and it makes no sense or is impossible to break up the combined work into its constituting parts, the effort of creating it would go to waste, which is why collaborative projects need to decide on licensing in advance to avoid legal entrapment.

The screenshot could also be seen as a derivative work and to create such requires permission from the original author of the work the derivative is based on. While it’s pretty obvious that Frode’s text isn’t the main subject of the screenshot, Frode could claim that the image is a derivative because I would not have been able to produce it without his text as the basis. Choosing a different text or coming up with one myself might sound like a feasible alternative, but the aspects I want to focus the viewer’s attention on are particular constructs as found in this specific instance. Demonstrating them in any other context would make them far less relevant in their function as a practical statement. So if it is a derivative, I didn’t ask Frode for permission to create it.

Frode suggested that I should handle it as a citation, maybe because Frode works on documents and academic publishing a lot, so he’s the expert when it comes to citations. Quite frankly, I’m not convinced that my use of his text portion in the screenshot is actually a citation, and I wonder how Frode seems to not be aware of the citational deficits. For one, I’m not referring to any specific statements of the text. The red boxes highlight sporadic XHTML features that might be created by a program, so if I’m citing, I might not cite Frode. Frode’s text just happens to be around the potential citation and is captured as the broader context, while not being aligned and cut off, so the point really isn’t to cite the text at all. Second: I didn’t copy the cited text authentically (except if the claim is that there are many of them, cited down to character level and even splitting characters visually) as a citation would require. I also don’t indicate the source of the portions and won’t for the entire screenshot, because the source for the latter is myself, it doesn’t exist in this particular composition anywhere else. Or is the idea that the image can only be found on his jrnl blog server instance? Well then, I submitted it to there, so if I would have uploaded the image to my server (ignoring copyright implications), the result would have been me citing myself without involvement of Frode other than being the storage host for the citing publication. If it were a citation, I and everybody else would be able to use my image composition under my copyright terms including Frode’s cited portions as long as the copied citation material isn’t changed and the reference to the source is kept intact, without asking Frode for permission in advance. Not having to ask Frode for permission for every occurrence of redistribution would be a nice thing, but being prohibited from modifying the incorporated copy of the cited original wouldn’t. Do we look at Wikipedia articles as works built by each contributor citing all earlier citations/contributions down to the very first author who created the initial article text?

With a citation, Frode’s text portion as “cited” by me doesn’t need his permission to be cited, nor does his text fall under my copyright license just because I cited it, so if one extracts and reproduces Frode’s text from my image, he’s solely under Frode’s copyright provisions, but if he does something with my work as a whole including Frode’s citations (derived works for example), it would fall entirely under my copyright provision. Does Frode want to resort to citations in order to retain his copyright in case that somebody extracts his text from my image, to not give his text into the commons of the libre-freely licensing community as his original article publication is still under “all rights reserved”? Does he care about the function of citations as references that serve as promotion and “impact factor” (“relevance”, page rank and SEO for academia)? Does he care about personality rights/moral rights? If I only knew, I could adjust accordingly.

It’s important to understand that copyright notices as a consequence of licensing don’t serve as a mechanism to make explicit who referenced what as citations do for critical review and discussing the cited separate original. Instead, they’re about explicitly declaring the legal status of a combined or non-combined work. Copyright isn’t designed to make things open and useful, it is designed to do the exact opposite: to restrict usage, mandated by law as the default. Citations in their legal function limit copyright a little bit, so useful reception isn’t prevented entirely. The legislator deemed the most primitive “collaboration” of re-inventing the wheel over and over again to be already sufficient. It was imposed by the slow, static production and dissemination of printed books, where no-one can interact with the original author via the physically fixed medium, so the suggested way is to prepare and publish a response that includes citations and improvement suggestions in another book, for the original author/publisher to hopefully discover it in order to cite them again in a third publication. This methodology is the maximum of what citations grant us legally for “collaboration” despite the advent of computers and networking.

If there’s some disagreement that eventually leads to a legal dispute, a judge would need to decide what kind of work the screenshot might be and either split it into individual, autonomous parts or treat it as an indivisable whole. Depending on the outcome, we would face different scenarious of who would be rightsholder: Frode alone, me alone, both of us or none of us, if the screenshot should turn out to be not copyrightable. I would learn if I did infringe on Frode’s copyright or not, if I (and subsequent others) would be permitted to make use of the creation or not. In fact, the pre-digital copyright utterly lacks ways to find out who is a copyright holder or to establish who owns what if only specific parts of a combined work should be used, which in itself is reason enough to explain why copyright is broken beyond repair. Transclusion doesn’t help at all because the transcluded original source can disappear at any time and in the case of the screenshot image, there’s no way to technologically construct it in the exact composition of elements as on display. To create ViewSpec capabilities that enable such results is not only very difficult, they would be of almost no general use and therefore a great waste of time. Neither transclusions nor ViewSpecs would solve any of the legal problems, they would only distract from them by creating the impression that combined works are possible without too much care about the legal status of the elements they’re composed from.

The sad reality of copyright in the digital era is that no matter how good the technical work or content is – if the human system of accompanying legal provisions turns out to be deficient, everything is destined to go right into the trash, pretty much like most of software development and text publishing is done up to this day, producing into obscurity and irrelevance.

To be completed…

What’s the “Topic” Tool?

I think most of us can easily agree that any structure is better than no structure, and flexible structure(s) is better than just a single, fixed one (would love to explore contrary views on this, otherwise I would assume this as the working hypothesis). There are obviously costs associated with the application of structures on what otherwise is unstructured, context-less, meaningless chaos, but that’s also the only way we know how to gain benefits from what we have or are looking at.

I didn’t think about topics as a structuring tool/mechanism (in terms of a particularly useful structuring tool, which is the whole point of introducing such structure at all, except for natural structures that exist as a matter of fact, but might not be too useful to us as we don’t understand the nature and structure of many things), where categories, taxonomies and keywords/tags are useful structuring tools in my mind, instead, I regarded topics as broad, overarching, general “brackets” that include loosely what might be relevant for a question or problem and exclude everything that doesn’t seem to be related. As the topic tends to be broad and not very specific/formal, aspects and related fields can easily be added or removed from a topic, which makes it a not very useful structuring tool, because the topic is a single structure with a flexible, versatile meaning/interpretation. One day, other, unrelated knowledge/facts/aspects/topics can turn out to be closely related to the topic, and the next day it is found that it was wrong to see a connection between both, that they are in fact unrelated, but just seemed to be related, so those things get removed again. Thus, the topic is more of a snapshot tool for declaring what the human knowledge workers think at the given time of what’s related/included in a topic, to make the distinction to what is unrelated/excluded. It’s much more difficult to deny a piece the annotation/connection to a certain category, taxonomy or keyword/tag, to no small part because they’re used on small pieces/portions where topics cover large collections of pieces, categories, taxonomies, keywords/tags, even if the latter are in conflict with each other, they can still be included into the same generalized topic as different perspectives for looking at what’s relevant with-under the topic. Sure, we know that in reality, “everything is deeply intertwingled” and a problem of immense complexity, so the topic as a structuring tool doesn’t reflect reality, so it is indeed just a tool, so topics face resistance/opposition by people who think that separating disciplines, stereotypes, etc. are a bad thing precisely because they’re tools that don’t reflect reality, but it’s not that they can suggest a more useful alternative (cybernetics exist, but also don’t improve the usefulness that much), but demand that the limited usefulness a topic has needs to be furtherly destructed, maybe because it’s a bad thing and misleading and dangerous to think or look at things on a broad, generalized scope, that it is an illusion that you can.

That’s my current view of what topics are, it’s certainly a different question if/how we can improve topics or improve on topics or improve our or the structuring tools, as well as the question if our current tools/technology (properly or improperly understood and/or applied) are suitable (useful) enough for the increasingly complex problems at hand.

Just to note, before I forget: from computers we’ve learned that an answer to the latter question could be the network/“graph”, Ted Nelsons crusade against hierarchical structures, which topics are despite being flexible, because they’re “on the top” and other things “included/grouped below/within them”.

Addition: Everything is a structure, and if we care enough, we can also make it a topic. I’m not sure if we can reasonably describe something that has no structure, or if things without structure can or do exist, but I’m curious how we could approach such a notion. Consciousness might be something that’s not a structure, and we could discuss if consciousness requires host structure(s), but here we’re back again at the problem that we can’t properly talk about it because the lack of structure makes it hard to prove it’s existence. Not that things that potentially exist or don’t exist can’t exist if we don’t find their structure, but in absence of finding their structure or assigning a structure to them, one can easily claim that they do exist as well as claim that they don’t exist, which may or may not have influence over their real existence, but what’s certain is the fact that we can’t easily talk about it for that particular reason.

To avoid confusion about the “may or may not have influence over their real existence” statement: one can bring things into existence by simply claiming that they exist, or by introducing structure to something that was unstructured before (so it exists in or by or because of the structure), and we can debate if they really exist, but they’re not less or more existent than we are. If they have a consciousness is a different question, but even the possibility that they could have consciousness can’t be easily dismissed for the things we otherwise would be most sure that they don’t exist and aren’t real. A prime example could be a fictional character in a book or movie, is he/she more or less real/existent than, let’s say, “Shakespeare”, or you and me?

By the mere act of talking about consciousness, we certainly made it a topic and gave it (some) structure, but does consciousness itself have a structure, can we even know if it exists? Surely it exists, because of us assigning/identifying a structure of what consciousness is or might be and what it isn’t and probably might not be, so it has at least one structure (ours, as a topic or several topics, at least), so we’re back at wondering if things without structure can exist (again, not in terms of if they actually, really exist or actually/really because of us, or only virtually or any of that, but existence as something we can learn and talk about in opposition to things that may or may not exist, but about which we can’t talk or gain any knowledge about because of the lack of observable structure, including our own made-up structures to talk/think about things that didn’t exist for us before, so we can say that we don’t know about the existence of anything without structure, except unstructuredness itself potentially, if it actually or virtually exists, but that might be the only unstructured thing we can ever talk/learn about).

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International.

Personal Hypertext Report #6

Stephan Kreutzer doesn’t want to work on a WordPress enhancement: the current posts on the default installation broke XML processing by a non-browser client because of a named DTD entity and bugs in WordPress’ wpautop() function – both unnecessary problems caused by the general web context, too expensive to fix and difficult to work with.

There’s great consensus that the first project of the FTI, the jrnl, should be a WordPress enhancement project. I’m happy with that as a good practical compromise for bootstrapping ourselves: it’s readily available and many people know how to set it up and/or use it, it’s libre-freely licensed (not ideal though, GPLv2 makes no sense for software that is run over a network, the software should be licensed under the GNU Affero General Public License 3 + any later version, but they probably lack a way to update the license or can’t do that for social reasons), it tends to be an “open”, friendly actor of the open web and it has a usage share of the total of all websites that makes it likely that I encounter such legacy web content I want to work with, I wouldn’t want to lose those entirely. For our group, we need something to capture our discussions, otherwise we would lose some of our own history. Doug Engelbart, at the time he didn’t have the NLS yet, printed his proposals on paper, which is good enough to communicate the idea and get the project going, as these documents can be imported into the very system they describe, but it’s not that we need to wait for the perfect tool in order to allow ourselves to talk about the design of such perfect tool. The conversation about this topic is more important than the costs of doing it wrong. It’s part of bootstrapping really to improve on what’s already around, because if we would re-invent everything starting from digital electronics over networking infrastructure to software architecture and graphical rendering engines, we would never get anywhere in regard to our end goals, so the improvement process must happen from within – adjusting, enhancing, replacing piece by piece for an exponentially enabling overall progress.

What I personally plan to do however won’t contribute much to a WordPress enhancement because WordPress for all its benefits also comes with many problems as it happens to have been built according to the web model. The web wasn’t designed as a hypertext capability infrastructure, but as a document accessing scheme that developed into a fully blown programming framework for online software. It has never been much concerned about the needs of text, which becomes all too apparent for the otherwise perfectly reasonable things it cannot do. Changing that within the web paradigm would be unbelievably expensive for socio-political reasons and the required money/time investment, so I just might play devil’s advocate for client components that aren’t browsers and don’t care much about what servers do for the web by building my own little system that solves the problems I still remain to have when it comes to reading, writing and publishing. That can either lead to wonderful interplay with WordPress enhancements or to great conflicts depending on the technical specifics of implementation and design. In any case, it’s a worthwile research endeavor that promises directly usable results in the larger context, no matter how all the other parts behave, being the independent, potentially non-cooperating entities on the network they are.

The first thing I tried as a simple test was to automatically download the existing posts on jrnl.global with my wordpress_retriever_1 workflow. I encountered two minor issues that were easily solvable by curation without having negative effects on what WordPress generates for the web: In the biography post about Teodora Petkova, there was a single, lone &nbsp; on the last line, and that broke my XML processing of the post data as retrieved via the JSON API. XML does not come with a lot of DTD entities except those to escape its own special characters and therefore is unaware of HTML’s legacy DTD entity declarations. They’re usually not really needed because characters should just be encoded in the file with the encoding clearly stated in the XML declaration or processing instruction. I agree that it might be sometimes favorable to explicitly use escaping for special characters, especially the non-printable ones that can be hard to read and write in standard text editors, but then XML supports the encoding of Unicode characters in the form of &#xA0; (hexadecimal). My parsers generally still lack the ability to translate these Unicode escape codes to the actual character (with the additional difficulty to never break encoding through the entire processing pipeline), but that needs to be added anyway eventually, with the XML tools probably already being fine. The old HTML entities for encoding special characters is really a legacy remnant of DTDs, and while DTD declarations at the beginning of HTML file might arguably help with identifying the HTML version (especially a huge problem with “versionless”, “living” and therefore bad HTML5), they also cause a lot of trouble for processing XHTML with a XML processor (which is just great, but was abandoned by the web and browser people for questionable reasons): DTDs can include other DTDs, each of which can theoretically contain entity declarations, so an generic, vocabulary-agnostic, XHTML-unaware XML processor needs to load/read all involved DTDs in order to find out how entities need to be replaced for text output, which entity names are defined and which aren’t (the latter will cause the processor to fail). Now, the DTDs define the HTML vocabulary or specialized format based on the generic XML format convention, which is published as a standard by the W3C and needs an unique identifier to be disambiguated from other standards and versions, otherwise the whole vocabulary would be ambiguous and of no help for processors to interpret its meaning correctly. So if the W3C needs an identifier, ideally globally unique (which implies a central name registry), what do you think they would use, especially back in the day? An URL of course! Act two: imagine you write a generic XML parser library and you encounter a DTD declaration, while your library is also going to parse DTDs and allow DTD validation (for proper, but separate HTML support) eventually, and lack the time or knowledge to do a particular good job with the first release of the implementation? It might even be legal constraints as the DTD files by the W3C have some almost non-free restrictions or the question why a generic XML parser library should come with W3C DTDs if some users will never parse any XHTML ever – what happens a lot of times, as the unique identifier looks like and is an URL and the referenced, needed DTDs are not locally available, these parsers send out HTTP GET requests for the initial DTD, which then references several other DTDs internally, and then sometimes the received results (if any) might not get saved (because why would you and how, for small, ad-hoc, generic XML parsers?). There are probably plenty of web crawlers and bots and client implementations that try to obtain a resource from the web and then read it in for further processing with a normal XML tool, and in a distributed network, this causes quite some excessive traffic for the W3C server. So what did the W3C do in response? They throttle so serve the response of course! So imagine yourself, being an unsuspecting user of such an XML library, trying to parse in some nice XHTML, just to observe that your program execution stalls. Is it a bug, an endless loop? You might enter a debugging session but can’t find out a lot. Lucky you if you go to get you a coffee eventually while leaving the parser running instead of giving up the plan entirely, to be surprised when coming back that miraculously the XHTML input somehow did get parsed. That’s likely because the artificial delay at the W3C server has expired – to then serve the requested DTD(s)! To treat the identifier as resolvable instead of an abstract one is the root cause of this problem, delivering the DTDs encourages such behavior, and additionally, the delay might lead to the impression that XHTML/XML parsing is very, very slow. But what else can you do? Not serving the DTDs would leave HTML entities unresolvable and the XML would fail to parse. If parsers/processors don’t have the DTDs and can’t get them, all of XHTML would be unreadable, which would be a grim option for the whole XML/XHTML/semantic effort. But can XHTML demand special handling from all generic XML parsers out there? Of course not, the whole point of XHTML is to make it universally available just as all the other XML vocabularies. Therefore, still having the DTD declaration and named entities in XHTML for backward compatibility with crappy HTML4 while not advancing to XHTML2 (HTML5 removed the DTD and with that, XHTML5 doesn’t have one either, but as rubbish HTML5/XHTML5 is the one without version as a “living standard” indistinguishable from a potentially upcoming HTML6/XHTML6 except for heuristic analysis, a HTML5 DTD would have been better than no version indication at all) is a historical tragedy that renders even the best, well-formed, valid portions of the web hard to use for some of our machinery, completely unnecessarily. For my tools, I install a catalog of the DTDs that I included in the download, so all DTD identifiers can be resolved to the local copy, but that’s not an option for my own primitive StAX parser in C++ or its sloppy JavaScript port (deliberate lack of XML tools in browsers is unbelievably ironic and a big loss of potential). XML parsers of the big programming frameworks usually also have a DTD parser on board and just need the W3C DTDs from somewhere, ideally delivered as a local resource with me resolving the ID to the local file in the catalog. These catalogs could be adjusted to not resolve to the actual W3C DTD for mere parsing (we’re not doing DTD validation here, do we?), but to a pseudo DTD that contains nothing than the named entity declarations and omit all references to any of the other DTDs. For the custom, small StAX parser, I think I’ll add a mechanism that instructs the parser to first read a XML file that lists named entity names and their intended textual replacement, so it can continue with the real input file and recognize all entities that were configured this way in advance. And all of this needs to happen in preparation to handle a single &nbsp; that hasn’t have any effect to the visual appearance anyway. Pragmatically, I just removed it from the blog post for now, I think it won’t be missed.

Something else broke the parsing of the jrnl launch meeting announcement (for Friday 24th of August 2018) post: welcome to the wpautop() function of WordPress! “autop” stands for “auto paragraph” and replaces two consecutive line breaks with XHTML <p> tags. In the process, for XML/XHTML well-formedness as the very minimum of standard compliance, it has to find the beginning of the previous paragraph and the end of the paragraph that follows. It also must respect other encapsulating tags as tags are in hierarchical structure and aren’t allowed to overlap. For some reason, autop is notoriously error-prone. Have a look at the issue tracker: there’s #103, #488, #2259, #2691, #2813, #2833, #3007, #3362, #3621, #3669, #3833, #3935, #4857, #7511, #9744, #14674, #27350, #27733, #28763, #33466, #33834, #38656, #39377, #40135, #40603, #42748, #43387. The fact that regex is involved explains part of this mess. WordPress recognizes that plain text linebreaks are considered harmful when they’re typed to convey semantic meaning, and as the application is managing output for the web in (X)HTML, whitespace like linebreaks doesn’t encode meaning, it usually gets entirely ignored in the visual rendering and instead formats the source code for better readability. If the plain text editor of WordPress is used, autop should not be applied onto the text whatsoever because it can be expected that the user will directly input XHTML. The visual editor should create a a new <p>-pair with the cursor right within it – any subsequent enter key press in an empty paragraph should have no effect (but then people will start to enter spaces into the empty paragraphs to force their way to vertical spacing for layout/typesetting, so a whitespace check needs to be performed). An actual plain text editor variant should XML-escape all angle brackets as mixing different formats and encodings is always a bad idea. Here we’re already losing WordPress as a decent writing and publishing tool, that’s with the out-of-the-box default configuration and no plugins or themes installed yet to interfere with the content even more. The divs with the ApplePlainTextBody class seem to be of other origin, also being pretty useless as the CSS class doesn’t seem to exist and doesn’t help as a mechanism to derive semantic meaning. There’s even an empty one of those divs, so I just removed all of them and that also caused autop to not fail for what otherwise was perfectly valid XHTML.

“Enhancement” can mean different things, “WordPress enhancement” means using WordPress as the basis for the enhancement, but then “WordPress” can be looked at from different perspectives. The web perspective isn’t one I personally consider much because I precisely want to get rid of the limitations that are associated with it, plus it would be too expensive for me to invest into fixing all the problems that exist with the current implementations/paradigm against the resistance of those who are in actively favor of them or simply don’t care.

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International.