Doug@50: Status Update 2018-03-15

Management Summary: Problems with local, client-side XHTML (CORS, local file-I/O) prevent me from making progress with the independent glossary implementation for our Journal, question is if web stack needs to be abandoned entirely with switch back to native application programming.

You might know or not know that I tried to make an independent client for our Journal Glossary, which retrieves all blog posts, some of which are glossary entries and some regular blog posts, and apply the former onto the latter. It’s not intended as a final “solution”/“product”, but to start experimentation/prototyping in a bootstrap fashion to learn about the problems and improve solutions. It is on GitHub (green button “Clone or download” on the right, “Download ZIP”, extract, double click on index.html and wait). The glossary is outputted at the very bottom, and the glossary terms used within the text get highlighted in yellow color after waiting until loading is completed. The visual representation is deliberately a little bit awkward, instead of yellow highlighting, there could as well be an popup/popunder, link, or the glossary could provide a list of links to the blog posts that contain a term etc., that’s where we can play around with it to get a small demo working quickly.

Yesterday I started another one of those local, independent clients for adding a new term to our Journal glossary, as captured in the screenshot. Several glossary term input fields can be added via the [+] button (for synonyms), as well as several definition fields for those terms. Sure, it’s neither beautiful nor final, but a way for me to enter the data conveniently enough for testing. Then I tried to write the XML-RPC-API call to WordPress which would send the data to the server, but then I encountered that CORS is blocking such calls. CORS is a feature in the browser that prevents non-trivial calls (especially those who send data) to another server than the one the website was retrieved from in order to prevent that injected, rogue JavaScript code can send confidential user data to other servers than the one the site/client is supposed to talk to. As my file is local and no server ever involved in providing the site or sending a CORS header for allowing to talk to the Journal, I don’t know how to continue to make something for the #booksummit or how to go about the Doug@50 demo technology-wise. CORS can be circumvented by either starting the browser without this feature (which would make it vulnerable for all other use cases where we want CORS and requires some dirty tricks or user cooperation by operating-system-specific scripting to start the browser in that mode) or install and start a local server which does nothing than reading the local file from the hard disk and send it to the browser without any interference except for sending the additional CORS header. We might need the latter anyway because browsers don’t have a way to access the local hard drive, something my local client implementations would want to do just as any other native application. All of this is based on my assumption that the functions should be local without a remote server operated by somebody else to provide me such functionality, so the user can retain privacy and independence. I could try to promote and work on introducing local file-I/O and no CORS or explicitly granting permission to web standards and browsers in cases the file in the browser comes from the hard drive, so the web stack would become a decent framework to do “native” application programming, but people already tried that and failed, as well as there are complex security implications as most of the stuff is designed for online and we probably don’t want new scam like “download this HTML and double click it, whoops: all of your passwords are gone”, but on the other hand, it’s not different than other native applications that get installed, so on a decent operating system like the libre-free GNU/Linux (or BSD or whatever libre-free) none of that would be an issue because such local applications implemented with the web stack could be installed from online software repositories where stuff is checked and signed, so no scamming or malware or something else would be encountered at all. Another option would be to bundle those XHTML+JS pages/packages with a native application of my own or one of the existing ones, which are a runtime environment and extend the browser with functions for local file access (and hopefully CORS management), and in case native programming languages have a well-maintained browser control/component that provides a bridge for JavaScript calls into the host language, that could work too. If I don’t find a solution, I would need to abandon the web stack entirely and continue with my native efforts, loosing all of the decent rendering/visualization the web stack offers as well as potential support of the web dev community. So for the OHS, as there can be many implementations (of parts/components) in many different programming languages and stacks, I wonder in terms of OHS Core and the general OHS infrastructure conversation if the web stack can play any role in it. With it, we can get a few quick wins and people are very familiar with it, but the constraints may render some things impossible.

My Journey Through Text

At first, I did some experimentation with browsergame programming (not casual games, but with server-side persistence) attempting to build/generate “worlds” while avoiding the need to hand-design everything in a time-consuming process. One result was a world editor that served as an image composer (using GDLib for PHP) and primitive image map manipulator at a time where HTML5 canvas wasn’t there yet.

Later, I wanted to improve my note-taking in printed German Bible translations, in particular I wanted to produce my own interleaved editions. Soon I learned that digital Public Domain German Bible texts are usually not true to their printed originals, so I had to start a digitalization and proofread effort (more) first. From a semantically annotated XML source, it was easy to generate a modern XHTML reproduction of the text and then PDF layouts via XSL:FO and LaTeX. I was looking into SILE and recently PoDoFo as PDF generator backends (former accepts XML as input, latter is an C++ API that still needs a XML frontend) but didn’t invest too much into supporting them yet. Finally I achieved the original goal of generating interleaved PDFs for printing, and thanks to the advent of print-on-demand, I’m now able to order hardcover thread-stitched books in a quantity as low as a single copy (not even to mention the magazine variant or the DIN A6 or DIN A4 variants out of my DIN A3 monochrome duplex laser printer).

One proofreader introduced me to EPUB, which of course made sense to add as an output format and eventually got me interested in e-publications, e-ink based devices and the publishing industry in general.

Somehow I discovered a Wiki for creating a new libre-freely licensed German Bible translation collaboratively by using a parser that extracts OSIS from the online Wikitext of the MediaWiki software, and for a church congress event we hacked together a semi-automatic workflow that generated the PDF of the study version of the Gospel according to Mark. As I didn’t want to change my existing tools to OSIS as input format and most of the time I didn’t even need the advanced OSIS features, I just internally converted OSIS to my Zefania-XML-based Haggai XML format and made a few adjustments for being able to produce the usual output formats XHTML, PDF and EPUB. Another project was the conversion from the “verse-per-line” format to Haggai XML, not too different from another similar CSV to XHTML to EPUB project.

In the e-book hype of those days, I failed to see why other publications should be produced in a different way than my Bible reproductions (based on concepts like workflow automatization, digital-first, XML-first, single-source publishing, multi-channel publishing, etc) as generating EPUBs and PDFs could easily be generalized and later the entire workflow (shorter, silent video). I added a converter from ODT to XHTML, so OpenOffice/LibreOffice can be used as a writing tool as long as predefined styles are used to introduce WYSIWYM to the document in lack of a better editor. For being able to offer it as a service to self-publishers, I wrote a frontend in PHP that invoked the very same Java code via system calls on a vServer, only adding administrative functionality like user or publication project management (the latter should have become part of the Java package eventually). I even went to some book fairs and more obscure events of the e-book avantgarde, so I know a few people from those worlds and their mentality.

From such occations, I picked up two of my major projects in that space, one is uploading EPUBs via XHTML to WordPress by using the XML-RPC API (again, there’s an online version of it using the same Java code behind a PHP wrapper), which then wasn’t used in production as the guy who needed it produced EPUBs the WYSIWYG way and naturally wanted this manual typesetting to be preserved in the blog post, while I cared about WYSIWYM instead. With that workflow already available, I got into contact with one of the guys who are behind several amazing projects, and as they went into the business of running an online e-book store, they got a lot of e-books from publishers along with ONIX metadata file(s), so the job was to import all of the ONIX metadata to WordPress and even update existing records. My attempt was never finished because the shop was shut down after some time, probably in part due to my lack of supporting them well/soon enough as I encountered several problems with the testing environment, WordPress and my not-so-hacky, not-so-smart, not-so-agile workflows. But even without completing this mechanism, I went beyond this particular use case and did some general ONIX work.

Smaller projects include the subversion of placebo non-digital whishful thinking by a self-publishing site that disabled the download button without any technical effect, a GUI frontend for epubcheck, a failed attempt to enlist “e-book enthusiasts” for building a digital library, an importer in PHP (SAX) from Twine to Dembelo (was later rewritten by the Dembelo lead developer in more modern PHP), a parser for a Markdown-like note taking language to XHTML and LaTeX (interest for learning about writing parsers for domain-specific languages came from the Wikitext to OSIS parser I still didn’t find the time to revisit) and a Twitch Video Uploader using their API (but I guess it’s broken now because of their “Premiere” nonsense).

As I grew more frustrated about traditional publishers, self-publishers, e-book “pirates”, the average reader and the big enterprises who use digital to exploit those who don’t understand it properly, the arrival of refugees from Afghanistan, Iraq, Syria and Africa in Europe forced me to focus on way more serious things than our digital future. Only a tiny fraction of time investment went into software development. All other civic tech programmers lost interest after only 1/2 years, and I joined the game late where the momentum was already gone. Most of my attempts to build software for helping out with solving some of the issues are targeted towards volunteers, for instance the ticket system, the petition system, the AutoMailer for mass mailings via PHPMailer, the asylum event system or the case management system as it turned out to be incredibly difficult to get refugees themselves involved with anything that’s not the Facebook app or WhatsApp, be it the one-way message system or the downloader for the “Langsam Gesprochene Nachrichten” by Deutsche Welle via their RSS feed. Even those for the German volunteers were only sporadically used, except the AutoMailer, which was a success; it did its job according to plan.

Two other projects remain incomplete due to lack of time, one is the attempt to parse Wikitext from the large monthly dump of the Wiktionary in order to generate a list of Arabic nouns with their German translation plus their articles, and the other an attempt to build an online voting system that ended up as a good exercise for learning ReST concepts as a fully functional system would require a lot more conceptual planning.

Entering the Hypertext space, I did experimentation by developing a variant of Ted Nelson’s span selector in Java, a disfunctional text editor that tracks all changes which either needs more work or porting to a better GUI library, a converter from Ted’s EDL format to XML, a downloader/retriever for EDLs in XML form and a workflow that glues together the retrieval of such EDLs in XML form and the concatenation of text portions from the obtained resources in order to construct the base text of the document. Originally started for the XML frontend for PoDoFo, I completed an early first version of a StAX parser in C++, but then was able to quickly port it to JavaScript, which was handy to handle the embedded XHTML WordPress blog post content as provided via the JSON API as I didn’t want to use DOM, contributing to an independent read-only client for our Doug@50 Journal and HyperGlossary.

At the moment I’m working on the side on a new project to introduce semantic annotation (“linked data”) and tooling for it to Bible texts, using the specific example of marking people and places in the Public Domain German Luther 1912 Bible translation as there was renewed interest as a result of the 500 year anniversary of the reformation. As it turned out to the surprise for some (not me), no digital version of the Luther 1912 text is authentic according to any printed original to our knowledge, so another analysis and proofread effort was necessary. An existing source became usable after a format transformation from flat CSV to hierarchical XML plus wrapping Strong numbers in XML tags instead of XML markers/milestones, plus converting that arbitrary, custom XML format as derived from the CSV to Haggai XML, plus textual correction.

Hypertext: DKR Scenario 29 Dec

This is an answer to Demo@50: Welcome Builder!, but please also consider the more practical interactive mapping of my project results to an OHS. Everything that follows is up for discussion, adjustment, extension, removal and implementation. A Dynamic Knowledge Repository, according to my current understanding, is a comprehensive collection of documents and other media supported by software to deal with it, created and curated by a community of people who are consumers, contributors, knowledge workers and stewards.

Can we make a 21st Century DKR happen? Yes, it’s not too difficult. It’s a totally different question if we can make it big, because besides the technological, legal and contentual solutions, there’s also the social/political aspect, timing and luck. On the other hand, I already count it as victory if the manage to augment our own matters or if we improve on the current state of affairs – it’s about bootstrapping, and I’m unable to predict what people will use the tools for in the future. Additionally, I believe that DKRs do exist today, for example the Wikipedia.

What infrastructures and components would have to be built? All of them and many, many more. It’s not so important what we will end up with eventually, but how it is built and what policies are applied. Formats and protocols need to be open, software needs to be libre-licensed, online services need to be self-hostable, dependencies need to be avoided, offline usage and preservation needs to be an option, social mechanisms should prevent abuse and encourage constructive collaboration. Most of the potential of the digital revolution wasn’t realized yet because strong powers entrap unsuspecting users in silos, where we need to build an alternative of interchangability on every level, basically liberating ourselves and humanity from totally artificial limitations.

What aspect of the DKR do you feel should be built and how do you want to contribute, based on the list of “What is missing” and other aspects of Doug’s work? As said, everything of it should be build, and a lot more. When it comes to me, I’m more a plumber in the background who builds many small intercompatible things to retain flexibility in an attempt to avoid programming myself into a corner, unmanageable complexity and bloatware. I have trouble with UI/UX design, but slowly try to catch up. Therefore, I’m 100% in for the Journal, don’t care too much about xFiles (as the underlying data structures should be abstracted away, will be subject for optimization experts and only needs to comply with the requirements given above), I would like to learn the Chorded Keyset but don’t know how to build one, are fine with any kind of linking but rather want more of it than less, and really like the notion of ViewSpecs although I’m not sure if they can work in some of the most complex situations. Regarding other aspects of Doug’s work, establishing useful augmented reality would be the killer in my opinion on the technical/instrastructural level. Furthermore, I would like to support collaboration and research in human-computer interaction, but definitively would go for the low-hanging fruit and advance from there.

If you are already working on something you feel is naturally part of a DKR, how do you feel it can interact with other components? I don’t know how my software can interact with other components, I’m not aware of a lot of existing other components or format specs, except Ted Nelson’s EDL format/concept and span selector, which aren’t libre licensed and need improvement, therefore I “interact” with them by implementing libre licensed alternatives that are almost compatible if not for trademark law creeping into the format spec. Time Browser and Author sound interesting, but I didn’t look into them yet. My components are intended to interact with other open/libre components to the fullest extend possible, and be it by means of data conversion.

Please write a walkthrough to help communicate our personal perspectives (featuring Joe). There is no single Joe, but a lot of different Joes, each with his own individual or pre-defined configuration. One of the Joes sits down at a computer, which allows him to continue reading/working where he left off the last time, or switch to new publications in his field of interest, or to look things up with one of the agents that use different metics, employing different methods to be most effective. He decides to continue with his current reading, where he adds a few comments, a new paragraph, annotations, links, fixes some errors, marks semantic meanings, and by doing so he discovers other documents, meanings and relations. His lifestream records his journey, which he can share or keep for himself, but it’s not unlikely that an agent/policy will decide what kind of activity will be publicly shared, so Joe usually doesn’t bother to take manual action. If a “document” (it’s more a text composition from diverse sources, rendered by many layers of standard, public and custom ViewSpecs) is particulary interesting to him, he tears it apart in order to fit portions of it into his own collection/archive, which in effect constitutes another new document, public or private. Joe also does this frequently together with other people: family members, colleagures, friends and complete strangers. Because he’s involved in several communities, he uses different, very specific ViewSpecs tailored for particular tasks, but it can happen from time to time that he doesn’t have and can’t find a ViewSpec that suits his needs, and as he didn’t learn yet how to build one himself, he either asks his neighbor or the plenty of people who provide services for the public network infrastructure, its data and software. No matter if Joe is awake or sleeping, agents/bots work for him, for others, for the content, for the network, to keep it clean, to answer questions, to do all sorts of operations, but in an intelligent way in cooperation so resources don’t get wasted unnecessarily.

As Joe is a little bit paranoid, he has the habit to print his favorite pieces or produce other physical manifestations in his living room (sometimes he’s too lazy and just orders them cheaply from an on-demand service), so he doesn’t need to worry about loosing anything in case “the lights go out”. Even if this should ever happen, most of the manifestations can be read in again automatically with some kind of 3D printed, matrix code, standardized OCR, RFID or punch card technology. Speaking of the physical world, Joe still needs to do business in it of course, but wherever he goes, online repositories go with him (well, the truth is that he rarely enjoys connectivity because of bandwith and legal constraints, not even to mention the long distances with no network access at all, so he’s forced to work with offline repos and auto-sync with the rest of the world later). Especially when in cities, places, people, objects and time present themselves in an augmented way via his mobile device, so he can either learn about or interact with them, make them do things as they’re too networked and compatible in a network of things. Quite frankly, a substantial portion of his “document work” is related to what he discusses and designs together with others in meetings or is linked to physical stuff, so he can overlay relevant data if he wants, with all the tools and capabilities available to him as he would at home in front of his computer. Now, those things are just a fraction of what Joe does on a typical day, but as the full account doesn’t fit on a computer screen and requires an Open Hypertext System to be properly presented, please continue reading here.

Hypertext: Mother of All Demos 50th anniversary

With less than one year left to prepare for the 50th anniversary of the Mother of all Demos, it’s about time to look at the projects that potentially can be presented on the 2018-12-09. The event is an important milestone to determine if we made significant progress in realizing Doug’s vision for the last 50 years, and it seems like we’re a little bit late. Still, I like to address the big picture first before committing to blind actionism.

But what does it even mean to “realize Doug’s vision” in the context of 2018? I have to admit that I’ve never watched the entire demo in one go, it’s just very difficult to spend a hour of quality Internet time in front of my computer, passively watching things that happen on a screen. I wondered why this is the case, and my impression is that the demo was made for end users, not system developers. Doug didn’t explain/discuss the technical details or what their results could mean conceptually for human-computer interaction, and as there’s no way for me to actually use their system today, that particular showcase is kind of practically irrelevant for me. It feels more like an advertisement, and to some extend, it is. If I’m not mistaken, part of the goal for the demo was to attract further funding for the project, which is perfectly fine, as there was no fast/cheap way to make the system a universally availabe product soon in those days. So I see the great demo more like a PR stunt, which in fact served pretty well to introduce people to the notion of personal, networked, visual and augmented interaction with a computer. So how can we replicate a similar effect? Do we even have to?

The world is very different now. In 1968, computers (especially for personal use) were a new thing, so it was natural to explore this technology – how can it help with solving problems of ever-increasing complexity? Today, people have seen computers, we know what they do. If the goal is to contribute tools for dealing with the big problems, we might not deploy computers in the traditional sense any more. For instance, if quantum computing would be around the corner, there would be the chance to hold another revolutionary demo. In the field of augmented reality, we should immediately start with preparing one and it wouldn’t even be difficult to do so. Would such endeavors still be true to Doug in spit? Sure, they’re not about documents and knowledge, the web would stay unfixed, so there could be some merit in replicating the image that Doug put in front of us. Keep in mind that it doesn’t have to be an either-or, but a decision for the anniversary event might shape what’s considered to be Engelbartian for the future, if it is a conservational or progressive movement, or both. Is the anniversary supposed to refresh the original vision, to promote a new/updated one (we could even mock one), or encourage long-term research and development?

If we end up with the goal to organize the world’s information and make it universally accessible and useful (in an open way and for real!), we have to realize that people don’t get excited about text manipulation any more as they already have Office suites and the web, that blue-sky spending isn’t available any more, that our results will be less innovative as the field isn’t new and there’s much more “competition” around. On the other hand, what Doug was able to do with a team of people working several years while spending money, a single guy can get ahead slowly without any financial backing as we don’t have to start from scratch and can benefit from existing technology and methods.

What’s wrong with HyperScope? It’s really unfortunate that the W3C didn’t standardize a XSLT interface (despite browsers tend to come with a XSLT processor), but can’t we get a JavaScript implementation of it by now? How much work would it be to update the GUI framework library that was used for HyperScope?

Hypertext: Socratic Authoring

The Future Text Initiative published an introduction to the notion of “Socratic Authoring”, which is pretty straightforward in describing the general requirements for a modern writing and publishing system. For sure similar visions have already been expressed before, but this proposal might be the most recent one.

In it, we find the demand to retain as much as possible from the richness the author experienced during composition. What does this mean? Is it referring to WYSIWYG, the ability of the author to hide parts of the process, the realization that the reader might become a co-author and therefore needs an equivalent writing environment? It sounds like we don’t want to see works being produced electronically just to end up in a single, primitive, fixed form, so re-digitalization would be needed to enrich it again.

Then there’s the suggestion that semantic augmentation should take place when the document is on a server. I ask, why should this only be possible on a server? Why shouldn’t the user deserve full augmentation on the client side also? Sure, augmentation could require some resources that are only available online or include actions that need to be executed on a different machine, but even when disconnected or without using an external proxy (regardless if such an external online service is owned or third party), the user should face artificial limitation just because the client side is lacking implementation.

To be continued…