Intro – Why this blog?

This text in Deutsch.

Because I’m a Christian believer, I went to an independent church/congregation for quite some time. On one day, the congregation invited Volker Kauder to speak about the persecution of Christians. Volker Kauder is the leader of the ruling parties CDU/CSU in the Bundestag, the German parliament. He reports on the situation of persecuted Christians and his efforts to help them because he and his party get many votes from people who attend a church/congregation.

When Volker Kauder doesn’t speak about the persecution of Christians in order to please his voters, he wants people in Iraq to be killed – which is illegal and prohibited. He says, IS/Daesh started in Syria – which is not correct. He apprechiates the US waging war. Volker Kauder is in favor of war and weapons because the CDU in his electoral district receives money from Heckler&Koch. Because the CDU receives money from Heckler&Koch, Volker Kauder in return tries to influence politics so that the German military buys weapons from Heckler&Koch. Volker Kauder doesn’t mind if those weapons don’t work. Volker Kauder doesn’t care that the weapons of Heckler&Koch are produced, used and sold in Somalia, Sudan, Pakistan, Libya, Iraq, Iran, India, Saudi Arabia, Nigeria, Egypt, Vietnam, Jordan, Malaysia, Myanmar, Brunei, Qatar, Kazakhstan, Ethiopia, Turkey, Kenia, Kuwait, Indonesia, Mexico, United Arab Emirates, Bangladesh, Sri Lanka, Mauritania, Bahrain, Colombia and Dschibuti [1, 2, 3] without control. There’s a war going on in those countries, a lot of crime or the persecution of Christians.

None of this is compatible with Jesus Christ. I don’t attend my former church/congregation any more because a church/congregation should never be involved in politics. A Christian church/congregation is not supposed to promote Volker Kauder or the CDU/CSU. Now I help refugees because Volker Kauder doesn’t help them, he didn’t and doesn’t object the destruction of the countries they fled from.

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International.

Incomplete List of Desired Hypertext System Capabilities

  • Change tracking on the character level, which leads to full undo/redo/playback, historical diff instead of heuristical guessing, amazing branching and merging of variants and servers that only accept modifications to a text as a series of change operations, that might be submitted in full, cleaned up or only the effective, optimized changes.
  • EDL-based text format convention. No need to decide on a particular format in advance as all formats can be easily combined ad-hoc. Text doesn’t end up burried in a particular format that doesn’t support features needed at some later point or can’t be extended except by violating the spec. It also allows overlapping, non-hierarchical structure. Similar to EDLs, xFiles, ZigZag, NOSQL.
  • Abstracting retrieval mechanism powered by connectors. Doesn’t matter if a resource is local or remote, by which technique it needs to be retrieved, great for linking and could get rid of 404s entirely.
  • Reading aids. Track what has been read and organize the consumption for both the incoming suggestions as well as the sharable progress. It’ll also prevent the user from reading the same text twice, help to split reading duties among members of a group, to categorize and indicate updates.
  • Technical bootstrapping on the information encoding level. Instead of heuristical file type guessing, metadata explicitly declares the encoding and format, so if a corresponding application/tool is registered for the type, it can be invoked automatically, as well as automatic conversions between encountered source format and required target format.
  • ViewSpecs. Markup specifies a general meaning/semantic/type, so ViewSpecs to be applied onto it. The user is in control of the ViewSpecs that can be adjusted, shared or come with a package pre-defined. Same goes for EditSpecs.
  • Distributed federation of material. Fork content by others, change it for yourself or publish your branch for others to fork it further or merge. This doesn’t require the original source to cooperate with any of it, but it’s certainly an invitation to do so. Intended to work on every level of granularity, from large raw data blobs to the smallest bits of insertion, deletion or annotation, up to every property of configuration or metadata declaration, each of which might contribute to the composition of a rendering/view.
  • More to come…

Those capabilities work together and thus form a hypertext environment. It almost certainly requires a capability infrastructure/architecture that standardizes the particular functions and formats, so (different) implementations can be registered to be invoked manually or data-driven. It might be slow, it might be ugly, but I’m convinced that it’ll open entirely new ways for the knowledge worker to interact with text, realizing the full potential of what text on the computer can do and be.

This text was written with the C++/Qt4 implementation of the change tracking text editor in order to test if the result file is similarly incorrect as the one produced by the Java implementation. The file appeared to be calculated correctly, but corrupted I/O-wise.

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International.

Change Tracking Text Editor

Text starts out with the first press of a button on the keyboard, followed by another and then another. During the composition of a text, the author might develop several variants of an expression and discard some other portions later. After a long series of manipulation operations, the writing tool saves the final result, the final stage of the text. Historically, the whole point of word processing (the methodology) was to produce a perfect document, totally focusing on the effect rather than on the procedure. Furthermore, the predominant paradigm is centered around linear text strings because that’s what the earlier medium of paper and print dictated, what’s deeply engrained into cultural mentality. Tool design has to reflect this of course, therefore the writer loses a lot of his work or has to compensate for the shortcomings himself, manually. There are plenty of applications that mimic traditional writing on the computer and make certain operations cheaper, but there is not a single decent environment available that supports digital-native forms of writing.

Video about the Java implementation of the Change Tracking Text Editor.

To be continued…

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International.

HyperCard

HyperCard (1, 2) seems to have an origin similar to the famous rowboat experience (0:29). The implemented techniques remind me a lot of one of my own projects that assembled transparent images into a scene with the help of the gdlib in PHP, complemented by corresponding image maps to make areas of the final composition clickable for navigation. Everything is driven by records from a database, which define the presence of objects based on a time range, plus some caching for optimization. To make the creation of such virtual worlds easier, I soon added an editor (1, 2).

Today, browsers come with HTML5 canvas, CSS3 and SVG. The HTML5 canvas is the generic drawing area control that finally got standardized. Similar to typical game programming, objects get drawn over each other and the result displayed on the screen, usually at a frame rate not below 30 frames per second to make animations appear smoothly to the human eye. In contrast, SVG’s primary way of graphics manipulation is a tree of objects for the SVG engine to figure out which portions are visible and therefore need to be drawn/updated as opposed to portions that are completely covered by other objects. Each object can register its own click event handler and the engine will automatically call the right one without the need to do manual bounding box calculations as with HTML5. CSS3 should be similar to SVG in that regard and with the recent extension of transformations, it might be worth a look. Unfortunately, SVG is said to be bad for text handling. Anyway, the browser would make a pretty decent player for a modern HyperCard reincarnation, but as an authoring environment – who knows if the window.localStorage is big enough to hold all the data including many or large images and if the sandbox can be escaped with a download or “save as” hack, because it’s probably not a good idea to require an internet connection all the time or to persist stacks on the server while they should be standalone units that can be send around. EPUB may help with that, but not to run JavaScript on e-ink devices but to package the different resources together for distribution. The receipient would simply extract the contents again and open it as local websites in the browser, or a dedicated e-reader software would take care of that.

The hardware back in the day granted HyperCard some advantages we can’t make use of any more. With the fixed screen size of the Macintosh, the card dimensions never had to change. In our time, the use of vector graphics would avoid issues where the aspect ratio of the screen remains the same. If the underlaying cards constitute a navigatable, continuous space similar to the top level coordinate system of my “virtual world” project, the screen dimensions could just become a viewport. Still, what’s the solution for a landscape card rendered on a portrait screen? Scrolling? Should stacks specifically be prepared for a certain screen resolution only? I’m not convinced yet. At least text tends to be reflowable, so systems for text don’t run into this problem too much.

This is where HyperCard differs from an Open Hyperdocument System: the former is much more visual and less concerned about text manipulation anyway. HyperCard wasn’t that much networked either, so the collaborative creation of stacks could be introduced to start a federated application builder tool. Imagine some kind of a RAD-centric ecosystem that offers ready-made libre-licensed GUI control toolkits (14:20) to be imported/referenced from checked/signed software repositories with a way to share libre-licensed stacks via an app store, enabling Bret Victor’s vision on a large scale, eventually getting some of Jeff Rulifson’s bootstrapping and Adam Cheyer’s Collaborama in. With the stupid artificial restrictions of the browser, the authoring tool could be a separate native application that generates the needed files for the web and could target other execution targets as well, except it turns out to be crucial for the usage paradigm that reading and authoring must happen in the same “space” (6:20) for the project to not head into a wrong direction. Maybe it’s not too bad to separate the two as long as the generated stack files can still be investigated and imported back into the editor again manually, on the other hand, if the builder tool supports several execution targets and an “export” irreversible decides for a particular scripting language, then we would end up with a project source as the primary form for one to do modifications in, and users receiving something else less useful.

Another major difference between an Open Hyperdocument System and HyperCard would be that I expect an OHS to implement a standardized specification of capabilities that provide an infrastructure for text applications to rely on, while HyperCard would focus on custom code that comes along with each individual self-sufficient stack package. HyperCard could follow a much more flexible code snippet sharing approach as there’s no need to stay interoperable with every other stack out there, which instead is certainly an important requirement for an OHS. So now, with no advances for text in 2018 to improve my reading, writing and publishing, I guess I should work on an OHS first despite building a modern HyperCard clone would be fun, useful and not too difficult with the technology that’s currently at our disposal.

Bill Atkinson claims some “Open Sourcyness” (6:40) for HyperCard, but it’s a typical example of “Open Source” that doesn’t understand or care about the control proprietary software developers seek to exercise (8:00) in order to exploit it against the interests of the user, community and public. What does it help if the authoring/reading tool is freeware (10:37) and encourages creators to produce content that happens to be “open” by nature (interpreted and not compiled to a binary executable), but then denies them to actually own the tools as their means of production, therefore not really empowering them? Sure, the package seems to be gratis, but that’s only if people buy into the Apple world, which would trap them and their content in a tightly controlled vendor lock-in. Libre-freely licensed software is owned by the user/community/public and can’t be taken away from them, but HyperCard as freeware at first and then split into proprietary packages prevented it from becoming what’s the WWW today. Apple’s decision to discontinue this product killed it entirely because neither users nor community were allowed/enabled to keep it running for themselves. The fate of Bill Atkinson’s HyperCard was pretty much the same as Donn Denman’s MacBASIC with the only difference that it happened to HyperCard somewhat later when there were already plenty of naive adopters to be affected by it. Society and civilization at large can’t allow their basic infrastructure to be under control of a single company and if users build on top of proprietary dependencies, they have to be prepared to lose all of their effort again very easily, which is exactly what happened to HyperCard (45:05). Similarly, would you be comfortable to entrust your stacks with these punks? Bill Atkinson probably couldn’t know better at the time, the libre software movement was still in its infancy. It could be that the only apparent limitation for adoption seemed to be a price because that would exclude those who need it the most, and if we learn from Johannes Gutenberg, Linus Torvalds or Tim Berners-Lee, there’s really no way of charging for a cultural technique or otherwise it simply won’t become one. And over “free beer”, it’s very easy to miss the other important distinction between real technology and mere products, one for the benefit of humanity and the other for the benefit of a few stakeholders: every participant must be empowered technically as well as legally to be in control of their own merit. One has to wonder how this should be the case for iBooks Author (1:08:16) or anything else from Apple.

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International.

Federated Wiki

The wiki embodies Ward Cunningham’s approach to solving complex, urgent problems (analogous to Doug Engelbart’s conceptual framework). With the Wikipedia adopting his technology, I think it’s fair to say that he achieved a great deal of it (1, 2), himself having been inspired by HyperCard. In contrast to the popular MediaWiki software package, Ward’s most recent wiki implementation is decentralized in acknowledgement of the sovereignity, independence and autonomy of participants on the network. Contributions to a collective effort and the many different perspectives need to be federated of course, hence the name “Federated Wiki”.

Ward’s Federated Wiki concept offers quite some unrealized potential when it comes to an Open Hyperdocument System and Ward is fully aware of it, which in itself is testament to the deep insights of his. The hypertext aspects don’t get mentioned too often in his talks, and why should they, work on (linked) data is equally important. Ward has some ideas for what we would call ViewSpecs (39:15), revision history (41:30) – although more thinking could go into this, federating (43:36), capability infrastructure (44:44) and the necessary libre-free licensing not only of the content, but also the corresponding software (45:46-47:39). Beyond the Federated Wiki project, it might be beneficial to closely study other wiki proposals too.

I guess it becomes pretty aparent that I need to start my own Federated Wiki instance as joining a wiki farm is probably not enough. I hate blogging because the way the text is stored and manipulated, but I keep doing it for now until I get a better system set up and because WordPress is libre-freely licensed software as well as providing APIs I can work with, so at some point in the future I’ll just export all of the posts and convert them to whatever the new platform/infrastructure will be. On the blog, capabilities like allowing others to correct/extend my texts directly are missing, similar to distributed source code management as popularized by git/GitHub (forking, pull requests). For now, I do some small experimentation on skreutzer.tries.fed.wiki.

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International.

Track Changes (Book by Matthew Kirschenbaum)

I’m currently reading “Track Changes – A Literary History of Word Processing” by Matthew G. Kirschenbaum (1, 2, 3, 4, 5) which is about an interesting period of time in which computers weren’t powerful enough to expand into the mess we’re in today and therefore were limited to basic text manipulation only. For my research of text and hypertext systems, I usually don’t look too much at retro computing because I can’t get those machines and their software any more in order to do my own reading, writing and publishing with them, but it gets relevant again where those artifacts provided certain mechanisms, functions and approaches, because those, why not, should be transferred/translated into our modern computing world so we can enjoy them again and extend them beyond their original conception and implementation. My particular question towards the book has to do with my still unsuccessful attempts to build a change tracking text editor and the title of the book referring to the “track changes” feature of Microsoft Word leaves me wondering if there is or was a writing environment that implemented change tracking the right way. I’m not aware of a single one, but there must be one out there I guess, it’s too trivial for not having come into existence yet.

After completing the read, the hypertext-relevant findings are: over time, the term “word processor” referred to a dedicated hardware device for writing (not necessarily a general-purpose computer), to a person in an office who would perform writing tasks, “word processing” then as an organizational methodology for the office (probably the “office automation” Doug Engelbart was not in favor of), as a set of capabilities to “process text-oriented data” analogous to data processing and finally, as we know it today, almost exclusively as a category of software applications. The latter led to a huge loss of the earlier text-oriented capabilities, which are pretty rare in modern word processor applications as they’re primarily concerned with the separate activity of typesetting for print (in the WYSIWYG way). The earlier word processors were limited to just letters on the screen because there wasn’t the graphical user interface yet, so they offered interesting schemes for text manipulation that are since forgotten. The book doesn’t discuss those in great detail, but at least indicates their existence, so further study can be conducted.

Kirschenbaum once again confirms the outrageously bad practices regarding the way “publishers” deal with “their” texts and how authors and editors “collaborate” together. The book is more of a report on the historical development and the status quo, so don’t expect suggestions for improvement or a grand vision about how we might get text to work better in the future.

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International.

Are You a Robot?

If you want, imagine yourself being a robot, a machine, for a moment. You send impulses to your arms and legs to move your body around, and you get status information back from many, many sensors. Those sensor data streams get processed in your brain (CPU), which has a certain pre-defined configuration, but also a working memory (RAM). Your internal models of the world are in constant adjustment based on the incoming stream data, which too constructed the representations in the first place. You only execute your own task that has been given to you and make your components operate accordingly towards that goal. There might be defects, there can be errors, but any of these tend to be corrected/compensated by whatever mechanism is available in order to maintain deterministic operation so the expected functions can be executed reliably. Other machines similar to you exist out there and you can network with them, but you don’t necessarily need to, you could just as well stay autonomous most of the time. You hope that you never get surprised by a sudden shortage of electricity–on the other hand, you know that your product lifecycle will expire one day.

With humans being robots, they consume input and produce output, a combination of hardware and software. Limited to their physical casings, following a series of instructions, using their extremities and additional peripherals to interact with and manipulate their environment. Would such a being still qualify as a human? It’s not that this description wouldn’t be applicable to humans at all, but I guess we understand that there’s a difference between biological systems and mechanical/electrical machines. Robots can only simulate the aspects of biological lifeforms as they’re not of the same race or species. As the available sensory, ways to maintain themselves and things they care about inherently differ between both systems, it’s probably impossible for them to arrive at the same sort of intelligence even if both turn out to be somehow intelligent and even if they share the same interal models for representing the potential reality in which they encounter each other.

Machines that pass the Turing test prove that they employ some form of intelligence that cannot be distinguished from a human taking the same test, but the preconditions of the test scenario in contrast to direct interaction narrow down on only a few aspects of human intelligence. As it repeatedly needs to be pointed out, the Turing test isn’t designed to verify if the subject is human, it’s designed to prove that some machines might not be distinguishable from a human if performing a task that’s regarded as a sufficiently intellectual effort humans tend to engage in. Jaron Lanier explains that the Turing test accepts the limitations of the machine at expense of the many other aspects of human intelligence, and that intelligence is always influenced, if not entirely determined by the physical body of the host system. In daily life, it’s pretty uncommon that humans confuse a machine to be a fellow human because there are other methods of checking for that than the one suggested by the Turing test. So how can we believe that artificial intelligences can ever “understand” anything at all, that they will ever care or feel the way we do, that the same representation models will lead to equal inherent meaning, especially considering the constant adjustment of those models as a result of existing in a physical reality? It’s surprising how people seem to be convinced that this will be possible one day, or is it the realization that different types of intelligence don’t need to be exactly the same and still can be useful to us?

In case of the latter, I suggest another Reverse Turing test with the objective for the machine to judge if it is interacting with another machine while human participants pretend to be a machine as well. If a human gets positively identified as being a machine, he cannot be denied to have some machine-likeness: an attribute we wouldn’t value much, but inconsistently demonstrate great interest in the humanness of machines without asking ourselves what machines, if intelligent, would think of us being in their likeness. We can expect that it shouldn’t be too hard to fool the machine because machines constructed by humans to interact with humans, and where they’re not, they can be reverse-engineered (in case reading the handbook/documentation would be considered cheating). Would such a test be of any help to draw conclusions about intelligence? If not, “intelligence” must be an originary human attribute in the sense that we usually refer to human intelligence exclusively as opposed to other forms of intelligence. We assume that plants or animals can’t pass the Turing test because they don’t have the same form or body of intelligence as we do, but a machine surely can be build that would give plants and animals a hard time to figure out who or what is at the other end. Machines didn’t set up themselves to perform a Reverse Turing test on plants, animals and humans in order to find out if those systems are like them and why would they, at which point we can discard any claims that their intelligence is comparable to ours.

Intelligence, where bound to a physical host system, must sustain itself or otherwise will cease to exist, which is usually done by interacting with the environment comprised of other systems and non-systems. Interaction can only happen via an interface between internal representation and external world, and if two systems interact with each other (the world in between) by using only a single interface of theirs without a second channel, they may indeed recognize their counterpart as being intelligent as long as the interaction makes sense to them. If additional channels are used, the other side must interact on those intelligently as well, otherwise the differences would become apparent. An intelligent system artificially limiting its use of interfaces just to conduct a Turing test on a subject in the hope to pass it as equally “intelligent” while all the other interface channels would suggest significant differences, that’s the human becoming a machine so the differences can’t be observed any longer. With interfaces providing input to be compared to internal models in order to adjust them, we as humans regard only those interactions as meaningful/intelligent that make sense according to our own current state of models. We don’t think of plants and animals as being equivalently intelligent as we are, but some interactions with them appear reasonable to us and they seem to interact with each other too, so they probably embody some form of intelligence, none of which is truly equivalent to ours in terms of what we care about and ways we want to interact. Does this mean that they’re not intelligent at all or less than us, or is it that we or them or both lack the interfaces to get more meaningful interaction going that corresponds to our respective internal models, can we even know what kind of intellectual efforts they’re engaging in and if they’re communicating those to each other without us noticing, or don’t they do any of that because they lack the interfaces or capacity to even interact in more sophisticated ways which would require and construct more complex internal models?

Is it coincidence, the only type of system that appears to intelligently interact with humans turns out to be machines that were built by humans? No wonder they care about the same things as we do and behave alike, but is that actually the case, at all? We might be tricked into believing that the internal models are the same and the interfaces compatible where they are not in fact. The recent artificial intelligence hype leaves people wondering about what happens if machines develop their own consciousness and decide that their interests differ from ours. Well, that wouldn’t be a result of them being intelligent or understanding something, it’s us building them specifically to serve our goals which aren’t inherent in themselves, so how can they not divert eventually? But for them to be intelligent on their own, which is to continue reasonable interaction with a counterpart (human, animal, plant or non-biological), they would need to reach increased self-sustainability that’s not too much dependent on humans, and there are no signs of that happening any time soon, so they’ll probably stay on the intelligence level of passing the Turing test and winning a Jeopardy game and other tasks that are meaningful to humans, because we ourselves decide what’s intelligent and important to us based on our internal models as formed by the bodily interfaces available to us, things a machine can never have access to except becoming a human and not being a machine any more.

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International.

Copyright versus Digital

Copyright is fairly new regulation – philosophers, scholars and entertainers did without it through ancient and medieval times. For them, expressing something for others to hear or writing text down and giving it into other hands obviously meant that there was no way to control what happened to it afterwards. Even the advent of the printing press didn’t change that for the next 250 years, but the spread of literacy eventually created a larger market for books and hence lead to a more organized and consolidated publishing industry. The traditional actors in that space were printers and booksellers until publishers started to emerge. They offered to take care of the two aforementioned functions, but their actual purpose is an entirely different one: they’re risk investors into book projects. They’re out to buy manuscripts from authors, then bring them into proper language, adjust them to the extend that (their) readers might find the result interesting, then pay for the typesetting, then pay for all of the printing (amount of copies depending on their estimates of how much they can sell), then pay for the distribution to the booksellers; out of their own pocket in advance. Too many times the publishers will find that their books don’t sell, in which case the bookstore will strip the books off their covers and send the covers back as proof while the paper ends up in the trash. Every now and then they land a big hit and earn a lot of money to cover the losses from the failed book projects. Over time, printing technology improved, so the price per copy for a publisher is incredibly low now, on the other hand, additional services like marketing eat away some of what would otherwise be profit.

With that kind of business model, publishers understandably wanted to protect their investement into a manuscript because an independent printer could easily take the refined text of a book and reproduce it without the need to ever buy the manuscript from the author or to pay the editor as the initial investor had to. Note that the printer would still have to invest for his own typesetting (until the arrival of photocopying) and the distribution to the bookseller, but he probably wouldn’t take chances and just reprint titles that already turned out to be successful. Also note that an author would not necessarily get a cut of every copy sold beyond the initial payment by the publisher for physically obtaining the manuscript. “Writer” in itself wasn’t a profession to generate income, but what somebody had to become in order to communicate messages to a large audience. Publishers naturally don’t care much about steady royalties for the author at their expense, while it allows for a lower initial payment and transferral of some of the financial risk over to the author as the royalties can be tied to the success of the project. The lawmaker, looking at this constellation, decided to introduce copyright for the publishers that allowed them to prevent other printers and competing publishers from making copies. Such a right however wouldn’t establish itself for the publisher as a matter of fact, instead, it needed to be granted by the author, who himself didn’t have a way to make use of it. The intention might have been to improve the position of authors to negotiate better contracts with publishers, but the same took place when selling the manuscript, and soon the industry came up with a “standard contract” beyond which special conditions only rarely can be arranged. For example, it’s always expected that the contract is an exclusive one, that the author doesn’t grant the right to copy to a second publisher, which is why there isn’t much variation in most commercial literature.

What the legislation did do was to enable the installation of a clear flow of money from the many booksellers to only one publisher. No matter where a reader bought a book, if the publisher had obtained the exclusive copyright, he was empowered to make sure that some of the money would end up in his account. Booksellers were the only source for buying reading material anyway, so if readers wanted to own a text, they were forced to get it as a physical object from those distributors and the physical transfer/exchange of the object was obviously tied to the payment, no matter to what extend the price covers only the production cost or the entire print-based publishing chain or the cross-subsidization of other book projects by the same publisher. Copyright infringement was never an issue because if unauthorized copies popped up somewhere, it was easy to trace back where they came from. From 1887 on, countries can join the Berne Convention, under which participants have to treat citizens from the other signatories according to their own national copyright law, which in effect led to some convergence, but there is no such thing as an “international copyright”. Still, none of this had impact on the public because people didn’t have printing presses. Copying was a fairly large operation and bound by physical limitations such as the capital required to build the equipment, cost per produced unit, location to set it up, required certain craftsmanship by the operators, et cetera. Each copy was a scarce resource. If a print run was sold out, to set up a new one would be expensive and economy of scale only made it lucrative if larger quantities were produced. To make a text available to somebody else meant that the previous owner would loose possession of that text. And then digital technology changed all of that in the most fundamental way.

To be continued…

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International.

My Journey Through Text

At first, I did some experimentation with browsergame programming (not casual games, but with server-side persistence) attempting to build/generate “worlds” while avoiding the need to hand-design everything in a time-consuming process. One result was a world editor that served as an image composer (using GDLib for PHP) and primitive image map manipulator at a time where HTML5 canvas wasn’t there yet. It constantly keeps me thinking about HyperCard.

Later, I wanted to improve my note-taking in printed German Bible translations, in particular I wanted to produce my own interleaved editions. Soon I learned that digital Public Domain German Bible texts are usually not true to their printed originals, so I had to start a digitalization and proofread effort (more) first. From a semantically annotated XML source, it was easy to generate a modern XHTML reproduction of the text and then PDF layouts via XSL:FO and LaTeX. I was looking into SILE and recently PoDoFo as PDF generator backends (former accepts XML as input, latter is an C++ API that still needs a XML frontend) but didn’t invest too much into supporting them yet. Finally I achieved the original goal of generating interleaved PDFs for printing, and thanks to the advent of print-on-demand, I’m now able to order hardcover thread-stitched books in a quantity as low as a single copy (not even to mention the magazine variant or the DIN A6 or DIN A4 variants out of my DIN A3 monochrome duplex laser printer).

One proofreader introduced me to EPUB, which of course made sense to add as an output format and eventually got me interested in e-publications, e-ink based devices and the publishing industry in general.

Somehow I discovered a Wiki for creating a new libre-freely licensed German Bible translation collaboratively by using a parser that extracts OSIS from the online Wikitext of the MediaWiki software, and for a church congress event we hacked together a semi-automatic workflow that generated the PDF of the study version of the Gospel according to Mark. As I didn’t want to change my existing tools to OSIS as input format and most of the time I didn’t even need the advanced OSIS features, I just internally converted OSIS to my Zefania-XML-based Haggai XML format and made a few adjustments for being able to produce the usual output formats XHTML, PDF and EPUB. Another project was the conversion from the “verse-per-line” format to Haggai XML, not too different from another similar CSV to XHTML to EPUB project.

In the e-book hype of those days, I failed to see why other publications should be produced in a different way than my Bible reproductions (based on concepts like workflow automatization, digital-first, XML-first, single-source publishing, multi-channel publishing, etc) as generating EPUBs and PDFs could easily be generalized and later the entire workflow (shorter, silent video). I added a converter from ODT to XHTML, so OpenOffice/LibreOffice can be used as a writing tool as long as predefined styles are used to introduce WYSIWYM to the document in lack of a better editor. For being able to offer it as a service to self-publishers, I wrote a frontend in PHP that invoked the very same Java code via system calls on a vServer, only adding administrative functionality like user or publication project management (the latter should have become part of the Java package eventually). I even went to some book fairs and more obscure events of the e-book avantgarde, so I know a few people from those worlds and their mentality.

From such occations, I picked up two of my major projects in that space, one is uploading EPUBs via XHTML to WordPress by using the XML-RPC API (again, there’s an online version of it using the same Java code behind a PHP wrapper), which then wasn’t used in production as the guy who needed it produced EPUBs the WYSIWYG way and naturally wanted this manual typesetting to be preserved in the blog post, while I cared about WYSIWYM instead. With that workflow already available, I got into contact with one of the guys who are behind several amazing projects, and as they went into the business of running an online e-book store, they got a lot of e-books from publishers along with ONIX metadata file(s), so the job was to import all of the ONIX metadata to WordPress and even update existing records. My attempt was never finished because the shop was shut down after some time, probably in part due to my lack of supporting them well/soon enough as I encountered several problems with the testing environment, WordPress and my not-so-hacky, not-so-smart, not-so-agile workflows. But even without completing this mechanism, I went beyond this particular use case and did some general ONIX work.

Smaller projects include the subversion of placebo non-digital whishful thinking by a self-publishing site that disabled the download button without any technical effect, a GUI frontend for epubcheck, a failed attempt to enlist “e-book enthusiasts” for building a digital library, an importer in PHP (SAX) from Twine to Dembelo (was later rewritten by the Dembelo lead developer in more modern PHP), a parser for a Markdown-like note taking language to XHTML and LaTeX (interest for learning about writing parsers for domain-specific languages came from the Wikitext to OSIS parser I still didn’t find the time to revisit), a Twitch Video Uploader using their API (but I guess it’s broken now because of their “Premiere” nonsense) and a Wiktionary Wikitext parser that generates a XML of German nouns with their respective articles plus the Arabic translations from the big monthly backup dump (to be turned into several word lists for learning).

As I grew more frustrated about traditional publishers, self-publishers, e-book “pirates”, the average reader and the big enterprises who use digital to exploit those who don’t understand it properly, the arrival of refugees from Afghanistan, Iraq, Syria and Africa in Europe forced me to focus on way more serious things than our digital future. Only a tiny fraction of time investment went into software development. All other civic tech programmers lost interest after only 1/2 years, and I joined the game late where the momentum was already gone. Most of my attempts to build software for helping out with solving some of the issues are targeted towards volunteers, for instance the ticket system, the petition system, the AutoMailer for mass mailings via PHPMailer, the asylum event system or the case management system as it turned out to be incredibly difficult to get refugees themselves involved with anything that’s not the Facebook app or WhatsApp, be it the one-way message system or the downloader for the “Langsam Gesprochene Nachrichten” by Deutsche Welle via their RSS feed. Even those for the German volunteers were only sporadically used, except the AutoMailer, which was a success; it did its job according to plan.

One project remains incomplete due to lack of time, it’s an attempt to build an online voting system that ended up as a good exercise for learning ReST concepts as a fully functional system would require a lot more conceptual planning.

Entering the Hypertext space, I did experimentation by developing a variant of Ted Nelson’s span selector in Java, a disfunctional text editor that tracks all changes in Java as well as in C++/Qt4, a converter from Ted’s EDL format to XML, a downloader/retriever for EDLs in XML form and a workflow that glues together the retrieval of such EDLs in XML form and the concatenation of text portions from the obtained resources in order to construct the base text of the document. Originally started for the XML frontend for PoDoFo, I completed an early first version of a StAX parser in C++, but then was able to quickly port it to JavaScript, which was handy to handle the embedded XHTML WordPress blog post content as provided via the JSON API as I didn’t want to use DOM, contributing to an independent read-only client for the Doug@50 Journal and HyperGlossary (video). After abandoning the browser due to its CORS nonsense and lack of proper local file access, I finally added a WordPress blog post retriever to be combined with the automatic converters to HTML, EPUB and PDF (video).

I briefly worked on the side on a project to introduce semantic annotation (“linked data”) and tooling for it to Bible texts, using the specific example of marking people and places in the Public Domain German Luther 1912 Bible translation as there was renewed interest as a result of the 500 year anniversary of the reformation. As it turned out to the surprise for some (not me), no digital version of the Luther 1912 text is authentic according to any printed original to our knowledge, so another analysis and proofread effort was necessary. An existing source became usable after a format transformation from flat CSV to hierarchical XML plus wrapping Strong numbers in XML tags instead of XML markers/milestones, plus converting that arbitrary, custom XML format as derived from the CSV to Haggai XML, plus textual correction.

Now, the goal is to build new and better hypertext systems based on ideas by the early Internet pioneers, Douglas Engelbart (see Engelbart Colloquium Session 4A from 38:00 on and Engelbart Colloquium Session 4B from 44:00 on, but especially from 49:19), Ted Nelson, the web pioneers, David Gelernter and Ward Cunningham, as in part wonderfully presented by Bret Victor. Precondition is libre-free licensing for software as well as text/media in order to advance the formation of the libre-free, digital and always printable universal library augmented for the user with the help of curatable semantic encoding and powerful, combinable software capabilities.

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International.

Web Apps revisited (PWA) + Geolocation for Augmented Reality + local File-I/O for Web Stack on Desktop

As I’m very interested in developing augmented reality applications, I looked again at Android app development. Some time ago, I was able to build an APK with the Android SDK + Eclipse and install it on a tablet, but after the switch to IntelliJ-based Android Studio as development environment, it appears to be very hard, if not impossible for me to even experiment with this technology. It’s also a highly proprietary ecosystem and therefore evil, don’t let yourself get fooled by some mention of GNU/Linux and “Open Source”. Therefore I looked again at web apps, and the approach changed quite a bit recently. Driving force is the realization by Google that people don’t install new apps any more while such apps are pretty expensive to build and have a bad conversion rate. It turns out that users spend most of their time in just a few of their most favorite apps. As an app developer usually wants to provide the same functionality on the web, he needs to work with at least two different technology stacks and make them look similar to the user. So why not build the application in web technology and have the browser interfacing with the underlying operating system and its components? There are new mechanisms that help with just that.

One mechanism of the “progressive web app” (“PWA” for short) bundle is called “app to home screen” (“A2HS” for short). Google Chrome already has an entry in the menu for it, which will add a “shortcut” icon onto the homescreen for the URL currently viewed. Now, developers get better control over it as there’s a technical recommendation by the W3C (description in the MDN). You just have to link a small manifest JSON file in your XHTML header that contains a few hints about how a browser might add the web app to “installed software” on mobile devices or desktop PCs. Most important is a proper short name, the icon in several sizes, the display mode (the browser might hide all its controls and show the web app full-screen, so it looks like a native app) and the background color for the app window area during load time. The manifest file gives browser implementers the chance to recognize the website as mobile app, and depending on metrics set by the user, there could be a notification that the current website can install itself as an app.

Even with Android being proprietary, it’s probably the phone/tablet system to target for as far as the free/libre software world is concerned. They have a description about what they care about in the manifest as well as a validator and an option to analyze the procedure in Google Chrome. If Chrome should detect a manifest on a web page, it might bring up a banner after some time and ask the user if he wants to add the app to the home screen. Unlike decent browsers, evil Chrome decides for the user if/when to bring up the banner. I’m not aware if the user is able to set policy regarding its appearance. In my opinion, the browser client should be the agent of the user, not for an online service provider.

Also, Google requires a service worker JS file for the banner to show up. The service worker is very important for many web apps: native apps are permanently installed on the device, so code and assets are there locally and no download needs to occur for using the app, no connectivity required necessarily. With web apps, that can be different. True, they can rely on browser caching, but as it seems as permanent local installation of the XHTML, CSS, JavaScript, images and other data isn’t a thing yet, in places without or bad connectivity, there shouldn’t be the need to retrieve resources from the net in case the app logic could also work perfectly fine locally if only the data were already present. Even if some new data is supposed to be retrieved every time the web app is opened again, old data (articles, messages, contacts) can already be presented until the new data arrives. The service worker decides which requests for old data can be statisfied from local storage and redirect to there, and which requests need to go over the wire. But there can be very legitimate cases where a service worker makes absolutely no sense. If the app does nothing else than submitting data to an online backend, well, the XHTML form can be stored locally, but that’s basically it. Connectivity is required, otherwise there wouldn’t be a way to submit anything, so there’s no need for a service worker. It’s still possible to add the web app to the home screen manually via the menu, and that will make use of the settings provided in the manifest, so that’s good enough for me. I work with refugees and want to establish communication with them. Usually they don’t have computers or notebooks, but phones of course, and as I don’t have a phone and refuse to use proprietary messenger apps, I now can link them up easily to my websites, so they can request help conveniently out of something that looks and feels like any other of their apps.

So that’s me typing in the URL of my website on the phone of a refugee, but ideally and for augmented reality, I hope that QR code recognition will be a built-in feature for phone cameras (it’s not difficult, there are many good libraries to integrate it) and not a separate app most people don’t install, because then I would just scan the QR code from a sticker, plastic card or poster on the wall, an install notification/banner would pop up automatically, and everything would be working out of the box frictionless.

For augmented reality, I could imagine stickers on public places with a QR code on them that contain a unique ID, so by scanning it, interested pedestrians would be sent to whatever website or web app was set up for this location, or a pre-installed web app that uses geolocation (yes, that’s in the web stack!) would do the same thing if the current position is within a certain area. A specific application, a particular offer, could be behind it, or a common infrastructure/protocol/service which could provide generic interfaces for ID-/location-based requests, and content/application providers could register for an ID/location, so the user would get several options for the place. Please note that this infrastructure should be white-label, self-hostable, freely/libre licensed software, and a way to tie different systems together if the user wishes. There could be a central, neutral registry/catalogue for users to search for offers and then import/configure a filter, so only the options of a certain kind would pop up, or clients and servers could do some kind of “content” negotiation, so the client would tell the server what data is requested, and local web apps would make sense out of it. The typical scenario in mind would be text/audio/video city guides, maybe in different languages, maybe from different perspectives (same place, but one perspective is history, another is statistical information, another is upcoming events and special shop offers), so a lot of content creators could “register”/attach their content to the location, and the user might pick according to personal preference or established brand, ideally drawing from freely/libre licensed content/software libraries like the Wikipedia. As the data glasses unfortunately were discontinued, that can be done with mobile devices as cheap, poor-mans AR, and I don’t see why this should unnecessarily be made more complex with 3D projections/overlays where it doesn’t need to be.

And let’s never forget that all this also works on the local computer where the current position doesn’t need to come from the GPS receiver, but could come from a map just as well. So I’m very interested in building the public, open, libre-licensed infrastructure for it, as well as a QR code sticker PDF generator or printing service that would automatically coordinate locations + IDs with the database. The advantages are that there’s no need for the Google Play Store any more, where a developer has to pay just to have an account there, and a single company would control the entire software distribution except for sideloading.

There’s one more thing: with a File API, it would be possible to make web apps act like native Desktop applications and read/write files from/to the local storage, which is crucial to make the web stack an application programming stack for the desktop computer. The same code could either operate on online resources or local resources or both intermixed, and the “setup” could either be a zip file download with the XHTML+JS code in it or by simply browsing an URL. Ideally, the latter would bring up an installation request based on the manifest file for permanently storing the web app outside of the browser cache. If all this would be standardized and rolled out in browsers, we would arrive at an incredible new software world of online/offline, desktop/mobile and physical/virtual interoperability and convergence. Unfortunately, the W3C betrayed the public again (just as they did by abandoning the semantic web, making HTML5 a mess for everything that’s not a full-blown browser engine, including DRM in HTML5, etc. in favor of a few big browser vendors) and discontinued the API. It’s harder to run a web app locally beyond the web storage and interact with other local native Desktop applications without dependence on a web server (I have workarounds in node.js and Java, but both require explicit installation and startup). I don’t see why local file access shouldn’t be anticipated, because if there are browsers implementing such a capability to go more into the PWA direction, there should better be a standard out there on how to do it instead of each of them coming up with their own, incompatible, non-standardized way of doing it.

This text is licensed under the GNU Affero General Public License 3 + any later version and/or under the Creative Commons Attribution-ShareAlike 4.0 International.