Doug@50: Status Update 2018-03-15

Management Summary: Problems with local, client-side XHTML (CORS, local file-I/O) prevent me from making progress with the independent glossary implementation for our Journal, question is if web stack needs to be abandoned entirely with switch back to native application programming.

You might know or not know that I tried to make an independent client for our Journal Glossary, which retrieves all blog posts, some of which are glossary entries and some regular blog posts, and apply the former onto the latter. It’s not intended as a final “solution”/“product”, but to start experimentation/prototyping in a bootstrap fashion to learn about the problems and improve solutions. It is on GitHub (green button “Clone or download” on the right, “Download ZIP”, extract, double click on index.html and wait). The glossary is outputted at the very bottom, and the glossary terms used within the text get highlighted in yellow color after waiting until loading is completed. The visual representation is deliberately a little bit awkward, instead of yellow highlighting, there could as well be an popup/popunder, link, or the glossary could provide a list of links to the blog posts that contain a term etc., that’s where we can play around with it to get a small demo working quickly.

Yesterday I started another one of those local, independent clients for adding a new term to our Journal glossary, as captured in the screenshot. Several glossary term input fields can be added via the [+] button (for synonyms), as well as several definition fields for those terms. Sure, it’s neither beautiful nor final, but a way for me to enter the data conveniently enough for testing. Then I tried to write the XML-RPC-API call to WordPress which would send the data to the server, but then I encountered that CORS is blocking such calls. CORS is a feature in the browser that prevents non-trivial calls (especially those who send data) to another server than the one the website was retrieved from in order to prevent that injected, rogue JavaScript code can send confidential user data to other servers than the one the site/client is supposed to talk to. As my file is local and no server ever involved in providing the site or sending a CORS header for allowing to talk to the Journal, I don’t know how to continue to make something for the #booksummit or how to go about the Doug@50 demo technology-wise. CORS can be circumvented by either starting the browser without this feature (which would make it vulnerable for all other use cases where we want CORS and requires some dirty tricks or user cooperation by operating-system-specific scripting to start the browser in that mode) or install and start a local server which does nothing than reading the local file from the hard disk and send it to the browser without any interference except for sending the additional CORS header. We might need the latter anyway because browsers don’t have a way to access the local hard drive, something my local client implementations would want to do just as any other native application. All of this is based on my assumption that the functions should be local without a remote server operated by somebody else to provide me such functionality, so the user can retain privacy and independence. I could try to promote and work on introducing local file-I/O and no CORS or explicitly granting permission to web standards and browsers in cases the file in the browser comes from the hard drive, so the web stack would become a decent framework to do “native” application programming, but people already tried that and failed, as well as there are complex security implications as most of the stuff is designed for online and we probably don’t want new scam like “download this HTML and double click it, whoops: all of your passwords are gone”, but on the other hand, it’s not different than other native applications that get installed, so on a decent operating system like the libre-free GNU/Linux (or BSD or whatever libre-free) none of that would be an issue because such local applications implemented with the web stack could be installed from online software repositories where stuff is checked and signed, so no scamming or malware or something else would be encountered at all. Another option would be to bundle those XHTML+JS pages/packages with a native application of my own or one of the existing ones, which are a runtime environment and extend the browser with functions for local file access (and hopefully CORS management), and in case native programming languages have a well-maintained browser control/component that provides a bridge for JavaScript calls into the host language, that could work too. If I don’t find a solution, I would need to abandon the web stack entirely and continue with my native efforts, loosing all of the decent rendering/visualization the web stack offers as well as potential support of the web dev community. So for the OHS, as there can be many implementations (of parts/components) in many different programming languages and stacks, I wonder in terms of OHS Core and the general OHS infrastructure conversation if the web stack can play any role in it. With it, we can get a few quick wins and people are very familiar with it, but the constraints may render some things impossible.

My Journey Through Text

At first, I did some experimentation with browsergame programming (not casual games, but with server-side persistence) attempting to build/generate “worlds” while avoiding the need to hand-design everything in a time-consuming process. One result was a world editor that served as an image composer (using GDLib for PHP) and primitive image map manipulator at a time where HTML5 canvas wasn’t there yet.

Later, I wanted to improve my note-taking in printed German Bible translations, in particular I wanted to produce my own interleaved editions. Soon I learned that digital Public Domain German Bible texts are usually not true to their printed originals, so I had to start a digitalization and proofread effort (more) first. From a semantically annotated XML source, it was easy to generate a modern XHTML reproduction of the text and then PDF layouts via XSL:FO and LaTeX. I was looking into SILE and recently PoDoFo as PDF generator backends (former accepts XML as input, latter is an C++ API that still needs a XML frontend) but didn’t invest too much into supporting them yet. Finally I achieved the original goal of generating interleaved PDFs for printing, and thanks to the advent of print-on-demand, I’m now able to order hardcover thread-stitched books in a quantity as low as a single copy (not even to mention the magazine variant or the DIN A6 or DIN A4 variants out of my DIN A3 monochrome duplex laser printer).

One proofreader introduced me to EPUB, which of course made sense to add as an output format and eventually got me interested in e-publications, e-ink based devices and the publishing industry in general.

Somehow I discovered a Wiki for creating a new libre-freely licensed German Bible translation collaboratively by using a parser that extracts OSIS from the online Wikitext of the MediaWiki software, and for a church congress event we hacked together a semi-automatic workflow that generated the PDF of the study version of the Gospel according to Mark. As I didn’t want to change my existing tools to OSIS as input format and most of the time I didn’t even need the advanced OSIS features, I just internally converted OSIS to my Zefania-XML-based Haggai XML format and made a few adjustments for being able to produce the usual output formats XHTML, PDF and EPUB. Another project was the conversion from the “verse-per-line” format to Haggai XML, not too different from another similar CSV to XHTML to EPUB project.

In the e-book hype of those days, I failed to see why other publications should be produced in a different way than my Bible reproductions (based on concepts like workflow automatization, digital-first, XML-first, single-source publishing, multi-channel publishing, etc) as generating EPUBs and PDFs could easily be generalized and later the entire workflow (shorter, silent video). I added a converter from ODT to XHTML, so OpenOffice/LibreOffice can be used as a writing tool as long as predefined styles are used to introduce WYSIWYM to the document in lack of a better editor. For being able to offer it as a service to self-publishers, I wrote a frontend in PHP that invoked the very same Java code via system calls on a vServer, only adding administrative functionality like user or publication project management (the latter should have become part of the Java package eventually). I even went to some book fairs and more obscure events of the e-book avantgarde, so I know a few people from those worlds and their mentality.

From such occations, I picked up two of my major projects in that space, one is uploading EPUBs via XHTML to WordPress by using the XML-RPC API (again, there’s an online version of it using the same Java code behind a PHP wrapper), which then wasn’t used in production as the guy who needed it produced EPUBs the WYSIWYG way and naturally wanted this manual typesetting to be preserved in the blog post, while I cared about WYSIWYM instead. With that workflow already available, I got into contact with one of the guys who are behind several amazing projects, and as they went into the business of running an online e-book store, they got a lot of e-books from publishers along with ONIX metadata file(s), so the job was to import all of the ONIX metadata to WordPress and even update existing records. My attempt was never finished because the shop was shut down after some time, probably in part due to my lack of supporting them well/soon enough as I encountered several problems with the testing environment, WordPress and my not-so-hacky, not-so-smart, not-so-agile workflows. But even without completing this mechanism, I went beyond this particular use case and did some general ONIX work.

Smaller projects include the subversion of placebo non-digital whishful thinking by a self-publishing site that disabled the download button without any technical effect, a GUI frontend for epubcheck, a failed attempt to enlist “e-book enthusiasts” for building a digital library, an importer in PHP (SAX) from Twine to Dembelo (was later rewritten by the Dembelo lead developer in more modern PHP), a parser for a Markdown-like note taking language to XHTML and LaTeX (interest for learning about writing parsers for domain-specific languages came from the Wikitext to OSIS parser I still didn’t find the time to revisit) and a Twitch Video Uploader using their API (but I guess it’s broken now because of their “Premiere” nonsense).

As I grew more frustrated about traditional publishers, self-publishers, e-book “pirates”, the average reader and the big enterprises who use digital to exploit those who don’t understand it properly, the arrival of refugees from Afghanistan, Iraq, Syria and Africa in Europe forced me to focus on way more serious things than our digital future. Only a tiny fraction of time investment went into software development. All other civic tech programmers lost interest after only 1/2 years, and I joined the game late where the momentum was already gone. Most of my attempts to build software for helping out with solving some of the issues are targeted towards volunteers, for instance the ticket system, the petition system, the AutoMailer for mass mailings via PHPMailer, the asylum event system or the case management system as it turned out to be incredibly difficult to get refugees themselves involved with anything that’s not the Facebook app or WhatsApp, be it the one-way message system or the downloader for the “Langsam Gesprochene Nachrichten” by Deutsche Welle via their RSS feed. Even those for the German volunteers were only sporadically used, except the AutoMailer, which was a success; it did its job according to plan.

Two other projects remain incomplete due to lack of time, one is the attempt to parse Wikitext from the large monthly dump of the Wiktionary in order to generate a list of Arabic nouns with their German translation plus their articles, and the other an attempt to build an online voting system that ended up as a good exercise for learning ReST concepts as a fully functional system would require a lot more conceptual planning.

Entering the Hypertext space, I did experimentation by developing a variant of Ted Nelson’s span selector in Java, a disfunctional text editor that tracks all changes which either needs more work or porting to a better GUI library, a converter from Ted’s EDL format to XML, a downloader/retriever for EDLs in XML form and a workflow that glues together the retrieval of such EDLs in XML form and the concatenation of text portions from the obtained resources in order to construct the base text of the document. Originally started for the XML frontend for PoDoFo, I completed an early first version of a StAX parser in C++, but then was able to quickly port it to JavaScript, which was handy to handle the embedded XHTML WordPress blog post content as provided via the JSON API as I didn’t want to use DOM, contributing to an independent read-only client for our Doug@50 Journal and HyperGlossary.

At the moment I’m working on the side on a new project to introduce semantic annotation (“linked data”) and tooling for it to Bible texts, using the specific example of marking people and places in the Public Domain German Luther 1912 Bible translation as there was renewed interest as a result of the 500 year anniversary of the reformation. As it turned out to the surprise for some (not me), no digital version of the Luther 1912 text is authentic according to any printed original to our knowledge, so another analysis and proofread effort was necessary. An existing source became usable after a format transformation from flat CSV to hierarchical XML plus wrapping Strong numbers in XML tags instead of XML markers/milestones, plus converting that arbitrary, custom XML format as derived from the CSV to Haggai XML, plus textual correction.


Sometimes the JavaScript Object Notation, JSON for short, is promoted as an alternative to the Extensible Markup Language, XML for short. On the format level, I fail to see a difference other than not repeating the name at the end tag.

<?xml version="1.0" encoding="UTF-8"?>

is pretty equivalent to

    "text": "words"

JSON saves a few bytes per tag, but memory/bandwidth can’t be the concern of JSON as it keeps commata, colons or the outmost braces. For both JSON and XML, it’s just text after all and we generally don’t care too much about its size as it is often optimized with gzip compression by the server, the multiplexing in HTTP/2 or some methods of minification as long as they don’t change tag names (removing whitespace indentation that’s only there for human readability would be the typical candidate). Is JSON concerned about ease of manual manipulation? Well, doesn’t matter as people aren’t supposed to fiddle with it directly anyway, it’s just the lack of decent tools that force us to. Is JSON about the tools? To some extend – JSON is for JavaScript. That’s particulary bad as JavaScript initially was designed to power DHTML. Web developers saw it as an improvement that they don’t have to declare types, and now there’s TypeScript. Web developers saw it as an improvement that they don’t have to declare classes, and now they were introduced by ECMAScript6/2015. Those extensions are good for modern JavaScript to become a better tool for building serious applications and not just stupid animations, but JavaScript still remains tied to the browser and the limitations that come from the security sandbox as native file-I/O didn’t gain traction. But even if we accept JavaScript as an equal citizen in the application languages world, there’s not a lot of support for JSON in the other languages. Just think about it: Java Enterprise Edition has a JSON library (let’s write server backends in Java for all those websites written in JavaScript), but the Standard Edition doesn’t (why should clients written in Java talk to servers with JSON as they’re not websites and therefore no JavaScript involved?), which leads to projects like JSON-java. It’s wheel-reinventing for a capability that already exists for the server side, wasting valuable lifetime to compensate for web deficiencies. For XML, on the other hand, there’s good support in almost any programming language.

All this waste of valuable lifetime for the sole reason to save a few bytes of memory/bandwith? More likely is a mere historical coincidence. As XHTML is based on XML, browsers are XML processors anyway. Just look at what happens if you open a random XML file in your browser, it most likely will render a representation that’s different from the plain-text equivalent of the XML file in the most basic text editor without any syntax highlighting. Even more interesting is the result if the XML file contains a reference to a XSLT stylesheet, because it might end up being applied, so the “browser” is more or less expected to be a XSLT processor too. There’s the most interesting XSLTProcessor interface, which isn’t standardized unfortunately, but look at the browser compatibility list. In the browser war days, as there was only Netscape/Mozilla and Microsoft, the red “no support” by popular Microsoft Internet Explorer might have killed the XSLT-based ViewSpecs of HyperScope, but nobody cares about Microsoft browsers anymore (did you know that Internet Explorer is based on Mosaic code Microsoft licensed from NCSA after Andreessen and his bunch left to found Netscape?).

JSON is fine for data transfer if the developer controls both endpoints, but then the web guys found out that it lacks “out-of-band” metadata like XML attributes and semantic descriptors/identifiers like XML namespaces, so now there’s JSON-LD, mimicing the mentioned XML features. Wait, LD, Linked Data, isn’t that the new name for the abandoned notion of a semantic web? Didn’t the web guys together with the browser vendors kill that effort, and just now realize that it is actually needed, rebooting it with their own JavaScript stuff slowly and with years of unnecessary delay, plus doing it wrong? It’s not difficult to predict that one day they’ll find out that they also need JSON Web Services Description Language, JSON Schema Definition, JSON Stylesheet Language Transformations and JSON Path Language, but with a completely new syntax because they can. Throwing away all the XML technology that already exists and reinventing it in the exact same way (don’t be surprised, XML concepts tend to make some sense) but in different packaging. JSON has a big chance however, that the web people denied XML: as soon as they recognize that they’ve made browsers into parsers of almost any arbitrary markup trash, be it by writing invalid, non-well-formed HTML or the W3C’s new efforts to deliberately break the XML-ness, websites could be written in the JSON Hypertext Markup Language. For some curious reason, the JSON deserializer demands well-formedness, which turns out to be important, just as XML always did and always got except on the web where it isn’t XHTML. Futhermore, as there’s no support for XML Cascading Stylesheets, it would be of equal help to have a JSON version of CSS.

See, it doesn’t really matter if it is XML or JSON as they’re basically the same format-wise, except that XML is way more advanced and JSON still too primitive. It would be incredibly cool to arrive at a “programmable” web where no big, bloated browser as interpreter and runtime environment is needed, but small clients/agents could consume and act on the semantic markup, a real “people’s web” infrastructure and data collection, fully accessible to the public without the need for centralized, lock-in Internet company services. Wonder who tries to prevent that? Isn’t it a dangerous vertical integration if those who offer web services also own the browsers and influence standardization consortiums too in order to make sure that better digital technology won’t disrupt their current sources of income? Most of modern web developers weren’t old enough to deeply learn about digital, software and networking, they grew up with “social” networks and apps already presented to them in a particular way. They’re easily fooled into hyping technological stagnation (let’s see what WebAssembly will end up as), while the smart developers conspire with the big companies who understand digital perfectly well to exploit unsuspecting markets, politicians and society with great success. Regardless of our dystopian future, let’s never forget that the centralized “cloud” isn’t everything, that personal computing is all about the independent individual, that computer liberation still needs to go on.

So let’s imagine a parallel universe in which some day somebody decided to add XML object serialization to JavaScript. It’s probably almost a trivial task if a more efficient way to parse/represent XML than DOM is already available, let’s say a JavaScript implementation of StAX for example or SAX if an asynchronous (in terms of node.js and async/await) push instead of pull method would be needed. That would offer the JavaScript developer a “new” native way to work with XML as if it were a JSON object (which it will be and represented as actually), as other dynamically typed languages enjoy such feature for quite some time now (that would be PHP’s SimpleXML. But why even bother? Is serialization really a thing that demands its own non-free license, pretty much like the well-known “CSV CRLF linebreak license” or the famous “SQL plus operator license”? No, for me, JSON is fine if I shovel non-public dumb data between two websites under my control. For everything else, I’ll just convert JSON to XML as the universal format for text-oriented data and then work with the latter. JavaScript is only a small fraction of application programming and I’ll certainly not abandon decades of improvement for seriously broken web stuff. It’s not that JSON is “bad” or something, it’s just not very helpful for the things I want and need to do.

Hypertext: DKR Scenario 29 Dec

This is an answer to Demo@50: Welcome Builder!, but please also consider the more practical interactive mapping of my project results to an OHS. Everything that follows is up for discussion, adjustment, extension, removal and implementation. A Dynamic Knowledge Repository, according to my current understanding, is a comprehensive collection of documents and other media supported by software to deal with it, created and curated by a community of people who are consumers, contributors, knowledge workers and stewards.

Can we make a 21st Century DKR happen? Yes, it’s not too difficult. It’s a totally different question if we can make it big, because besides the technological, legal and contentual solutions, there’s also the social/political aspect, timing and luck. On the other hand, I already count it as victory if the manage to augment our own matters or if we improve on the current state of affairs – it’s about bootstrapping, and I’m unable to predict what people will use the tools for in the future. Additionally, I believe that DKRs do exist today, for example the Wikipedia.

What infrastructures and components would have to be built? All of them and many, many more. It’s not so important what we will end up with eventually, but how it is built and what policies are applied. Formats and protocols need to be open, software needs to be libre-licensed, online services need to be self-hostable, dependencies need to be avoided, offline usage and preservation needs to be an option, social mechanisms should prevent abuse and encourage constructive collaboration. Most of the potential of the digital revolution wasn’t realized yet because strong powers entrap unsuspecting users in silos, where we need to build an alternative of interchangability on every level, basically liberating ourselves and humanity from totally artificial limitations.

What aspect of the DKR do you feel should be built and how do you want to contribute, based on the list of “What is missing” and other aspects of Doug’s work? As said, everything of it should be build, and a lot more. When it comes to me, I’m more a plumber in the background who builds many small intercompatible things to retain flexibility in an attempt to avoid programming myself into a corner, unmanageable complexity and bloatware. I have trouble with UI/UX design, but slowly try to catch up. Therefore, I’m 100% in for the Journal, don’t care too much about xFiles (as the underlying data structures should be abstracted away, will be subject for optimization experts and only needs to comply with the requirements given above), I would like to learn the Chorded Keyset but don’t know how to build one, are fine with any kind of linking but rather want more of it than less, and really like the notion of ViewSpecs although I’m not sure if they can work in some of the most complex situations. Regarding other aspects of Doug’s work, establishing useful augmented reality would be the killer in my opinion on the technical/instrastructural level. Furthermore, I would like to support collaboration and research in human-computer interaction, but definitively would go for the low-hanging fruit and advance from there.

If you are already working on something you feel is naturally part of a DKR, how do you feel it can interact with other components? I don’t know how my software can interact with other components, I’m not aware of a lot of existing other components or format specs, except Ted Nelson’s EDL format/concept and span selector, which aren’t libre licensed and need improvement, therefore I “interact” with them by implementing libre licensed alternatives that are almost compatible if not for trademark law creeping into the format spec. Time Browser and Author sound interesting, but I didn’t look into them yet. My components are intended to interact with other open/libre components to the fullest extend possible, and be it by means of data conversion.

Please write a walkthrough to help communicate our personal perspectives (featuring Joe). There is no single Joe, but a lot of different Joes, each with his own individual or pre-defined configuration. One of the Joes sits down at a computer, which allows him to continue reading/working where he left off the last time, or switch to new publications in his field of interest, or to look things up with one of the agents that use different metics, employing different methods to be most effective. He decides to continue with his current reading, where he adds a few comments, a new paragraph, annotations, links, fixes some errors, marks semantic meanings, and by doing so he discovers other documents, meanings and relations. His lifestream records his journey, which he can share or keep for himself, but it’s not unlikely that an agent/policy will decide what kind of activity will be publicly shared, so Joe usually doesn’t bother to take manual action. If a “document” (it’s more a text composition from diverse sources, rendered by many layers of standard, public and custom ViewSpecs) is particulary interesting to him, he tears it apart in order to fit portions of it into his own collection/archive, which in effect constitutes another new document, public or private. Joe also does this frequently together with other people: family members, colleagures, friends and complete strangers. Because he’s involved in several communities, he uses different, very specific ViewSpecs tailored for particular tasks, but it can happen from time to time that he doesn’t have and can’t find a ViewSpec that suits his needs, and as he didn’t learn yet how to build one himself, he either asks his neighbor or the plenty of people who provide services for the public network infrastructure, its data and software. No matter if Joe is awake or sleeping, agents/bots work for him, for others, for the content, for the network, to keep it clean, to answer questions, to do all sorts of operations, but in an intelligent way in cooperation so resources don’t get wasted unnecessarily.

As Joe is a little bit paranoid, he has the habit to print his favorite pieces or produce other physical manifestations in his living room (sometimes he’s too lazy and just orders them cheaply from an on-demand service), so he doesn’t need to worry about loosing anything in case “the lights go out”. Even if this should ever happen, most of the manifestations can be read in again automatically with some kind of 3D printed, matrix code, standardized OCR, RFID or punch card technology. Speaking of the physical world, Joe still needs to do business in it of course, but wherever he goes, online repositories go with him (well, the truth is that he rarely enjoys connectivity because of bandwith and legal constraints, not even to mention the long distances with no network access at all, so he’s forced to work with offline repos and auto-sync with the rest of the world later). Especially when in cities, places, people, objects and time present themselves in an augmented way via his mobile device, so he can either learn about or interact with them, make them do things as they’re too networked and compatible in a network of things. Quite frankly, a substantial portion of his “document work” is related to what he discusses and designs together with others in meetings or is linked to physical stuff, so he can overlay relevant data if he wants, with all the tools and capabilities available to him as he would at home in front of his computer. Now, those things are just a fraction of what Joe does on a typical day, but as the full account doesn’t fit on a computer screen and requires an Open Hypertext System to be properly presented, please continue reading here.

Hypertext: Mother of All Demos 50th anniversary

With less than one year left to prepare for the 50th anniversary of the Mother of all Demos, it’s about time to look at the projects that potentially can be presented on the 2018-12-09. The event is an important milestone to determine if we made significant progress in realizing Doug’s vision for the last 50 years, and it seems like we’re a little bit late. Still, I like to address the big picture first before committing to blind actionism.

But what does it even mean to “realize Doug’s vision” in the context of 2018? I have to admit that I’ve never watched the entire demo in one go, it’s just very difficult to spend a hour of quality Internet time in front of my computer, passively watching things that happen on a screen. I wondered why this is the case, and my impression is that the demo was made for end users, not system developers. Doug didn’t explain/discuss the technical details or what their results could mean conceptually for human-computer interaction, and as there’s no way for me to actually use their system today, that particular showcase is kind of practically irrelevant for me. It feels more like an advertisement, and to some extend, it is. If I’m not mistaken, part of the goal for the demo was to attract further funding for the project, which is perfectly fine, as there was no fast/cheap way to make the system a universally availabe product soon in those days. So I see the great demo more like a PR stunt, which in fact served pretty well to introduce people to the notion of personal, networked, visual and augmented interaction with a computer. So how can we replicate a similar effect? Do we even have to?

The world is very different now. In 1968, computers (especially for personal use) were a new thing, so it was natural to explore this technology – how can it help with solving problems of ever-increasing complexity? Today, people have seen computers, we know what they do. If the goal is to contribute tools for dealing with the big problems, we might not deploy computers in the traditional sense any more. For instance, if quantum computing would be around the corner, there would be the chance to hold another revolutionary demo. In the field of augmented reality, we should immediately start with preparing one and it wouldn’t even be difficult to do so. Would such endeavors still be true to Doug in spit? Sure, they’re not about documents and knowledge, the web would stay unfixed, so there could be some merit in replicating the image that Doug put in front of us. Keep in mind that it doesn’t have to be an either-or, but a decision for the anniversary event might shape what’s considered to be Engelbartian for the future, if it is a conservational or progressive movement, or both. Is the anniversary supposed to refresh the original vision, to promote a new/updated one (we could even mock one), or encourage long-term research and development?

If we end up with the goal to organize the world’s information and make it universally accessible and useful (in an open way and for real!), we have to realize that people don’t get excited about text manipulation any more as they already have Office suites and the web, that blue-sky spending isn’t available any more, that our results will be less innovative as the field isn’t new and there’s much more “competition” around. On the other hand, what Doug was able to do with a team of people working several years while spending money, a single guy can get ahead slowly without any financial backing as we don’t have to start from scratch and can benefit from existing technology and methods.

What’s wrong with HyperScope? It’s really unfortunate that the W3C didn’t standardize a XSLT interface (despite browsers tend to come with a XSLT processor), but can’t we get a JavaScript implementation of it by now? How much work would it be to update the GUI framework library that was used for HyperScope?

Hypertext: Socratic Authoring

The Future Text Initiative published an introduction to the notion of “Socratic Authoring”, which is pretty straightforward in describing the general requirements for a modern writing and publishing system. For sure similar visions have already been expressed before, but this proposal might be the most recent one.

In it, we find the demand to retain as much as possible from the richness the author experienced during composition. What does this mean? Is it referring to WYSIWYG, the ability of the author to hide parts of the process, the realization that the reader might become a co-author and therefore needs an equivalent writing environment? It sounds like we don’t want to see works being produced electronically just to end up in a single, primitive, fixed form, so re-digitalization would be needed to enrich it again.

Then there’s the suggestion that semantic augmentation should take place when the document is on a server. I ask, why should this only be possible on a server? Why shouldn’t the user deserve full augmentation on the client side also? Sure, augmentation could require some resources that are only available online or include actions that need to be executed on a different machine, but even when disconnected or without using an external proxy (regardless if such an external online service is owned or third party), the user should face artificial limitation just because the client side is lacking implementation.

To be continued…

Web Apps revisited (PWA) + Geolocation for Augmented Reality + local File-I/O for Web Stack on Desktop

As I’m very interested in developing augmented reality applications, I looked again at Android app development. Some time ago, I was able to build an APK with the Android SDK + Eclipse and install it on a tablet, but after the switch to IntelliJ-based Android Studio as development environment, it appears to be very hard, if not impossible for me to even experiment with this technology. It’s also a highly proprietary ecosystem and therefore evil, don’t let yourself get fooled by some mention of GNU/Linux and “Open Source”. Therefore I looked again at web apps, and the approach changed quite a bit recently. Driving force is the realization by Google that people don’t install new apps any more while such apps are pretty expensive to build and have a bad conversion rate. It turns out that users spend most of their time in just a few of their most favorite apps. As an app developer usually wants to provide the same functionality on the web, he needs to work with at least two different technology stacks and make them look similar to the user. So why not build the application in web technology and have the browser interfacing with the underlying operating system and its components? There are new mechanisms that help with just that.

One mechanism of the “progressive web app” (“PWA” for short) bundle is called “app to home screen” (“A2HS” for short). Google Chrome already has an entry in the menu for it, which will add a “shortcut” icon onto the homescreen for the URL currently viewed. Now, developers get better control over it as there’s a technical recommendation by the W3C (description in the MDN). You just have to link a small manifest JSON file in your XHTML header that contains a few hints about how a browser might add the web app to “installed software” on mobile devices or desktop PCs. Most important is a proper short name, the icon in several sizes, the display mode (the browser might hide all its controls and show the web app full-screen, so it looks like a native app) and the background color for the app window area during load time. The manifest file gives browser implementers the chance to recognize the website as mobile app, and depending on metrics set by the user, there could be a notification that the current website can install itself as an app.

Even with Android being proprietary, it’s probably the phone/tablet system to target for as far as the free/libre software world is concerned. They have a description about what they care about in the manifest as well as a validator and an option to analyze the procedure in Google Chrome. If Chrome should detect a manifest on a web page, it might bring up a banner after some time and ask the user if he wants to add the app to the home screen. Unlike decent browsers, evil Chrome decides for the user if/when to bring up the banner. I’m not aware if the user is able to set policy regarding its appearance. In my opinion, the browser client should be the agent of the user, not for an online service provider.

Also, Google requires a service worker JS file for the banner to show up. The service worker is very important for many web apps: native apps are permanently installed on the device, so code and assets are there locally and no download needs to occur for using the app, no connectivity required necessarily. With web apps, that can be different. True, they can rely on browser caching, but as it seems as permanent local installation of the XHTML, CSS, JavaScript, images and other data isn’t a thing yet, in places without or bad connectivity, there shouldn’t be the need to retrieve resources from the net in case the app logic could also work perfectly fine locally if only the data were already present. Even if some new data is supposed to be retrieved every time the web app is opened again, old data (articles, messages, contacts) can already be presented until the new data arrives. The service worker decides which requests for old data can be statisfied from local storage and redirect to there, and which requests need to go over the wire. But there can be very legitimate cases where a service worker makes absolutely no sense. If the app does nothing else than submitting data to an online backend, well, the XHTML form can be stored locally, but that’s basically it. Connectivity is required, otherwise there wouldn’t be a way to submit anything, so there’s no need for a service worker. It’s still possible to add the web app to the home screen manually via the menu, and that will make use of the settings provided in the manifest, so that’s good enough for me. I work with refugees and want to establish communication with them. Usually they don’t have computers or notebooks, but phones of course, and as I don’t have a phone and refuse to use proprietary messenger apps, I now can link them up easily to my websites, so they can request help conveniently out of something that looks and feels like any other of their apps.

So that’s me typing in the URL of my website on the phone of a refugee, but ideally and for augmented reality, I hope that QR code recognition will be a built-in feature for phone cameras (it’s not difficult, there are many good libraries to integrate it) and not a separate app most people don’t install, because then I would just scan the QR code from a sticker, plastic card or poster on the wall, an install notification/banner would pop up automatically, and everything would be working out of the box frictionless.

For augmented reality, I could imagine stickers on public places with a QR code on them that contain a unique ID, so by scanning it, interested pedestrians would be sent to whatever website or web app was set up for this location, or a pre-installed web app that uses geolocation (yes, that’s in the web stack!) would do the same thing if the current position is within a certain area. A specific application, a particular offer, could be behind it, or a common infrastructure/protocol/service which could provide generic interfaces for ID-/location-based requests, and content/application providers could register for an ID/location, so the user would get several options for the place. Please note that this infrastructure should be white-label, self-hostable, freely/libre licensed software, and a way to tie different systems together if the user wishes. There could be a central, neutral registry/catalogue for users to search for offers and then import/configure a filter, so only the options of a certain kind would pop up, or clients and servers could do some kind of “content” negotiation, so the client would tell the server what data is requested, and local web apps would make sense out of it. The typical scenario in mind would be text/audio/video city guides, maybe in different languages, maybe from different perspectives (same place, but one perspective is history, another is statistical information, another is upcoming events and special shop offers), so a lot of content creators could “register”/attach their content to the location, and the user might pick according to personal preference or established brand, ideally drawing from freely/libre licensed content/software libraries like the Wikipedia. As the data glasses unfortunately were discontinued, that can be done with mobile devices as cheap, poor-mans AR, and I don’t see why this should unnecessarily be made more complex with 3D projections/overlays where it doesn’t need to be.

And let’s never forget that all this also works on the local computer where the current position doesn’t need to come from the GPS receiver, but could come from a map just as well. So I’m very interested in building the public, open, libre-licensed infrastructure for it, as well as a QR code sticker PDF generator or printing service that would automatically coordinate locations + IDs with the database. The advantages are that there’s no need for the Google Play Store any more, where a developer has to pay just to have an account there, and a single company would control the entire software distribution except for sideloading.

There’s one more thing: with a File API, it would be possible to make web apps act like native Desktop applications and read/write files from/to the local storage, which is crucial to make the web stack an application programming stack for the desktop computer. The same code could either operate on online resources or local resources or both intermixed, and the “setup” could either be a zip file download with the XHTML+JS code in it or by simply browsing an URL. Ideally, the latter would bring up an installation request based on the manifest file for permanently storing the web app outside of the browser cache. If all this would be standardized and rolled out in browsers, we would arrive at an incredible new software world of online/offline, desktop/mobile and physical/virtual interoperability and convergence. Unfortunately, the W3C betrayed the public again (just as they did by abandoning the semantic web, making HTML5 a mess for everything that’s not a full-blown browser engine, including DRM in HTML5, etc. in favor of a few big browser vendors) and discontinued the API. It’s harder to run a web app locally beyond the web storage and interact with other local native Desktop applications without dependence on a web server (I have workarounds in node.js and Java, but both require explicit installation and startup). I don’t see why local file access shouldn’t be anticipated, because if there are browsers implementing such a capability to go more into the PWA direction, there should better be a standard out there on how to do it instead of each of them coming up with their own, incompatible, non-standardized way of doing it.

Video Platform Crisis

Five years ago, I experimented with screen capturing under OpenSUSE with a Java Applet recording software that randomly crashed in this setup but still left me with what was recorded up to that point, so I uploaded those resulting videos to YouTube. Back then, each video was required to have no more than 15 minutes playtime and there wasn’t a way to request the unlocking of longer or even “unlimited” playtime. YouTube decided about who was eligible for uploading longer videos in an intransparent fashion, so people came up with wild theories about what could be required. I wasn’t affected much as most of my videos were shorter (partly due to the crash behavior described above) and after some time this limitation miraculously was removed from my channel. Later, YouTube dropped that policy completely.

Next, the first major YouTube design update, at least of my remembrance, “happened”. It removed the ability to customize the channel’s background in order to make the site more mobile friendly and only left the header banner to fiddle with. People were pretty upset as there was quite some effort put into those backgrounds, and from todays perspective, it’s difficult to comprehend why a responsive design should not be able to support some kind of dynamic background. The old layout allowed channel owners to convey context, charm and personality, but users eventually compromised with the new, cold look. My channel wasn’t affected much as I didn’t use a custom channel background, but in principle I didn’t like the downgrade as a user of the platform.

Then there was a long period of mediocrity. Good and innovative features were added like the online video editor, livestreaming and the option to submit and therefore work together on closed captions/translations (albeit channel owners are allowed to disable it and it’s not enabled per default), as well as features which are a required minimum like the removal of the character limit on comments or links in comments. Other good features were removed for bogus reasons, like video answers, chronological and tree-hierarchical display of comments, while other features are still missing like the ability to search for comments, export comments or ad-hoc playlists. The forced Google+ integration as a vain attempt to immitate Facebook was annoying, but also easy to ignore.

In 2015, video platform disaster struck. A big YouTuber complained about all the negativity in his comment section as people were spamming it with insults etc., so Google took action against potential haters. The “solution” was to automatically flag suspicious comments and don’t display them to any other visitor except the potential spammer himself while logged in, so he wouldn’t notice that he is writing his comments into the void, presumably continuing such activitiy. After some time with nobody seeing his messages and therefore not responding, he eventually would get demotivated and stop. The problem with this approach is that the algorithm never worked. My comments were repeatedly “ghost banned” as I usually write long, thoughtful commentary, adjusting the text in short succession in order to correct typographic or spelling errors or to enhance it a little more, or because I added a link after the initial save. If I invest valuable lifetime to state my opinion and cannot be sure that my texts get actually published, there’s no reason to continue doing so, especially as there’s the risk of never getting replies from people I would be interested in talking to. As a publishing platform, such conduct, tricking people into believing that their contributions matter, is a no-go and not acceptable. This is why I abandoned YouTube and only (ab)use it for SEO since then.

So Vimeo became the new home for my videos. It’s much cleaner and had two nice features I liked very much: they allowed videos to be marked as being licensed under CC BY-SA (instead of only CC BY as on YouTube), even despite this information is well hidden from the viewer. The other feature is that they provided a download button (albeit channel owners can disable it). Being a software developer, I don’t believe marketing claims like streaming a video is different from downloading it. Technically it’s the exact same process and while a stream replay might be cancelled earlier and therefore may consume less bandwith, a full download allows offline replay and could save much more bandwith after all. Just think about all the music videos that get streamed over and over again for no other reason than people not having a local copy of them for an offline playlist. For me, the download button is relevant because I want my freely/libre licensed videos to be shared everywhere and archived by multiple parties. The computer has to retrieve the data anyway in order to play it, so it doesn’t matter if this happens via an explicit full download or automatically in the background the browser, internally saving the received data to disk. Vimeo recently decided to remove this convenience feature for all viewers if the channel owner doesn’t pay a subscription fee. As I regard a download as a view, Vimeo as a publishing platform downgraded its publishing, and that’s another unacceptable no-go.

Furthermore, I guess by complaining about the downgrade to Vimeo support, they looked into my channel and suspected it to be a commercial one. Technically I registered a company at the beginning of the year and yes, there was one video linking to a shop seemingly advertising a product, but the shop is not an active one and I never sold a single item there or anywhere else. While I’m not necessarily a non-profit, it’s an attempt to build a social enterprise in it’s early, non-operational stages. I’m fine with paying for the hosting, but I expect the price to be tied to the actual service and not to the arbitrary removal of software features, especially if extortionary practices are used against me and my viewers. Additionally, Vimeo was taking my original video source files hostage and deleted all of them without providing a way to object to their conclusion. They deleted my entire account including all comments and follows, a big time fail in publishing. That’s why I abandoned Vimeo.

I briefly tried Twitch, but uploaded videos don’t get a lot of attention there as the main focus of the site remains on live-streaming. Joining made sense because I’m planning to stream more, but then I discovered that they run ads for the German military (Bundeswehr) before videos and streams, something I’m totally opposed to as a Christian believer and conscientious objector by both, conviction and approval. This is especially the case after I developed anabaptist views. I don’t mind if Twitch promotes violence and destruction which is none of my business and therefore easy to refute, but I never want to contribute to the endorsement of military or their recruiting, so it’s basically the YouTube adpocalypse the other way around.

After that, I moved to Vidme. I liked that they attempted to educate and familiarize viewers with the concept of paying for the production if there’s a demand, but I wondered if Vidme would be able to pull it off. The site had a “verification” process with the goal to determine if an uploader is a “content creator”, which is strange because they demanded rather arbitrary metrics: one needed 50 subscribers but was hardly able to attract any as there were limits on what can be uploaded if one wasn’t “verified” as a creator yet. In my opinion, they put this requirements into place to restrict the amount and size of uploads to YouTubers with a huge audience who switched to Vidme in response to the the adpocalypse and to deter uploaders with a small audience and huge content collection. The latter only cost money for hosting, the former are supposed to earn the site operator some income. I was already suspicious if such policy might have been their pretty serious business need if their endgame wasn’t to be bought up by you know whom. Now, I have videos that are small in size as they’re just screencasts and yet more than 30 minutes playtime, and without the status of being verified, I was prevented from uploading those, while the very existence of those videos proves that I am actually a content creator. I applied for “verification” but they declined the request, so obviously my presence was not appreciated on their site, so I set all my videos to private. Soon thereafter, they announced that they’re shutting down.

A few notes on other, minor video platforms: there’s Dailymotion, but they don’t have users on the site and lack search engine visibility. My impression is that they’re not in the game of building communities around user-generated content but around traditional TV productions (remember Clipfish, Sevenload, MyVideo?). Also, the translation of their site into German needs some work. Discoverability is poor as results are polluted by spam video posts advertising PDF downloads and nobody flags them as abuse, as they’re still lacking the option to do so. Then there are their hidden requirements, which aren’t stated on the site: 2GB maximum /video, 60 minutes maximum/video, 96 videos/day/all your accounts and 2h maximum duration of total videos/day/all your accounts. There’s Veoh, but their website is terrible and they’re still relying on Adobe Flash instead of HTML5. There’s Myspace accepting video uploads, but they’re not positioned and percieved as a video site. There are new efforts like, SPKOUT and D.Tube that try to promote decentralized peer-to-peer hosting. I’m unable to register on D.Tube because I don’t have a mobile phone, so I can’t receive their text message that’s supposed to prevent spam bots from creating an account. SetStorm (see my English or German introduction video for the site) is worth a look, but not very popular. At the moment, I upload most of my videos to SetStorm manually as it doesn’t provide an API for automatic upload yet. Unfortunately, developing the Twitch Video Uploader (a fairly technical component automating the uploads via their v5 API) was a waste of time, but still, I could imagine to extend the effort into a fully-fledged video management suite, supporting video upload not only to SetStorm, but many other places as well.

Which leads to new conceptual thinking about online video hosting. Learning from my earlier mistakes and experiences, I don’t see why the acutal video files need to be tied to a specific player and surrounding website any more, leading to demands and dependence on whoever is operating the service. In a better world, the video data could be stored anywhere, be it on your own rented webspace, a gratis/paid file hosting provider like RapidShare, Dropbox or Megaupload, peer-to-peer/blockchain hosting, traditional video websites or on storage offers by institutions or groups curating certain types of materials. The software to retrieve, play, extend and accompany the video stream would be little more than an overlay, pulling the video data from many different sources of which some might even be external ones and not under the control of the website operator. Such a solution could feel like what’s relatively common in the enterprise world where companies deliver product presentations, training courses and corporate communication via SaaS partners or self-hosting. A standardized, open protocol could report new publications to all kinds of independent directories, search engines and data projects (or have them pulling the data after registering the instance automatically or manually), some offering general and good discoverability, others serving niches and special purposes, all of them highly customizable as you could start your own on top of common white-label video hosting infrastructure. You could imagine it as the MediaWiki (without stupid Wikitext as unparsable data model) or WordPress software packages embedding video and optionally making it the center of an article or blog post, but in an even more interoperable way and with more software applications on top of them. The goal would be to integrate video hosting and playback into modern hypertext for everyone as promoted by the Heralds of Resource Sharing, Douglas Engelbart (from 37:20), Ted Nelson and David Gelernter. With the advent of information theory, XML, the semantic web and ReST, it’s about time to improve and liberate publishing in networked environments.

Update: Sebastian Brosch works on PicVid, which focuses on picture hosting at the moment, but considers video hosting as a potential second target. One downside is that the project is licensed under the GPLv3 where it should be AGPLv3 + any later. Online software doesn’t lead to the distribution of the software, only to the transmission of the generated output, so the user wouldn’t be in control of his own computing. The AGPL, in contrast to the GPL, requires software that’s accessible over a network to provide a way to obtain the source code. Now the user gets the option to either use the online service as provided by the trusted or untrusted operator, or to set up the system on his own computer and use it there, or to task a trusted entity with the execution.

Update: Jeff Becker works on sometimes. Seems to be related to

Update: I just found a community of video creators who produce short clips that speak a language of their own. Their uploads are way off-mainstream, often silly, highly incestuous (them talking about themselves, “drama”), sometimes trolling, meme-related, trashy, usually quite creative and always deliberately amateurish, which in part contributes to their charm. There are plenty of sites that cater to this audience, most notably VidLii. VidLii requires videos to be not longer than 15 minutes, not larger than 1 Gigabyte in size and it won’t take more than 8 uploads per day. With those limitations, VidLii hardly can be considered an alternative to “everything goes” video hosters, but as far as I’m concerned, I would just develop an automated uploader that splits the source material into parts of 15 minutes playtime, append the current part number to the title and won’t send more than 8 parts in one go, especially as the individual pieces could be connected together in a playlist. It’s no coincidence that sites like VidLii recreate the early YouTube layout from around 2008 – not only for nostalgia reasons, but also because in the absence of movie studio productions and advertising money, community engagement was more important than the passive consumption by the masses. The old YouTube felt way more personal, at least in retrospective perception. CraftingLord21 reports on this community and the piece about Doppler gives some more insights.

Here’s a list of similar sites: VidBitFuture, MetaJolt, Upload Stars, ClipMoon, loltube and Ittr23 Videos. But be warned: not all of them managed to avoid spam bots, while on the other hand they’re at least not filled with questionable content as for example SPKOUT and PewTube appear to be. Furthermore, keep in mind that those sites might go down at any time or were never serious to begin with (fake sites that make fun of the other ones), so a time investment of manually uploading or commenting there might just as well go down the drain as it might on the big hosting providers. And make sure that you use a new e-mail account and a randomly generated password if you should ever consider to try them out.

Update: It’s no surprise that people aren’t statisfied with general-purpose functionality around video hosting, so they come up with sites specialized for particular niche audiences. One of the many opportunities is education where you find TeacherTube (behind an adblocker-blocker) or Viduals (language switcher in the upper right corner). With the latter, users can crowdfund the production of a video that answers them a certain question. Video creators could see the 80% of the pledged money they’ll receive as an incentive to put their effort into one of the requested projects for which there is actually proven demand instead of randomly guessing and competing with clickbait SEO. Eventually, experts in their field could make way more specific content than the broad presentations that try to address almost everybody. Think of StackOverflow plus video plus income as incentive to do it, as time is always very limited, but money indeed can buy some quality time. What about other niches? I already mentioned the “MediaWiki” around video, but what about journalism/debate, product manuals, mapping, to just list some of the most obvious?

Update: The number of non-western YouTube alternatives is increasing too, with some of them translating their interfaces into English. looks pretty decent, even if one cannot upload more than 15 videos in total, none longer than 60 minutes playtime, none larger than 1 Gigabyte in size.

Update: Similar to Vimeo and Vidme, tries to establish a different business model for online video, a different one than advertising. Sure, in the digital age, the attempt to maintain artificial scarcity with exclusive licensing in order to immitate print-era business models that don’t work any more doesn’t make a lot of sense, but as rumble accepts non-exclusive uploads, it’s at least gratis video hosting. Unfortunately I was unable to open the info box on the upload page in order to read the details of the non-exclusive agreement, so please make sure that you check it if you should consider uploading.

Project Ideas

Here are a few project ideas I would like to work on if I only had the time. As long as I’m not able to start them, you could interpret those suggestions as lazyweb requests, so you should feel invited to execute them, but please make sure that your results are freely/libre licensed, otherwise your work would be of no help.

  • A reminder web app with push notifications that works locally on the phone and only occasionally imports new events/appointments when an Internet connection is established. Many refugees fail to attend their appointments as they have difficulty to observe a weekly schedule. The web part needs to be self-hostable.
  • A self-hostable messenger implemented as web app. It could also interface with e-mail and display it as if mails were normal messenger messages. Many refugees have smartphones, but without relying on telephony, Facebook or WhatsApp, it can be difficult to get in touch with them. Often they’re not used to e-mail, which is more common among German people, companies and government offices. The web app would also work on open WLAN.
  • A new TODO/task manager: when a new TODO/task is added, it appears at the very top of the list, so the user can “vote” it’s priority up or down in order to make entries more or less important. If the user finds time to work on one TODO/task, he simply needs to start with the topmost one and stick to its completion. The tool should support multiple TODO/task lists, so entries can be organized in different topics/subjects.
  • A “progressive web app” app store that signs/verifies the submissions, catalogues and distributes them via “add to homescreen”, URL bookmarking, full download or other techniques. The idea is to create something similar to the distro repositories known in the free/libre software world and f-droid, demanding that those apps come with their code to avoid the SaaSS loophole. Features like user reviews or obligatory/voluntary payment to the developers could be considered, as well as crawling the web to discover existing web apps.
  • E-Mail to EPUB. E-Mail is .eml files in IMF format. I already have a few components and are currently trying to write a parser for IMF or find a no-/low-dependency existing one. In the future, that project could evolve into a semantic e-mail client (better than putting broken HTML into the message body) that retrieves and sends data to the server.
  • In Germany, the “Störerhaftung” finally got abandoned, so Augmented Reality and the Internet of Things get a late chance here too. Before, by sharing my flatrate-paid Internet access with neighbors and passersby, law would have considered me as a contributor/facilitator if a user committed an illegal action over my wire (basically, as investigators can never find out who’s at the end of a line, they cheaply just wanted to punish whoever happened to rent/own the line at that time or demand users of the network to be identified). In cities, there are a lot of private WLAN access points connected to the Internet, and almost all of them are locked down because of the liability, leaving Germany with pretty bad connectivity. With such legal risks removed, I can assume more and more gratis Internet access outside in the public, so I would like to develop an app where people can join a group and via GPS the app would display in which direction and in what distance members of the group are located. That way, if I’m in an unfamiliar city together with friends, family or colleagues, it’s easier to not loose each other without the need of phone calls or looking at an online/offline map. Such a solution needs to be self-hostable to mitigate privacy concerns, and enabling geolocation always needs to be optional. Deniability could come from lack of connectivity or the claim of staying inside a building, or the general decision of not using the app, joining groups but not enabling geolocation, or faking the location via a modified client. The app is not trying to solve a social problem, it’s supposed to provide the capability if all members of the group agree to use it. “App” is intended for AR glasses, alternatively the smartphone/tablet, alternatively a notebook.
  • Develop a library, self-hosted server, generic client and a directory of services for Augmented Reality that allows content to be delivered depending on the location of the user. That could be text (special offers by department stores or a event schedule), audio (listen to a history guide while walking through the city, maybe even offered in many languages) or just leaving notes on places for yourself or members of the family or group (which groceries to buy where), where the door key is hidden, long lists of inventory locations).
  • In my browser, I have lots and lots of unstructured bookmarks. If I switch to one of my other machines, I can’t access them. It would be nice to have a synchronization feature which could be extended to check if the links are still live or dead. In case of the latter, the tool could attempt to change the bookmark to the historical version of the site as archived by the Wayback Machine of the Internet Archive, referencing the most recent snapshot relative to the time of the bookmarking, hopefully avoiding the pro-active archival of what’s being bookmarked. On the other hand, a bookmark could also be an (optional?) straight save-to-disk, while the URL + other metadata (timestamp) is maintained for future reference. The tool could do additional things like providing an auto-complete tagging mechanism, a simple comment function for describing why this bookmark was set or a “private/personal” category/flag for links which shouldn’t be exported to a public link collection.
  • An XML interface for PoDoFo. PoDoFo is a new library to create PDF files with C++. With an XML interface, PoDoFo would become accessible for other programming languages and would avoid the need to hardcode PoDoFo document creation in C++, so PoDoFo could be used as another PDF generator backend in automated workflows. The XML reading could be provided by the dependency-less StAX library in C++ that was developed for this purpose.
  • Subscribe to video playlists. Many video sites have the concept of a “channel”, and as they fail to provide their users a way to manage several channels at once (combining messages and analytics from different channels into a single interface), uploaders tend to organize their videos in playlists, usually by topic. As humans are pretty diverse creatures and do many different things at the same time, I might not be interested in everything a channel/uploader is promoting, I would rather subscribe to a particular playlist/topic instead to the channel as a whole. This notion of course could be generalized, why not subscribe to certain dates, locations, languages, categories, tags, whatever?
  • Ad-hoc video playlists. Video sites allow their users to create playlists, but only if they have an account and are logged in. Sometimes you just want to create a playlist of videos that are not necessarily your own without being logged in, for instance to play some of your favorite music in random order. If such external playlists should not be public, access could be made more difficult by requiring a password first or by only being available under a non-guessable URL. Embedding the videos shouldn’t be too hard, but continuing with the next probably requires some JavaScript fiddling. Should support many video sites as sources via embedding.

If you’re not going to implement those ideas yourself, please at least let me know what you think of them or if you have suggestions or know software that already exists. If you want or need such tools too, you could try to motivate me to do a particular project first. Maybe we could even work together in one way or another!

Angriff auf die Privatsphäre der Geflüchteten und all ihrer Kontakte

Nun, unsere lupenreine und gleichzeitig leider bettelarme Bundesregierung hat für Geflüchtete leider, leider überhaupt kein Geld übrig, worüber noch zu anderer Gelegenheit ausführlicher geschrieben werden muss. Wenn es jedoch darum geht, den Geflüchteten eins auszuwischen, sind plötzlich Mittel im Überfluss vorhanden: 3,2 Millionen Euro + 300.000 Euro jährlich + Schulungskosten, alles für Kartenlesegeräte zwecks Verletzung der Privatsphäre von Geflüchteten und allen, die mit ihnen in Verbindung stehen oder gestanden haben.

Die Überlegung dahinter ist natürlich, dass man bei Geflüchteten, die keine Originaldokumente vorweisen können oder absichtlich nicht vorweisen wollen, die Identität der Person feststellen möchte, indem man deren Handys und Smartphones entsprechend ausliest. Was aber will das Amt machen, wenn das Handy mit allen Daten darauf kürzlich verloren gegangen ist und das neue (noch?) keine dahingehenden Informationen enthält oder jemand schlicht kein Handy hat? Neben der Unverschämtheit, den Leuten die digitale Nacktmachung abzuverlangen, ist das Verfahren aber auch methodisch Unfug, da keineswegs sichergestellt ist, dass die Eigentümer der Geräte beim Anlegen ihrer Accounts ihren realen Namen hinterlegt haben oder dieser sich aus den Daten zuverlässig ermitteln lässt. Außerdem wird nicht nur die Privatsphäre der Geflüchteten verletzt, die ein unveräußerliches Naturrecht darauf haben müssen, dass der Staat nicht deren politische Gesinnung, religiöse Überzeugung, familiäre und geschäftliche Beziehungen usw. durchstöbern darf, sondern auch aller Leute, die mit dem Geflüchteten elektronisch Kontakt hatten. Neben Angehörigen, Freunden und Kollegen des Herkunftslandes wird zwangsläufig auch die Privatsphäre deutscher Staatsbürger durch diese Maßnahme mit Füßen getreten werden.

Man kann sich aber auch einmal fragen, was in den Thomas de Maizière gefahren sein muss, dass er diesen Quatsch auch nur in Erwägung zieht: es ist die Unterstellung, dass viele Geflüchtete ihre Ausweisdokumente des Herkunftslandes absichtlich nicht aushändigen, weil sie anderenfalls eine negative Entscheidung über den Asylantrag und einen Ausreisezwang fürchten. Dass Thomas immer noch nicht verstanden hat, dass Länder, bei denen die Rückreise aus dem Urlaub dort für einen deutschen Staatsbürger vom Auswärtigen Amt organisiert werden müsste, die Furcht vor Abschiebung dorthin begründet, sei erstmal geschenkt, aber es gibt in der Tat auch Personen, denen im Rahmen der Flucht die Dokumente abhanden gekommen sind oder abgenommen, gestohlen, zerstört wurden. Hinzu kommen klassische Staatenlose, für die sich kein Land zuständig fühlt. Dass diese Leute unter den gleichen Generalverdacht gestellt werden wie andere Personen, die ihre Dokumente absichtlich zurückhalten, zeugt nicht gerade von Rechtsstaatlichkeit. Der Vorschlag ist dann, dass die Betroffenen doch einfach zu ihrer Botschaft gehen sollen, um sich dort ein Identifikationsdokument ausstellen zu lassen, aber wenn ein Land gar nicht weiß, dass sich die Person nicht innerhalb der eigenen Grenzen aufhält, ist das natürlich auch keine besonders kluge Idee für geflüchtete Journalisten/Aktivisten/Deserteure. Wenn die in Frage kommenden Länder kein Dokument ausstellen, weil sie die Person nicht im Computer registriert haben, ihrerseits die Identität genausowenig zweifelsfrei feststellen können oder sich schlichtweg weigern, weil sie den Asylbewerber nicht zurückhaben wollen, bleiben auch die ausgelesenen Daten ohne jede Aussagekraft, weil sie nicht von offizieller Stelle bestätigt sind und so richtig oder falsch sein können wie die Angaben, welche die Person von sich aus gemacht hat.

Von daher können wir entweder diese Lesegeräte gleich wieder abschaffen und eine Menge Geld einsparen, oder halt den Thomas und seine unchristliche Clique [aus ihren Ämtern entfernen].