Web Apps revisited (PWA) + Geolocation for Augmented Reality + local File-I/O for Web Stack on Desktop

As I’m very interested in developing augmented reality applications, I looked again at Android app development. Some time ago, I was able to build an APK with the Android SDK + Eclipse and install it on a tablet, but after the switch to IntelliJ-based Android Studio as development environment, it appears to be very hard, if not impossible for me to even experiment with this technology. It’s also a highly proprietary ecosystem and therefore evil, don’t let yourself get fooled by some mention of GNU/Linux and “Open Source”. Therefore I looked again at web apps, and the approach changed quite a bit recently. Driving force is the realization by Google that people don’t install new apps any more while such apps are pretty expensive to build and have a bad conversion rate. It turns out that users spend most of their time in just a few of their most favorite apps. As an app developer usually wants to provide the same functionality on the web, he needs to work with at least two different technology stacks and make them look similar to the user. So why not build the application in web technology and have the browser interfacing with the underlying operating system and its components? There are new mechanisms that help with just that.

One mechanism of the “progressive web app” (“PWA” for short) bundle is called “app to home screen” (“A2HS” for short). Google Chrome already has an entry in the menu for it, which will add a “shortcut” icon onto the homescreen for the URL currently viewed. Now, developers get better control over it as there’s a technical recommendation by the W3C (description in the MDN). You just have to link a small manifest JSON file in your XHTML header that contains a few hints about how a browser might add the web app to “installed software” on mobile devices or desktop PCs. Most important is a proper short name, the icon in several sizes, the display mode (the browser might hide all its controls and show the web app full-screen, so it looks like a native app) and the background color for the app window area during load time. The manifest file gives browser implementers the chance to recognize the website as mobile app, and depending on metrics set by the user, there could be a notification that the current website can install itself as an app.

Even with Android being proprietary, it’s probably the phone/tablet system to target for as far as the free/libre software world is concerned. They have a description about what they care about in the manifest as well as a validator and a option to analyze the procedure in Google Chrome. If Chrome should detect a manifest on a web page, it might bring up a banner after some time and ask the user if he wants to add the app to the home screen. Unlike decent browsers, evil Chrome decides for the user if/when to bring up the banner. I’m not aware if the user is able to set policy regarding its appearance. In my opinion, the browser client should be the agent of the user, not for an online service provider.

Also, Google requires a service worker JS file for the banner to show up. The service worker is very important for many web apps: native apps are permanently installed on the device, so code and assets are there locally and no download needs to occur for using the app, no connectivity required necessarily. With web apps, that can be different. True, they can rely on browser caching, but as it seems as permanent local installation of the XHTML, CSS, JavaScript, images and other data isn’t a thing yet, in places without or bad connectivity, there shouldn’t be the need to retrieve resources from the net in case the app logic could also work perfectly fine locally if only the data were already present. Even if some new data is supposed to be retrieved every time the web app is opened again, old data (articles, messages, contacts) can already be presented until the new data arrives. The service worker decides which requests for old data can be statisfied from local storage and redirect to there, and which requests need to go over the wire. But there can be very legitimate cases where a service worker makes absolutely no sense. If the app does nothing else than submitting data to an online backend, well, the XHTML form can be stored locally, but that’s basically it. Connectivity is required, otherwise there wouldn’t be a way to submit anything, so there’s no need for a service worker. It’s still possible to add the web app to the home screen manually via the menu, and that will make use of the settings provided in the manifest, so that’s good enough for me. I work with refugees and want to establish communication with them. Usually they don’t have computers or notebooks, but phones of course, and as I don’t have a phone and refuse to use proprietary messenger apps, I now can link them up easily to my websites, so they can request help conveniently out of something that looks and feels like any other of their apps.

So that’s me typing in the URL of my website on the phone of a refugee, but ideally and for augmented reality, I hope that QR code recognition will be a built-in feature for phone cameras (it’s not difficult, there are many good libraries to integrate it) and not a separate app most people don’t install, because then I would just scan the QR code from a sticker, plastic card or poster on the wall, an install notification/banner would pop up automatically, and everything would be working out of the box frictionless.

For augmented reality, I could imagine stickers on public places with a QR code on them that contain a unique ID, so by scanning it, interested pedestrians would be sent to whatever website or web app was set up for this location, or a pre-installed web app that uses geolocation (yes, that’s in the web stack!) would do the same thing if the current position is within a certain area. A specific application, a particular offer, could be behind it, or a common infrastructure/protocol/service which could provide generic interfaces for ID-/location-based requests, and content/application providers could register for an ID/location, so the user would get several options for the place. Please note that this infrastructure should be white-label, self-hostable, freely/libre licensed software, and a way to tie different systems together if the user wishes. There could be a central, neutral registry/catalogue for users to search for offers and then import/configure a filter, so only the options of a certain kind would pop up, or clients and servers could do some kind of “content” negotiation, so the client would tell the server what data is requested, and local web apps would make sense out of it. The typical scenario in mind would be text/audio/video city guides, maybe in different languages, maybe from different perspectives (same place, but one perspective is history, another is statistical information, another is upcoming events and special shop offers), so a lot of content creators could “register”/attach their content to the location, and the user might pick according to personal preference or established brand, ideally drawing from freely/libre licensed content/software libraries like the Wikipedia. As the data glasses unfortunately were discontinued, that can be done with mobile devices as cheap, poor-mans AR, and I don’t see why this should unnecessarily be made more complex with 3D projections/overlays where it doesn’t need to be.

And let’s never forget that all this also works on the local computer where the current position doesn’t need to come from the GPS receiver, but could come from a map just as well. So I’m very interested in building the public, open, libre-licensed infrastructure for it, as well as a QR code sticker PDF generator or printing service that would automatically coordinate locations + IDs with the database. The advantages are that there’s no need for the Google Play Store any more, where a developer has to pay just to have an account there, and a single company would control the entire software distribution except for sideloading.

There’s one more thing: with a File API, it would be possible to make web apps act like native Desktop applications and read/write files from/to the local storage, which is crucial to make the web stack an application programming stack for the desktop computer. The same code could either operate on online resources or local resources or both intermixed, and the “setup” could either be a zip file download with the XHTML+JS code in it or by simply browsing an URL. Ideally, the latter would bring up an installation request based on the manifest file for permanently storing the web app outside of the browser cache. If all this would be standardized and rolled out in browsers, we would arrive at an incredible new software world of online/offline, desktop/mobile and physical/virtual interoperability and convergence. Unfortunately, the W3C betrayed the public again (just as they did by abandoning the semantic web, making HTML5 a mess for everything that’s not a full-blown browser engine, including DRM in HTML5, etc. in favor of a few big browser vendors) and discontinued the API. It’s harder to run a web app locally beyond the web storage and interact with other local native Desktop applications without dependence on a web server (I have workarounds in node.js and Java, but both require explicit installation and startup). I don’t see why local file access shouldn’t be anticipated, because if there are browsers implementing such a capability to go more into the PWA direction, there should better be a standard out there on how to do it instead of each of them coming up with their own, incompatible, non-standardized way of doing it.