The words “Augmented Reality” (AR) are synonymous with games like Pokemon Go or Harry Potter: Wizards Unite. For developers, this extends to mobile toolkits and engines like Unity, Vuforia, ARCore, and ARKit. Even we, though, tend to overlook one of the simplest AR platforms: the browser.

In a recent project, I worked to create a browser-based AR Progressive Web App (PWA). As excited as I was to dive in, I had zero experience developing AR. It meant working from the ground up which gave me an opportunity to learn a ton about browser-based AR tools. Unfortunately, documentation for some of these tools is woefully lacking, and tutorials with up-to-date and accurate information are also hard to come by. I spent a lot of time researching and experimenting to get the desired outcome. To (hopefully) spare you some time, I’ve included below what I learned along the way, beginning with a quick review the available tools.

AR.JS

The most popular tool you’ll find in this regard is AR.js, a marker-based solution. Its claim-to-fame is that you can create an AR app in 10 lines of HTML and, for the most part, this is true. Here’s their basic example:

<!doctype HTML>
<html>
	<script src="https://aframe.io/releases/0.9.1/aframe.min.js"></script> 
	<script src="https://cdn.rawgit.com/jeromeetienne/AR.js/1.7.1/aframe/build/aframe-ar.js"></script>
	<body style='margin : 0px; overflow: hidden;'>
	<a-scene embedded arjs>
		<a-marker preset="hiro">
			<a-box position='0 0.5 0' material='color: yellow;'></a-box> 
		</a-marker> 
		<a-entity camera></a-entity> 
	</a-scene> 
	</body> 
</html>

Run that code, point a Hiro marker at the camera, and you’ll see a nice, happy yellow cube. The nice part about AR.js is that the authors have done all the heavy lifting for you. It relies on three popular 3D libraries: THREE.js, Aframe.js, and JSARToolKit5, a JavaScript version of the standard artoolkit used by many mobile devs. For all its merit, there are a few downsides to note:

  1. First, you’ll have to set up an SSL pointing at localhost if you don’t already have one. Most modern browsers require this when accessing your device’s camera. This took me a good bit to get configured, but I found this tutorial helpful. The biggest trouble I had was ensuring the SSL worked with the Ionic app I was creating, so I ended up creating a separate server using Node.js and Express instead of relying on Ionic’s dev server. *Note: At the time of this writing, Ionic SSL is unreliable and, unfortunately, did not work for me. Once the Ionic team gets this functionality up and running, this will save you tons of setup time.

  2. Second, AR.js is quickly becoming outdated. Its owner has moved onto bigger and brighter prospects (which is awesome!), but it makes maintaining the AR.js repo difficult. This means you are likely to run into several issues with the codebase like deprecated JavaScript syntax or outdated dependencies. I faced this problem several times. Since the owner seems to have packed up and left AR.js, these issues are largely going unresolved, so be prepared to spend a decent chunk of time on trial-and-error code writing.

There is, however, a bright side for the future of AR.js. A devoted group of users is currently working toward transferring ownership of the repo to a GitHub organization enabling further contribution from active devs. If this effort succeeds, I admit AR.js will be the best AR resource for browser-based apps for its simplicity. Until then, devs might find better solutions with the dependencies AR.js relies on.

JSARTOOLKIT5 & THREE.js

Turning to AR.js’s building blocks is what I chose to do. If you’re looking to get the job done without learning tons of new material, you’re probably better off sticking with AR.js and resolving the problems as they arise. Otherwise, I recommend building something from scratch as a way to not only learn more about how browser-based AR functions, but also discover more ways to customize the content.

I was able to get our app going with two of the three dependencies mentioned above: JSARToolKit5 and THREE.js. The former is a marker-based toolkit that makes the AR magic happen, and THREE.js is a 3D library that helps you set up the scene of objects and models. I used Ionic Framework (v4) coupled with Angular 7, so my work was in TypeScript. The below example will also be TypeScript, mostly because JSARToolKit5’s repo already has some great examples using plain old JavaScript.

Setup

The easiest way to get started is to use npm/yarn/your module manager of choice to install. For example:

npm i three
npm i jsartoolkit5

*Note, make sure you include the number 5 on the end. There is an older, deprecated version of JSARToolKit as well.

Then, import these into your project like so:

import { WebGLRenderer, Color, Mesh, MeshNormalMaterial, BoxGeometry } from 'three';
import { ARController, ARThreeScene, artoolkit } from 'jsartoolkit5';
import { Platform } from '@ionic/angular';

The above is everything you need to create a simple scene with a Hiro marker and 3D box object. First, we’ll need to access the device’s camera and obtain the viewport parameters. In the component’s constructor:

window.URL = window.URL || (<any>window).webkitURL;
(<any>window).MediaDevices = (<any>window).MediaDevices || navigator.getUserMedia;
this.width = platform.width();
this.height = platform.height();

Here, we’re grabbing the url of the window and calling navigator.getUserMedia to access the webcam/camera. We’re also setting the global width and height variables to match those of the device.

Initialize the Three Scene

Now comes the fun part! Once the view initializes, we’ll get the device’s parameters and setup the THREE Scene:

ngAfterViewInit() {
    let viewportWidth = this.width;
    let viewportHeight = this.height;

	if ('MediaDevices' in window) {
		ARController.getUserMediaThreeScene({
			maxARVideoSize: 640,
			cameraParam: 'camera_para.dat',
			onSuccess: (arScene: ARThreeScene, arController) => {
				arController.setPatternDetectionMode(artoolkit.AR_TEMPLATE_MATCHING_COLOR);
				
				let renderer = this.createWebGLRenderer(viewportWidth, viewportHeight);
				document.body.appendChild(renderer.domElement);

				let cube = this.createCube();
				this.trackMarker(arScene, arController, cube);

				let tick = () => {
					arScene.process();
					arScene.renderOn(renderer);
					requestAnimationFrame(tick);
				};
				tick();
			}
		});
	}
}

A THREE Scene object includes several properties to define the structure and functions of the scene. The most important thing to note, though, is the ARController. This is a JSARToolKit5 object that contains most of the functions and properties we will use to build our scene, including some handy THREE helper functions. One of these, getUserMediaThreeScene() is used above and takes the following parameters:

{
    onSuccess: function(ARController, ARCameraParam),
    onError: function(error),
    cameraParam: url,
    maxARVideoSize: number,
    width: number | {min: number, ideal: number, max: number},
    height: number | {min: number, ideal: number, max: number},
    facingMode: 'environment' | 'user' | 'left' | 'right' | {exact: 'environment' | ... }
}

These parameters are all optional apart from the three (maxARVideoSize, cameraParam, and onSuccess) used above. MaxARVideoSize does precisely what it sounds like, setting the maximum video size of the AR scene in pixels. CameraParam is a url to a .dat file that contains necessary information to setup the device camera parameters. You can customize one, or an easy setup for most cameras is defined in the .dat file used by AR.js.

When our device camera is fully setup, the onSuccess() function will allow us to set the type of pattern detection on our arController, create our renderer based on the viewport width and height, and create a simple cube. The pattern detection mode used here is for pattern markers (like the Hiro marker), but you can also use barcode markers. Checkout the artoolkit docs for more info and specifics on marker specifications. Finally, we define tick() to render our Scene and re-render on each animation frame to ensure the feed contains the most up-to-date data on which markers are present.

Define Your Marker

Part of the detection and rendering process requires tracking which markers are associated with which 3D objects. We will do that with a method called trackMarker() introduced in the above code:

private trackMarker(arScene: ARThreeScene, arController, object: Mesh) {
    arController.loadMarker('hiro.patt', (markerUID) => {
		let marker = arController.createThreeMarker(markerUID);
		marker.add(object);
		arScene.scene.add(marker);
	});
}

This method takes three parameters: our Scene, our controller, and the object that will be displayed on top of the marker. We load the marker onto our controller using the .patt file for the marker and providing a callback that will return a unique marker ID. In this callback, we create the marker, passing in the marker ID, add it to our 3D object, then add both to our Scene.

*Note: The .patt file for the hiro marker was created using this tool developed by the AR.js authors. Simply upload the image, and the generator does the rest.

Create Your 3D Object

The next step will be creating the Mesh that we attached to the marker in the previous method. In 3D terms, a Mesh refers to the 3D object and the material or textures that colorize the object. We’ll use THREE.js’s Mesh class to create a cube for our Hiro marker.

private createCube() {
    let cube = new Mesh(
		new BoxGeometry(1, 1, 1),
		new MeshNormalMaterial()
	);
	cube.position.z = 0.5;
	return cube;
}

Mesh() takes two parameters: the object and the material. We’re using more THREE.js methods here to create our cube. BoxGeometry() defines the width, height, and depth in meters, respectively, while MeshNormalMaterial() maps normal vectors to RGB colors. There are numerous geometries and materials available, all of which are mentioned in THREE.js’s documentation.

The last thing we’ll do is position the cube so we can see it. Default positioning for 3D objects is 0y, 0x, 0z which is the same default position as the camera. Thus, if we leave the position alone, we won’t be able to see our object. You can either move the camera back which would look something like arCamera.position.z = -0.5 or move the object forward like above.

Render!

Finally, we need to take everything we’ve created and render it on the screen. To do this, we’ll use THREE.js’s createWebGLRenderer() method:

private createWebGLRenderer(width: number, height: number) {
    let renderer = new WebGLRenderer({
	    antialias: true,
	    alpha: true
    });

	renderer.setClearColor(new Color('lightgrey'), 0);
	renderer.setSize(width, height);
	renderer.domElement.style.position = 'absolute';
	renderer.domElement.style.top = '50%';
	renderer.domElement.style.left = '50%';
	renderer.domElement.style.transform = 'translate(-50%, -50%)';
	return renderer;
}

When creating the renderer, we set antialias: true to smooth out any jagged lines and edges that might appear. This is especially useful if your Scene will include any moving objects. The alpha: true property refers to transparency and blending of the object within the Scene. Check out this helpful SO post for more details.

The last thing we need to do is set some styling on the renderer’s domElement property and ensure the width and height are correct for the device viewport. Then, bam! You’ve got yourself an AR Scene! To test it out, run your server and point the Hiro marker at your device’s camera. You should see a box appear just over the marker, and move along with the marker.

Conclusion

Developing browser-based AR has given me a new perspective on programming, especially in terms of how libraries are written. As a FrontEnd developer, I’ve rarely had to delve deeply into the build code of the libraries and frameworks I use to troubleshoot problems, but I found myself doing this on a daily basis when using AR.js. For all the problems I faced with it, I learned a lot about how AR works in the browser and much of what I spent time fixing was information I found useful when deciding to simply build using JSARToolKit and THREE.js.

There’s hope for the future of browser-based AR, and if the dedication of devs to tools like AR.js proves anything, it is that toolsets, libraries, and frameworks will only continue to improve and making the AR world ever more useful and exciting.