With the release of Apple’s ARKit a little while ago, I thought it would be an interesting chance to compare how Apple’s new tech stands up to the likes of other augmented reality solutions out there. Most popular solutions are oriented around target recognition where ARKit is more about spatial-awareness, but never-the-less there are comparisons that can be drawn with regards to overall stability and extended (offscreen) tracking. Having worked with Vuforia professionally for some time now, as well as dabbling with ARToolKit and Kudan, it was very interesting to note the differences.
The short of it is that ARKit is fantastic – it is remarkably stable in comparison to the jitter encountered using Vuforia, and coupling this with its planar-surface detection and ambient lighting analysis, it adds up to a package that makes it extremely easy to get into AR development and eases the process of making augmentations look “grounded” in the real world (which is a common problem).
It doesn’t however come without its drawbacks. I find that the constant plane-detection and adjustment that happens in the background can be problematic as ends up moving your placed items around the scene. This is generally a problem when first starting up as the app is still mapping the environment, and there may be ways to avoid this, but I have not yet had the chance to investigate. The other stand-out concern is to do with the ARKit Unity plugin: units from the scene are mapped 1:1 with metres in the real world. This is fine for larger scenes and objects, but as soon as you scale down items in the Unity scene (for example, a city on a table) the coordinates become stupendously small and hard to manage, physics can breakdown at such a small scale, and particle effects at this scale can become problematic.
A Small Project
As a technical experiment I created a small scene being made up of a system of planets and moons, and added some space-ship traffic for good measure. I even added a small city-scape on the planet that spawned the ships, set the moons up to orbit their parent bodies, and added a “floating” effect to the planets to make them look less static. The models were made using Unity primitives and still use default shaders but it was enough to illustrate the point of the experiment.
Relying on ARKits behaviour of detecting and creating planar surfaces, I added a tap-to-place feature that simply ray-casts from the camera and collects any intersection with planes, placing the planetary system at the point of intersection. There is also a slider on the left-hand side of the screen that scales the scene objects at runtime.
As with many AR solutions the scene camera is governed by the plugin and in this case is meant to mimic the movement of the device, adjusting according to the devices position and orientation in order to view the scene correctly and be able to move around naturally. This is in contrast to target-based solutions where the recognition of an image target determines the position/orientation of the camera in the scene (or the reverse, where the camera is fixed at world centre and objects in the scene move around it correspondingly). Because the position is determined entirely based on the target, these solutions suffer from “popping” (excessive positional/rotational changes) when the target is not clearly visible or lost.
Vuforia’s extended tracking is an iffy (by comparison) attempt at keeping track of targets when they are offscreen. Although sometimes successful, often the targets will be lost or will shift position/rotation when not visible.
Kudan’s SLAM feature (which also relies on spatial awareness and not target recognition) does a better job in keeping track the devices movements, but still cannot hold a stick up to ARKits ability to keep track of positions when points of interest are offscreen. This is thanks to Apple’s extremely intelligent combination of camera image and sensor analysis that accurately predicts the device’s movements in real time (called Visual Inertial Odometry, or in layman’s terms: black magic).
I am thoroughly impressed by what Apple has achieve here. It may be a bold statement, but in the right circumstances (correct lighting, enough points of interest for the camera to track, and an already-mapped room) I feel like ARKit comes close to Microsoft Hololens in terms of stability. Of course it can’t quite keep up with the likes of Hololens – moving away from an area and returning will inevitably cause drift of objects in the area – this is just due to the fact that it is not creating a true model of the environment and using that as reference (as Hololens does). However, in a small space even brash, fast movements are accurately tracked with nearly no lag or jitter. It is worth noting that ARKit can struggle in open spaces and performs best in small, indoor areas where there are more feature points to track.
Of course Google’s ARCore has subsequently joined the party, and it would be extremely interesting to compare apples and oranges in the future. Unfortunately only Google Pixel devices and the Samsung S8 are currently supported, so for the time being I won’t be testing the competition.