Interview with Wikitude: new SDK & future of AR

AR, Industry, Interviews, Mobile & Wearable

Hi everybody,

let`s get back to down-to-earth AR for a bit. There a couple of good toolkits out there to use with your today`s consumer devices. Not everyone has AR glasses at his or her disposition or is willing to put them on during a fair or at work. One well-established player for mobile (but also smartglasses) is Wikitude. They just released their new version today. For the SDK, you can read the full changelog and spec info on their blog here.

But, I took the chance to let Andy Gstoll explain to me directly how they plan to impact the AR space with their new release. Andy Gstoll has been pioneering the mobile augmented reality space with Wikitude since 2010 and is Evangelist & Advisor of the company today, he is also the founder of Mixed Reality I/O. So, we talked about the SDK and AR in general. Let`s jump right in after their release video:

augmented.org: Hi Andy, thanks for taking your time to talk about your new release and AR! Always a pleasure.

Andy: Same here, thanks for having me, Toby!

Congratulations on the new release of the wikitude SDK. I had the chance to see it prior to release and know the specs, but could you briefly summarize: what do you think are the key technical break-throughs with version 6 – for the developers and through that also to the end-users?

The Wikitude SDK 6 is our very first SDK product enabling a mobile device to do what we as humans do countless times per day with highest precision: seeing in 3D. This means to understand the dimensions and depth properties of the physical environment around us in real time. After providing GEO AR technology since 2010 and 2D recognition and tracking since 2012, moving into the third dimension with our 3D instant tracking technology is a break through for us and of course our developer customer base. In a short while it will also be a breakthrough for consumers, once those SDK 6 powered apps get out there.

I’ve seen the Salzburg castle demo where you walk through the city and the U.F.O. floats above the river Salzach. How do you glue the position of an object to the real world? Would two different users – coming from different directions – see the U.F.O in the very same geo spot with relative orientation, i.e. the augmented object faces in the same direction in the real world for both?

The “glue” is our 3D instant tracking technology, which is based on an algorithm called SLAM in combination with some Wikitude secret sauce. Our 3D instant tracking is built to work in any “unknown” space, so the demo that you have seen would work anywhere and is not bound to Salzburg’s city center. However, positioning content based on a geo location, for example like Pokemon Go, is very easy to implement. Our GEO AR SDK would probably be best suited for that scenario instead or perhaps a combination of the two.

Could you elaborate a bit on the new feature instant tracking and what it might be able to enable?

The obvious uses cases are of course the visualisation of products in space. This could be furniture, appliances like a refrigerator or a washing machine. But it could also be animated 3D characters that would appear in front of you to perhaps tell you something or be part of a gaming concept. The technology has also great potential in the architecture industry, it can for example enable you to place a building on a piece of land. For enterprise, this could mean that you can visualise a piece of machinery in your new factory to demonstrate and test it in a greater context. But I am sure there will be apps built by our large developer community that even we were not able to think of.

The use cases you are describing are all good generic AR examples. As I understand it, instant tracking kicks in if you have no prior knowledge to your real space and no markers placed. But this could make exact and repeatable positioning impossible. If you e.g. need to overlay virtual parts on a machinery you would still need a known reference to begin with, right? Like in the video when the man examines the pipes and starts off at the top with a marker. How will instant tracking help out?

Thanks for bringing this up. We have to differentiate between slightly different use cases here and different types of 3D tracking solutions suitable for each. You are right, the 3D instant tracking is always most suitable when used in unknown spaces, rooms and environments. When actual recognition is required, for example a hydraulic pump model xyz, you would use our 3D object recognition technology, which we have introduced at Augmented World Expo in Berlin last October, mostly focussing on IoT uses cases. Referring to the man examining the pipes, this is yet another technology available through our new SDK 6 called “extended tracking”. After scanning a unique marker of your own creation and choice – which you can see in the video at the top left – the man examines the pipes without having to keep the marker in the field of view of the tablet giving him the freedom to examines the entire wall of pipes.

(Note from augmented.org: This video shows their instant tracking. You can read more about their IoT approach here.)

We just had the examples of architecture or machinery, so let`s speak of more use cases: the press release specifically states indoor and outdoor usage. Let’s say, I build my university campus navigation that needs to bring me to the right building (outdoors) and then to the right meeting room in the dark basement (indoors). Is switching between tracking methods seamless and can it be used at the same time? How do I use it?

This first generation of our 3D instant tracking is optimised for smaller spaces. What you are describing I think would involve the mapping of very complex areas and structures such as the pathways of a university campus. To be honest, we have not fully tested this use case yet. What I can tell you is that it performs quite well in both light and also in darker environments, it cannot be completely dark of course as it is a vision based technology.

So, let´s talk a bit more about your tracking technology. Your team says to have improved the recognition quality heavily, especially in rough conditions. Do you think there is still room for more or did we reach the end with today’s handheld device’s sensors? Do you plan to support Google’s Tango or a similar technology in the near future to go beyond?

To answer the first part of your question which refers to our 2D tracking, yes, there is always room for improvement. However, our 2D tracking is a very mature product since we have been working on and improving it since 2012 already. I think it is not too self confident if I claim that it is the best 2D tracking in the market today. With regards to Google Tango support, we currently do not have the plan to support this niche platform. As you know there is only one consumer Tango device out there today which is the Lenovo Phab 2 Pro available in the US and a few other additional countries, hence the market share is less than 1% today. With ASUS and other OEMs there will be more coming this year, but it will be quite some time until we will have a significant base of market penetration making it worthwhile for developers to build on top of this platform. As long as this is the case, WIkitude will be focussing on the iPhone and Android powered smartphones out there by the billions today.

Everyday-AR in everyone’s pocket is still not there on a broad scale. If you count Pokémon, it had a short rise in 2016, but it’s still a niche for consumers. Do you agree? What do you think will push AR out there?

I agree to the extent that AR is still a niche for many people out there, but we are in the middle of changing exactly this as the three important puzzle pieces are coming together: hardware, software and content. Pokemon Go was of course a great example of what the potential of consumer AR is, but we will need more killer apps like this.

What do you think is missing?

The main challenge from a technology point of view is to recognise and track the three dimensional world around us all the time, flawlessly without any limitations. Wikitude 3D instant tracking technology is a big step forward but there are many challenges to be solved still, which will keep us and our competitors busy for some time.

Looking at competitors and partners…. hardware players that are more industry focussed are building their HMDs successfully for their clients, like Epson or DAQRI. Others who are also looking at consumers are preparing their launches of AR software and hardware – be it Microsoft with Holographic or Meta. Do you think AR glasses will bring the break-through?

Whether it’s standard mono-cam smartphones, Tango and “Tango-like” devices or HMDs as you mentioned above – all of them will have their place in the ecosystem. However, I do believe that devices like Hololens, ODG’s R9 and Magic Leap’s “magic device” will change everything in the mid- to longterm when they will become small and ergonomic enough to be worn by end consumers. The main advantage of these devices is of course that you do not need touch any displays with your hands and that they are always there in front of you with the potential to give you very context rich information and content as and when you need it.

Will you be on these platforms?

Wikitude is already on these kinds of devices, we have created customised and optimised SDK versions with our partners at ODG, Epson and Vuzix, which are available for purchase on our website now.

In the very beginning, I saw wikitude only as a nice GPS-tag viewfinder. Today we are at version 6 and it became a complete AR SDK. What will we be able to see in the near and far future with it? Could you give us a glimpse?

As indicated above, Wikitude is fully committed to augmenting the world around us. As the world as we know it comes in three dimensions, Wikitude will continue to push the envelop to provide the best 3D technology solutions, enabling developers to recognise and track objects, rooms, spaces and more. Different technology algorithms are needed for different scenarios today. We will not stop working until these different computer vision approaches can be merged into one, which is the ultimate goal.

That brings me to my next question – when do you think will we reach the positive point-of-no-return where everybody makes use of AR naturally?

This will be the case when the real and virtual worlds become indistinguishable from a technological and daily experience point of view.

Allright. So be it indistinguishable or obviously augmented – what do you think is the biggest chance for the world with AR technology?

My most favourite use case of AR is teleportation. I have been been living and working in many distant parts of the world over the last 20 years. When AR technology can render a high quality 3D model of my family members right next and merge with my immediate physical environment, even though they are a thousand miles away, I think it would make me and the millions of other people traveling out there all the time, very happy. If you are interested in reading a bit more about this topic, you may want to check out my recently published article on TechCrunch.

Great! Thanks for your answers.

Andy: My pleasure!

So, that’s it. I can sure relate to the teleportation concept that I long for very much, too. Currently I´m trying to get around it in AltspaceVR and other solutions. But a proper holographic buddy or family at my desk would be best. Well, seems like Wikitude is following their path well to enhance AR tech even further for mobile and currently available HUD glasses. I will sure check back to see what others make out of the new SDK features. If you want to read Andy’s Techcrunch article, it’s here. So, stay tuned for more AR soon to come as always!

– Toby.