Augmenting Your Senses: Audio AR

Dear readers,

we are getting closer to finish 2017. It has been a great year for AR and VR with a lot of new SDKs emerging and hardware popping up. There have been great demos for the Microsoft Hololens, Google and Apple started a great native AR-SDK push with ARKit and ARCore, but unfortunately leaving Tango as an unloved orphan on the road. MR devices just got shipped, great lightfield demos gave a glimpse into the future and some others still wait for their metaverse-ready glasses… It has been a great year with you all!

Thanks for following and sharing your thoughts with me on conferences, talks and via PMs reaching me every week and every day. I’d love to write more here, but time does not permit it. But make sure, to follow me on twitter, facebook and google+ for daily updates and exchange of thoughts! I’m always trying to get back to all, asap. Thanks for a great AR’ed year, crowd! Will continue to write here in 2018 of course!

Today, I’d love to take a step back from AR as we know it and try to document some thoughts and hopefully give some inspiration – on non-visual AR.

Introduction – Let’s augment our surroundings!

It’s no secret… AR can be more than visuals. After all, our real reality is perceived through various senses and not limited to images. Especially our ears, touch and smell give very valuable additional input. For some, those are the only input data to get! But when we think of AR, everybody thinks of visual effects in real-time only. But why? Let’s go for other senses today briefly.

I will not talk about the smell-o-vision or other crazy stuff from the (failed) past. Most prominently audio is the next big one to include. But also simple tactile feedback could be considered. Smartwatches already give navigational information by tapping twice for left and three times for a right turn, etc. You augment your routine non-visually. Audio from (digital) tape is typically only consumed as entertainment and as passive media. Ever since I was a child I have been a fan of the three investigators and other audio plays. Why not take those dreamy and immersive audio adventures to the next level?

Idea of Audio AR

Focussing on audio, we see different classic techniques from mono, simple stereo to dummy head recording stereo for a greater experience. If you add more sensors to the active user and calculate the audio signal in real-time you can reach 360° spatial audio – with rotation only (3 DOF) or full movement of your body within the sound landscape (6 DOF). What could we use this freedom for?
A 360° audio adventure could give you more immersion – but we would need to check: why should I rotate on my couch? A full freedom audio augmentation sounds weird at first, when you think of it on small scale (for example: your HTC Vive lighthouse tracking space at home), but if you go building-scale or city-scale it feels more natural: a museum audio guide is a well-known form of location-based audio AR. An audio city-guide, using GPS, is one logical bigger step for an outdoor non-visual AR experience.

But GPS-audio is rough only. What more could we do with all the mobile sensors we carry around? The gyro and compass allow additional orientation and further fine tuning of AR audio experiences. So, what is there today?

Existing AR Audio Demos

Many games and plays are heavily visual. Computer games, mobile games and board games are typically addressing the visuals for sure… some other senses added, audio being first. Some games exist though, that put audio first. The classic children’s toy “Simon” is one example: you need to memorize played notes and repeat their order by pushing buttons. The melody gets longer every time and the game gets harder.

Jumping out the 80s into today, VR offers some audio-focussed games like Audioshield or Audio Arena. But those are still depending on the visuals.

Others explore audio on mobiles deeper, like Björk did with Biophilia. Different immersive binaural audio stories exist, like Owlfield or the nice A Blind Legend, that tells a full story without visuals, but with interactivity. “Dark Echo” or “Scary Echo” are other fun games I tried that rely on audio only. Some use the gyro/compass information to allow a 360° placement of sound sources that change when you move, e.g. the Spanish “Blind Tales”. A great example for iOS is BlindSide. Though, it really takes time to get used to the audio-only concept with all games. You really have to relax and make the tutorial without rush, but still, it can be really, really hard to navigate and not give up. But with all these approaches you are not really influencing or augmenting your real life situation – it’s an immersive, but virtual audio story or game.

We want location- or movement-bound audio AR! One example for a GPS-based audio guide is “Echoes” or the audio guide platform SonicMaps. But the best augmented audio experience I’ve tried so far that really integrates your physical self and location remains “Zombies, Run!”. You will listen to a Zombie audio play that supports your work-out and let’s you run faster when the Zs are approaching or when you are falling behind. The system uses your GPS only for speed and distance tracking. So, nothing special happening when you pass by a specific building… They will mention specific places like the old hospital or the woods in the storyline, but here you have to use your imagination. It does not connect to your actual surrounding. But it’s fun, well told and produced with quality. Unfortunately, I’m not too much into running, oh well.

Technical Challenges

As seen there are some experiences that rely on gyroscopes, compass and/or GPS. But if we want to have a more realistic immersion (turning our head) we would need to stick the phone to our head while wearing headphones (like in VR). If we want full spatial movements on personal scale we could rely on Tango or ARKit, ARCore, etc. to track our position… but with the same problem. We would need to wear our phone. A smarter way could be glasses (as always for good AR): we would have sensors attached to our head, hands-free! One Hololens demo did use the depth sensing to help blind to “see” by giving stereo audio queues on distances. The system works! If we talk about phones, feedback could also be given through vibration alerts when hitting an invisible wall, etc.!

If we stick with smartphones for a while, we could still auto-trigger location-bound experiences through NFC-codes, beacons, WiFi signal checks or by scanning QR-codes manually (e.g. the museum setup for audio guides). Trigger/tracking solved for now. But how to interact? If not through a touchscreen it could be speech recon, whistling or by shaking the phone when thinking of direct device manipulation. Otherwise actions could be triggered through our movement, speed, rotation and position. Maybe I need to shake my head or nod to answer AR audio experiences (wearing AR glasses)? Maybe I can trigger a storyline action through my heartbeat ratio tracked through my smartwatch?

It’s kind of complicated. Regarding the software tech, there are solutions out there. The Audio Engineering Society (AES) offers an E-LIBRARY on “Augmented Reality Audio for Location-Based Games” to play audio files to fixed physical locations. Engines like DearVR help us to do the real-time calculations for audio placement in space that react to our position and rotation.

What would we need to consider when creating such an app?

There sure are some technical limitations. But there are more constraints to consider for AR audio experiences coming from practicalities. How about the user’s situation? Is he or she fully dedicated or distracted? On the go or in company? Is there outside noise to distract? How much time can the experience last? How to deal with interruptions? It could be a simple pause button that works for less complex experiences. But how much interaction is even needed? How to interact? Hands-free? How to actually integrate the real world context?

Do people long for AR audio? Maybe we all are heavily conditioned towards visuals and don’t leave space for audio. Wearing AR glasses outside or hunting GPS-fixed invisible monsters might make you look silly once again (after playing visual Pokémon). Moving outside through your day can make it harder to immerse into an audio story – if it wants you to disconnect from your environment. Therefore, it needs to be something that has something to do with your surroundings – or at least your movements! Otherwise it would be better to make a “stay-at-home” audio experience without augmenting something. A city-guide or audio treasure hunt through e.g. ancient time’s Rome or London could work better. Practical AR information for our day could be triggered automatically via distance and direction of view – but would we accept it? We can easily close our eyes or change our gaze. We cannot close our ears or avoid noises without running away. We are more sensitive to noise and rather avoid additional AR noise, I’d guess. But a reliable in-ear prompter that tells us forgotten names of others or guides our way through unknown territory or tasks would work fine – if sensory input reaches the device.

Closing Thoughts

The good thing about audio-based AR and other non-visual inputs like the tactile feedback from your smartwatch have a huge advantage: they don’t need your visual attention and can happen “secretly”. A navigation information or other useful input can be given on the go. If integrated wisely it could not only entertain but help. People are gamifying their lives more and more – Zombies, Run! is a great example for it, though not really connected to your space. Society is so used to pay attention to visuals and visuals only. Audio stories and games could help us out to relax more, dream more, push our imagination once again and step into some fantasies that we can fill up with our own visuals. It’s definitely fun, though technical challenges seem hard to overcome for all scenarios before we have slim AR glasses, that integrate it seamlessly.

So, maybe these days around Christmas and the cold winter season (well, if applicable to you) could be a great time to try out something new. To dive into audio-only augmentations and stories to dream. Spend some mindfulness and awareness with your ears for a change. Enjoy!

Do you have another great audio AR example? Would you be interested in creating one? Please let me know in the comments or through a PM!


(Free picture via