Audio Technology's Volume King Mode
Image source: Adobe
Audio preference has always been considered a personal experience. What one person considers good may not be so for another. However, with the support of Personalized Spatial Audio in Apple iOS 16, there are many different opinions on many channels, and misinformation and false alarms are emerging. This article will briefly discuss the current status and characteristics of spatial audio technology.
The quest for better sound reproduction seems to be endless. From Victrola's phonograph to new surround sound technologies, listeners have always sought technology to improve the sound delivered to their ears. This quest for improved audio and listening experiences has taken a new turn with new applications of spatial audio technology, as the technology promises to provide a more immersive listening experience than ever before.
Apple Spatial Audio is not the only spatial audio technology; companies such as Sony and Denon are also at the forefront of this technology and provide commercial products. However, this article only discusses the general concept of spatial audio technology and Apple's Personalized Spatial Audio.
Audio preference has always been considered a personal experience. What one person considers good may not be so for another. However, with the support of Personalized Spatial Audio in Apple iOS 16, there are many different opinions on many channels, and misinformation and false alarms are emerging. This article will briefly discuss the current status and characteristics of spatial audio technology.
01
Personalized audio takes into account the physiological and physical factors behind how the body works; each person is unique. The distance of the ear from the head, its position on the head, and the shape and angle of the ear itself all affect our auditory experience. For Personalized Spatial Audio, Apple uses the 3D TrueDepth® camera feature in iPhones running iOS 16 to scan the user's head in three dimensions.
The iPhone scans three times: the left side of the head, the right side of the head, and the front of the face (not the inner ear canal as some people say). A profile unique to the individual is generated and stored for use by the playback engine. There are concerns that these profile data files can be obtained and used for advanced facial recognition systems. Apple says these files are secure and encrypted on the device and are not used for surveillance and advanced facial recognition applications.
The TrueDepth scan parameters create an acoustic model that the audio rendering engine uses to optimize the real-time audio stream received by the user's ears.
But wait, it seems there’s more to it than that.
For humans, when played sounds enter the inner ear, the inner ear resonates and responds with sounds of its own. These sounds originate from the cochlea in the ear and can be detected and measured. These sympathetic sounds are called otoacoustic emissions (OAEs) and are noticeably louder at frequencies that the listener is more sensitive to. Many headphone manufacturers install sensitive microphones in the earbuds to detect otoacoustic emissions. Frequency scanning allows the Spatial Audio system to analyze the hearing frequency response graph of each ear of the user.
The system uses the frequency characteristics of each ear to tailor the audio, using the full frequency spectrum by compensating for frequencies to which the user is less sensitive. The resulting dynamically adjusting equalizer uses the specific angles of the earbud emitters to optimize the audio and spectral power at different frequencies, presenting the full audio stream frequency.
02
Spatial audio feels a bit like a bubble of sound around your head. The soundtrack takes on a new character, not just to the left, right, front center, and back center, as they are from directional speakers. Instead, the sound emitting sources appear to be surrounding the head, getting louder and brighter as you move your head closer to the "line of sound" of those sources (similar to your line of sight). In order to achieve this, the soundtrack must be encoded with data about all the sound sphere sources and their relative levels and distances.
03
Spatial audio processing can be used for drama audio, film audio, gaming audio, and health and fitness applications. Arguably, the most popular application right now is gaming — especially virtual reality (VR) gaming.
VR headsets use advanced and efficient head tracking to ensure that audio and video are in sync. Without fast and accurate head tracking, VR can quickly make people feel nauseous; for example, when you turn your head, your brain will have problems if the scene doesn't track you in real time.
Therefore, VR headsets anchor the spatial audio engine so that when your head turns, the main sound sources from that direction will sound extra loud. Other sounds will also change position depending on the position of the head and the speed of the turn.
But home theater spatial audio systems can't do this kind of anchoring. For example, if you're sitting on the couch watching a movie, spatial audio might provide a reasonable approximation of surround sound as long as you're looking at the center screen. But as you turn your head, the system has a harder time making the sounds to the sides more prominent. Machine vision cameras and artificial intelligence might help the system recognize when you turn your head, but the technology isn't there yet.
The accelerometers and gyroscopes used in some hearables can perform head tracking, but it’s not a perfect solution, so this relative head tracking technique is not nearly as fast or accurate as absolute head tracking.
In any case, including gaming, fast response times and low latency are required so that when the listener moves their head, their audio lines respond with louder audio sources directly in front of them and silence or quieter audio sources to the sides.
A possible solution for home and theater use is for everyone to wear an immersive VR headset (Figure 1). This solution is only feasible if the audio engine can provide a tailored audio stream to everyone at the same time. But this is a more expensive solution and greatly reduces the social experience of watching a movie.
Figure 1: Using an immersive VR headset in a theater environment enables the spatial audio processor engine to take advantage of the high-speed and precise accelerometers and gyroscopes in the headset. (Image source: Marija/stock.adobe.com)
04
Although spatial audio is a purely digital technology, it can also present problems when used with earbuds. Smaller audio transmitters limit the bass response. This is why bass amplifiers use more prominent speakers, subwoofers, and large subwoofers.
Bass relies on a lot of air movement, so smaller emitters are not as good as larger sound wave emitters. Phased arrays have demonstrated the ability to reproduce bass by properly spacing smaller emitters to boost power in the low and middle frequencies of the spectrum, but this is difficult to achieve with earbuds.
Larger headphones generally use larger audio transmitters that provide better bass response. But headphones require different audio processing to reproduce the frequency spectrum, especially for surround sound effects. Headphones use 360-degree head-related transfer function (HRTF) filters. These filters adjust the way sound is played, so the ear bounces it so it appears to perceive the sound at different locations and levels.
Regardless, 360-degree audio technology has many potential applications and uses beyond gaming, theater, and health and wellness. White noise and pink noise machines are already helping people fall asleep, relax, and de-stress. Future applications could add biometric sensors to spatial audio to record which frequencies and patterns help individuals relax, lower blood pressure, and fall asleep. Tracking alpha brain waves could close this feedback loop, thereby enhancing the state of relaxation.
The technology could also be used as an assistive listening device for people who are deaf or hard of hearing, although this is an unproven use at this time. Musicians can use spatial audio to get the ideal in-ear mix. Stage volume and venue mix are always different. Sound engineers can adjust the venue mix, but spatial audio can help performers hear better on stage.
Spatial audio is a mixed bag at the moment; some people like it, some don't. It's a technology that everyone has to try for themselves before they can make their own decisions. After all, many users are dissatisfied with the high prices and complicated setups of many surround sound systems, not to mention that background sounds sometimes drown out dialogue.
05
Spatial audio technology analyzes the structure and physiological characteristics of the human body, utilizes unique audio profiles, and combines advanced audio rendering technology to provide a more personalized and immersive listening experience. The application range of this technology covers games, theater, health, fitness and other fields, and its future impact may completely change the way we experience audio.
This article is an exclusive original article. Please indicate the source when reprinting it. We reserve the right to pursue legal liability for unauthorized copying and non-compliant reprinting.