Meta has released a demo showcasing its innovative Audio to Expression feature in Quest v71, highlighting its capabilities in action.
AudioToExpression is a cutting-edge, on-device AI model that leverages microphone audio input to generate highly realistic facial muscle movements, effectively simulating facial expressions without the need for any physical face-tracking hardware.
In 2015, the Oculus Lipsync SDK offered real-time lip motion capture capabilities, while Meta’s more recent Audio to Expression solution takes it a step further by incorporating facial expressions, including cheek, eyelid, and eyebrow movements. And strikingly, Meta asserts that Audio To Expression utilizes significantly fewer computational resources than Oculus Lipsync.
Audio to Expression seamlessly supports Quest 2, Quest 3, and Quest 3S headsets. It also benefits Quest Professional, as the headset’s built-in face monitoring sensors enable developers to accurately represent the wearer’s genuine facial expressions, rather than just approximating them.
The clip reveals a striking difference when comparing Audio To Expression to the legacy Oculus LipSync SDK, despite using the same input.
Audio-to-expression technology enables more realistic avatar interactions from non-professional players in social VR and multiplayer games, while also offering a cost-effective solution for developers to create believable NPC faces, particularly beneficial for smaller studios and indie builders who cannot afford facial capture tools.
Meta’s personal Meta Avatars don’t just assist Audio To Expression; instead, they leverage the Oculus Lipsync SDK while incorporating simulated eye motion. The developer cleverly tags every digital object within the scene by its degree of visible saliency, further enhancing realism with occasional blinking – a feature that goes beyond mere lip syncing.
Discover the comprehensive Audio to Expression documentation for builders at [link]. Unity / Unreal / Customized Engines.