Among the many things Apple’s Siri can’t do is reading your lips. But a newly emerged Apple patent shows that the company is actively studying what a proprietary lip-reading program would look like.
The patent application was first filed in January of this year and describes a system that determines whether motion data matches a word or phrase. The charts specifically mention Siri with simple voice commands like “Hey Siri,” “go to” or “next song,” and how all these inputs can be improved thanks to an algorithm that analyzes users’ mouthparts.
The patent explains that there are obvious problems with voice-recognition systems like Siri, as first noted by Apple Insider. Sounds can be distorted by background noise, and other sensors that constantly monitor people’s voices consume a lot of battery and processing power. Such a system does not necessarily use a device’s camera. Instead, voice recognition software uses one of the phone’s motion sensors to record the mouth, neck, or head and determine if any of these gestures could signal human speech.
As Apple states in its patent, these sensors could be an accelerometer or gyroscope, which it says are much less likely to be disrupted by unwanted stimuli than a microphone. It doesn’t seem necessary to use this technology only on a phone, as the patent includes a vague reference to how such motion-sensing technology could be integrated into AirPods or even “smart glasses” that would then send this data to a user’s iPhone. According to the document, the devices can detect fine facial muscles, vibrations or head movements. Although Apple’s smart glasses goals were terminated years ago, the company’s Vision Pro headset could be another notable candidate in this area.
Apple will likely need to develop a machine learning model to realize this technology, and the patent mentions a “first language model” that will need to be trained on sample datasets.
Of course, Apple has filed numerous patents, some of which are a little more incredible than others, and the vast majority of them don’t make it into final products.