A group of scientists from five American universities has developed a side-channel EarSpy attack that can be used to eavesdrop on Android devices: recognize the gender and identity of the caller, and also partially parse the contents of the conversation.
Eavesdrop can be carried out using motion sensors that are able to capture the reverberation of the speakers of mobile devices.
Let me remind you that we also wrote that PCspoF Attack Could Disable Orion Spacecraft, and also that Experts Demonstrate Data Extraction Using LEDs and a Gyroscope.
The media also noted that Data from Isolated Computers Can Be Stolen Using SATA Cables.
The EarSpy attack was presented by experts from Texas A&M University, New Jersey Institute of Technology, Temple University, Dayton University, and Rutgers University. They said that similar side-channel attacks had already been studied before, but a few years ago, smartphone speakers were found to be too weak to generate enough vibration to eavesdrop.
Modern smartphones use more powerful stereo speakers (compared to previous years), which provide better sound quality and stronger vibrations. Similarly, modern devices use more sensitive motion sensors and gyroscopes, capable of registering even the smallest nuances of the speakers.
A clear proof of these words can be seen in the illustration below, where the performance of the 2016 OnePlus 3T speakers is barely visible on the spectrogram and is compared with the 2019 OnePlus 7T stereo speakers, which obviously allow you to extract significantly more data.
Left to right: OnePlus 3T, OnePlus 7T, OnePlus 7T
In their experiments, the researchers used OnePlus 7T and OnePlus 9 devices, as well as various sets of pre-recorded sounds that were played through the speakers of the devices. The specialists also used a third-party Physics Toolbox Sensor Suite application in their work to collect accelerometer readings during a simulated call, and then transferred them to MATLAB for analysis.
The machine learning algorithm was trained using readily available datasets for speech recognition, caller ID, and gender. As a result, the data obtained as a result of the tests varied depending on the data set and device used, but in general, the researchers’ experiments gave promising results and proved that such wiretapping is possible.
For example, caller gender accuracy on OnePlus 7T ranged from 77.7% to 98.7%, caller ID classification ranged from 63.0% to 91.2%, and speech recognition succeeded with an accuracy of 51.8% to 56.4%.
On the OnePlus 9 device, gender accuracy exceeded 88.7%, but caller ID fell to an average of 73.6%, and speech recognition showed a result from 33.3% to 41.6%.
The researchers acknowledge that the volume that users themselves choose for the speakers of their devices can significantly reduce the effectiveness of the EarSpy attack. That is, the low volume of the speaker may well interfere with the implementation of wiretapping in general.
In addition, reverberations are significantly affected by the location of the device’s hardware components and assembly density, as well as the accuracy of the data, reducing user movement and vibration caused by the environment.
OnePlus 7T device
Let me remind you that in one of the studies of past years, the Spearphone PoC application was used, which also abused access to the accelerometer and analyzed the reverberations that occur during telephone conversations.
However, at that time, experts used a speakerphone, due to which the accuracy of determining the gender and caller ID reached 99%, and the accuracy of speech recognition – 80%.
The authors of EarSpy summarize that phone manufacturers should ensure a stable sound pressure level during telephone conversations, as well as place sensors in the case in such a way that internal vibrations do not affect them or have the least possible impact.