A new study has found that deidentifying wearable data may not be sufficient to protect individuals’ privacy. With advances in machine learning, seemingly innocuous data can be used to infer sensitive information about individuals, including medical diagnoses, mental health, personality traits, and emotions. This information can then be used to reidentify individuals, revealing not only the originally collected data but also inferences made about them.
Despite the risk, regulation changes often lag behind real-world reidentification events and their consequences. To address this, a new study published in the Lancet Digital Health journal aimed to provide an overview of the risks of reidentification from wearable devices, which are often not considered as generating identifiable information. The authors explore open questions surrounding reidentification through an extensive systematic review of available literature. This study also raises concerns about the potential risks to individuals’ privacy and highlights the need for more research to determine what types of data from wearable devices can enable reidentification and what resolution of such data is necessary for individuals’ privacy to be protected.
To conduct their study, the researchers followed PRISMA guidelines and registered it on PROSPERO. They searched for information in various databases and journals that dealt with biometric technologies, with no start date restrictions. The researchers excluded some types of technologies, such as GPS-based and widely-used biometrics like iris scans, which presented clear privacy risks. They also excluded studies with animals, used theoretical models, used video or cameras, or employed impractical form factors. Two independent reviewers screened the titles and abstracts, and a third reviewer resolved any disagreements. The researchers extracted data from the studies and sensing-modality-specific characteristics. The data was then checked for any discrepancies and created graphs using R (v4.0.2) with ggplot2.
The systematic review conducted by researchers found that wearable devices can be used to reidentify individuals using their biometric signals. The researchers searched various databases for peer-reviewed literature and found 72 studies that met their eligibility criteria. They assessed the quality of the studies and found that 89% were high quality, 11% were moderate quality, and none were low quality. The studies used 20 unique sensing modalities, with electroencephalogram (EEG), inertial measurement unit (IMU), and electrocardiogram (ECG) being the top three. Notably, some studies used less common biosignals, indicating the importance of privacy considerations even in emerging sensing technologies. Of the 72 studies, 65 reported biometric identification performance, with the correct identification rate (CIR) being the most widely reported performance metric. The studies showed that little data is required for reidentification, with as little as 30 seconds of typing data achieving a CIR of 99.2% for a 34-person participant pool.
The researchers also analyzed the body positioning of wearable devices and found that the majority were positioned on the wrist, head, or chest. They explored the biometric identification performance of the studies with the highest number of participants for each sensing modality and found that high CIRs were observed, ranging from 87% to 100%.
Despite high correct identification rates found in many of the reviewed studies, small group sizes in some of them could limit generalizability. Nevertheless, the authors emphasized the importance of privacy-preserving methods to allow for open science to flourish, as more health data becomes available. They suggested supporting privacy-conscious data-sharing platforms and avoiding the blocking of biometric data sharing. Instead, they argued that careful consideration of how data should be shared is crucial, given that the risk of not sharing data could be greater than the risk of reidentification.