New definitions of privacy for mobile users & apps
In the past months, you might have noticed that I have been repeatedly addressing the pressing issue of the lack of adequacy of law enforcement policies regarding of privacy laws, human rights laws and the disclosure of users’ sensitive data.
I am adding one more contribution to this hot topic with the concern for re-anonymization of so-called anonymized datasets used and shared by state and corporations with third parties.
This communication aims at underpinning some ethical considerations rising with the disclosure of users’ personal online communications and the massive surveillance of users’ mobility patterns under the veil of anonymization and statistical generalization for EU technology policies (as the one offered by big tech policies).
By tracking locational data and users queries and by disclosing users preferences and daily routines and mobility patterns collected via internet services and internet apps with unidentified audiences of foreign agents and governments, these companies might contravene to the right to be let alone, let free from interference or from intrusion.
The extensive surveillance caused by bad encryption or insufficient anonymization of users mobility patterns for law enforcement and predictive policing – which modelize the trajectories of individual routines and private cellphones without the users awareness make data easier to access by internal and external hackers for political interference, cybercriminality and industrial spying.
A better ethics by design framework is needed as well as a call for the redefinitions of privacy for political and technology policies.
By merging metadata such as behavioral, personnalized and mobility data with public databases as census and voters data, law enforcement agencies collect users’ traces as digital footprints and access citizens’ very private matters.
Giant companies have already indicated that search warrants from governments are in constant increase, passing from 1896 search warrants in 2012 to 6900 search warrants in 2018, which represents an increase of 264%.
Big techs pretend that anonymization and statistical generalization protect data from thieves and abuses.
In fact, anonymization is a broken promise (Olm, 2010). Privacy and security researches show that large datasets might be de-anonymized and that individuals and public figures can be re-identified in cryptographic datasets (Sweeney; 2000, 2002; Naramayan & Shmatikov, 2008; Golle & Partridge, 2009; Zang & Bolot, 2011; de Montjoye et al., 2013).
In the absence of explicit consent, apps might put citizens at higher risks of being exposed to data breaches due to much faith in anonymization. Cross-databases correlations and self-predictive models used for predicting the informations people are not willing to share can lead to deepest harms by sharing data with unintended audiences at global scale.
Consequently, privacy rights might be endangered as companies are not verifying enought the extent to which personal data might be tracked and diverted on the basis of existing information-sharing arrangements or due to the lack of agreements in policy procedures for data collection and the sharing of IP adresses.
In this communication, we address the mistake of heavily relying on data only and on anonymization as a mode for data regulation.
As definitions of privacy evolve in times, we offer to build new definitions for privacy as well as an ethics by design framework that integrate better cultural contexts and systematical ethical guidelines into AI processes.
The purpose of this short paper is to debunk existing and inadequate definitions of privacy used for law enforcement and EU data protection regulators, to question furtherly the limits of privacy frameworks authorizing access to users’ cellphones and private networks to third parties.
The ethics behind disclosure answer the needs to question how merging public accessible datasets with megadatasets and self-predictive models may allow to fill the blanks of the undisclosed data with not enough evaluation and verification which might impact negatively privacy, security and civic rights.
In regards with current EU and global policies for data protection and law enforcement, we should not pass laws that are causing greater harm and prejudice than an absence of procedure, for not keeping data policies independent from external influence as specified in Art. 16(2) and 8(3) of the EU Charter.
Diverging conceptions of privacy
Because metadata as transactional and social interactions, unlike contents of email and private communications, are not enough protected by laws, they can be accessed anytime by agencies, under the secretive obligations to disclose personal data with suspicions of terrorism or criminal activities imposed by abusive tagging.
With the micro-targeting of megadata, locational data, individual preferences and online transactions, securitists argue that by tracking users’ timetable and preferred locations, they can produce micro-targeting to down the search to specific persons and their networks.
But Police Departments have obtained access from Google to data collected on smartphones or any devices that had been within an area of a crime scene at specific times, including from random persons. Google also has been providing informations and data belonging to individuals who were not even connected to the crime scene, leading to one person being wrongly jailed due to an error in phone-call source location.
As specified in the US Fourth Amendment, the right to be left alone is a vital right : “The right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures, shall not be violated, and no Warrants shall issue, but upon probable cause, supported by Oath or affirmation, and particularly describing the place to be searched, and the persons or things to be seized.”.
The need for privacy as the right of withdrawing from public affairs should not be left aside, leaving behind overexposed communities prejudiced by the criminalisation of their lifestyles, cultural background and who lack the means and ressources to address the issue by law, with the risk of criminalizing digital behaviours with subjective moral judgements.
However the FAFT and the EU Data protection authorities consider that anonymity and pseudonymity might be putting society at higher risks of cyberattacks and cybercrimes, thus authorize the sharing of confidential informations and personal details from government stakeholders and national agencies with industrial or business partners.
The FAFT recommendations for instance recognize the authority of analyzing a wide range of datapoints with realtime location data to link personal accounts to related accounts in order to provide reports about suspected criminals useful for analysis or for best practices (FAFT
Cross collaborations between the EU authorities and foreign governments might not comply then with the requirement for DPAs to be independent of any political, governmental or other influence.
Extensive datasurveillance through de-anonymization
Dr. Sweeney, a professor of computer science, found that the k-anonymity model commonly used for privacy protection (the one used by Google and other tech companies) allows to uniquely reidentify a person in a very large dataset with as much as 3 or 4 identifiers : gender, zipcode and birthdate (Olm, 2010; Sweeney, 2002) due to the uniqueness of their identity. She used a 1990 census data survey with 87% accuracy rate and a 2000 census data with 63% accuracy (2000).
Sweeney also succeeded in re-identifying the governor of Massachusset from past health insurance data records and publicly available voters’ data, with the recombination of gender, home adress and birthdate (Sweeney, 2000; 2002; de Montjoye et al., 2013).
With the extension of intrusive technologies of surveillance that track users location and mobility patterns (Zang et Bolot, 2011), the localization and duration of every call given by a cellular, and the civic identities of both parties involved can be accessed, reducing dramatically the control and ownership over data.
Qualititative inquiry of cultural backgrounds is needed in addition of computering analysis as we observed that the concern about where and in what context people share and disclose data in private or public settings differ following backgrounds, cultures and situations (Debaveye, 2012a, 2012b).
Public data and private data shared under password in private forum, IRC chat or through private social networks and messenging apps influence people’s patterns for self-determination and exposure (Debaveye, 2012a, 2012b, 2015).
As technology is so deeply intertwinned, interdependent and indistinguishable from our social forms (Debaveye, 2012b), technology and media bias might then produce distorsion and lack of awareness (Debaveye, 2012b).
The dataification of sensitive data replacing people’s identities and personal stories with numerical identifiers creates a derealization separating the real and its metaphorical remediation via maths models facilitating data manipulation and bias (Debaveye, 2012b).
As Olm advocates : « These advances should trigger a sea change in the law because nearly every information privacy law or regulation grants a get-out-of-jail-free card to those who anonymize their data. » (Olm, 2010:4)
Data regulation as trade markets cause processes of fake moralization between preachers and believers (Olm, 2010:16). The concept of utility should be decreased in order to grant more privacy to society to find more balance between privacy and utility (Olm, 2010).
We add that without the use of contextual inquiry of personal settings, ethical commitment in reaching standards in data collection and in-person investigations, the promises that AI have to offer might not be fulfilled in the future and our ethics by design frameworks might help to reduce uncertainty.
Blumberg, A. & Eckersley, P. On locational
privacy and how to avoid losing it
forever. E.F.F. (2009).
de Montjoye, A. C. A. Hidalgo, M. Verleysen and V. D.
Blonde, Unique in the Crowd: The privacy
bounds of human mobility, Scientific report 3, Art.
Number : 1376 (2013).
FAFT Report, From Money mules to chain hopping, 2018
Golle, P. & Partridge, K. On the anonymity of
home/work location pairs. Pervasive Computing 390–
Narayanan, A. & Shmatikov, V. Robust deanonymization
of large sparse datasets. IEEE Trans.
Secur. Priv. 8, 111–125 (2008).
Olm, Broken promise of privacy, responding to the
surprising failure of anonymization, UCLA L.
Rev. 1701 (2010)
Sweeney, L. Uniqueness of simple demographics in the
Sweeney, L. k-anonymity: a model for protecting
privacy. Int. J. Uncertainty Fuzziness and Knowledge-
Based Systems 10, 557–570 (2002).
Zang, H. & Bolot, J. Anonymization of location data
does not work: A large-scale measurement
study. Proc. Int. Conf. on Mobile computing and
networking 17, 145–156 (2011).
Communication (unpublished), submitted for the Data Science Conference, Bern 2019 (15th of February 2019).