New definitions of privacy for mobile users & apps
In the past months, you might have noticed that I have been repeatedly adressing the pressing issue of the lack of adequacy of law enforcement policies about privacy, human rights and disclosure of sensitive data.
Here I am adding one more contribution on this hot topic concerning the re-anonymization of so-called anonymized datasets used and shared by state and corporations with third parties.
This communication aims at underpinning ethical considerations rising with the disclosure of users’ personal communications and the datasurveillance of users’ mobility patterns under the veil of anonymization and statistical generalization for EU technology policies.
By tracking locational data and users queries and disclosing users preferences and daily routines with mobility patterns collected via internet services and apps and by the sharing data indiscriminately with unidentified audiences, governments might contravene to the right to be let alone, free from interference or intrusion.
The extensive surveillance caused by bad encrypted mobility patterns for law enforcement and predictive policing which modelizes the trajectories of individual homes and private cellphones make data easier to access by internal and external hackers for political interference, cybercriminality and industrial spying.
A better ethics by design framework is needed as well as the redefinitions of privacy for technology policies.
By merging metadata such as behavioral, personnalized and mobility data with public databases as census and voters data, law enforcement agencies collect users’ traces as digital footprints and access citizens’ very private matters.
Giant companies have already indicated that search warrants from government are in constant increase, passing from 1896 search warrants in 2012 to 6900 search warrants in 2018, which represents an increase of 264%.
Big techs pretend that anonymization and statistical generalization protect data from thieves and abuses.
In fact, anonymization is a broken promise (Olm, 2010). Privacy and security researches show that large datasets might be de-anonymized and individuals and public figures can be re-identified in cryptographic datasets (Sweeney; 2000, 2002; Naramayan & Shmatikov, 2008; Golle & Partridge, 2009; Zang & Bolot, 2011; de Montjoye et al., 2013).
In the absence of explicit consent, apps might put citizens at higher risks of being exposed to breaches because of too much faith in anonymization. Cross-databases correlations and self-predictive models used for predicting the undisclosed data people are not willing to share can lead to deepest harm by sharing data with unintended audiences at global scale.
Consequently, privacy rights might be endangered by not verifying enought the extent to which personal data might be tracked and diverted on the basis of existing information-sharing arrangements or due to the lack of agreement in policies procedures for data collection and sharing of IP adresses.
In this communication, we adress the mistake of heavily relying on data only and on anonymization as modes for dataregulation.
As definitions of privacy evolve in times, we offer to build new definitions for privacy as well as an ethics by design framework that integrate better cultural contexts and systematical ethical guidelines into AI processes.
The purpose of this short paper is to debunk existing and inadequate definitions of privacy used for law enforcement and EU data protection to question furtherly the limits of privacy frameworks authorizing access to users’ cellphones and private networks to third parties.
The ethics behind disclosure answer the needs to question how merging public accessible datasets with megadatasets with self-predictive models may allow to fill the blanks of the undisclosed data with not enough evaluation and verification which might impact negatively privacy, security and civic rights.
In regards with current EU global policies for data protection and law enforcement, we should not pass laws that are causing greater harm and prejudice than an absence of procedure, for not keeping data policies independent from external influence as specified in Art. 16(2) and 8(3)
of the EU Charter.
Diverging conceptions of privacy
Because metadata as transactional and social interactions, unlike contents of email and private communications, are not enough protected by laws, they can be accessed anytime by agencies, under the secretive obligations to disclose personal data with suspicions of terrorism or criminal activities.
With the micro-targeting of megadata, locational data, individual preferences and online transactions, securitists argue that by tracking users’ timetable and preferred locations, they can produce micro-targeting to down the search to specific persons and their networks.
But Police Departments have obtained access from Google to data collected on smartphones or any devices that had been within an area of a crime scene at specific times, including from random persons. Google also have been providing informations and data belonging individuals who were not even connected to the crime itself, leading to one person being wrongly emprisonned due to an error in phone-call source location.
As specified in the US Fourth Amendment, the right to be left alone is a vital right : “The right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures, shall not be violated, and no Warrants shall issue, but upon probable cause, supported by Oath or affirmation, and particularly describing the place to be searched, and the persons or things to be seized.”.
The need for privacy as the right of withdrawing from public affairs should not be left aside, leaving behind overexposed communities prejudiced by the criminalization of their lifestyles, cultural background or lacking means and ressources to adress the issue, with the risk of criminalizing digital behaviours with subjective moral judgements.
However the FAFT and the EU Data protection authorities consider that anonymity and pseudonymity might be putting society at higher risks of cyberattacks and cybercrimes, and authorize the sharing of confidential informations and personal details from government stakeholders and national agencies with industrial or business partners.
The FAFT recommendations for instance recognize the authority of analyzing a wide range of datapoints with realtime location data to link personal accounts to related accounts in order to provide reports about suspected criminals useful for analysis or for best practices (FAFT
Cross collaborations between the EU authorities and foreign governments might not comply then with the requirement for DPAs to be independent of any political, governmental or other influence.
Extensive datasurveillance through de-anonymization
Dr. Sweeney, a professor of computer science, found that the k-anonymity model commonly used for privacy protection (the one used by Google and other tech companies) allows to uniquely reidentify a person in a very large dataset with as much as 3 or 4 identifiers : gender, zipcode and birthdate (Olm, 2010; Sweeney, 2002) due to the uniqueness of their identity. She used a 1990 census data survey with 87% accuracy rate and a 2000 census data with 63% accuracy (2000).
Sweeney also succeeded in re-identifying the governor of Massachusset from past health insurance data records and publicly available voters’ data, with the recombination of gender, home adress and birthdate (Sweeney, 2000; 2002; de Montjoye et al., 2013).
With the extension of intrusive technologies of surveillance that track users location and mobility patterns (Zang et Bolot, 2011), the localization and duration of every call given by a cellular, and the civic identities of both parties involved can be accessed, reducing dramatically the control
and ownership over data.
Qualititative inquiry of cultural backgrounds is needed in addition of social network studies (SNS) and machine learning analysis as we observed that the concern about where and in what context people share and disclose data in private or public settings differ following backgrounds, cultures and situations (Debaveye, 2012a, 2012b).
Public data and private data shared under password in private forum, IRC chat or through private social networks and messenging, influence people’s patterns for self-determination and exposure (Debaveye, 2012a, 2012b, 2015).
As technology is so deeply intertwinned, interdependent and indistinguishable from our social forms (Debaveye, 2012b), technology and media bias might then produce distorsion and lack of awareness (Debaveye, 2012b).
The dataification of sensitive data replacing people’s identities and personal stories with numerical identifiers creates a derealization separating the real and its metaphorical remediation via maths models facilitating data manipulation and bias (Debaveye, 2012b).
As Olm advocates : « These advances should trigger a sea change in the law because nearly every information privacy law or regulation grants a get-out-of-jail-free card to those who anonymize their data. » (Olm, 2010:4)
Data regulation as trade markets cause processes of fake moralization between preachers and believers (Olm, 2010:16). The concept of utility should be decreased in order to grant more privacy to society to find more balance between privacy and utility (Olm, 2010).
We add that without the use of contextual inquiry of personal settings, ethical commitment in reaching standards in data collection and inperson investigations, the promises that AI have to offer might not be fulfilled in the future and our ethics by design frameworks might help to reduce uncertainty.
Blumberg, A. & Eckersley, P. On locational
privacy and how to avoid losing it
forever. E.F.F. (2009).
de Montjoye, A. C. A. Hidalgo, M. Verleysen and V. D.
Blonde, Unique in the Crowd: The privacy
bounds of human mobility, Scientific report 3, Art.
Number : 1376 (2013).
FAFT Report, From Money mules to chain hopping, 2018
Golle, P. & Partridge, K. On the anonymity of
home/work location pairs. Pervasive Computing 390–
Narayanan, A. & Shmatikov, V. Robust deanonymization
of large sparse datasets. IEEE Trans.
Secur. Priv. 8, 111–125 (2008).
Olm, Broken promise of privacy, responding to the
surprising failure of anonymization, UCLA L.
Rev. 1701 (2010)
Sweeney, L. Uniqueness of simple demographics in the
Sweeney, L. k-anonymity: a model for protecting
privacy. Int. J. Uncertainty Fuzziness and Knowledge-
Based Systems 10, 557–570 (2002).
Zang, H. & Bolot, J. Anonymization of location data
does not work: A large-scale measurement
study. Proc. Int. Conf. on Mobile computing and
networking 17, 145–156 (2011).
Communication (unpublished), submitted for the Data Science Conference, Bern 2019 (15th of February 2019).