This page aims to gather articles, studies and reports on the risks and excesses caused by a weak data protection.
Feel free to contribute by adding links to this pad or by directly editing this page.
Other documents are available in French
- 1 Studies on re-identification
- 1.1 Nature - Unique in the Crowd: The privacy bounds of human mobility (25 March 2013)
- 1.2 University of Cambridge - Digital records could expose intimate details and personality traits of millions (11 March 2013)
- 1.3 Wired - Liking curly fries on Facebook reveals your high IQ (12 March 2013)
- 1.4 ArsTechnica - 'Anonymized' data really isn’t—and here’s why not (08 September 2009)
- 2 Public positions
- 3 The data industry
- 3.1 CNN Money - What your zip code reveals about you (18 April 2013)
- 3.2 RadioFreeEurope/RadioLiberty - Interview: 'It's Pretty Much Impossible' To Protect Online Privacy (8 April 2013)
- 3.3 SydneyMorningHerald - Facebook 'erodes any idea of privacy' (8 April 2013)
- 3.4 MemeBurn - How much are you worth to Facebook? (4 April 2013)
- 3.5 Computerworld - Judge awards class action status in privacy lawsuit vs. comScore (4 April 2013)
- 3.6 GigaOm - Why the collision of big data and privacy will require a new realpolitik (25 March 2013)
- 3.7 WorldCrunch - European Academics Launch Petition To Protect Personal Data From "Huge Lobby" (13 March 2013)
- 4 Data breach
- 4.1 Information is beautiful - World's Biggest Data Breaches (continually updated)
- 4.2 Infosecurity - Ubuntu Forum Hacked; 1.8 Million Accounts Compromised (22 July 2013)
- 4.3 Wired - Cloud Computing Snafu Shares Private Data Between Users (4 February 2013)
- 4.4 Bits - Yahoo Breach Extends Beyond Yahoo to Gmail, Hotmail, AOL Users (12 July 2012)
- 4.5 The Huffington Post - Yahoo Confirms 450,000 Accounts Breached, Experts Warn Of Collateral Damage (12 July 2012)
- 4.6 Network World - eHarmony data breach lessons: Cracking hashed passwords can be too easy (6 July 2012)
- 4.7 The New York Times - Lax Security at LinkedIn Is Laid Bare (10 June 2012)
- 4.8 The Register - 35m Google Profiles dumped into private database (25 May 2011)
- 4.9 Wikipedia Article - Data breach: Major incidents
- 4.10 Wikipedia Fr - Piratage du PlayStation Network
- 5 Surveillance state and its willing helpers
- 5.1 The Wall Street Journal - French Privacy Agency Moves to Sanction Google (27 September 2013)
- 5.2 Computer World - Google knows nearly every Wi-Fi password in the world (12 September 2013)
- 5.3 New York Times - Germany Fines Google Over Data Collection (22 April 2013)
- 5.4 Harvard Law Review: The Dangers of Surveillance (2012)
- 5.5 ZDNET : China's new data protection rules good step, but little bite (September 27, 2013)
- 6 Effects of loss of privacy on employment and credit worthiness
- 7 General studies
- 7.1 The Boston Consulting Group - The Value of Our Digital Identity (20 November 2012)
- 7.2 The Cost of Reading Privacy Policies. I/S: A Journal of Law and Policy for the Information Society 2008 Privacy Year in Review issue. (with A. McDonald) http://lorrie.cranor.org/#publications Download: http://lorrie.cranor.org/pubs/readingPolicyCost-authorDraft.pdf
- 8 Other
Studies on re-identification
Nature - Unique in the Crowd: The privacy bounds of human mobility (25 March 2013)
A simply anonymized dataset does not contain name, home address, phone number or other obvious identifier. Yet, if individual's patterns are unique enough, outside information can be used to link the data back to an individual. [...] We study fifteen months of human mobility data for one and a half million individuals and find that human mobility traces are highly unique. In fact, in a dataset where the location of an individual is specified hourly, and with a spatial resolution equal to that given by the carrier's antennas, four spatio-temporal points are enough to uniquely identify 95% of the individuals. [...]
University of Cambridge - Digital records could expose intimate details and personality traits of millions (11 March 2013)
Research shows that intimate personal attributes can be predicted with high levels of accuracy from ‘traces’ left by seemingly innocuous digital behaviour, in this case Facebook Likes. In the study, researchers describe Facebook Likes as a “generic class” of digital record - similar to web search queries and browsing histories - and suggest that such techniques could be used to extract sensitive information for almost anyone regularly online.
Researchers created statistical models able to predict personal details using Facebook Likes alone. Models proved 88% accurate for determining male sexuality, 95% accurate distinguishing African-American from Caucasian American and 85% accurate differentiating Republican from Democrat. Christians and Muslims were correctly classified in 82% of cases, and good prediction accuracy was achieved for relationship status and substance abuse – between 65 and 73%. [...]
The researchers also tested for personality traits including intelligence, emotional stability, openness and extraversion. While such latent traits are far more difficult to gauge, the accuracy of the analysis was striking. Study of the openness trait – the spectrum of those who dislike change to those who welcome it – revealed that observation of Likes alone is roughly as informative as using an individual’s actual personality test score.
Wired - Liking curly fries on Facebook reveals your high IQ (12 March 2013)
What you Like on Facebook could reveal your race, age, IQ, sexuality and other personal data, even if you've set that information to "private". [...] The research shows that although you might choose not to share particular information about yourself it could still be inferred from traces left on social media, such as the TV shows you watch or the music you listen to or the spiders that are afraid of you. [...]
ArsTechnica - 'Anonymized' data really isn’t—and here’s why not (08 September 2009)
This article makes the point that although information is "scrubbed" of all information that could personally identity an individual, this sounds better in theory than in practice. The article states that "87 percent of all Americans could be uniquely identified using only three bits of information: ZIP code, birthdate, and sex. [...]". The article referes to a paper by Ohm on "the surprising failure of anonymization."
More than 100 Leading European Academics are taking a position (February 2013)
A position paper that puts forth a series of arguments that argue generally in favour of stronger protection for personal data. The document is available in several European languages. The arguments include perspectives on affects on innovation and competitiveness and raises the question of how to correct or best ways to legislate on this topic making reference to legislation currently in front of the European Institutions.
The data industry
CNN Money - What your zip code reveals about you (18 April 2013)
This article echoes the points that are few other articles already made concerning the existence of data brokers (the article makes mention in particular to Acxiom) and how their industry works. The article raises the question of what is considered 'personal' information and mentions the decision in a Massachusets court that Zip codes (Postal code) can/should be considered as personal information. In this state and California retailers can no longer request your zip code for promotional purposes. As a whole the article makes a good overview of issues around personal data and uses simple language to get the ideas across.
RadioFreeEurope/RadioLiberty - Interview: 'It's Pretty Much Impossible' To Protect Online Privacy (8 April 2013)
From online companies tracking users' digital footprints to the trend for more and more data to be stored on cloud servers, Internet privacy seems like a thing of the past -- if it ever existed at all. RFE/RL correspondent Deana Kjuka recently spoke about these issues with online security analyst Bruce Schneier, author of the book "Liars and Outliers: Enabling the Trust Society Needs to Survive." [...] "If [you use] Gmail, [then] Google has all of your e-mail. If your files are in Dropbox, if you are using Google Docs, [or] if your calendar is iCal, then Apple has your calendar. So it just makes it harder for us to protect our privacy because our data isn't in our hands anymore." "I don't know about the future, but my guess is that, yes. The big risks are not going to be the illegal risks. They are going to be the legal risks. It's going to be governments. It's going to be corporations. It's going to be those in power using the Internet to stay in power."
SydneyMorningHerald - Facebook 'erodes any idea of privacy' (8 April 2013)
Facebook Home for Android phones has been dubbed by technologists as the death of privacy and the start of a new wave of invasive tracking and advertising. [...] Prominent tech blogger Om Malik wrote that Home “erodes any idea of privacy”. “If you install this, then it is very likely that Facebook is going to be able to track your every move, and every little action,” said Mailk. “This opens the possibility up for further gross erosions of privacy on unsuspecting users, all in the name of profits, under the guise of social connectivity,” he said. [...]
MemeBurn - How much are you worth to Facebook? (4 April 2013)
The argument that hundreds of millions of people give away their personal data on social networks with absolutely no interest in the commercial value of that information does not make sense. It is simply the case that they don’t have the slightest idea. [...] According to Spiekerman: “Even if privacy is an inalienable human right it would be good if people were enabled to manage their personal data as private property.” It’s not only about “monetizing”. The earth is, happily, not that flat. But materializing privacy might help us to overcome the huge issues we have when it comes to the privacy of internet users, and finally social networks and marketing will profit from more knowledge and more trust in the use of personal data.
Computerworld - Judge awards class action status in privacy lawsuit vs. comScore (4 April 2013)
A federal court in Chicago this week granted class action status to a lawsuit accusing comScore, one of the Internet's largest user tracking firms, of secretly collecting and selling Social Security numbers, credit card numbers, passwords and other personal data collected from consumer systems. [...] To collect data, comScore's software modifies computer firewall settings, redirects Internet traffic, and can be upgraded and controlled remotely, the complaint alleged. The suit challenged comScore's assertions that it filtered out personal information from data sold to third parties, and of intercepting data it had no business to access. [...]
GigaOm - Why the collision of big data and privacy will require a new realpolitik (25 March 2013)
People’s movements are highly predictable, researchers say, making it easy to identify most individuals from supposedly anonymized location datasets. As these datasets have valid uses, this is yet another reason why we need better regulation. [...] One of the explicit purposes of Unique in the Crowd was to raise awareness. As the authors put it: “these findings represent fundamental constraints to an individual’s privacy and have important implications for the design of frameworks and institutions dedicated to protect the privacy of individuals.” [...]
WorldCrunch - European Academics Launch Petition To Protect Personal Data From "Huge Lobby" (13 March 2013)
This week, more than 90 leading academics across Europe launched a petition to support the European Commission’s draft data protection regulation, reports the EU Observer. The online petition, entitled Data Protection in Europe, says “huge lobby groups are trying to massively influence the regulatory bodies.” The goal of the site is to make sure the European Commission’s law is in line with the latest technologies and that the protection of personal data is guaranteed. [...]
Information is beautiful - World's Biggest Data Breaches (continually updated)
Go to the website for an updated version of this image.
Infosecurity - Ubuntu Forum Hacked; 1.8 Million Accounts Compromised (22 July 2013)
The article gives details of the July 2013 hack of the Ubuntu Linux forum that defaced the forum's website and stole users' account information.
A low-cost competitor to giants such as RackSpace and Amazon, DigitalOcean sells cheap computing power to web developers who want to get their sites up and running for as little as $5 per month. But it turns out that some of those customers — those who were buying the $40 per month or $80 per month plans, for example — aren’t necessarily getting their data wiped when they cancel their service. And some of that data is viewable to other customers. Kenneth White stumbled across several gigabytes of someone else’s data when he was noodling around on DigitalOcean’s service last week. White, who is chief of biomedical informatics with Social and Scientific Systems, found e-mail addresses, web links, website code and even strings that look like usernames and passwords — things like 1234qwe and 1234567passwd. [...]
Bits - Yahoo Breach Extends Beyond Yahoo to Gmail, Hotmail, AOL Users (12 July 2012)
Yahoo confirmed Thursday that about 400,000 user names and passwords to Yahoo and other companies were stolen on Wednesday.
A group of hackers, known as the D33D Company, posted online the user names and passwords for what appeared to be 453,492 accounts belonging to Yahoo, and also Gmail, AOL, Hotmail, Comcast, MSN, SBC Global, Verizon, BellSouth and Live.com users. [...]
The hackers wrote a brief footnote to the data dump, which has since been taken offline: “We hope that the parties responsible for managing the security of this subdomain will take this as a wake-up call, and not as a threat.”
The Huffington Post - Yahoo Confirms 450,000 Accounts Breached, Experts Warn Of Collateral Damage (12 July 2012)
Security researchers warned Thursday that thousands of people could be vulnerable to hackers after Yahoo confirmed that about 450,000 usernames and passwords were stolen from one of the company's databases [...]
Yahoo Voices contributors signed up using a variety of accounts: about 140,000 Yahoo addresses, more than 100,000 Gmail addresses, more than 55,000 Hotmail addresses and more than 25,000 AOL addresses. [...]
A hacker group called D33D claimed responsibility for the disclosure of usernames and passwords belonging to Yahoo Voices' users.
"We hope that the parties responsible for managing the security of this subdomain will take this as a wake-up call, and not as a threat," the group said in a statement. [...]
Alex Horan, a senior product manager at CORE Security, criticized Yahoo for apparently storing usernames and passwords without encrypting them.
"The bigger problem is these passwords were sitting there in the clear," Horan said. He added that encrypting passwords was "Security 101."
"That’s mind-blowing that a company wouldn't do that," he said. [...]
Network World - eHarmony data breach lessons: Cracking hashed passwords can be too easy (6 July 2012)
Last month the dating site eHarmony suffered a data breach in which more than 1.5 million eHarmony password hashes were stolen and later dumped online by the hacker gang called Doomsday Preppers. The crypto-based "hashing" process is supposed to conceal stored passwords, but Trustwave's SpiderLabs division says eHarmony could have done this process a lot better because it only took 72 hours to crack about 80% of 1.5 million eHarmony hashed passwords that were dumped.
Cracking the dumped eHarmony passwords wasn't too hard, says Mike Kelly, security analyst at SpiderLabs, which used tools such as oclHashcat and John the Ripper. In fact, he says it was one of the "easiest" challenges he ever faced. There are many reasons why this is so, starting with the fact the cracked passwords may have been "hashed," but they weren't "salted," which he says "would drastically increase the time it would take to crack them." [...]
The New York Times - Lax Security at LinkedIn Is Laid Bare (10 June 2012)
Last week, hackers breached the site and stole more than six million of its customers’ passwords, which had been only lightly encrypted. [...]
What has surprised customers and security experts alike is that a company that collects and profits from vast amounts of data had taken a bare-bones approach to protecting it. The breach highlights a disturbing truth about LinkedIn’s computer security: there isn’t much. Companies with customer data continue to gamble on their own computer security, even as the break-ins increase.
“If they had consulted with anyone that knows anything about password security, this would not have happened,” said Paul Kocher, president of Cryptography Research, a San Francisco computer security firm. [...]
LinkedIn does not have a chief security officer whose sole job it is to monitor for breaches. The company says David Henke, its senior vice president for operations, oversees security in addition to other roles, but Mr. Henke declined to speak for this article. [...]
The Register - 35m Google Profiles dumped into private database (25 May 2011)
In order to demonstrate that online information is trivial to mine, a Phd student from the University of Amsterdam in one month dumped the names, email addresses and biographical information of 35 million google profiles into a database. This was a experiment to test how difficult it would be to do and the answer was, not hard at all. The article does into some technical detail explaining why and how Google profiles in particular were vulnerable.
Wikipedia Article - Data breach: Major incidents
Wikipedia Fr - Piratage du PlayStation Network
Surveillance state and its willing helpers
The Wall Street Journal - French Privacy Agency Moves to Sanction Google (27 September 2013)
An article about the recent legal procedure initiated against Google Inc. by the French organisation Commission Nationale de l'Informatique et des Libertés, or CNIL, (http://www.cnil.fr/english/). The action focuses on the lack of transparency in Google's use of data collected. The article states that Google had introduced changes to its privacy data regulations in order to make it's different products more compatible and as advertisers are increasingly turning away from other forms of advertising (e.g. TV ads), this data is becoming more valuable. The article overviews other European procedures against Google. See also NYTimes article 22 April 2013 below for similar developments in Germany.
Computer World - Google knows nearly every Wi-Fi password in the world (12 September 2013)
A short article reports that given the default setting on most of the millions of Android phones in the world, google is now probably in possession of very many of the world's wifi passwords. The default setting is to "back up your data" which means that quite a bit of personal information stored on your phone, including wifi passwords, is backed up with google. Given that the passwords are not well encrypted, and surely google is well able to decrypt them, this means that google has potentially access to very very many millions of wifi passwords. The article also lists several other articles that deal with the same issue.
New York Times - Germany Fines Google Over Data Collection (22 April 2013)
An article that starts with the 145 000 Euro fine Google received from a German court with regard to illegally collecting personal data during it's street view recording. It then does a quick overview of reactions by other countries to the disclosure that street view collected personal data. The article then considers approaches by other west European countries on the matter and efforts by the European institutions to introduce European legislation on the issue which would, amongst other things, raise the amount by which fines could in the future be levied on organizations.
Harvard Law Review: The Dangers of Surveillance (2012)
"From the Fourth Amendment to George Orwell’s Nineteen Eighty-Four, and from the Electronic Communications Privacy Act to films like Minority Report and The Lives of Others, our law and literature are full of warnings about state scrutiny of our lives. These warnings are commonplace, but they are rarely very specific. Other than the vague threat of an Orwellian dystopia, as a society we don’t really know why surveillance is bad, and why we should be wary of it. To the extent the answer has something to do with “privacy,” we lack an understanding of what “privacy” means in this context, and why it matters. We’ve been able to live with this state of affairs largely because the threat of constant surveillance has been relegated to the realms of science fiction and failed totalitarian states."
ZDNET : China's new data protection rules good step, but little bite (September 27, 2013)
Summary (quoted from website): "China has introduced rules to regulate the collection and use of personal data by its internet and telecoms operators. The rules have been a long time coming, but do they actually offer anything to users?"
Article that does not say very much beyond what the headlines promises but given the subject matter it's as in-depth as it is going to be on the matter of the protection of personal data in China.
Effects of loss of privacy on employment and credit worthiness
On Device Research - Facebook costing 16-34s jobs in tough economic climate (29 May 2013)
The index which covers 6000 16-34 year olds across six countries revealed some surprising results [: ] If getting a job was not hard enough in this tough economic climate, one in ten young people have been rejected for a job because of their social media profile.
CNN - Facebook friends could change your credit score (27 August 2013)
An article on the uses by financial companies of 'personal' data collected from Facebook, eBay... or even how you fill in an online application form. Amongst other the argument is made by representatives of these companies that this data is reliable enough, especially when taken as a whole, to have good enough idea of a person's creditworthiness.
The Boston Consulting Group - The Value of Our Digital Identity (20 November 2012)
The Value of Our Digital Identity, a new report by The Boston Consulting Group in the Liberty Global Policy Series, takes a unique approach to understanding this new phenomenon. It quantifies, for the first time, the current and potential economic value of digital identity. It also explores—through research involving more than 3,000 individuals—the value that consumers place on their personal information and how they make decisions about whether or not to share it. Building on these findings, the report presents a new paradigm for unlocking the full value of digital identity in a sustainable, consumer-centered way. [...]
The report shows that the value created through digital identity can indeed be massive: €1 trillion in Europe by 2020 [...]
Individuals with a higher than average awareness of how their data are used require 26 percent more benefit in return for sharing their data. Meanwhile, consumers who are able to manage their privacy are up to 52 percent more willing to share information than those who aren’t. [...]
Given proper privacy controls and sufficient benefits, the survey found, most consumers are willing to share their personal data. To ensure that the flow of personal information continues, organizations therefore need to make the benefits clear to consumers. They also need to embrace responsibility, transparency, and user control. [...]
It would take each American 244 hours per year to read privacy policies… cf. page 18. About 40 minutes every day.
EuropaQuotidiano - Facebook, i Big Data e la fine della privacy
Euobserver - Academics line up to defend EU data protection law