Privacy implications of Local Storage in web browsers

Privacy professionals often have a hard time keeping track of technology and how it affects privacy. This post is meant to help explain the technology of local/web storage.

With the ability to track users across domains, cookies have earned a bad reputation in the privacy community. This became particularly acute with the passing of the EU Cookie law. In particular the law requires affirmative consent when local storage on a user’s computer is used in a way that is not “strictly necessary for the delivery of a service requested by the user.” In other words, if you’re using it to complete a shopping cart at an online store, you need not get consent. If you’re using it to track the user for advertising purposes then you need to get consent.

Originally part of the HTML5 standard, web storage was split into it’s own specification. For more history on the topic, see this article. Web storage is meant to be accessed locally (by javascript) and can store up to 5MB per domain, compared to cookies which only store a maximum of 4kbs of data. Cookies are natively accessible by the server; the purpose of the cookie is to be accessed by server side scripts. Web storage is not immediately accessible by the server but it can be through javascript.

CONS

The con here is that, as a privacy professional, you should be aware of what your developers are doing with web/local storage. Simply asking your developer if they are using cookies may illicit a negative response when they are using an alternative technology that isn’t cookies. Later revelations and returning to your developers may result in a response “Well you asked about cookies, not local storage!” There are also proposals for a local browser accessible database but as of the time of this writing this is not an internet standard (see Mozilla Firefox’s IndexDB for an example).

Web Storage is not necessarily privacy invasive but two things need to be addressed. First, whether that local data is transmitted back to the server or used in such a way that implies results that are transmitted back to the server. Secondly, whether the data stored in local storage is accessible to third parties and represents a risk of exposure to the user. As of this writing, I’m not sure if 3rd party javascript running through a 1st party domain has the ability to access local storage of if it is restricted by a content security policy. The other risk is that a local user can access local storage through the a javascript console. Ideally data on the client should be encrypted.

PROS

Local storage also has the potential to increase privacy. Decentralization is a key technique for architecting for privacy and having access to 5MB of local storage allows enough room to keep most, if not all, client data on the client. Instead of developing rich customer profiles for personalization on the server, keeping this data on the client reduces the risks to the user because the server becomes less of a target. Of course, care must be taken to deal with multi tenancy (more than one person on an end client), which may be especially difficult for systems accessed often by library patrons and the problems of people accessing the data of other local users.

Thoughts on the term “privacy enhancing technologies”

For the last two years I’ve been lamenting the lack of standardization around the term “privacy enhancing technologies.” In fact, as I see it, the term has been bastardized to mean whatever the speaker wants it to mean.  Shortened with the moniker PETs, the term is used in both the privacy professional’s community and in the academic realm of cryptographic research. Newer incarnations, “privacy enabling technologies” and “privacy enhancing techniques,” do not even make the cut on Google’s Ngram service, which rates occurrences of terms in books (See chart below).

In 2008, the British Information Commissioner Office recognized the definitional problem in a paper on Privacy by Design and PETs:

There is no widely accepted definition for the term Privacy Enhancing Technologies (PETs) although most encapsulate similar principles; a PET is something that:

1. reduces or eliminates the risk of contravening privacy principles and legislation.
2. minimises the amount of data held about individuals.
3. empowers individuals to retain control of information about themselves at all times.

To illustrate this, the UK Information Commissioner’s Office defines PETs as:
“… any technology that exists to protect or enhance an individual’s privacy, including facilitating individuals’ access to their rights under the Data Protection Act 1998”.

The definition given by the European Commission is similar but also includes the concept of using PETs at the design stage of new systems:
Defining Privacy Enhancing Technologies
“The use of PETs can help to design information and communication systems and services in a way that minimises the collection and use of personal data and facilitates compliance with data protection rules. The use of PETs should result in making breaches of certain data protection rules more difficult and / or helping to detect them.”

The problem with such definitions is that they are broadly written and thus broadly interpreted and can be used to claim adherence to protecting privacy when in fact, one is not. This also leads to the perverse isolation of privacy protection as synonymous with data protection, which it is not. Privacy is as much about risk of aggregation, intrusion, freedom of association and other forms of invasions in the personal space that we designate and distinguish from the social space where we exist in a larger society.

I see the Privacy by Design (PbD) camp dancing around this. Ann Cavoukian, the Ontario Information and Privacy Commissioner and chief PbD champion, has promoted PETs for years and this evangelism is evident in the PbD foundational principle of full functionality. However, even she has allowed this termed to be applied loosely to make it more palatable to her audience. PbD and PETs thus become buzzwords to attach to an effort in a marketing ploy to give the appearance of doing the right thing, but often results in minimal enhancing of privacy.

I thus suggest the follow definition, and one to which I use in my own vernacular, a privacy enhancing technology is “a technology whose sole purpose is to enhance privacy.” Firewalls, something I see too often referred to as a PETs by laypersons, can enhance privacy but it’s purpose is not necessarily to do so. It is a security technology, protecting confidentiality and also securing the integrity and availability of the systems it protects. Data loss prevention, similarly, can be actually very privacy invasive but could enhance the privacy of data on some occasions. However, the primary purpose is to protect against loss of corporate intellectual property (be it personal information of customers or not), not enhance privacy.

Technologies which would qualify can be found in mixmaster networks (whose sole purpose is to obscure the sender and receiver identity of email) or zero knowledge proofs and related secure multi-party computations (which allow for parties to calculate public functions on private data without revealing anything other than the conclusions of the public function).

Some technologies may be privacy enhancing in application but the technology wasn’t created for the purpose of enhancing privacy. My purpose here is not to split hairs on the definition, per se. My purpose is to expose the dilution of the term to where it becomes doublespeak.

 

Problem closure in privacy

While reading this article earlier today, I came upon the term ‘problem closure’ which has been defined by sociologist to mean “the situation when a specific definition of a problem is used to frame subsequent study of the problem’s causes and consequences in ways that preclude alternative conceptualizations of the problem.” This is something that I see time and time again in discussions on privacy.

Perhaps the most prominent example is the refrain “I have nothing to hide” in response to concerns about surveillance. The answer defines the problem, suggesting that the only issue with surveillance is for those engaged in nefarious acts who need to conceal those acts. It precludes other issues of surveillance. This has been addressed at length by Law Professor Daniel Solove and his discussion of the topic can be found here.

More to practical applications of privacy, I find that many organizations suffer from problem closure in their business models and system implementations. This makes it difficult to add in privacy controls when the starting point is a system that is antithetical to privacy or precludes it. One of the many reasons for this is that companies view their raison d’être as their particular solution not solving the problem they originally set out to solve.  This can best be illustrated by example. Ask many up and coming firms in the online advertising space and they are in the business of ‘targeted behavioral marketing.’ Really, they are in the business of effective online advertising but they’ve defined their company by the one solution they currently offer. This attitude not only is bad for privacy it is bad for the business. The ability to adapt and change to changing customer needs and market conditions is the hallmark of a strong enterprise. Those that are stuck in an unadaptable business model have already sealed their eventual fate. This is especially true in industries driven by technology.

Are you in the business of providing gas powered automobiles or are you in the business of providing transportation solutions?

Are you in the business of printing newpapers or providing news?

Are you in the telephone business or the communications business?

Waste management is a good example of a company which adapted to changing social mores about waste. Originally a trash hauling and dumping company, they have readily adapted their business model to recycling.

When trying to escape the “problem closure” problem, organizations need to look not at the solution they are currently implementing to define them but the problem they are solving for their customers. Once they do that, they can open up their eyes to potential solutions that solve the problem for their customers AND have privacy as a core feature.

This problem is most prevalent, IMHO, in smaller companies who have bet their socks on a particular solution they’ve invented. They don’t have the luxury of having an existing customer base and the ability to explore alternative solutions.

It is a problem I deal with often in trying to convince companies that privacy must be built in. You can’t build the solution and then come back and worry about privacy. It has to be part of the solution building process.