Privacy implications of Local Storage in web browsers

Privacy professionals often have a hard time keeping track of technology and how it affects privacy. This post is meant to help explain the technology of local/web storage.

With the ability to track users across domains, cookies have earned a bad reputation in the privacy community. This became particularly acute with the passing of the EU Cookie law. In particular the law requires affirmative consent when local storage on a user’s computer is used in a way that is not “strictly necessary for the delivery of a service requested by the user.” In other words, if you’re using it to complete a shopping cart at an online store, you need not get consent. If you’re using it to track the user for advertising purposes then you need to get consent.

Originally part of the HTML5 standard, web storage was split into it’s own specification. For more history on the topic, see this article. Web storage is meant to be accessed locally (by javascript) and can store up to 5MB per domain, compared to cookies which only store a maximum of 4kbs of data. Cookies are natively accessible by the server; the purpose of the cookie is to be accessed by server side scripts. Web storage is not immediately accessible by the server but it can be through javascript.


The con here is that, as a privacy professional, you should be aware of what your developers are doing with web/local storage. Simply asking your developer if they are using cookies may illicit a negative response when they are using an alternative technology that isn’t cookies. Later revelations and returning to your developers may result in a response “Well you asked about cookies, not local storage!” There are also proposals for a local browser accessible database but as of the time of this writing this is not an internet standard (see Mozilla Firefox’s IndexDB for an example).

Web Storage is not necessarily privacy invasive but two things need to be addressed. First, whether that local data is transmitted back to the server or used in such a way that implies results that are transmitted back to the server. Secondly, whether the data stored in local storage is accessible to third parties and represents a risk of exposure to the user. As of this writing, I’m not sure if 3rd party javascript running through a 1st party domain has the ability to access local storage of if it is restricted by a content security policy. The other risk is that a local user can access local storage through the a javascript console. Ideally data on the client should be encrypted.


Local storage also has the potential to increase privacy. Decentralization is a key technique for architecting for privacy and having access to 5MB of local storage allows enough room to keep most, if not all, client data on the client. Instead of developing rich customer profiles for personalization on the server, keeping this data on the client reduces the risks to the user because the server becomes less of a target. Of course, care must be taken to deal with multi tenancy (more than one person on an end client), which may be especially difficult for systems accessed often by library patrons and the problems of people accessing the data of other local users.

Thoughts on the term “privacy enhancing technologies”

For the last two years I’ve been lamenting the lack of standardization around the term “privacy enhancing technologies.” In fact, as I see it, the term has been bastardized to mean whatever the speaker wants it to mean.  Shortened with the moniker PETs, the term is used in both the privacy professional’s community and in the academic realm of cryptographic research. Newer incarnations, “privacy enabling technologies” and “privacy enhancing techniques,” do not even make the cut on Google’s Ngram service, which rates occurrences of terms in books (See chart below).

In 2008, the British Information Commissioner Office recognized the definitional problem in a paper on Privacy by Design and PETs:

There is no widely accepted definition for the term Privacy Enhancing Technologies (PETs) although most encapsulate similar principles; a PET is something that:

1. reduces or eliminates the risk of contravening privacy principles and legislation.
2. minimises the amount of data held about individuals.
3. empowers individuals to retain control of information about themselves at all times.

To illustrate this, the UK Information Commissioner’s Office defines PETs as:
“… any technology that exists to protect or enhance an individual’s privacy, including facilitating individuals’ access to their rights under the Data Protection Act 1998”.

The definition given by the European Commission is similar but also includes the concept of using PETs at the design stage of new systems:
Defining Privacy Enhancing Technologies
“The use of PETs can help to design information and communication systems and services in a way that minimises the collection and use of personal data and facilitates compliance with data protection rules. The use of PETs should result in making breaches of certain data protection rules more difficult and / or helping to detect them.”

The problem with such definitions is that they are broadly written and thus broadly interpreted and can be used to claim adherence to protecting privacy when in fact, one is not. This also leads to the perverse isolation of privacy protection as synonymous with data protection, which it is not. Privacy is as much about risk of aggregation, intrusion, freedom of association and other forms of invasions in the personal space that we designate and distinguish from the social space where we exist in a larger society.

I see the Privacy by Design (PbD) camp dancing around this. Ann Cavoukian, the Ontario Information and Privacy Commissioner and chief PbD champion, has promoted PETs for years and this evangelism is evident in the PbD foundational principle of full functionality. However, even she has allowed this termed to be applied loosely to make it more palatable to her audience. PbD and PETs thus become buzzwords to attach to an effort in a marketing ploy to give the appearance of doing the right thing, but often results in minimal enhancing of privacy.

I thus suggest the follow definition, and one to which I use in my own vernacular, a privacy enhancing technology is “a technology whose sole purpose is to enhance privacy.” Firewalls, something I see too often referred to as a PETs by laypersons, can enhance privacy but it’s purpose is not necessarily to do so. It is a security technology, protecting confidentiality and also securing the integrity and availability of the systems it protects. Data loss prevention, similarly, can be actually very privacy invasive but could enhance the privacy of data on some occasions. However, the primary purpose is to protect against loss of corporate intellectual property (be it personal information of customers or not), not enhance privacy.

Technologies which would qualify can be found in mixmaster networks (whose sole purpose is to obscure the sender and receiver identity of email) or zero knowledge proofs and related secure multi-party computations (which allow for parties to calculate public functions on private data without revealing anything other than the conclusions of the public function).

Some technologies may be privacy enhancing in application but the technology wasn’t created for the purpose of enhancing privacy. My purpose here is not to split hairs on the definition, per se. My purpose is to expose the dilution of the term to where it becomes doublespeak.


Problem closure in privacy

While reading this article earlier today, I came upon the term ‘problem closure’ which has been defined by sociologist to mean “the situation when a specific definition of a problem is used to frame subsequent study of the problem’s causes and consequences in ways that preclude alternative conceptualizations of the problem.” This is something that I see time and time again in discussions on privacy.

Perhaps the most prominent example is the refrain “I have nothing to hide” in response to concerns about surveillance. The answer defines the problem, suggesting that the only issue with surveillance is for those engaged in nefarious acts who need to conceal those acts. It precludes other issues of surveillance. This has been addressed at length by Law Professor Daniel Solove and his discussion of the topic can be found here.

More to practical applications of privacy, I find that many organizations suffer from problem closure in their business models and system implementations. This makes it difficult to add in privacy controls when the starting point is a system that is antithetical to privacy or precludes it. One of the many reasons for this is that companies view their raison d’être as their particular solution not solving the problem they originally set out to solve.  This can best be illustrated by example. Ask many up and coming firms in the online advertising space and they are in the business of ‘targeted behavioral marketing.’ Really, they are in the business of effective online advertising but they’ve defined their company by the one solution they currently offer. This attitude not only is bad for privacy it is bad for the business. The ability to adapt and change to changing customer needs and market conditions is the hallmark of a strong enterprise. Those that are stuck in an unadaptable business model have already sealed their eventual fate. This is especially true in industries driven by technology.

Are you in the business of providing gas powered automobiles or are you in the business of providing transportation solutions?

Are you in the business of printing newpapers or providing news?

Are you in the telephone business or the communications business?

Waste management is a good example of a company which adapted to changing social mores about waste. Originally a trash hauling and dumping company, they have readily adapted their business model to recycling.

When trying to escape the “problem closure” problem, organizations need to look not at the solution they are currently implementing to define them but the problem they are solving for their customers. Once they do that, they can open up their eyes to potential solutions that solve the problem for their customers AND have privacy as a core feature.

This problem is most prevalent, IMHO, in smaller companies who have bet their socks on a particular solution they’ve invented. They don’t have the luxury of having an existing customer base and the ability to explore alternative solutions.

It is a problem I deal with often in trying to convince companies that privacy must be built in. You can’t build the solution and then come back and worry about privacy. It has to be part of the solution building process.



Cave diving and blindness by policy

Policy is a control by which an organization attempts to mitigate risk by ensuring that organization acts in a way that reduces the likelihood or impact of an event. However, thinks can and do go awry because policies are developed with certain facts in mind that may not be prevalent during the actual application of that policy. Let my give a few examples, one that has no privacy implications and the other than does not.

I went diving at Ginnie Springs in Florida yesterday as  I have been since 1987, when my dive instructor took us there for our open water checkout dives. I dive on an NACD Intro to Cave certification which was issued in 1990. I’ve been diving on that card (at Ginnie Springs) since then, in other words 23 years. At the time, the NACD only had 3 levels of certification (Cavern diver, Intro to Cave and Full Cave). Now the NACD has 4 levels (Cavern, Intro to Cave, Apprentice to Cave, and Full Cave).  Interestingly enough, my Intro to Cave card has in fine print, which I never noticed, a one year expiration. Neither has anybody else until yesterday when the cashier checking me in said my card had expired. This is no longer the case with current Intro to Cave certifications, though Apprentice to Cave does has a one year expiration. I do have a newer Intro to Cave card that does not have an expiration but I did not have that on me. Regardless, I registered under my older Cavern certification.

After diving I went to have my air tanks filled. I’m diving two 50 cu ft tanks with a cross over bar in a double tank configuration. This is an uncommon configuration and unfortunately has given me nothing but grief. I asked the air fill attendant whether he was charging me for a single or a double air fill, after a bit of negotiation he agreed to charge only a single. I only get a single dive off of the set. It is equivalent to having a single 100 cu ft tank, which though large, is doable. However, while it was filling he proceed to debate me and say that I should be diving it and they shouldn’t be allowing it. You see Ginnie has a policy that those without full cave are not allowed to use double cylinders. In fact the training organizations do have a limitation for Intro to Cave divers that they must turn the dive after 1/6th of their air supply has been used if diving double tanks as opposed to 1/3 normally. This is to prevent Intro to Cave divers from exceeding their training by going to deep or too far into the cave environment. Ginnie, because divers can’t be trusted to not exceed this limitation, restrict double tanks to only full cave divers. Fine, this is understandable. However, the air fill attendant said that I couldn’t dive my doubles because ….well they were doubles….never mind that they were HALF the size of normal tanks, the policy said you couldn’t dive two tanks and I had two tanks therefore I shouldn’t be diving with them. Now, luckily I’ve never had a problem actually diving at Ginnie with these tanks and was done for the day but this is where I’d like to point out that blind application of policy is stupid when it can’t be amended for a given factual situation. The justification for the policy is sound, don’t want divers going beyond their training limitation, but the application of this policy in this case doesn’t serve that purpose.

I’d like to give another example. While in law school, I did not supply my SSN to the school for privacy reasons. However, after two years I decided to receive financial aid and had to supply my SSN number to the school. They required that I supply my originally SSN card or a tax return showing my SSN. Why? Well I was changing my number and they needed verification. That policy makes sense when someone initially supplies one number and then supplies a new number. If you tell them one thing before they change it, they need some validation that the new number you’re supplying is the legitimate number. This wasn’t applicable in my case because I never supplied them a number in the first place. They have 10k + students enrolling every year where they accept SSNs without validation but would accept mine because I was “changing” it, only I wasn’t.  Even escalating this to a vice president, she couldn’t differentiate between the policy, the justification for the policy, and the application in this scenario.

Blind application of policy is one thing, but just as bad is wholesale lying about following the policy. In the wake of NSA revelations, it’s become quite apparent that the intelligence establishment is hell bent on keeping Congress and the public in the dark about what they are doing. Laws (policies) are ineffective if you have no ability to ensure compliance.  The cost of failing to comply has to exceed the benefit of non-compliance. If you tell an employee the policy is not to steal people’s identity but the cost to that employee is losing an $8/hr job where the benefit is steal hundreds of thousands of dollars, you essentially have policy by begging, begging people to do the right thing. You have to make it easy to comply with policy and hard to not comply.


Privacy is …..

I’m giving a speech to the Temple Terrace Rotary Club this coming week and I’m sure they are expecting something fairly dry about the need to protect your privacy online and secure your passwords and such, however, I’m going to give them a little different perspective. I don’t usually write out my speeches, preferring to speak from an outline just so I make sure I touch on my major points, so what follows is an essay based on my outline.

A few years ago, when my girlfriend and I were living here in Temple Terrace we used to frequent a Pita shop over near MOSI (the museum of science and industry). Sometimes we went together and sometime we would go alone to pick up lunch or dinner. Often times, especially when she went alone, the clerk would flirt with her. No big deal, she was an attractive young woman. Well one day she receives a Facebook friend request from him. That disconcerted her…and me. How did he get her name?  We finally figured out that he must have gleaned it off her credit card and searched for her on Facebook. So….what was bothersome about this? Why did she feel violated by his contact? Why was this a bit “creepy?” The clerk had exceeded the social norms associated with customer/merchant relations. He contacted her outside of the normal bounds associated with that relationship. This isn’t to say it can’t happen. I’m sure plenty of relationships have begun because people meet in the places they work. But usually, they ask permission….the waitress leaves her phone number on the check….the customer who gives the cute store clerk his business card. But those actions all give the other person the choice, the option of denying the expansion of the relationship.

Privacy is much more than just keeping secrets. My girlfriend gave him her name. She wasn’t trying to conceal it. It was his use outside of the merchant/customer relationship that was the privacy violation. Privacy does not require things be private. Privacy requires respect for context, respect for decision making.

Consider other relationships:

You tell your friends different things than you tell your boss.
You tell your spouse different things than you tell your children.
You tell your doctor different things than your waitress.
You tell your bartender everything!

If someone start inquiring about things that aren’t normally appropriate for that relationship, we get queasy. We feel unnerved. If your doctor asks about your finances or your children start asking about your sex life. Those things aren’t part of the relationship. Some friends we may share our health issues, some we don’t but if we don’t announce our health concerns on Facebook, we probably don’t want our friends to do so either. Even though we didn’t explicitly state so, when they exceed that unwritten norm that we have in our society the relationship is damaged.

A few years ago, a high school received a mailer from Target advertising maternity clothes, baby carriages and other things a would be mom might like to have. Her angry father stormed to the local Target to complain to the manager, who was profusely apologetic. A few days later, it was the father who was apologizing.  It turns out the young woman was pregnant and Target knew before her father. They had used some predictive analysis based on her purchasing patterns to identify her probable pregnancy and predict her due date. Target knew not because she had disclosed it but because her purchases implied it.Target

Some people respond to that story, not in concern of Target’s behavior, but with the retort that children don’t have privacy from their parents. I respond that nothing can be further from the truth. As children grow up, privacy is essential to their growth as individuals. The ability to distinguish their thoughts and actions from those of their parents is what develops their person-hood. Children develop into adults as they become individuals, with their own thoughts, their ability to control their interactions with the world. They develop a sense of self, a sense of freedom from their parents all knowing action.  And kids do take steps to actively claim that privacy from their parents. In the past 5 years as the wave of parents joined Facebook, kids began seeking alternative avenues of expression, where they could be free and would not have to self censor. They have moved to Tumblr and Twitter and other services not yet used by their parents or others adults in their lives. There was a fascinating article by a Dad who, for a decade, monitored the online activities of his 3 daughters. He used key-loggers, screen capture software, and other technology to totally monitor them. He learn some incredible things, not bad, but introspective and I quote  “The idea that even my virtual presence on Tumblr or Twitter might prevent them from being able to express themselves or interact with their friends (some of whom they have never met) in an authentic way made me feel like I was robbing them of one of the most powerful features of the social web.”

Daniel Solove . Nothing to HideYou often hear the retort that if you aren’t doing anything wrong you have nothing to hide. The problem with the nothing to hide retort is that in presumes that the only privacy violation is a lack of secrecy. Surveillance is not about the harms of exposure, it is much broader. The problem is not Orweillian it is Kafkaesque. In Franz Kafka’s The Trial, the protagonist is arrested by a secret court, with a secret dossier doing a secret investigation. He is not allowed to see it or know how it was compiled or what it contains.

The FTC is grappling with how to deal with mortgage companies that don’t necessarily advertise their services. These companies use complex predictive algorithms to predict who won’t default on loans but unlike traditional mortgage companies they aren’t beholden to the Fair Credit Reporting Act because they don’t “reject you” they just never market to you. What if you never have the opportunity to buy a mortgage because of where you live, what products you buy, who you associate with?

Orbitz was found to pitch higher priced hotels to Mac users over PC users.

People on the no-fly list are not told why they were put there and appeals are limited to those who are mistakenly identified because they share a name with a suspected individual.

Who is in control of your life?

Let’s talk about control for a bit. Companies are increasing putting out devices that they control, not you. This is either done at the behest of government or industry. It started with copyright, with VHS devices and now DVDs having technical mechanisms to stop people from copying them. Sony, a few years ago, installed a virus on millions of their CDs which  when placed in a computer, prevented the computer from copying the CD but also exposed the computers to risk of other viruses. Copy machines are restricted from copying US currency. Years ago I was forced by my phone provider to upgrade my phone because mine didn’t have GPS. The FBI was behind the initiative called Enhanced 911 as a public safety measure so that when you called 911 they could identify your location. Yes, the FBI, the public safety organization. There is talk about preventing 3D printers from printing gun parts, but what about Mickey Mouse dolls or interchangeable anatomically correct Barbie parts. Apple doesn’t allow applications that show pornography in their App store….or ones that identify drones strikes in Pakistan. Google’s Android market won’t allow Ad blocker software which prevents other apps from blocking advertising and increasing your bandwidth bill.

In 2012, San Francisco’s BART shut down cell phone service in the subways system to block a planned protest.

I want to switch gears a bit and talk about risk. Risk is important when making decisions about privacy. Generally there are trade offs. We share information because there is a benefit. We tell our doctors about our heath issues to get the benefit of his professional advice. On a social scale, we allow search warrants to fight against criminal activity. There is always a balancing of privacy versus other interests, be they personal or societal. However, we are terrible at assessing risk. Risk is the probability of occurrence and the severity of the damage.

We have cognitive (mental) biases that make us prefer to get things now and discount the costs in the future. We behave irrationally. A few years ago, a study was done where people were given $10 gift cards in a mall and given the option of getting $12 card in exchange for some personal information about them, about 50% kept the $10 card. Other participants were given $12 gift cards and given option of switching to $10 cards if it was anonymous, only 10% switched. Essentially the same economic results but depending upon where they started, privacy was worth more to those who already had it versus those who could pay to acquire it.

On a societal level, privacy suffers from not being visceral enough; not vivid enough; not violent enough. Fear is a powerful motivator and we tend to fear things that are extremely low probability; terrorist attacks, child abductions; random shooters. But these things are so incredibly rare, that’s why they are so amplified out of proportion. You have a higher chance of drowning than dying in a terrorist incident. Children have a higher chance of drowning than being abducted by a stranger. Yet we spend incredible resources trying to defend against such low probability events.  What else do we lose in the process? What kind of society do you want?

Prior to a few hundred years ago, the notion of privacy was very limited but so too was the notion of freedom. We lived in societies ruled by oligarchies, feudal lords, dictatorships. It is no coincidence that as we gained our privacy so too we gained freedom. As we grew into a society of individuals with power over our lives and our bodies, we no longer served the state as serfs. Privacy is not about secrecy. Privacy is freedom.


The privacy of death.

I remember during law school, I was helping out at the Florida First Amendment Foundation and discussing privacy and public record issues with the Director. She was fairly adamant that death obviated any expectation of privacy. The person was dead, how could a person without consciousness have an expectation. Her position was informed in part by the debate over the autopsy photos of Dale Earnhardt, Jr.

Michael Jackson's death certificate Fast forward to 2013 in which I’ve had a few friends, unfortunately, pass away recently. One friend’s death was particularly gruesome and was publicized in the local press. The other friends were more mundane and to this day I don’t know what they died of. One of those was elderly and I suspect due to health issues, though I’m unsure. The other friend was my age. In Florida, a death certificate is public record an easily obtained by paying the appropriate fee. However, only certain related person or those with an interest in the estate of the deceased may receive a death certificate which list the cause of death. I believe that this is fairly common practice through the US.  I have witness that it is generally verboten to ask or tell if known, what one’s cause of death is unless it is widely known (obvious illness, accident, murder, etc). It appears that we have adopted a cultural norm against generalized disclosure of cause of death and that norm has been codified in law as they related to death certificates. This isn’t airtight (as certain causes mentioned above become widely known) and it doesn’t appear to be based on any preference of the deceased, though one could imagine the deceased leaving instructions to publicize their cause of death.


I’m not sure how or when this cultural norm developed but I do find in interesting and would like to learn of others perspectives in differing cultures.

Privacy and Mesh Networks

I’ve been thinking a lot about Mesh Networking and the possibility to foil NSA style tapping by bypassing centralized networks for localized networks. For those who don’t know mesh networks are ad-hoc peer to peer networks, primarily wireless. The decentralized nature of the communications provides some level of privacy. Additional privacy comes with making the system anonymous. However, it seems that anonymity comes with a price to bandwidth. Here are some of my findings:

The standard model is circuit switched. In other words, each node maintains a topological map of the entire network, so that it knows who is connected to whom. This allows it to create a circuit through the network to the destination. That means that each transmission takes t bandwidth where t represents the size of the data. If s represents the shortest number of hops from source to destination then the total network bandwidth used B= s*t. In this model, no storage is required because each node forwards the data without need to retain it. There is some data storage requirements for the network map. This map grows with n^2.

Circuit: (Bandwidth = s*t, Storage: n*(m*n)^2)     m= the amount of data necessary to indicate a link or not (maybe a bit, maybe a byte if you store strength).

The upside is that each new node increases the bandwidth of the network. The downside of this is that an attacker could possibly follow the data as it gets transferred from node to node and identify sender and receiver, providing linkability and thus defeating anonymity.

Consider, alternatively a broadcast model. In this model, no topological map must be stored but every node gets a copy of the data.

Broadcast: (Bandwidth = n*t, Storage: n*t)

In this model, nobody can identify the destination of a message which is very privacy preserving. However, the bandwidth cost are enormous. Now, each new node added to the network actually adds an external cost to the other node, similar to a car being added to a highway. The storage cost also increase at rate n*t because every node must keep a copy of the message (or at least a digest) for a period of time to prevent it re-accepting the data from one if it’s neighbors.

A third option is the random walk model, also called the hot potato. A node passes the data packet to another who passes it again, etc. In this model no node keeps a copy of the data once it has passed so the storage costs are 0. The bandwidth is a minimum s*t, because that’s the shortest circuit. BUT, the packet could be passed along for ever, so the bandwidth could be potentially infinite. Needless to say this is not good.

Random walk:  ( S*t < Bandwidth < ∞, Storage = 0)

What about a biased or intelligent random walk, a lukewarm potato. The network has the following rules. Each node ask its immediate neighbors, “is this yours?” Appropriate response are “yes, give it to me” “no, but I’ll take it” or “no, i’ve seen it”.  If the neighbor said yes, the node gives them the data. If the neighbor say “no, i’ve seen it” it ignores that node. The node then randomly selects one of the nodes that hasn’t seen the data and sends it to them to continue the process. If the node can’t find anybody to hand it over to, it tells the node that it got it from that it can’t pass it along, that node starts over again. [Alternatively, it could force one of the nodes that have seen it to take it again]. This method allows the data to snake it’s way through the network but not repeat any nodes. Here is the bandwidth and storage boundaries.

Intelligent random walk: ( s*t < Bandwidth < n*t, s*n < Storage < n* t)

So this doesn’t have as low of bandwidth and storage as the circuit but it’s not as bad as broadcast for storage and not as bad as random walk for bandwidth. However, anonymity is not perfect. An attacker who had access to the entire network could identify the recipient as the packet traces it ways through the network.


It appears then than anonymity costs either in bandwidth or storage cost. The question is what is more valuable to the network. There maybe additional techniques to mitigate this, and I continue to investigate this area.



Algorithmic privacy versus personal privacy

In this blog post, Peter Kinnaird, attempts to analogize the NSA spying to the algorithmic review of our emails by Google. He notes that a majority of people accept such review as non-invasive and worth the benefits derived from free and useful email-as-a-service. I would like to point out several fallacies in his analysis.

  1. As a quick note, he says “I feel certain that if Google didn’t have adequate social and technical safeguards in place, we would have heard of at least one case of a Google employee snooping or abusing their power.” Here is the one case I’m familiar with: This doesn’t mean their aren’t others that Google quietly fired in order to keep out of the press. Government employee abuse of the information at their disposal is rampant and has huge historical precedent, whether sanctioned by higher ups or performed by rogue individuals.
  2. The post fails to distinguish the voluntary nature of participation with Gmail and the involuntary participation in the state surveillance apparatus. Mutuality is the cornerstone of privacy expectations. Without voluntariness, mutuality can not exist.
  3. The post fails to consider the risks involved in revealing information to Google versus the government. If I reveal information to Google I might get mislabeled and have inappropriate ads sent to me. If I get reveal information to the government, i might get mislabeled and jailed or murdered.
  4. The post mentions the public awareness of Google’s practice but fails to contrast that with the secret nature of the NSA program. Overt versus covert makes a world of difference in privacy. We don’t even know what we don’t know about NSA spying.
  5. The post fails to consider other, less privacy invasive means of achieving the same results, i.e. national security. Any privacy analysis of a system must dismiss other means of achieving the same goals.

There are a host of non-privacy related issues having to do with NSA spying, such as international relations and the loss of world wide confidence in buying American information services, that also need to be considered. Frankly a world in which I am spied on, personally or algorithmically, is not one in which I wish to live.


Suggested reading:   1984, The Trial

Balancing privacy and societal benefit

One common retort to claims of privacy infringement from government and industry is that in prevents or impairs  a societal benefit.  The most oft cited example is the alleged dichotomy of security versus privacy. Another incarnation is in the debate over black box recorders in consumer vehicles and whether the public policy should favor detailed tracking to reduce auto fatalities over the lost privacy of drivers. However, not every public policy debate has settled in favor of reducing social harm to the determent of privacy.

Consider sexually transmitted diseases.  Health information is one of the special classes of personal information that the US has deemed fit to grace with one of it’s sectoral laws, HIPPA. Public policy in the United States favor strong protection for health data. However, HIPPA only covers health care providers and their business associates. What about private individuals who come into knowledge of health information about another person? May they disclose a persons health condition to the public at large? Generally, the answer is yes, as long as it is truthful. However, this is not always the case. Many states have a privacy tort for public disclosure of embarrassing private facts. The elements of proving such a tort are

  • There must be public disclosure of the facts
  • The facts must be private facts, not public ones
  • The matter made public must be one which would be offensive and objectionable to a reasonable person of ordinary sensibilities.

In some cases, the revelation must not be newsworthy or a matter of public interest. One’s condition of having an STD would seem like the prototypical private fact that one would seek to keep from disclosure. But from a public policy perspective, there is an argument to be made that in order to prevent the spread of the communicable diseases, infection with an STD should be public knowledge to prevent potential partners from becoming infected. However, to my knowledge, no US state requires mandatory public disclosure (to the public at large). They may require reporting to a government agency by health providers or require disclosure of status by an individual to a potential sexual partner. Failure to do so may be criminally or civilly punishable. The decision to reveal one’s status publicly remains in the control of the individual, even when the law requires revelation to those at risk (an STD infected individual’s sex partners).  This is by no means a closed debate.

Privacy is often pitted against societal benefits and the debate framed as an individual‘s right to privacy versus this particular social good.  Privacy rarely comes out ahead because the societal benefits of privacy are hard to quantify in terms of lives, money, or some other enumerable figure that can be directly compared against. But protection of privacy does have societal benefits. Selective disclosure allows people to building trusting relationship. Financial privacy prevents targeting and theft, making society more productive. Anonymity is critical to a society free speech less speakers be judged for controversial or unpopular thoughts. A lack of privacy impedes risk taking and chills the activities of people. Ultimately, privacy is about decision making and the autonomy of the individual to make those decisions for themselves. Without that autonomy, one does not have a free society and all the benefits that liberty brings.
We can not allow ourselves to fall into the trap that any social good which implicates privacy interests outweighs the privacy harms. If we’re going to take a economic approach we have to find a way to quantify privacy harms as a social costs.


Please note that this blog post is not mean to be a treatise on the intricacies of privacy law as they relate to health data, in general, or sexually transmitted diseases, in particular. For some additional information see





Thoughts on DNT

Yesterday I attended the Atlanta IAPP KnowledgeNet on DNT.  The panelist were Peter Swire (@peterpswire) and Brooks Dobbs. Peter is co-chair of the W3C working group on DNT. He is trying to find consensus amongst nearly 100 participants in the process.  Up until recently, at least one point of contention was whether DNT stood for Do Not Track or Do Not Target.  However, the dispute isn’t over the acronym so much as whether the meaning is to not send people targeted ads or track them as they surf the internet and compile dossier or at least gauge  correlated interests of segments of the population.  I would suspect that some of the confusion is related to the historical providence of the DNT initiative coming out of the wildly successful Do Not Call registry. While there are several distinction, one that must be clearly understood is the underlying privacy harm.  In Do Not Call, no one is decrying marketing firms dossier building, rather it is the intrusion into the personal space of the called part that is deemed the privacy harm. Counter this to web surfing and the ubiquity of advertising on free websites. The intrusion in the personal space of the user is not the privacy harm at issue.  Rather the desire by firms to almagamate information about people to create more targeted or more effective advertising (thus increasing cost effectiveness of the advertising).  Before we can begin designing a solution, it’s important to identity the actual harm you’re trying to prevent.

Peter framed it rather nicely during the panel by asking two questions.  The first question he asked the audience was whether they wanted to be tracked without their knowledge as they surfed the internet. Only one hand was raised and that one was in jest.  The second question, to which many people myself included answered affirmatively, was whether they wanted a more personal experience on the internet. Those two questions framed the debate as a conflict between something people don’t want (tracking) and something people do want (personalization).  Thought I personally haven’t engineering a system to do it, my gut reaction is that these two concepts need not be at odds.  Certainly, you can design a system (as most are) where tracking support personalization but I suspect that just because it is sufficient, does not necessitate tracking.

Consider for instance a decentralized design where a person’s dossier were kept browser side. When an ad network wanted to serve an ad, the browser could request an ad targeted at sports enthusiasts who like dogs. If someone visited a website for fishing, the dossier may add that tag to the person’s dossier. An intrusive one might actually say “Hey this website wants to identify you as a fisherman.” Later, the person could modify or even wipe out their personal dossier. Family members could switch dossiers depending on who is using the browser or even individuals could maintain multiple personas (the dog loving sports enthusiast and the business professional into golfing).  This would solve one problem the ad industry is struggling with which is transparency.  Some ad networks may reject this idea because they won’t be able to throttle ads (not true, the browser could tell the ad network not to send some ads) or because they can’t resell their data about customers. As to the second point, just because it’s more cost effective for the business (and thus the consumer), doesn’t make it an acceptable practice.  Cold calling is an effective sales technique (even with only a fraction of a percent acceptance rate) but we as a society reject it because the benefit to the consumers is not apparent.  Under that argument any privacy invasive technique that saves a company money could be argued to be beneficial to the consumer no matter how privacy invasive.

Tracking clearing falls within the umbrella of a surveillance system. Many types of surveillance could bring about more cost efficiencies, but it doesn’t make them legitimate.  The question is, are there other business models that bring the cost benefits of targeted advertising but don’t carry the privacy harms of tracking?

Before the W3C working group can come to the agreement, they must realize that personalization are not at odds and creative design solutions are possible.  Industry must be willing to explore all technological options and fully understand the privacy risks and tradeoffs of different solutions. I’d love the opportunity to work with any ad network interested in exploring the options.