Breaking down “Personal Data”

I’ve rallied for years against the use of PII (or Personally Identifiable Data) as unhelpful in the privacy sphere. This term is used is some US legislation and has unfortunately made its way into the vernacular of the cyber-security industry and privacy professionals. Use of the term PII is necessarily limiting and does allow organizations to see the breadth of privacy issues that may accompany non-identifying personal data. This post is meant to shed light on the nuances in different types of data. While I’ll reference definitions found in the GDPR, this post is not meant to be legislation specific.

Personal Data versus Non-personal Data

The GDPR defines Personal Data as “means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.” The key here term in the definition is the phrase “relating to.” This broad refers to any data or information that has anything to do with a particular person, regardless of whether that data helps identify the person or that person is known. This contrast with non-personal data which has no relationship to an individual.

Personal Data: “John Smith’s eyes are blue.”

In this phrase, there are three pieces of personal data. The first is the name John which is a first name related to an individual, John Smith. The second is his last name. Finally, the third is blue eyes, which also relates to John Smith.

Anonymous Data: “People’s eyes are blue.”

No personal data is indicated in the above sentence as the data doesn’t relate to an individual, identified or identifiable. It relates to people in general.

Identified Data versus Pseudonymous Data

Much consternation has been exhibited over the concept of pseudonymized data. The GDPR provides a definition of pseudonymized: “means the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person.” The key phrase in this definition is that data can no longer be attributed to an individual without additional data. Let me break this down.

Identified Data: “John Smith’s eyes are blue.”
The same phrase we used in our example for Personal Data is identified because the individual, John Smith, is clearly identified in the statement.

Pseudonymous (Identifiable) Data: “User X’s eyes are blue.”
Here we have processed the individual’s name and replaced it with User X. In other words, its been pseudonymized. However, it is still identifiable. From the definition above, Personal Data is data relating to an identified or identifiable individual. Blue eyes are still related to an identifiable individual, User X (aka John Smith). We just don’t know who he is at the moment. Potentially we can combine information that links User X to John Smith. Where some people struggle is understanding there must be some form of separation between the use of the User X pseudonym and User X’s underlying identity. Store both in one table without any access controls and you’ve essentially pierced the veil of pseudonymity. WARNING: Here is where it can get tricky. Blue eyes are potentially identifying. If John Smith is the only user with blue eyes, it makes it much easier to identify User X as John Smith. This is huge pitfall as most attributable data is potentially re-identifying when combined with some other data.

Identifying Data versus Attributable Data

In looking at the phrase “John Smith’s eyes are blue” we can distinguish between identifying data and attributable data.

Identifying Data: “John Smith”
Without going into the debate of number of John Smiths in the world, we can consider a person’s name as fairly identifying. While John Smith isn’t necessarily uniquely identifying, a type of data, a name, can be uniquely identifying.

Attributable Data: “blue eyes”
Blue eyes is an attribution. It can be attributable to a person, in the case of our phrase “John Smith’s eyes are blue.” It can be attributable to a pseudonym: “User X’s eyes are blue.” As we’ll see below, it can also be attributed anonymously.

Anonymous Data versus Anonymized Data

GDPR doesn’t define anonymous data but in Recital 26 it says “anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable.” In the first example, I distinguished Personal Data with Anonymous Data, which didn’t relate to specific individual. Now we need to consider the scenario where we have clearly Personal Data which we anonymize (or render anonymous in such a manner that the data subject is not or no longer identifiable).

Anonymous Data: “People’s eyes are blue.”
For this statement, we were never talking about a specific individual, we’re making a generalized statement about people and an attributed shared by people.

Anonymized Data: “User’s eyes are blue.”
For this statement, we took Identified Data (“John Smith’s eyes are blue”) and processed in a way that is potentially anonymous. We’ve now returned to the conundrum presented with Pseudonymous data. Specifically, if John Smith is the only user with blue eyes, then this is NOT anonymous. Even if John Smith is a one of a handful of users with blue eyes, the degree of anonymity is fairly low. This is the concept of k-anonymity, whereby a particular individual is indistinguishable from k-1 other individuals in the data set. However, even this may not given sufficient anonymization guarantees. Consider a medical dataset of names, ethnicities and heart condition. A hospital releases an anonymized list of heart conditions (3 people with heart failure, 2 without). Someone with outside knowledge (that those of Japanese descent rarely have heart failure and the names of patients) could make a fairly accurate guess as to which patients had heart failure and which did not. This revelation brought about the concept of l-diversity in anonymized data. The point here is that unlike Anonymous Data which never related to a specific individual, Anonymized Data (and Pseudonymized Data) should be carefully examined for potential re-identification. Anonymizing data is a potential minefield.

If you need help navigating this minefield, please feel free to reach out to me at Enterprivacy Consulting Group

Bots, privacy and sucide

I had the pleasure of serving last week on a panel at the Privacy and Security Forum with privacy consultant extraordinaire Elena Elkina and renowned privacy lawyer Mike Hintze. The topic of the panel was Good Bots and Bad Bots: Privacy and Security in the Age of AI and Machine Learning. Serendipitously, on the plane to D.C. earlier that morning someone had left a copy of the October issue of Wired Magazine, the cover of which displayed a dark and grim image of Ryan Gosling, Harrison Ford, Denis Villeneuve, and Ridley Scott from the new dystopian film, Blade Runner 2049. Not only was this a great intro to the idea of bot (in the movie’s case human like androids) but the magazine contained two pertinent articles to our panel discussion: “Q: In. Say. A customer service chat window, what’s the polite way to ask whether I’m talking to a human or a robot?” and “Stop the chitchat: Bots don’t need to sound like us.” Our panel dove into the ethics and legality of deception, in say a customer service bot pretending to be human.

White the idea was fresh in my mind, I wanted to take a moment to replay some of the concepts we touch upon for a wider audience and talk about the case study we used in more detail than the forum allowed. First off, what did we mean by bots? I don’t claim this is a definitive definition but we took the term, in this context, to mean two things:

  • Some form of human like interface. This doesn’t mean they have to the realism of Replicants in Blade Runner, but some mannerisms in which a person might mistake the bot for another person. This goes back to the days, as Elena pointed out, of Alan Turing and his Turing test, years before any computer could even think about passing. (“I see what you did there.”). The human like interface potentially has an interesting property, are people more likely to let their guard down and share sensitive information if they think they are talking to another person? I don’t know the answer to that and their may be some academic research on that point. If their isn’t I submit that it would make for some interesting research.
  • The second is the ability to learn and be situationally aware. Again, this doesn’t require the super sophistication of IBM’s Watson but any ability to adapt to changing inputs from the person with whom it is interacting. This is key, like the above, to giving the illusion a person is interacting with another person. By counter example, Tinder is littered with “bots” that recite scripts with limited, if any, ability to respond to interaction.

Taxonomy of Risk

Now that we have a definition, what are some of the heightened risks associated with these unique characteristics of a bot that, say, a website doesn’t have? I use Dan Solove’s Taxonomy of Privacy as my goto risk framework. Under the taxonomy I see 5 heightened risks:

  1. Interrogation (questioning or probing of personal information): In order to be situationally aware, to “learn” more, a bot may ask questions of someone. Those questions could go too far. While humans have developed social filters, which allows us to withhold inappropriate questions, a bot lacking a moral or social compass could ask questions which make the person uncomfortable or is invasive. My classic example of interrogation is an interview where the interviewer asks the candidate if they are pregnant or planning to become pregnant. Totally inappropriate in a job interview. One could imagine a front like recruitment bot smart enough to know that pregnancy may impact immediate job attendance of a new hire but not smart enough to know that it’s inappropriate to ask that question (and certainly illegal in the U.S. to use pregnancy as a discriminatory criteria in hiring).
  2. Aggregation (combining of various piece of personal information): Just as not all questions are interrogations, not all aggregation of data creates a privacy issue. It is when data is combined in new and unexpected ways, resulting in information disclosure than the individual didn’t want to disclose. Anyone could reasonably assume Target is aggregating sales data to stock merchandise and make broad decisions about marketing, but the ability to discern pregnancy of a teenager from non-baby related purchased was unexpected, and uninvited. For a pizza ordering bot, consider the difference between knowing my last order was a vegetable pizza and discerning that I’m a vegetarian (something I didn’t disclose) because when I order for one its always vegetable but if I order for more than one, it includes meat dishes.
  3. Identification (linking of information to a particular individual): There may be perfectly legitimate reasons a bot would need to identify a person (to access that person’s bank account for instance) but identification as an issue comes into play when its the perception of the individual that they would remain anonymous or at the very least pseudonymous. If I’m interacting with a bot as StarLord1999 and all the sudden it calls me by the name Jason, I’m going to be quite perturbed.
  4. Exclusion (failing to let an individual know about the information that others have about her and participate in its handling or use): As with aggregation, a situationally aware bot, pulling information from various sources may alter its interaction in a way that excludes the individual from some service without the individual understanding why and based on data the individual doesn’t know it has. For instance, imagine a mortgage loan bot, that pulls demographic information based on a user’s current address, and steers them towards less favorable loan products. That practice sounds a lot like red-lining and if it has discriminatory effects, could be illegal in the U.S.
  5. Decisional Interference (intruding into an individual’s decision making regarding her privacy affairs): The classic example I use for decisional interference is China’s historic one-child policy which interferes with a family’s decision making on their family make-up, namely how many children to have. So you ask, how can a bot have the same effect? Note the law is only influential, albeit in a very strong way. A family can still physically have multiple children, hide those children or take other steps to disobey the law, but the law is still going to have a manipulatory effect on the decision making. A bot, because if it’s human interface, and advanced learning and situational knowledge, can be used to psychologically manipulate people. If the bot knows someone is psychologically prone to a particular type of argument style (say appealing to emotion) it can use that and information at it’s disposal to subtly persuade you towards a certain decision. This is a form of decisional interference.

Architecture and Policy

I’m not going to go into a detailed analysis of how to mitigate these issues, but I’ll touch on two thoughts: first, architectural design and second, public policy analysis. Privacy friendly architecture can be analyzed along two axes, identifiability and centralization. The more identified and more centralized the design, the less privacy friendly it is. It should be obvious that reducing identifiability reduces the risk of identification and aggregation (because you can’t aggregate external personal data from unidentified individuals) so I’ll focus here on centralization. Most people would mistakenly think of bots as being run by a centralized server, but this is far from the case. The Replicants in Blade Runner or “autonomous” cars are both prominent examples of bots which are decentralized. In fact, it should be glaringly apparent that a self-driving car being operated by a server in some warehouse introduces unnecessary safety risks. The latency of the communication, potential for command injections at the server or network layer, and potential for service interruption are unacceptable. The car must be able to make decisions immediately, without delay or risk of failure. Now decentralization doesn’t help with many of the bot specific issues outlined above, but it does help with other more generic privacy issues, such as insecurity, secondary use and others.

Public policy analysis is something I wanted to introduce with my case study during the interactive portion of the session at the Privacy and Security Forum. The case study I present was as follows:

Kik is a popular platform for developing Bots. https://bots.kik.com/#/ Kik is a mobile chat application used by 300 million people worldwide and an estimated 40% of US teens at one time or another have used the application. The National Suicide Prevention Hotline, recognizing that most teens don’t use telephones wants to interact with them in services they use. The Hotline wants to create a bot to interact with those teens and suggest helpful resources. Where the bot recognizes a significant risk of suicide rather than just casual inquiries or people trolling the service, the interactions will first be monitored by a human who can then intervene in place of the bot, if necessary.

I’ll highlight one issue, decisional interference, to show why it’s not a black and white analysis. Here, one of the objectives of the service and the bot, is to prevent suicide. As a matter of public policy, we’ve decided that suicide is a bad outcome and we want to help people who are depressed and potentially suicidal get the help they need. We want to interfere with this decision. Our bot must be carefully designed to promote this outcome. We don’t want the bot to develop in a way that doesn’t reflect this. You could imagine a sophisticated enough bot going awry and actually encouraging callers to commit suicide. The point is, we’ve done that public policy analysis and determined what the socially acceptable outcome is. Many times organizations have not thought through what decisions might be manipulated by the software they create and what the public policy is that should guide they way the influence those decisions. Technology is not neutral. Whether it’s is decisional interference or exclusion or any of the other numerous privacy issues, thoughtful analysis must precede design decisions.

Financial Cryptography, Anguilla, Hettinga and Cate

As many of you probably know, the Lesser Antilles in the Caribbean were battered this week by Hurricane Irma, one of, if not, the most powerful storms to hit the Caribbean. Barbuda was essentially annihilated (90% of building damaged or destroyed). Anguilla, a country of only 11,000, hasn’t fared too well either.

What some, but not all, in the Financial Cryptography/crypto-currency community may not know is that Anguilla was home to the first and several subsequent International Financial Cryptography Conferences from 1997. The association, IFCA, still bears that legacy with the TLD of .ai, ifca.ai. The paper presented at IFCA conference and the discussions held by many of its participants paved the ground for Tor, Bitcoin and the crypto-currency revolution. Eventually, the conference had grown to big for this tiny island.

More directly, two of the founders of that conference, Bob Hettinga and Vince Cate, currently live on Anguilla.  While they have since separated from IFCA, their early evangelism was instrumental in the robust, thriving and important work developed at the conference throughout the years. Bob and Vince were both directly affected by the hurricane, though not their spirit. Here is a quote  from Bob about the aftermath:

Have a big wide racing stripe of missing aluminum from front to back. Sheets of water indoors when it rains. Gonna tarp it over tomorrow though. Sort of a rear vent across the back of the great room. See the videos. Blew out three steel accordion shutters covering three sliding glass doors. 200mph winds? No plywood on the island, much less aluminum.

Airport has no tower anymore. SXM airport had its brand new terminal demolished. Royal Navy tried to land at the commercial pier but the way was blocked. Rum sodomy and the lash.

We need money of course, blew a bunch prepping, etc. No reserves to speak of even then. Gotta pay people to do a bunch of stuff, but the ATMs are down so people are just doing shit anyway. Works in our favor except to buy actual stuff. We hadda a little cash stashed in the house and paid about half of that boarding up three of the six demolished sliding glass doors. Tomorrow we’ll put the MICR numbers off one of [Mrs RAH’s] checks here, so you can wire us money if you want. Heh. It still might be weeks before we get it. Or days. Just no idea right now.

If we can get the materials replacing the aluminum on the roof can be done in a couple days. Getting it here under normal circumstances would take weeks.

Certainly months now.

Then we have to rip out and replace all the Sheetrock in the house, basically all the closets and bathrooms and the inner bedroom walls. Again days to fix, months to do.

All the gas pumps on the island were flattened. Literally blown away. That makes thing interesting. A couple grocery stores are open, but who knows how long there inventories will hold up.

See “Royal Navy” above.

Both cisterns fine. 38000 gallons total. Full. Wind ripped off the gutters so no discernible salt in the water. And little floating litter on top. We’ve got water, and we can pull buckets out of the cisterns until the cows come home. I’d been looking at building a hand pump out of PVC and check valves, and now I have some incentive.

Car needs brake work. Threw the error codes the day after the storm. Supposed to park it and have the shop come and get it. Heh. No shop anymore. No money so it works out. 🙂 Car exists to charge the smartphones at the moment. Full tank. Life is good

See the rain from inside Bob’s house during the Hurricane.

I’m making this post to request that the community support Bob, Vince and the island of Anguilla. We can’t help the entire Caribbean but we can donate to efforts to support them and our adoptive island of Anguilla.

For donations to support the island of Anguilla, please consider donating to the Help Anguilla Rebuild Now fund or individual recipients on this page

To help Bob or Vince, please contact them directly (or you can contact me). I will update this page with funding options as I’m made aware of them.

My contact information:

Email: rjc at privacymaverick

Twitter @privacymaverick

LinkedIn

 

Recruiters…..

The whole employment market seems FUBAR (look it up if you don’t know). Not only am I constantly inundated with spam and calls telling me about a great new Sharepoint developer a staffing agency can place with me, recruiters send me desperately mismatched job opportunities. One particular one recently came across my email for a “Security Analyst” role. What struck me wasn’t the badly formatted main part of the message but the hilarity of the footers.

First was this:

The information transmitted in this email is intended solely for the individual or entity to which it is addressed and may contain confidential and/or privileged material. If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this transmission is prohibited. If you have received this transmission in error, please contact the sender and delete the material from your system.

The email “may” contain confidential information?  I’m “prohibited” from disclosing the contents of the email? By what law, regulation, contract, theory or act of God am I prohibited? This type of language is reminiscent of the blind leading the naked. It’s the same silliness that I get sometimes when someone explains to me I have to answer their question because “It’s the law!”  Really? What law? Where did you go to law school?  Often time, it a refrain people use to make someone else compliant with their needs and wishes. If the recipient is as ignorant of the law as the sender, then compliance is assured.

The second part of the footer was even funnier:

Note: We respect your Online Privacy. This is not an unsolicited mail. Under Bill s.1618 Title III passed by the 105th U.S. Congress this mail cannot be considered Spam as long as we include Contact information and a method to be removed from our mailing list. If you are not interested in receiving our e-mails then please enter “Please Remove” in the subject line and mention all the e-mail addresses to be removed, including any e-mail addresses which might be diverting the e-mails to you. We sincerely apologize for any inconvenience.

Let’s tally up the errors in this, shall we?

We respect your Online Privacy. Really? If you respected my privacy, you wouldn’t be spamming me with unsolicited messages, regardless of the law.

This is not an unsolicited mail. I didn’t solicit it, therefore it is unsolicited. You might be able to argue (though wrongly) that it doesn’t meet the definition of spam or isn’t illegal, but you can’t truthfully say it is not unsolicited.

Under Bill s.1618 Title III passed by the 105th U.S. Congress this mail cannot be considered Spam as long as we include Contact information and a method to be removed from our mailing list.
Somewhat technically true. Under that bill passed by the Senate in 1998, an “unsolicited commercial electronic mail message” must contain specific contact information and must stop further messages upon a reply that includes remove in the subject line. Several problems though. First, doing so doesn’t make it not spam (in fact the bill didn’t define spam) but rather makes it illegal if you don’t do so. Second, this bill, though it passed the Senate, never became law. While the email I received never claimed it was the law, the implication is clearly there.  On a side note, they failed to include a physical address as required by this “bill.”

If you are not interested in receiving our e-mails then please enter “Please Remove” in the subject line and mention all the e-mail addresses to be removed, including any e-mail addresses which might be diverting the e-mails to you. Wait, I have to include ALL e-mail address that might be diverting email to me? I have like 50 of those. I’m not sending you a list of all my email addresses.  Just remove the one you sent me this message from!

We sincerely apologize for any inconvenience. No you don’t. You can’t be remorseful in advance. Apology not accepted.

 

Purple purses, privacy and more

[Twitter rarely affords me the opportunity for a full discussion.  I prevent the following in clarification of a recent tweet.]

A recent promoted ad campaign called Purple Purse on Twitter caught my attention. Notably, the ad uses a purported hidden camera footage of individuals finding a purse left in a cab. In the purse, the phone rings and the cab rider, after routing through the purse and then the phone uncover evidence of domestic (financial) abuse.

First off, I want to say that domestic abuse is a hideous and far too common crime in the world today. I can’t count the number of times I’ve personally witnessed it and been essentially helpless to do anything. Two recent incidents come to light. Once, while sitting on the patio (alone) at a restaurant at lunch, I witnessed a young man following a woman (within inches). While not physically accosting her, he was certainly intimidating her and speaking to her in a manner to exert control over her. I couldn’t exactly tell what he was saying  but based on their interaction they did not appear to be strangers.

The second incident took place one night while staying at a friend’s apartment. I could hear upstairs, the male occupant verbally and physically assaulting his girlfriend. I was set to call the police but my friends said she had done so on several occasions with no positive outcome. I withheld calling, principally out of concern for my friend as it was clear, hers was the only apartment which could hear the altercation. I didn’t want my friend hurt based on my calling the police on this obviously violent individual.

On another occasion, I did call the police years ago when I heard my pregnant neighbor being beaten by her then boyfriend. He left before they arrived, but they later arrested him.

Privacy has long been a shield to protect domestic abusers against government invasions. In general, the right to make familial decisions and be free from government interference, is a hallmark of federal privacy law. It’s the basis of the Roe v. Wade decision and Griswold v. Connecticut whereupon the right to privacy is a right against government intrusion in the sanctity of family decisions. Unfortunately, in a historically patriarchal society, the same argument supported a man’s right to discipline his wife. That view, fortunately, has fallen out of favor, at least within the law in the U.S.

Financial dependence goes hand in hand with domestic abuse. Controlling the purse strings is one of the strongest ways that domestic abusers control their victims. So it’s perfectly appropriate for the group behind Purple Purse to focus on “financial” domestic abuse as a means of uncovering deeper problems. This is one of the reasons that the financial industry must find ways to support “financial privacy” not just in confidentiality of financial transactions but censorship resistant financial tools. It isn’t just the government that is prone to censor people’s financial choices.

Lock screen
Lock screen from my personal phone indicating a number to contact if found.

On it’s face, it appears that Purple Purse is encouraging people to invade one type of privacy (confidentiality) to discover another (financial privacy), or at the least offset a social evil (“domestic abuse”). One could make the argument, that if a victim needed to covertly disclose her predicament, without alerting her abuser, though this would be a mechanism to do so.  Most people with an interest in their own privacy lock their phone, even with a simple 4 digit pin code. In the words of courts, locking one’s phone is a manifestation of a subjective expectation of privacy in the phone. Locking one’s phone is something which an outsider can view as an affirmative act which says “Hey this is private, keep out.”  To further the legal analysis, locking one’s phone is a manifestation which society is willing to objectively recognize.

I’m not making the argument that one might not have a subjective expectation of privacy in a lost, but unlocked phone, but certainly the case is stronger if the phone is locked. A left unlocked phone could be, as the Purple Purse might be suggesting, an effort by a victim to seek help.

 

 

 

 

In re comment on Financial Privacy blog

This post is in response to a comment on my blog post about Financial Privacy. See https://www.linkedin.com/groups/42462/42462-6280511786831659008

Anonymization

I use terms like unlinkability and anonymity in the academic vernacular, not in respect to any legal definition. After all, the law can define a word to mean anything it wants. The technique used to anonymize the transaction is similar to Anonymous Lightweight Credentials (see https://eprint.iacr.org/2012/298.pdf for more information on ASL). Breaking the anonymity would require solving the discrete log problem. Solving that problem would put in jeopardy much of the cryptography upon which the world relies today, so I’m reasonably confident of its security for the moment.  Spending a token under the Microdesic system based on the technique allows the user to prove they have the right to spend a token without identifying themselves as a particular person who owns a particular token.

Now, as far as de-anonymization under fraud, if a user double spends the same token, they reveal themselves. If I were to offer a somewhat real world analogy, it would go like this: I walk into a store. If I’m minding my own business, the store can’t distinguish me from any other customer in the store. I can purchase what I want and remain anonymous (subject to the store taking other measures outside this scenario, like performing facial recognition). However, if I commit a crime (in this case fraud), the store forces me to leave my passport behind. (It is sometimes hard to create real world analogies of the strange world of cryptography, but this should suffice).

In other words, prior to committing that fraudulent act, I’m anonymous. In the act of committing that fraud (in order for the store to accept my digital token/money), I’m standing up and announcing my identity and revealing my past purchases.

Returning, now to the law and specifically Recital 26 of the GPDR, it states “To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments.”  There is clearly a temporal element. In other words, we need not account for a super computer in the distant future, or someone solving the discrete log problem. I also doubt the GDPR contemplates forcing the user to reidentify him or herself as a reasonable means of reidentification. Surely, they aren’t saying that if you rubber-hose the user and tell them to identify when and where they made a purchase, that’s reidentification. The data subject always knows that information, the question is whether anyone else can ascertain it without the user’s assistance. Under the Microdesic system, at the time of a non-fraudulent transaction, there is no reasonable means of reidentification (i.e. you must solve the discrete log problem).

The Middle Man

The subject of my previous post was financial privacy vis-à-vis decisional interference. The comment to which this post replies posed the question of whether Microdesic becomes the middle-man with the ability to interfere in the decision-making capabilities (i.e. spending decisions) of the user. Let me first explain by counter-example. When a payment authorization request comes in to PayPal, it knows the account of the spender, the account of the recipient, who those parties are, how much is being transferred and some extra data collected (such as in a memo, etc.). At that point, PayPal could, based on that information, prevent the transaction from occurring. Maybe they think the amount is too high. Maybe the memo indicates the person is purchasing something against PayPal’s AUP. The point is they can stop the transaction at the point of transaction. The way Microdesic works is different. A user in the Microdesic system is issued fungible tokens. From the system perspective, those tokens are indistinguishable from user to user. In fact, the system uses ring signatures which mixes a user’s tokens with other user’s tokens, to reduce correlation through forensic tracing. The tokens are then spent “offline” without the support of the Microdesic server. All the merchant knows is that they are receiving a valid token. Microdesic has no ability to prevent the transaction at the time of transaction.

Now for a bit of a caveat. Because the tokens are one time spends, the Merchant must subsequently redeem the tokens, either for other tokens or for some other form of money held in escrow against the value of the tokens. Microdesic could at this point require the Merchant to identify themselves and prevent redemption. Merchants that weren’t approved by Microdesic might therefore be excised from the system by virtue of being unable to redeem their tokens. However, the original point remains. Unlike a PayPal or credit card system, which authorizes each and every transaction, Microdesic has no ability to approve or disapprove of a particular transaction at the point of the transaction.

Financial Privacy and CryptoCurrencies

Financial privacy is most often conceptualized in terms of the confidentiality of one’s finances or financial transactions. That’s the “secrecy paradigm,” whereby hiding money, accounts, income, expenses prevents exposure of one’s activities, subjecting one to scrutiny or revealing tangential information one’s wants to keep private. Such secrecy can be paramount to security as well. Knowing where money is held, where it comes from or where it goes give thieves and robbers the ability to steal that money or resource. Even knowing who is rich and who is poor helps thieves select targets.

Closely paralleling “financial privacy as confidentiality,” is identity theft, using someone else’s financial reputation for one’s own benefit. In the US, Graham-Leech Bliley’s Safeguards Rule provides some prospective protection against identity theft, while the Fair Credit Reporting Act (FCRA), Fair and Accurate Credit Transactions Act (FACTA) and the FTC’s Red Flag Rules provide additional remedial relief.

However, as I’m found of saying in my Privacy by Design Training Workshops, “that’s not the privacy issue I want to talk about now.” All of the preceding examples are issues of information privacy. Privacy, though, is a much broader concept than that of information. In his Taxonomy of Privacy, Professor Daniel Solove categorized privacy issues into four groups: information collection, information processing, information dissemination and invasions. It’s that last category to which I turn the reader’s attention. Two specific issues fall under the category of “invasions,” namely intrusion and decisional interference. Intrusion is something commonly experienced by all when a telemarketer calls, a pop-up ad shows up in your browser window, you receive spam in your inbox or a Pokemon Go player shows up at your house; it is the disturbance of one’s tranquility or solitude. Decisional interference, may be a more obscure notion for most readers, except for those familiar with US Constitutional Law. In a series of cases, starting with Griswold v. Connecticut and more recently and famously in Lawrence v. Texas, the Supreme Court rejected government “intrusion” in the decisions individual’s make of their private affairs, with Griswold concerning contraceptives and family planning and Lawrence concerning homosexual relationships. In my workshop, I often discuss China’s one child law as a exemplary of such intrusion.

The concept of decisional interference has historical roots in US privacy law. Alan Westin’s definition of information privacy (“the claim of individuals, groups, or institutions to determine for themselves when, how, and to what extent information about them is communicated to others”) includes a decisional component. Warner and Brandeis’ right “to be let alone” also embodies this notion of leaving one undisturbed, free to make decisions as an autonomous individual, without undue influence by government actors. With this broader view, decisional interference need not be restricted to family planning, but could be viewed as any interference with personal choices, say government restricting your ability to consume oversized soft drinks. I guess that’s why my professional privacy career and political persuasion of libertarianism are so closely tied.

But is decisional interference solely the purview of government actors, then? Up until recently, I struggled with coming up with a commercial example of decisional interference, owing to my fixation on private family matters. A recent spark changed that. History is replete with financial intermediaries using their position to prevent activities they dislike and since many modern individual decisions involve spending money, the ability of an intermediary to disrupt a financial transaction is a form of decisional interference. A quick look at Paypal’s Acceptable Use Policy provides numerous examples of prohibited transactions that are not necessarily illegal (which is covered by the first line of their AUP). Credit card companies have played moral police, sometimes but not always at the behest of government, but more often being overly cautious, going beyond legal requirements. Even a financial intermediaries’ prohibition on illegal activities is potential problematic, as commercial entities are not criminal law experts and will choose risk-averse prohibition more often than not, leading to chilling of completely legal, but financially dependent, activity.

This brings me to the subject of crypto-currencies. Much of the allure of a decentralized money system like Bitcoin is not in financial privacy vis-a-vis confidentiality (though Zerocoin provides that) but in the privacy of being able to conduct transactions without inference by government or, importantly, a commercial financial intermediary. What I’m saying is not a epiphany. There is a reason that Bitcoin beget the rise of online dark markets for drugs and other prohibited items. Not because of the confidentiality of such transactions (in fact the lack of confidentiality played into the take-down of the Silk Road) but because no entity could interfere with the autonomous decision making of the individual to engage in those transactions.

Regardless of your position on dark markets, you should realize that in a cashless world, the ability to prevent, deter or even discourage financial transactions is the ability to control a societ. The infamous pizza privacy video is disturbing not just because of the information privacy invasions but because it is attacking the individual’s autonomy in deciding what food they will consume by charging them different prices based on external intermediaries social control (here a national health care provider). This is why a cashless society is so scary and why cryptocurrencies are so promising to so many. It returns financial privacy of electronic transaction vis-a-vis decisional autonomy to the individual.

[Disclosure: I have an interest in a fintech startup developing anonymous auditable accountable tokens which provides the types of financial privacy identified in this post.]

Trust by Design

I’ve fashioned myself a Privacy and Trust Engineer/Consultant for over a year now and I’ve focused on the Trust side of Privacy for a few years, with my consultancy really focused on brand trust development rather than privacy compliance. Illana Westerman‘s talk about Trust back at the 2011 Navigate conference is what really opened my eyes to the concept. Clearly trust in relationships requires more than just privacy. It is a necessary condition but not necessarily sufficient. Privacy is a building block to a trusted relationship, be it inter-personal, commercial or between government and citizen.

More recently, Woody Hartzog and Neil Richard’s “Taking Trust Seriously in Privacy Law” has fortified my resolve to pushing trust as a core reason to respect individual privacy. To that end, I though about refactoring the 7 Foundational Principles of Privacy by Design in terms of trust, which I present here.

  1. Proactive not reactive; Preventative not remedial  → Build trust, not rebuild
    Many companies take a reactive approach to privacy, preferring to bury their head in the sand until an “incident” occurs and then trying to mitigate the effects of that incident. Being proactive and preventative means you actually make an effort before anything occurs. Relationships are built on trust. If that trust is violated, it is much harder (and more expensive) to regain. As the adage goes “once bitten, twice shy.”
  2. Privacy as the default setting → Provide opportunities for trust
    Users shouldn’t have to work to protect their privacy. Building a trusted relationship occurs one step at a time. When the other party has to work to ensure you don’t cross the line and breach their trust that doesn’t build the relationship but rather stalls it from growing.
  3. Privacy embedded into the design → Strengthen trust relationship through design
    Being proactive (#1) means addressing privacy issues up front. Embedding privacy considerations into product/service design is imperative to being proactive. The way an individual interfaces with your company will affect the trust they place in the company. The design should engender trust.
  4. Full functionality – positive sum, not zero sum →  Beneficial exchanges
    Privacy, in the past, has been described as a feature killer preventing the ability to fully realize technology’s potential. Full functionality suggest this is not the case and you can achieve your aims while respecting individual privacy. Viewed through the lens of trust, commercial relationships should constitute beneficial exchanges, with each party benefiting from the engagement. What often happens is that because of asymmetric information (the company knows more than the individual) and various cognitive biases (such as hyperbolic discounting), individuals do not realize what they are conceding in the exchange. Beneficial exchanges mean that, despite one party’s knowledge or beliefs, they exchange should still be beneficial for all involved.
  5. End to end security – full life-cycle protection → Stewardship
    This principle was informed by the notion that sometimes organizations were protecting one area (such as collection of information over SSL/TLS) but were deficient in protecting information in others (such as storage or proper disposal). Stewardship is the idea that you’ve been entrusted, by virtue of the relationship, with data and you should, as a matter of ethical responsibility, protect that data at all times.
  6. Visibility and transparency – keep it open → Honesty and candor
    Consent is a bedrock principle of privacy and in order for consent to have meaning, it must be informed; the individual must be aware of what they are consenting to. Visibility and transparency about information practices are key to informed consent. Building a trusted relationship is about being honest and candid. Without these qualities, fear, suspicion and doubt are more prevalent than trust.
  7. Respect for user privacy – keep it user-centric  → Partner not exploiter
    Ultimately, if an organization doesn’t respect the individual, trying to achieve privacy by design is fraught with problems. The individual must be at the forefront. In a relationship built on trust, the parties must feel they are partners, both benefiting from the relationship. If one party is exploitative, because of deceit or because the relationship is achieved through force, then trust is not achievable.

I welcome comments and constructive criticisms of my analysis. I’ll be putting on another Privacy by Design workshop in Seattle in May and, of course, am available for consulting engagement to help companies build trusted relationships with consumers.

 

Price Discrimination

This post is not an original thought (do we truly even have “original thoughts”, or are they all built upon the thoughts of others? I leave that for others to blog about).  I recently read a decade old paper on price discrimination and privacy from Andrew Odlyzko.  It was a great read and it got more thinking about many of the motivations for privacy invasions, particularly this one.

Let me start out with a basic primer on price discrimination. The term refers to pricing items based on the valuation of the purchaser, in other words discrimination in the pricing of goods and services between individuals. Sounds a little sinister, doesn’t it? Perhaps downright wrong, unethical. Charging one price for one person and a different price for another.  But price discrimination can be a fundamental necessity in many economic situations.

Here’s an example. Let’s say I am bringing cookies to a bake sale. For simplicity, let’s say there are three consumers at this sale (A, B and C).  Consumer A just ate lunch so isn’t very interest in a cookie but is willing to buy one for $0.75. Consumer B likes my cookies and is willing to pay $1.00. Consumer C hasn’t eaten and loves my cookies but only has $1.50 on him at the time. Now, excluding my time, the ingredients for the cookies cost $3.00. At almost every price point, I end up losing money

Sale price $0.75 -> total is 3x$0.75 = $2.25
Sale price $1.00 -> total is 2x$1.00 = $2.00 (Consumer A is priced out as the cost is more than they are willing to pay)
Sale price $1.50 -> total is 1x$1.50 = $1.50 (Here both A and B are priced out)

However, if I was able to charge each Consumer their respective valuation of my cookies, things change.

$0.75+$1.00+$1.50= $3.25

Now, not only does everyone get a cookie for what they were willing to pay, I cover my cost and earn some money to cover my labor in baking the cookie. Everybody is happier as a result, something that could not have occurred had I not been able to price discriminate.

What does this have to do with Privacy? The more I know about my consumers, the more I’m able to discover their price point and price sensitivity. If I know that A just ate, or that C only has $1.50 in his pocket, or that B likes my cookies, I can hone in on what to charge them.

Price discrimination it turns out is everywhere and so are mechanisms to discover personal valuation. Think of discounts to movies for students, seniors and military personnel. While some movie chain may mistakenly believe they are doing it out of being a good member of society, there real reason is they are price discriminating. All of those groups tend to have less disposable income and thus are more sensitive to where they spend that money. Movies theaters rarely fill up and an extra sale is a marginal income boost to the theater.  This is typically where you find price discrimination, where the fix costs are high (running the theater) but the marginal cost per unit sold are low. Where there is limited supply and higher demand, the seller will sell to those willing to pay the highest price.

But what do the movie patrons have to do to obtain these cheaper tickets? They have to reveal something about themselves….their age, their education status or their profession in the military.

Other forms of uncovering consumer value also have privacy implications.  Most of them are very crude groupings of consumer in to bucket, just because our tools are crude, but some can be very invasive. Take the FAFSA, the Free Application for Federal Student Aid.  This form is not only needed for U.S. Federal loans and grants, but many universities rely on this form to determine scholarships and discounts. This extremely probing look into someones finances is used to perform price discrimination on students (and their parents), allowing those with lower income and thus higher price sensitivity to pay less for the same education as another student from a wealthier family.

Not all methods of price discrimination affect privacy, for instance, bundling.  Many consumers bemoan bundling done by cable companies who don’t offer an ala carte selection of channels. The reason for this is price discrimination. If they offered each channel at $1 per month, they would forgo revenue from those willing to pay $50 a month for the golf channel or those willing to pay $50 a month for the Game Show  Network. By bundling a large selection of channel, many of whom most consumers don’t want, they are able to maximize revenue from those with high price points for certain channels as well as those with low price points for many channels.

I don’t have any magic solution (at this point). However, I hope by exposing this issue more broadly we can begin to look for patterns of performing price discrimination without privacy invasions. One of the things that has had me thinking about this subject is a new App I’ve been working on for privacy preserving tickets and tokens for my start-up Microdesic. Ticket sellers have a problem price discriminating and tickets often end up on the secondary market as a result.

[I’ll take the bottom of this post to remind readers of two upcoming Privacy by Design workshops I’ll be conducting. The first is in April in Washington, D.C. immediately preceding the IAPP Global Summit. The second is in May in Seattle. Note, the tickets ARE price discriminated, so if you’re a price sensitive consumer, be sure to get the early bird tickets. ]

Pokemon Goes to Church

In case you haven’t read enough about Pokemon Go and Privacy

In the past, you knew you’d arrived on the national scene if Saturday Night Live parodied you. While SNL still remains a major force in television, the Onion has taken its place for the Internet set. Just as privacy issues have graced the covers of major news sites around the world, so too has it made its way into plenty of Onion stories. The latest faux news story involves the Pokemon Go craze sweeping the nation like that insidious game in Star Trek: The Next Generation that took over crew member brains on the Enterprise.

“What is the object of Pokemon Go?” asks the Onion in their article. And their response was “To collect as much personal data for Nintendo as possible.” That may or may not have been part of the intent of Nintendo, but the Onion found humor because of its potential for truth. Often times comedians create humor from uncomfortable truthfulness. In a world of Flashlight apps collecting geolocation, intentions for collecting data are not always clear as was Nintendo’s potential collection with their game. Much has already been written about this. So much attention has been focused on Nintendo, it stirred frequent pro-privacy Senator Al Franken to write a letter. I’d like to focus, though, on something that another news story picked up.

The privacy issue I’m talking about isn’t about the collection of information by Pokemon Go or even the use of the information that was collected. The privacy issue I want to relay is something even the most astute privacy professional might overlook in an otherwise thorough privacy impact assessment. As mentioned by Beth Hill in her previous post on the IAPP about Pokemon Go, a man who lived in a church found players camped outside his house. The App uses churches and gyms where player would converge to train. While this wouldn’t normally be problematic but one particular church was converted years ago into a private residence. The privacy issue at play here is one of invasion, defined by Dan Solove as “an invasive act that disturbs one’s tranquility or solitude.” We typically see invasion issues more commonly crop up related to spam emails, browser pop-ups, or telemarketing.

This isn’t the first time we’ve seen this type invasion. In order to personalize services, many companies subscribe to IP address geolocation services. These address translation services translate an IP address into a geographic location. Twenty years ago the best one could do would be a country or region based on assigned IP address space in ARIN (American Registry for Internet Numbers). If your IP address was registered to a California ISP, you were probably in California. The advent of smartphones and geolocation has added a wealth of data granularity to the systems. Now, if you connect your smart phone to your home WiFi, the IP address associated with that WiFi could be tied to your exactly longitude and latitude. Who do you think that “Flashlight” application was selling your geolocation information to? The next time you go online with your home computer (without GPS), services still know where you are by virtue of the previously associated IP address and geolocation. One of the subscribers to these services are law enforcements, and lawyers and a host of others trying to track people down. Behind on your child support payment? Let them subpoena Facebook, get the IP address you last logged in and then geo-locate that to your house, to serve you with a warrant. Now that’s personalization by the police department!  No need to be inconvenienced and go down to the station be arrested. But what happens when your IP address has never been geolocated? Many address translation services just pick the geographic center of where what they can determine, be that city, state or country. Read about a Kansas farm owner’s major headaches because he’s located at the geographic center of the U.S. at http://fusion.net/story/287592/internet-mapping-glitch-kansas-farm/

Many privacy analysts wouldn’t pick up on these type of privacy concerns for no less that four reasons. First, it doesn’t involve information privacy, but intrusion into an individual’s personal space. Second, even when looking for intrusion type risks, an analyst is typically thinking of marketing issues (through spamming or solicitation), in violation of CAN-SPAM, CASL, the telephone sales solicitation rules or other national or laws. Third, the invasion didn’t involve Pokemon Go user privacy but rather another distinct party. This isn’t something that could be disclosed on a privacy policy or adequately addressed by App permissions settings. Finally, the data in question didn’t involve “personal data.” It was the address of churches at issue. If you haven’t been told by a system owner, developer or other technical resource that no “personal data” is being collected, stored or processed, then you clearly haven’t been doing privacy long enough. In this case, they would be more justified that most. Now, this isn’t to excuse the developers for using churches as gyms. An argument could easily be made that people are just as deserving of “tranquility and solitude” is their religious observations as in their home.

Ignoring the physical invasion in religious institution’s space for one moment, one overriding problem in identifying this issue is that it is rare. Most churches simply aren’t people’s homes. A search on the Internet reveals a data broker selling a list of 110,000 churches in the US (including geolocation coordinates). If the one news story represents the only affected individual, this means that only approximately 1/100,000 churches were actually someone’s home. If you’re looking for privacy invasions, this is probably not high on your list based on a risk based analysis.

There are two reasons that this is the wrong way to think about this. First off, if your company has millions of users (or is encouraging millions of users to go to church), even very rare circumstances will happen. Ten million users with a one in a million chance of a particular privacy invasion means is going to happen, on average, to ten users. The second reason that this is extremely important to business is because these types of very rare circumstances are newsworthy. It is the one Kansas farm that makes the news. It is the one pregnant teenager you identify through big data that gets headlines. The local auto fatality doesn’t make the front page but if one person poisons a few bottles of pills out of the billions sold then your brand name is forever tied to that tragedy. Corporations can’t take advantage of the right to be forgotten.

Assuming you can identify the issue, what do you do? Despite the rarity of the situation, the fact that it doesn’t involve information, it isn’t about marketing, it isn’t about your customers or users of your service, and, on it’s face, it doesn’t involve personal data, is all hope lost? What controls are available at your disposal to mitigate the risks? Pokemon Go developers were clearly cognizant enough to not include personal residences as gyms. They chose locations that were primarily identified as public. At a minimum then, they could have done, potentially, more to validate the quality of the data and confirm that their list of churches didn’t actually contain people’s residences. Going a step further, they could have considered excluding churches from the list of public places. This avoids not only the church converted to residence issue but also the invasion into religious practitioners’ solitude. Of course, the other types of locations chosen as gyms still needs to be scrubbed for accuracy as public spaces. However, even this isn’t sufficient. Circumstances change over time. What is a church or a library today, may be someone’s home tomorrow. Data ages. Having a policy of aging information and constantly updating it is important even when it may not be, on its’ face, personal data. A really integrated privacy analyst or a development team that was privacy aware could even have turned this into a form of game play. Getting users to, subtly, report back through in-game mechanism that something is no longer a gym (i.e. no longer a public space), would keep your data fresh and mitigate privacy invasions.

No-one ever said the job of a privacy analyst was easy, but with the proper analysis, the proper toolset and the proper support of the business, you can keep your employer out of the news and try keeping your customers (and non-customers) happy and trusting your brand.