July 2016 – Privacy Maverick

In case you haven’t read enough about Pokemon Go and Privacy

In the past, you knew you’d arrived on the national scene if Saturday Night Live parodied you. While SNL still remains a major force in television, the Onion has taken its place for the Internet set. Just as privacy issues have graced the covers of major news sites around the world, so too has it made its way into plenty of Onion stories. The latest faux news story involves the Pokemon Go craze sweeping the nation like that insidious game in Star Trek: The Next Generation that took over crew member brains on the Enterprise.

“What is the object of Pokemon Go?” asks the Onion in their article. And their response was “To collect as much personal data for Nintendo as possible.” That may or may not have been part of the intent of Nintendo, but the Onion found humor because of its potential for truth. Often times comedians create humor from uncomfortable truthfulness. In a world of Flashlight apps collecting geolocation, intentions for collecting data are not always clear as was Nintendo’s potential collection with their game. Much has already been written about this. So much attention has been focused on Nintendo, it stirred frequent pro-privacy Senator Al Franken to write a letter. I’d like to focus, though, on something that another news story picked up.

The privacy issue I’m talking about isn’t about the collection of information by Pokemon Go or even the use of the information that was collected. The privacy issue I want to relay is something even the most astute privacy professional might overlook in an otherwise thorough privacy impact assessment. As mentioned by Beth Hill in her previous post on the IAPP about Pokemon Go, a man who lived in a church found players camped outside his house. The App uses churches and gyms where player would converge to train. While this wouldn’t normally be problematic but one particular church was converted years ago into a private residence. The privacy issue at play here is one of invasion, defined by Dan Solove as “an invasive act that disturbs one’s tranquility or solitude.” We typically see invasion issues more commonly crop up related to spam emails, browser pop-ups, or telemarketing.

This isn’t the first time we’ve seen this type invasion. In order to personalize services, many companies subscribe to IP address geolocation services. These address translation services translate an IP address into a geographic location. Twenty years ago the best one could do would be a country or region based on assigned IP address space in ARIN (American Registry for Internet Numbers). If your IP address was registered to a California ISP, you were probably in California. The advent of smartphones and geolocation has added a wealth of data granularity to the systems. Now, if you connect your smart phone to your home WiFi, the IP address associated with that WiFi could be tied to your exactly longitude and latitude. Who do you think that “Flashlight” application was selling your geolocation information to? The next time you go online with your home computer (without GPS), services still know where you are by virtue of the previously associated IP address and geolocation. One of the subscribers to these services are law enforcements, and lawyers and a host of others trying to track people down. Behind on your child support payment? Let them subpoena Facebook, get the IP address you last logged in and then geo-locate that to your house, to serve you with a warrant. Now that’s personalization by the police department! No need to be inconvenienced and go down to the station be arrested. But what happens when your IP address has never been geolocated? Many address translation services just pick the geographic center of where what they can determine, be that city, state or country. Read about a Kansas farm owner’s major headaches because he’s located at the geographic center of the U.S. at http://fusion.net/story/287592/internet-mapping-glitch-kansas-farm/

Many privacy analysts wouldn’t pick up on these type of privacy concerns for no less that four reasons. First, it doesn’t involve information privacy, but intrusion into an individual’s personal space. Second, even when looking for intrusion type risks, an analyst is typically thinking of marketing issues (through spamming or solicitation), in violation of CAN-SPAM, CASL, the telephone sales solicitation rules or other national or laws. Third, the invasion didn’t involve Pokemon Go user privacy but rather another distinct party. This isn’t something that could be disclosed on a privacy policy or adequately addressed by App permissions settings. Finally, the data in question didn’t involve “personal data.” It was the address of churches at issue. If you haven’t been told by a system owner, developer or other technical resource that no “personal data” is being collected, stored or processed, then you clearly haven’t been doing privacy long enough. In this case, they would be more justified that most. Now, this isn’t to excuse the developers for using churches as gyms. An argument could easily be made that people are just as deserving of “tranquility and solitude” is their religious observations as in their home.

Ignoring the physical invasion in religious institution’s space for one moment, one overriding problem in identifying this issue is that it is rare. Most churches simply aren’t people’s homes. A search on the Internet reveals a data broker selling a list of 110,000 churches in the US (including geolocation coordinates). If the one news story represents the only affected individual, this means that only approximately 1/100,000 churches were actually someone’s home. If you’re looking for privacy invasions, this is probably not high on your list based on a risk based analysis.

There are two reasons that this is the wrong way to think about this. First off, if your company has millions of users (or is encouraging millions of users to go to church), even very rare circumstances will happen. Ten million users with a one in a million chance of a particular privacy invasion means is going to happen, on average, to ten users. The second reason that this is extremely important to business is because these types of very rare circumstances are newsworthy. It is the one Kansas farm that makes the news. It is the one pregnant teenager you identify through big data that gets headlines. The local auto fatality doesn’t make the front page but if one person poisons a few bottles of pills out of the billions sold then your brand name is forever tied to that tragedy. Corporations can’t take advantage of the right to be forgotten.

Assuming you can identify the issue, what do you do? Despite the rarity of the situation, the fact that it doesn’t involve information, it isn’t about marketing, it isn’t about your customers or users of your service, and, on it’s face, it doesn’t involve personal data, is all hope lost? What controls are available at your disposal to mitigate the risks? Pokemon Go developers were clearly cognizant enough to not include personal residences as gyms. They chose locations that were primarily identified as public. At a minimum then, they could have done, potentially, more to validate the quality of the data and confirm that their list of churches didn’t actually contain people’s residences. Going a step further, they could have considered excluding churches from the list of public places. This avoids not only the church converted to residence issue but also the invasion into religious practitioners’ solitude. Of course, the other types of locations chosen as gyms still needs to be scrubbed for accuracy as public spaces. However, even this isn’t sufficient. Circumstances change over time. What is a church or a library today, may be someone’s home tomorrow. Data ages. Having a policy of aging information and constantly updating it is important even when it may not be, on its’ face, personal data. A really integrated privacy analyst or a development team that was privacy aware could even have turned this into a form of game play. Getting users to, subtly, report back through in-game mechanism that something is no longer a gym (i.e. no longer a public space), would keep your data fresh and mitigate privacy invasions.

No-one ever said the job of a privacy analyst was easy, but with the proper analysis, the proper toolset and the proper support of the business, you can keep your employer out of the news and try keeping your customers (and non-customers) happy and trusting your brand.

I first learned about essentialism while listening to an audio book of The Greatest Show on Earth by Richard Dawkins. Essentialism has it roots in Plato’s Idealism, though I would suggest that our being drawn to it may be a result in the way the human brain functions. For those unfamiliar, essentialism, simply put, is the notion that “things” have an essential form behind them. Thus in Plato’s world, a circle is defined by a perfect ideal of circle and while real world circles may have variations, bumps and such, a circle is essentially a line drawn around a point at all times equidistance from that point.

There a large variety of geometric shapes, triangles, squares, dodecagon, for which humans have assigned monikers. However, there are an infinite number of shapes that defy such simplistic definition. While a line equidistance from a point is the perfect circle, a random squiggle is the best whatever it is, despite us not having a name for it. Now, I don’t claim to be a neuro-biologist, but in my rudimentary understanding, our brains store things in a way that provides simple categorization. Language is built on defining things we can relate to. We see something round, our brain fires off the neurons that represent a circle. We can also abstract by grouping things together. We see a 12 side shape; we may know it is a polygon but not a dodecagon. Our brains are really good at analogizing as well. We learn by analogy. We see something big, strong, with fangs and bearing its teeth, we may not know what it is, but we can recognize it’s probably a predator.

Dawkins discussed essentialism in the concept of evolution. Prior to Charles Darwin, living creatures broken into a taxonomy. In 1735, Carl Linnaeus is the seminal work Systema Naturae started with three kingdoms of nature (only two animals and plants were living), divided into classes then orders, genus, and species. We still use a form of this taxonomy today when we talk about life, only now thanks to Thomas Cavalier-Smith, we have six kingdoms. Dawkins beef with essentialism is that by categorization we make it more difficult to see the evolutionary changes. Take a rabbit, defined as a furry creature with fluffy ears, a bushy tail and strong hind legs. But that’s the ideal, every rabbit is different and if you go back in the ancestry of rabbits, when does it cease to be a rabbit? In the future, as generations are born, when does the descendent of a modern day rabbit cease to be a rabbit? Humans have a hard time dealing with conceptualizing large spans of time, so we can analogize (again, using that great learning technique) to relatives and aging. My brother is clearly my relative, as are my first and second cousins. Though I don’t know them, I know I have third cousins and more that are relatives. At what point though are we no longer “relatives?” One young girl even claimed to show that all but one of the presidents were related, tracing lineage back to an English King. When I meet someone on the street, do I only not put someone in the “relative” bucket in my brain because nobody has done the analysis? Aging provides a similar means of clearly showing the continuity of life and a break down of our taxonomy of age. We are born as babies, grow to be infants, then toddlers, next children then young adults, then adults, then we’re labeled old, and perhaps elderly after that. But what defines those classifications? When do I become old? Do we one day wake up and we’re suddenly “elderly?” Isn’t 60 the new 30?

Once I learned about essentialism, I started seeing the dichotomy everywhere: the breakdown between where people try to classify or categorize things and the reality that there is a continuous line. One of my first epiphanies occurred when I was trying to clean up my vast MP3 collection. Many of the songs had no associated genre or the genre was way off. I set about to correct that. I started labeling all my music. But then I ran into a clear conundrum. Was Depeche Mode “new wave” or “80’s pop”? Was Billy Bragg punk, folk or some crossover folk punk? Clearly the simplistic labeling system provided by Windows was the problem as it only allowed me to pick one genre. I need something more akin to modern day tagging where I could tag a song with a related genre, one or more. But was that really the problem?

I started realizing this problem (though not in the way I’ve characterized it now) about 20 years ago in relations to techno music. There seemed to be all sorts of subgenres: jungle, synth, ambient, acid, trance, industrial. It seemed every time I turned around there was a new subgenre: darkwave, dubstep, trap, the list goes on. Wikipedia lists over a hundred genres of electronic music. I couldn’t keep up and have trouble distinguishing between many of them. SoundCloud has millions upon millions of songs. Many of these defy categorization. What we’re learning from this is that we can like a song without pegging it into a specific category and with the power of suggestion, SoundCloud can find other songs we like without us needing to search the “Pop-Country” section of the local record store.

So now I come to privacy. You may be thinking that I’m going to talk about personalization and privacy and how in order to suggest an uncategorizable song, I have to know about your musical taste. While that it a valid topic for conversation, I’ll leave that to another post. What I want to talk about today is privacy’s taxonomy. I’ve been a big fan of Dan Solove’s privacy taxonomy for quite some time. I think it really does a good job of pinpointing privacy issues that people don’t normally think about and allows me to explore when talking with others. Going through the taxonomy allows me to illustrate types of privacy invasion that aren’t just about insecurity and identity theft. Talking about surveillance allows me to discuss how it can have a chilling effect, even if you’re not the target of the surveillance or “doing anything wrong.” I can talk about how interrogation, even if the subject doesn’t answer, may make them uncomfortable.

But I’ve also been thinking about the taxonomy and essentialism. What are we missing in the gaps between the categories? I’ve been working on a book, hopefully, to be published later this year on a theory of privacy that I hope will fill those gaps. A unified field theory of privacy, I hope. Stay tuned.

Month: July 2016

Pokemon Goes to Church

Essentialism and Privacy