Social Security Numbers — Password or Identifier?

Social Security Numbers have the unenviable position of being both identifiers and passwords.  They are designed to uniquely identify individuals (in the US) but yet are supposed to be secret enough that companies’ attempt to rely on them as passwords, keys to that person’s account.  However, unlike passwords in online systems which are (if proper protection is taken) transmitted and stored as hashes to prevent easedroppers or hackers from learning the password, SSNsare most often transmitted and stored in plain text.  The SSN is usually given to an employee of the company who must be trusted not to reveal it or use it for disallowed purposes.  When one looks that this password is shared amongst many companies, the vulnerability is clear. 

Social security numbers should really only be used as unique identifiers, and then only to correlate accounts and tie an account to a specific individual.  However, at no time should the SSN be used to identify a physical person as the person behind the social security number.  Just because someone knows an SSN does not mean that they should be authorized as the owner of that SSN. 

To push this notion through society, I would like to propose a law that would force companies to stop relying on SSN as proof of identity.  How do we do this? Not by making sanctions and imposing regulations on companies for misuse, but a simply by pubilshing Social Security Numbers and names of the corresponding owners.  The simple feat of maknig this information widely accessible and known to be widely accessible would quickly force companies relying on the false security of the SSN to reengineer their processes not to rely on that false notion.

Drivers Licenses vs Driverless Cars

The recent revelation that Google is applying to allow driverless cars on the road in Nevada combined with the stink over E-Verify (the backdoor National ID attempt) and its collection of drivers license data ala REAL ID got me thinking.  What happens to the identity infrastructure in this country if drivers and necessarily drivers licenses go by the wayside?  It has always been a pet peeve of mine that drivers license have become the defacto identification in our society, because so many people have them and must have them to drive.  Its a classic case of mission creep, where drivers license, which once were solely issued to display that the bearer had a license to drive (i.e. met the minimum standards) but now are used in all sorts of scenarios to verify identity.  In the future it seems, people might not need this ubiquitous item.  Its quite possible that most people may just transition to state issued ID cards, but it is interesting the pontificate on the alternative.

What is PII?

There has been a lot of discussion on privacy lists recently about whether IP addresses, email addresses, etc are PII (personally identifiable information).  Clearly with regards to a specific law, you’d have to reference that law (or the corresponding case law) to make a determination.  In general, though, I’d like to suggest a way of thinking about information in making this determination.  Much of the discussion has revolved around whether the information by itself or in conjunction with other information can “identify” an individual.  Does identify John Smith and if so, which one? What about Does a dynamic IP address identify an individual or only when combined with the logs of the ISP?

The approach I suggest looks at information in terms of relationships to individual persons. Borrowing from the relational database world, information can be related One-to-One, One-to-Many or Many-to-Many. 
Some examples would be 
1. SSNs generally exhibits a One to One relationship: each person has one and only one SSN.
2. Physical addresses generally exhibit a One to Many relationship: several people could live at a particular address but most people only have one residential address. 
3. First names generally exhibit a Many to Many relationship: at any given time there are millions of people named John and most people have many names (surname, given name, nickname, etc).  
Hopefully you’ll see that almost anything COULD exhibit a many to many relationship.  Just as we change IP addresses, we change physical addresses and some people have multiple residences.  Even SSNs, though most people will only ever have one are used and reused by identify thieves.
A recent California court case,Pineda v William Sonoma, considered whether zip codes were PII. Clearly, the relationship is many to many as many people reside in a single zip code and people move and change zip codes several times throughout their lives. It’s clear from the alleged facts of the case that William Sonoma used the zip code procured from Pineda in combination with additional information to identify her and her address and used that to contact her to solicit additional sales. While in and of itself the zip code did not uniquely identify her, that information was useful in identifying her.  Without it, they may not have been able to track her down.

Other questions arise about whether car VIN numbers, license plates, etc are Personally Identifiable Information.  I would have to argue absolutely.  While in isolation the numbers don’t point to a particular individuals, they do relate to  individuals in various ways, as owners, drivers, passengers, etc at particular times.

In our information driven world, we must take care that any descriptive information when combined with what it’s describing is PII and should be treated as such.  “Blue” is not PII but “blue car” in context could very well describe (again) owners, passengers, sellers, drivers, etc….

Security is not Privacy

I read this blog last week titled “Privacy is not Security” and it got me thinking about the relationship between the two fields.  As the author states “Privacy is a whole lot more than security.”  I would like to clarify that statement by saying that security is a necessary but not sufficient component of privacy.  Again, security is required if you are to have privacy but really it’s insufficient to guarantee privacy.  Really, what’s required is a different mindset.

Toll roads have recently come up twice in my reading: first in researching Privacy by Design I came across a description of how to travel anonymously on Canada’s 407 and secondly in a recent article about a lawsuit commenced when the the local toll roads in Florida started asking for all this invasive information to combat counterfeiting of cash in addition to the existing privacy invasive electronic toll system.

The Florida toll system may (and I stress may) have all manner of security precautions on who has access to the data, how it’s stored, when to destroy it, etc. but the fact remains they have a culure that is inherently privacy invasive.  They haven’t even considered how to design their system in a way that is innately protective of privacy. 

IAPP KnowledgeNet Miami

I’ll be attending the IAPP KnowledgeNet in Miami next Tuesday.  This KnowledgeNet event will focus on a few topics:

Key points from the FTC Privacy Report issued in December 2010
Understanding the changes in the PCI DSS v2.0
Best practices when performing a privacy assessment
Incorporating MA 201 CMR provisions into your third party due diligence process

I’m really insterested to learn the best practics for privacy assesments, since that’s probably most pertinent to me at the moment.  It’ll also be nice to get my first continuning education credits towards my CIPP.  The good and the bad thing about being a Certified Information Privacy Professional is that they require you to keep abreast of changes in the law to keep the certification active.  Its good because potential employers know that you’re knowledge is not stale. The flip side of the coin, for an idependent privacy professional such as myself, is the expense involved.

I’ll certainly report back any interesting items I learn. 

Pre-hire Workplace Privacy

I just read a blog post about the case of the Maryland Department of Corrections requiring potential employees to turn over their FB username and passwords so the Department could look for possible illegal activity.  The blog didn’t allow comments, so I’m posting my comments on my blog.

The post did make a couple valid points.  The ACLU argument was weak and didn’t really have any justification in the law.  Oh, a quick summary for those too lazy to go read the full post.  As mentioned above the Maryland DOC is pseudo requiring applicants to provide their FB information.  It’s not a definitive requirement for employment, they say, but realistically you’ll probably get passed over if you don’t provide it.  The ACLU made the argument that since applicants did not have a choice, any access made by the department was unauthorized and therefore a violation of the federal Stored Communications Act.

As the post points out, the argument is weak because the applicant can walk away, they have no right to the job and the interaction is entirely voluntary.  The case is distinguishable from the one cited by the ACLU which concerned an existing employment relationship that was in jeopardy if the employee didn’t reveal her username and password. The argument there was the employer did violated the SCA because the threat of losing a job was sufficient to make the employee’s revelation of the username and password unauthorized.  I’m not sure I wholly agree, given the voluntary nature of employment, but there is some principle in employment law (and this may only be state employment) which given employees an expectation of continued employment without due process.

Now this issue has similarly come up recently in my mind when the Florida Board of Bar Examiners considered rules for requiring applicants to the Florida Bar to provide access to their accounts.  Its unclear if that means giving up a username and password, providing authorization to require Facebook to provide the information or just “friending” the Florida Bar with all privacy settings turned off.

Even without regards to the law, the practice seems particularly troublesome in light of a couple of considerations.

1)  Assume for a moment that your Social Media account (I don’t want to pick on Facebook though they are the elephant in the room), is locked down so tight that essentially all you use it for is private messaging friends.  What is the difference between requesting access to those private messages on the social networking site and requesting access to your private email account?
2) If the employer is not prepared to go that route and says it only wants “public information” you’ve posted, how do you distinguish between what’s public and private?  If I only allow 100 friends to read my wall post is that public or private?  10 friends? 1 friend?
3) The FBBE request is seemingly generic.  What if they ask you to list ALL your personal websites?  Are you supposed to give them access to your account?  What if you’ve misstated your height and weight or other information as most people do on dating sites, are they obliged to deny your admission for being less than honest?  Are you supposed to reveal even more, potentially embarrassing, but also irrelevant websites that you’re a member of?
4) Finally what if providing such access is a violation of the site’s Terms of Service?  From Facebook’s ” 4.8 You will not share your password, (or in the case of developers, your secret key), let anyone else access your account, or do anything else that might jeopardize the security of your account.” 

I think employers, or state licensing boards, head down an extremely slippery slope if they start accessing for non-public access to our digital lives.  I’m certainly not suggesting their ought to be a law, we already have too many of those.  I am suggesting that as employees, a line needs to be drawn in the sand.

Border Stylo

Well apparently my phone interview didn’t go as well as I thought and they determined my technical skills didn’t match the position.  Onward and upward!  I still think my proposed solution to their password resetting problem was at least mildly innovative.  I certainly haven’t seen any other social networking type sites utilizing a similar system, basically getting your friends to authenticate you.  Granted, it has little application outside social networking sites (though I am brainstorming on a similar implementation for my current employer dealing with initial passwords). 

Finding a job in Privacy is certainly proving more difficult than I expected.  I guess my job constraints are really limiting the market for me.  That’s why I was hopeful about the Border Stylo job because at least in terms of job responsibilities the mix was complementary with my skills. Note to self: next time study up on the standard interview questions. 

It’s interesting that the IAPP report, A Call for Agility: the next-generation privacy professional, seems to reiterate the fact that a combination of legal and technical skills are crucial, that the generation privacy professional is a privacy engineer, utilizing privacy by design to influence organizations across functional lines. 

While there is little debate as to whether privacy professionals ought to have a basic grasp of legal and technical concepts around data privacy and security, experts’ opinions diverged on whether tomorrow’s privacy professional would by necessity need a legal or technical degree. The central role of regulatory and IT drivers shaping the privacy profession almost ensures an ongoing need for privacy professionals to be conversant in not one, but both of these disciplines . . .
Quite simply, if people do embed these types of innovations into their daily lives, a new role may materialize: the privacy engineer. Companies that hope to market their innovations to a public more informed about their privacy risks will need to hire engineers who are also privacy experts. Their task will be to “bake in” privacy to their product designs.

 It does not appear this privacy engineer position has gained much traction.  The vast majority of open privacy  positions I’ve encountered are more geared toward assuring compliance rather than proactive use of privacy as a differentiator.  I’m sure, as the IAPP report suggest, that the role of Privacy Engineer will be more prevalent over the next decade, but it sure isn’t the status quo yet.

Morrison & Foerster: A Legal Framework for moving Personal Information to the Cloud

Morrison & Foerster recently released this summary of the legal issues surrounding Privacy in the Cloud

The summary suggest, as I always encourage, that organizations should encrypt their data before uploading it the cloud.  (See my post on doing this with RackSpace file hosting).  Of course, this may not always be possible when utlizing SaaS (Software as a Service) because the software may need raw access to your data to perform necessary functions.  My primary word of advise is don’t take companies word on what they can deliver in terms of data protection and security: test, test and retest. 

H/T: Next Practices

Interview and Social Verification

I had an interview with an interesting company yesterday, Border Stylo. They have two products currently, both social networking applications: Write on Glass and Retrollect (a mobile app).  During the interview (for their Privacy Architect position) the question arose of how to secure password retrieval.  In a subsequent letter to their VP of Engineering, I proposed a solution which I include below.

The criteria are as follow:

  • The company doesn’t store the password in plaintext and therefore can’t retrieve it directly.  
  • The company doesn’t want to collect extraneous information (such as a person’s elementary school), choosing instead to minimize information collected to the absolute minimum.  This policy precludes a series of “security” questions which are often present those needing to reset passwords. [The security risk in this approach is that an attacker with access to details about a person could emulate that person’s response].
  • The company doesn’t want to send an email with a link to reset the password.  [This common method also runs the risk of a compromised email account being used to compromise the users account on this system].
Authentication typically requires a user to supply one of the following: something the know, something they have or something they are (a biometric identifier).  
Strong authentication involves providing 2 of the 3 authentication factors above.
In trying to solve this problem, I began thinking, what information does this social networking company have that the user can provide to authenticate themselves?  It dawned on me that beyond a name and email (which wasn’t usable for the above reason) they have information about the friends or contacts in the user’s social network.
Now the simple response would be to require the user to identify some of their friends, but this leaks information to a potential attacker as to who the friends are. The better approach is to require the user to contact their friends and get their friends to verify they user.  We can do this using m-of-n secret sharing.

Here is the process:

The most common method of securely storing passwords in databases is through hashing.  A user will enter a password on a website and that password is hashed and compared against the hashed password the system has for the user.  If the hashes match, the user is authenticated.  Even if someone were gain access to the database, they may not be able to authenticate in the system because they would have solve for x, from h(x) and the hashing function is considered one-way.

A more complex method involves adding random salt and iterative hashing to each authentication request in order prevent replay attacks, but I leave that for another post.

Assuming a social networking system uses the password hashing method discussed above and the user has forgotten the password, we can authenticate them by making their social circle authenticate them for us. During account setup, the user is prompted to set a security level, M (essentially the number of friends that need to authenticate them).

When the user attempts to login and can’t (because they’ve forgotten the password), they are given the option of authenticating through friends.  They are supplied a URL to give to their friends.  The friends, once they visit the URL, are informed the user is unable to log into the network and if they want to authenticate their friend to supply them a code.  The code is actually one slice of an m-of-n secret share of the password hash. The friends are encouraged “verify” their friends before giving them the code.  The assumption is while an impostor may be able to dupe a few friends, the chance of duping M friends decreases as M increases. Once the user collect M codes, they enter it into the login page, which reconstitutes the hash of their password, authenticates them and allows them to set a new password.

Choosing M fairly low, say 1 or 2, is mostly insecure since it allows one or two friends to collude to enter a user’s account.  Choosing M high may be unacceptably bothersome to most users but the benefit is that users can set their own security level.

Comments are most welcome.  

Simple cloud file storage security

I’m in the process of implementing a fax solution for a customer that involves storing the faxes in a cloud based solution.  Its fairly simple but here is basically what I’m doing (using ColdFusion).

A fax is received at the customer’s eFax #.  eFax converts the facsimile to a PDF and parses out any information encoded as a barcode in the fax (a handy way of automating which faxes go to which queues in their application).  eFax then post fax via an ssl xml post to the customer’s web server.

The web server parses the XML and extracts the relevant meta information (phone number, bar code data, etc).  It then makes an entry of this information into a database for later retrieval.  The unique key that the database generates is concatenated with a hard-coded password to encrypt the actual file contents (the raw PDF).

Those file contents are, in turn, uploaded to a cloud file hosting with Rack Space Cloud.  The file name is the unique identifier associated with the eFax the customer received.  In other words, someone must have access to our database table with the eFax unique id and our primary key in order to find a file related to a particular customer and decrypt that file (they also have to have our hard-coded password).

Now granted, this isn’t rocket science and anybody (namely us) who has access to the code and the database could decrypt all the files but the files aren’t a high security risk.  They most sensitive data is credit card information on some of the faxes which are order forms but those are meant to be viewed by us (to process the orders).