Morrison & Foerster: A Legal Framework for moving Personal Information to the Cloud

Morrison & Foerster recently released this summary of the legal issues surrounding Privacy in the Cloud

The summary suggest, as I always encourage, that organizations should encrypt their data before uploading it the cloud.  (See my post on doing this with RackSpace file hosting).  Of course, this may not always be possible when utlizing SaaS (Software as a Service) because the software may need raw access to your data to perform necessary functions.  My primary word of advise is don’t take companies word on what they can deliver in terms of data protection and security: test, test and retest. 

H/T: Next Practices

Interview and Social Verification

I had an interview with an interesting company yesterday, Border Stylo. They have two products currently, both social networking applications: Write on Glass and Retrollect (a mobile app).  During the interview (for their Privacy Architect position) the question arose of how to secure password retrieval.  In a subsequent letter to their VP of Engineering, I proposed a solution which I include below.

The criteria are as follow:

  • The company doesn’t store the password in plaintext and therefore can’t retrieve it directly.  
  • The company doesn’t want to collect extraneous information (such as a person’s elementary school), choosing instead to minimize information collected to the absolute minimum.  This policy precludes a series of “security” questions which are often present those needing to reset passwords. [The security risk in this approach is that an attacker with access to details about a person could emulate that person’s response].
  • The company doesn’t want to send an email with a link to reset the password.  [This common method also runs the risk of a compromised email account being used to compromise the users account on this system].
Authentication typically requires a user to supply one of the following: something the know, something they have or something they are (a biometric identifier).  
Strong authentication involves providing 2 of the 3 authentication factors above.
In trying to solve this problem, I began thinking, what information does this social networking company have that the user can provide to authenticate themselves?  It dawned on me that beyond a name and email (which wasn’t usable for the above reason) they have information about the friends or contacts in the user’s social network.
Now the simple response would be to require the user to identify some of their friends, but this leaks information to a potential attacker as to who the friends are. The better approach is to require the user to contact their friends and get their friends to verify they user.  We can do this using m-of-n secret sharing.

Here is the process:

The most common method of securely storing passwords in databases is through hashing.  A user will enter a password on a website and that password is hashed and compared against the hashed password the system has for the user.  If the hashes match, the user is authenticated.  Even if someone were gain access to the database, they may not be able to authenticate in the system because they would have solve for x, from h(x) and the hashing function is considered one-way.

A more complex method involves adding random salt and iterative hashing to each authentication request in order prevent replay attacks, but I leave that for another post.

Assuming a social networking system uses the password hashing method discussed above and the user has forgotten the password, we can authenticate them by making their social circle authenticate them for us. During account setup, the user is prompted to set a security level, M (essentially the number of friends that need to authenticate them).

When the user attempts to login and can’t (because they’ve forgotten the password), they are given the option of authenticating through friends.  They are supplied a URL to give to their friends.  The friends, once they visit the URL, are informed the user is unable to log into the network and if they want to authenticate their friend to supply them a code.  The code is actually one slice of an m-of-n secret share of the password hash. The friends are encouraged “verify” their friends before giving them the code.  The assumption is while an impostor may be able to dupe a few friends, the chance of duping M friends decreases as M increases. Once the user collect M codes, they enter it into the login page, which reconstitutes the hash of their password, authenticates them and allows them to set a new password.

Choosing M fairly low, say 1 or 2, is mostly insecure since it allows one or two friends to collude to enter a user’s account.  Choosing M high may be unacceptably bothersome to most users but the benefit is that users can set their own security level.

Comments are most welcome.  

Simple cloud file storage security

I’m in the process of implementing a fax solution for a customer that involves storing the faxes in a cloud based solution.  Its fairly simple but here is basically what I’m doing (using ColdFusion).

A fax is received at the customer’s eFax #.  eFax converts the facsimile to a PDF and parses out any information encoded as a barcode in the fax (a handy way of automating which faxes go to which queues in their application).  eFax then post fax via an ssl xml post to the customer’s web server.

The web server parses the XML and extracts the relevant meta information (phone number, bar code data, etc).  It then makes an entry of this information into a database for later retrieval.  The unique key that the database generates is concatenated with a hard-coded password to encrypt the actual file contents (the raw PDF).

Those file contents are, in turn, uploaded to a cloud file hosting with Rack Space Cloud.  The file name is the unique identifier associated with the eFax the customer received.  In other words, someone must have access to our database table with the eFax unique id and our primary key in order to find a file related to a particular customer and decrypt that file (they also have to have our hard-coded password).

Now granted, this isn’t rocket science and anybody (namely us) who has access to the code and the database could decrypt all the files but the files aren’t a high security risk.  They most sensitive data is credit card information on some of the faxes which are order forms but those are meant to be viewed by us (to process the orders).

Public Private and Privacy

Some of the confusion surrounding the term privacy lies in the definition of private.  Private and public an antonyms.  Whereas private mean secure, unreleased, secret, and hidden, public means shared, open, visible, and available.

By contrast, privacy is, as has been discussed in the previous two posts, about the decision making process, the understanding of parties and their expectation as to whether information should remain private or be shared, open and publicy.  Privacy is information about the transformation of the state of information from private to public. 

What does “expectation of privacy” mean?

Alan Westin’s definition of privacy, as outlined in my last post, is that “Privacy is the claim of individuals, groups, or institutions to determine for themselves when, how, and to what extent information about them is communicated to others.”

That claim can come from multiple sources: a legal regime, a contract, a written privacy statement, a standard industry practice, a custom or a social norm.  It is the last of the sources, unwritten customs or social norms, that lay the foundation for the “expectation” of privacy.  While a written law, contract or statement will clarify the roles and responsibilities of the subject and possessor of the information, the expectation lies in the subject’s belief that a certain social norm governs the actions of the possessor. 

[NOTE: I often use the term subject to refer to the individual or group about whom the information relates and possessor refers to the holder of the information who is contemplating transferring information to a third party.  Possessor can loosely be used, also, to a potential possessor who is contemplating coming into possession of the information in question.]

“Expectation of privacy” is synonymous, in the US, with the 4th Amendment, where one is guaranteed a right to be free from unwarranted invasion of ones “person, houses, papers and effects.”  The purposefully vague language had lead courts to adopt the “reasonable expectation of privacy” standard deferring essentially to ones’ expectation that the government adhere to custom or social norm.   Clearly, tensions arise when there are no customs or social norms or those social norms are in flux or when courts have to analogize to established norms.  See Warshak v. US 6th Circuit 12-14-2010 (stating that there is a reasonable expectation of privacy in email because of its similarity to postal mail and telephonic conversations).  

What is privacy?

I’d like to start this blog with a little primer on the term privacy.  Most people conflate privacy with the notion of hiding something or secrecy.  However, Alan Westin is his seminal work  Privacy and Freedom (1967) said it succinctly “Privacy is the claim of individuals, groups, or institutions to determine for themselves when, how, and to what extent information about them is communicated to others.” In other words, privacy is about autonomous decision making.  Respecting privacy is about respecting and complying with the decisions individuals make about the information about them without regard to the nature of the information.

Understanding this distinction is paramount to understanding privacy.  No matter how insignificant, no matter how negligible the consequences, the fact that one ignores or acts contrary to the wishes of the subject of the information is the violation.  It is the defiant act itself, not the consequences of shame, embarrassment, identity theft, etc. that define a violation of privacy.

Perhaps an example is in order.  Consider a lady at a dinner party who whispers, in confidence, her birthday.  The recipient then announces to the party that Ms. X was born in 1950. Clearly most are aghast at the revelation of such personal information. What if the recipient had merely announced that Ms. X was born in August?  No longer is there social stigmatization attached to the information, as there is with revealing a lady’s age.  The lady, the subject of the information, may not be as ashamed of the revelation but just as great a violation has occurred.  The recipient of the information violated her trust as well as the unwritten agreement they had.  Now, there may be other social norms or agreements at play between the giver and the receiver of the information but in the simple scenario, no matter what information is released, the further release of information by the recipient failed to adhere to the complete understanding of the parties.