Two recent articles highlight what is probably old news to most security professionals: the weakest security link is the user. This is why it’s important to help users help themselves. They aren’t security (or privacy) experts. This is especially true when circumventing what user’s trust is a secure connection (i.e. a supposedly helpful man in the middle). I find it especially interesting to see that most users, who view going to a webpage as a very solitary and privaty experience, aren’t even aware of all the other users who can go to the same website as them. This is why they choose “password” as their password, because for them they are alone in going to the site, no one is around to see them type in “password.” They just are cognizant that other people might try to type in their username and “password.”
Humans often learn by analogy. You take something you don’t understand and try to analogize to something you do understand. This is one of the reasons wave particle duality is so difficult to understand, because light in analogous to two incongruous models we already know: waves and particles.
Cryptography is also very difficult to understand because people have a hard time bridging the gap between things they know and are familiar with and things they don’t know. Some cryptographic techniques just don’t have very good real world analogies. It’s like quantum science, unfathomable to most people.
M of N secret splitting is just such a creature. Something that really has no real world analogy and is hard for people to grasp. Even harder it seems, is the multitude of applications such a neat techniques has for us.
Say you have a sentence “Now the right to life has come to mean the right to enjoy life, — the right to be let alone.” which you want to split up and store in N locations. The most obvious option, store the entire sentence in each location, isn’t very secure since any one of those locations could be compromised and give up your secret. You could split the sentence into N parts and store each part in each location and then it would take someone compromising each location to reconstruct your sentence. That’s better but now we suffer from another problem. What happens if we lose one of our locations (due to corruption or destruction)? Now we can’t reconstruct the sentence, because we’ve lost data. What’s a nice medium solution?
M of N secret splitting allows us to split the sentence into N parts but any M of them will reconstruct the entire sentence. For example we could split it into 10 parts and it takes 5 to reconstruct the sentence. This gives us the security of an attacker needing to compromise 5 locations AND us the security that we could lose 50% of the locations and still find the secret. Without cryptography it’s difficult to understand how this could be done. What’s even more interesting, is what else can be accomplished with this technique, beyond the simple idea of secret sharing. I gave one example last month, but I’m going to give another example this month that I thought about extensively during the Privacy Academy in Dallas.
Assume your customers have your mobile phone application and you want to be able to alert them to crowded conditions so they can avoid them (like traffic) or go to them (like hot nightclubs) but you don’t want to worry about tracking customers location information. You could have some sort of polling system that just ticked off where customers are without storing the information but the customer’s phones still have to let you know where they are, right? This leads to problems of hack, leaks or employee malfeasance. Wouldn’t it be better to have a system that gave you the information you needed without customers needing to tell you where they were? Enter M of N secret splitting.
Let each customer be identified by a unique customer name (email address or username). Take the user identifier and compress it down to one of 100 slots. In other words, take the numeric equivalent of the identifier and modulo it by 100. [100 is an arbitrary number here]. That way each customer falls into an essentially random bucket out of 100 buckets.
Now describe the location, say “El Gaucho Inca Restaurant” (where I recently had a delicious Peruvian meal). Perform M of N secret splitting on the location, where N is 100 and M is 5. This way, you have 100 parts and any 5 will give you the location. Now, pick the piece number equivalent with the slot you chose for this customer and upload that.
The company now has one piece of data it can’t do anything else. If it collect 4 others, it can reconstruct the location, but not until. By splitting it into 100, there is a 99/100 chance that the next piece it gets won’t be in the same bucket. We have met the criteria of absolutely not knowing where each person is, yet we can tell, in the aggregate, if at least 5 people are at this location.
While this may not seem like a major revelation, it seems that many system designers and business people have difficultly grasping the ability to build a system to meet the needs without storing or collecting data it would seem critical to the system. I’ll be talking more about this in future posts.
As many others have pointed out, cloud computing is really nothing new. Before it was called cloud computing, application service providers (ASPs) provided software not as a downloadable product but as an online service. Really, what has changed is the acceleration of software (or infrastructure, data or platforms) as a much more modular and turnkey service. Service providers have minified the transaction costs of software (or hardware). Whereas before purchasing new or additional services took time and effort (i.e. transaction costs) on the part of both the seller and buyer, now it can be requisitioned and provisioned with a few clicks of a mouse, the so-called utility model; one just increases demand by adding more consuming devices and the utility provides.
However, shrinking transaction costs for efficiency means that there is no longer room for substantial negotiations between provider and consumer. This leads to a gap in the needs of the consumer for certain protections (e-discovery, retention, security, privacy etc) and the desires of the provider to limit liability and provide a one size fits all service. Bigger clients, which may command attention and have some bargaining power, make it more difficult for service providers to provide a simple cheap service because of the need for negotiation. I’m suggesting the end result is probably a stratification of service providers in differing industries (or geographically) in order to limit the need for negotiation with clients who have differing needs.
I attended Privacy Academy 2011 in Dallas last week and it was quite interesting. Met a lot of people and have been contacting them furiously this week (while still trying to catch up on 2 weeks of missed work). While the seminars and lectures were thought inspiring (especially the one on the law of obscurity), it’s still problematic the gap between the legal privacy types and the mathematical/computer science community. I was inspired, though seeing Marc Rotenberg, o EPIC give the headlining speech at the Friday luncheon. He mentioned many people that I admire, such as Phil Zimmerman (PGP), David Chaum (DigiCash) and others. He spoke about the need for PETs and Privacy by Design, which as I’ve mentioned, is sorely needed in the Privacy Professional community.
I did submit a proposal to do a speech on privacy engineering for non-engineers at the Global Privacy Summit next year in Washington. Crossing my fingers that it occurs.
WPES is going on mid October. I just learned about it or I might have tried to go.