Humans often learn by analogy. You take something you don’t understand and try to analogize to something you do understand. This is one of the reasons wave particle duality is so difficult to understand, because light in analogous to two incongruous models we already know: waves and particles.
Cryptography is also very difficult to understand because people have a hard time bridging the gap between things they know and are familiar with and things they don’t know. Some cryptographic techniques just don’t have very good real world analogies. It’s like quantum science, unfathomable to most people.
M of N secret splitting is just such a creature. Something that really has no real world analogy and is hard for people to grasp. Even harder it seems, is the multitude of applications such a neat techniques has for us.
Say you have a sentence “Now the right to life has come to mean the right to enjoy life, — the right to be let alone.” which you want to split up and store in N locations. The most obvious option, store the entire sentence in each location, isn’t very secure since any one of those locations could be compromised and give up your secret. You could split the sentence into N parts and store each part in each location and then it would take someone compromising each location to reconstruct your sentence. That’s better but now we suffer from another problem. What happens if we lose one of our locations (due to corruption or destruction)? Now we can’t reconstruct the sentence, because we’ve lost data. What’s a nice medium solution?
M of N secret splitting allows us to split the sentence into N parts but any M of them will reconstruct the entire sentence. For example we could split it into 10 parts and it takes 5 to reconstruct the sentence. This gives us the security of an attacker needing to compromise 5 locations AND us the security that we could lose 50% of the locations and still find the secret. Without cryptography it’s difficult to understand how this could be done. What’s even more interesting, is what else can be accomplished with this technique, beyond the simple idea of secret sharing. I gave one example last month, but I’m going to give another example this month that I thought about extensively during the Privacy Academy in Dallas.
Assume your customers have your mobile phone application and you want to be able to alert them to crowded conditions so they can avoid them (like traffic) or go to them (like hot nightclubs) but you don’t want to worry about tracking customers location information. You could have some sort of polling system that just ticked off where customers are without storing the information but the customer’s phones still have to let you know where they are, right? This leads to problems of hack, leaks or employee malfeasance. Wouldn’t it be better to have a system that gave you the information you needed without customers needing to tell you where they were? Enter M of N secret splitting.
Let each customer be identified by a unique customer name (email address or username). Take the user identifier and compress it down to one of 100 slots. In other words, take the numeric equivalent of the identifier and modulo it by 100. [100 is an arbitrary number here]. That way each customer falls into an essentially random bucket out of 100 buckets.
Now describe the location, say “El Gaucho Inca Restaurant” (where I recently had a delicious Peruvian meal). Perform M of N secret splitting on the location, where N is 100 and M is 5. This way, you have 100 parts and any 5 will give you the location. Now, pick the piece number equivalent with the slot you chose for this customer and upload that.
The company now has one piece of data it can’t do anything else. If it collect 4 others, it can reconstruct the location, but not until. By splitting it into 100, there is a 99/100 chance that the next piece it gets won’t be in the same bucket. We have met the criteria of absolutely not knowing where each person is, yet we can tell, in the aggregate, if at least 5 people are at this location.
While this may not seem like a major revelation, it seems that many system designers and business people have difficultly grasping the ability to build a system to meet the needs without storing or collecting data it would seem critical to the system. I’ll be talking more about this in future posts.