**Note:**This blog entry is a cross-post from Digital Operatives' blog. You can read the original post here.

Digital spin locks like the Kaba Mas X-09 and X-10 are very common for high security applications like vault doors. US General Services Administration approval means that they are nearly ubiquitous in securing government filing cabinets that contain documents that are classified secret, as well as securing the doors to laboratories in which secret code is written and stored. Strolling through the hallways of any Department of Defense facility or government contractor’s office, one is likely to see many of these on the doors:

Kaba Mas X-09 |

These types of locks are very sophisticated: they are self-powered
by rotating the dial and they have a digital printout on the top of
the code as it is being entered. Security features to thwart automated
brute force attack include random jumps in the numbers as the dial is
being turned, algorithms for detecting “non-human” rotations of the
dial, and long timeouts for successive failed attempts. These types of
locks are usually configured to accept a six-digit code that is input
as a sequence of three numbers from 0 to 99. That means there are
$math$10^6 = 100^3 = 1,000,000$math$ possible combinations. With a
four minute timeout, brute forcing all of the combinations would take
almost *eight years!*

Humans are generally bad at remembering long sequences of numbers,
and in most facilities the combinations to these locks are changed
frequently. A single employee might have to memorize dozens of
different combinations (one for each safe, door,
*etc.*). Therefore, as a mnemonic, the security officers who set
the codes often use six-letter words which are translated into codes
using the mapping from a phone keypad. In fact, one will almost always
see a magnet or sticker like this right next to the lock:

For example, the word “HACKER” would equate to the code 42-25-37.

1 | 2abc | 3def |

4ghi | 5jkl | 6mno |

7pqrs | 8tuv | 9wxyz |

0 |

And therein lies a vulnerability. First of all, using a phone dial pad for the mapping from letters to numbers means that the numbers zero and one will never be used! This means the number of potential codes to test has been decreased from one million to $math$8^6 = 262,144$math$. That’s a decrease of almost 75%! But we can do even better.

Next, let’s assume that the code word is a six-letter English word
that occurs in the dictionary. Looking at
`/usr/share/dict/words`, there are only 17,706 six-letter
English words that have direct mappings using a phone keypad. That’s
*much* lower than the 262,144 possible codes without zero and
one, which means that we have reduced the number of possible codes to
brute force by *over *98% from the original! For example, the
code 55-57-59 doesn’t appear to correspond to any words in the
dictionary. That’s likely because almost all words need at least one
vowel, and the phone keys 5, 7, and 9 don’t provide any
vowels. Furthermore, many of those 17,706 words actually map to the
same code. For example, the words “DOSAGE”, “ENRAGE”, “FORAGE”, and
“FORBID” all map to the code 36-72-43. If we just count the number of
*distinct* codes generated by those six-letter English words,
we get only 14,684, a reduction of almost 99%! With a four minute
failure timeout, brute forcing those 14,684 codes would only take
about *forty days.*

But wait, there’s more!

Not all letters occur in English words with the same frequency; you’re much more likely to find the letter ‘e’ in a randomly chosen word than the letter ‘x’. For example, almost 10% of the six-letter English words in the dictionary end in the code “37”, just like in our example word “HACKER”. Therefore, if we cleverly order our brute force approach to try codes that occur most frequently in English words, then we can increase our chances of cracking the code quicker.

I sorted all of the 14,684 distinct codes in decreasing order
according to the number of words that map to that code. If a code word
is chosen uniformly from the set of all six-letter English words, then
this ordering will maximize our chances of cracking the code
quickest. In fact, after fewer than 250 attempts, we will have cracked
the code with about 5% probability. In other words, with only about
*half a day* of access to the lock, a brute force attempt using
this ordering will have a 5% chance of success. In contrast, if the
lock’s code were instead set using a randomly generated six-digit
number, it would require on average 50,000 attempts (~5 months)
to crack it with 5% probability. On the other hand, if the code was
set from a six-letter English word using a phone keypad mapping, the
lock can be cracked with 50% probability after about 16 days.

**Codes are not usually changed on a month-to-month
frequency, so an adversary with intermittent access to the lock
( e.g., after hours) could quite conceivably crack it before the
code were changed.**

If there exists an additional vulnerability that allows the
attacker to detect whether each successive two-digit subcode is
correct before entering the next two-digit subcode, then the expected
value for the length of time required to crack the code is on the
order of *minutes*.

So what can we do about it? The most obvious solution is to simply choose codes randomly from the set of all possible one million 6-digit numbers. If we really do want to have a mnemonic, though, we could do a lot better by using a letter-to-number mapping that makes use of zeros and ones, and also produces a relatively even distribution of codes from English words. Here is one such example of a mapping:

1a | 2e | 3br |

4co | 5dnq | 6ip |

7lmjz | 8hkt | 9su |

0fgvwxy |

It’s not necessarily the best one, though. Finding the optimal mapping—the one that ensures a uniform distribution across all six-digit code for six-letter English dictionary words—is actually an instance of the multiple knapsack problem (a variant known as the multiple subset sum problem) and is therefore NP-Complete, and likely intractable for problems even as “small” as this one. Let $math$L = \{a, b, \ldots, z\}$math$ be the set of letters in the alphabet and let $math$f : L \rightarrow \mathbb{N}$math$ be a function mapping letters to their frequency among six-letter words in the dictionary. Then we can relax the problem to the minimization of the mean squared error over the summed frequencies of the letters mapped to each number: Find a function $math$t : L \rightarrow \{0, \ldots, 9\}$math$ that maps letters to numbers such that $equation$\frac{1}{10}\sum_{i=0}^9 \left(\sum_{\ell \in t^{-1}(i)} \left(f(\ell) - \frac{\sum_{\ell \in L} f(\ell)}{10}\right)\right)^2$equation$ is minimized. The total number of legal mappings, $math$t$math$, is $math$10^{26} = 100$math$ million billion billion. With that said, the example mapping above will be at least a million billion billion times more secure than using a phone keypad mapping!

**The moral of the story here isn’t that these locks are
inherently vulnerable; if used properly, they are in fact incredibly
secure. The takeaway is that these locks are only as secure as the
codes humans choose to assign to them. Using a phone keypad mapping on
six-letter English dictionary words is the physical security
equivalent of a website arbitrarily limiting passwords to eight
characters.**