Evan A. Sultanik, Ph.D.

Evan's First Name @ Sultanik .com

Chief Scientist
Digital Operatives, LLC

Adjunct Professor
Drexel University College of Computing & Informatics
Department of Computer Science

Recent Content:

Social Signals

or, the basis for an article that was nominated for best paper and subsequently rejected.


Looking back on my previous job at The Johns Hopkins University, I am struck by and grateful for the breadth of different topics I was able to research: distrubted phased array radar resource scheduling, Linux kernel integrity monitoring, TLS, hypervisors, probabilistic databases, streaming algorithms, and even some DNA sequencing, just to name a few.

Toward the end of my—to abuse an overloaded term—tenure at JHU/APL, I became involved in their social media analysis work. For example, I had some success creating novel algorithms to rapidly geolocate the context of social media posts, based solely upon their textual content. But one of the most interesting things I encountered along these lines was a phenomena that we came to call Social Signals.

Discovery of Social Signals

As will become evident from the following chronology, I was not party to this line of research until after many of the initial discoveries were made, so my account of the earlier details may be inaccurate. To my knowledge, however, this is the first time any of these discoveries have been made publicly available in writing.

In 2011, my colleague Clay Fink had been mining Twitter for tweets for quite some time. He had gigabytes of them, spanning months and covering various worldwide events like the London riots, the Mumbai bombings, the Tuscaloosa—Birmingham tornado, the Christchurch earthquake, the Nigerian bombing, &c. These provided a very interesting and, as it turned out, fruitful source of data to research. They shared the common theme of being crises, yet they were each of a very different nature, e.g., some were man-made and others natural disasters.

Clay had created some algorithms to process the mass of textual data, but there was no good, usable way for a human to sift through it to do manual analysis. Step in another colleague, Jaime Montemayor, who created a data exploration tool called TweetViewer.

TweetViewer is very simple in concept: It plots the relative term frequencies used in a text corpus over time. Yet this simple concept—coupled with the algorithms and HCI workflow accelerators (as Jaime calls them) behind-the-scenes that allow real-time interaction with and instantaneous feedback from the data—enabled some interesting disoveries. Take the large spike in Twitter traffic around May 2nd as an example: it corresponds to the killing of Osama bin Laden—the event associated with the highest sustained rate of Twitter traffic up to that point. Selecting the term "bin Laden" instantly generates its time series in the lower graph, which correlates perfectly with the spike in overall traffic.

This example perfectly illustrates one of the challenges of social media analytics, for there were three other major events being concurrently discussed on Twitter that were masked by the popularity of the bin Laden killing: the British royal wedding, a $100k fine levvied on Kobe Bryant, the NBA finals, and a tornado that devastated several cities in the Southern United States. Monitoring "trending topics" based upon spikes in the frequency or popularity of terms would not detect these less popular topics of discussion.

By exploring Clay's corpora in TweetViewer, he and Jaime quickly noticed a pattern: there were certain terms that always seemed to spike in frequency during a crisis event. These included "help," "please," and "blood," as well as various profanities. In the Nigerian bombing corpora, for example, there was a large spike in Twitter traffic temporally correlated with the bomb explosion, and a subsequent spike minutes later correlated with a bomb scare (but no actual blast). The frequency timeseries for the term "blood" pefectly correllates to the actual bomb blast, but does not spike again during the bomb scare. Clay and Jaime dubbed these terms Social Signals.

Automating the Process

When I was first briefed on this phenomenon I was immediately intrigued, primarily around the potential of publishing an academic paper peppered with legitimate use of profanity.

Nam castum esse decet pium poetam
ipsum, versiculos nihil necessest;

The challenge was that these social signals were discovered by humans. Were there others we had missed? Would this phenomenon extend to languages other than English? Could an algorithm be designed to classify and track social signals?

I played around with the data in TweetViewer for a while, and I eventually developed a theory as to what made a term a social signal. My underlying hypothesis was that the human-identified social signals are a subset of a larger class of abnormal terms whose frequency of usage does not correlate with the overall frequency of Twitter traffic. For each geographic region, Twitter traffic ebbs and flows in a pattern much like a circadian rhythm. In the Twitter posts captured from the Tuscaloosa tornado event, for example, traffic peaks around 23:00 every night and then rapidly falls—as people go to sleep—and reaches a nadir at about 09:00—when people are arriving at work. It is actually very regular, except for when people are discussing an unexpected event.

As an example, consider the terms "please" and "blood," which we have observed to temporally co-occur with certain types of disaster events. Under normal circumstances these terms are not temporally correlated to each other; however, during a disaster scenario they are. Furthermore, the term "blood" does not have a very high frequency, so methods based strictly on the frequency domain or burstiness of the terms might ignore that signal. Therefore, we first identify the set of maximally abnormal term signals, vi&., those that are temporally uncorrelated with the overall Twitter traffic. Next, these abnormal signals are sorted according to a significance metric which is based on how temporally correlated they are to each other. Finally, an unsupervised graph clustering algorithm is used to merge similar signals.

For the sake of brevity, I am going to save the technical details for a later post. I will likely also be publishing a related manuscript on the arXiv.


We evaluated our approach on a number of Twitter datasets captured during various disaster scenarios. The first corpus we examined includes tweets from the April 2011 tornados in Tuscaloosa, Alabama. This is the interesting corpus mentioned above in which the tornados co-occurred with the British Royal wedding, a much discussed $100,000 fine of NBA player Kobe Bryant, and the killing of Osama bin Laden—the event associated with the highest sustained rate of tweets per second to date. The second corpus includes tweets from the February 2011 earthquake in Christchurch, New Zealand. The final corpus includes tweets leading up to and including the July 2011 bombings in Mumbai, India. This corpus is interesting because it includes tweets from the prior thirteen days leading up to the bombings.

The time bucket size was set to 20 minutes. The number of maximally aberrant signals discovered was 300. For each corpus, the resulting automatically detected signals included the human-identified social signals "help" and "please." The Mumbai corpus also included the "blood" social signal. The clustered signals of highest significance are given in the following table, in which green signals are ones that are clearly related to an important event:

Dataset Start Date End Date # Tweets # Terms
Tuscaloosa April 28, 2011 15:46:35 CDT May 8, 2011 21:43:48 CDT 97409 58232

Top Detected Signals (sorted by decreasing significance)

tuscaloosanews greek relief need disaster followers text tell redcross like needs just back alabama tornado laden osama news dead obama know tuscaloosa really come thanks will help youtube helpttown right right love lakers think supplies mothers mother happy first volunteers give spann time donation make donate please courtside tickets
Christchurch February 17, 2011 23:10:03 NZDT February 24, 2011 11:51:11 NZDT 12973 15223

Top Detected Signals (sorted by decreasing significance)

need safe #eqnz earthquake still people know help please chch family anyone christchurch will quake #chch jobs #job canstaff victims darkest donate zealand towards donation make redcross just today power thanks everyone news home thank water christchurchcc open yeah tumblr photo another bexielady haha dairy nzne wellington nzpa #christchurch want
Mumbai July 1, 2011 21:39:06 IST July 15, 2011 03:29:30 IST 164820 123516

Top Detected Signals (sorted by decreasing significance)

need govt terror good every people varsha love twitter thanks hear time help please police anybody says serious right work control mumbai karia going think blood like #mumbai injured kasab will even make hospital room blasts stop home call parents blast today well looking life india dadar want just safe #mumbaiblasts stay news rahul come indian #here back know take anyone spirit also #bse #stocks #tips turn following sexy #instantfollow bitch #images #india #free

The Aftermath

Excited that we had developed an automated, completely unsupervised, language-agnostic technique for detecting social signals—which also happened to detect the same signals discovered by human analysis—we quickly wrote up a paper and submitted it to a conference. A few months later the reviews came back:

Reviewer 1

I read this in detail and it is amazing!
Accept Nominate for Best Paper

Reviewer 2

My graduate advisor pawned this review off on me, and I'd rather not be doing it, but this paper looks interesting and I find no technical faults in it.

Reviewer 3

I don't see any real technical faults, but why didn't you cite [a tangentially relevant paper that was likely written by Reviewer 3]?

As you might have guessed, the paper was ultimately rejected.

Shortly after, I left JHU and didn't continue this line of research, so it's just been bit-rotting for the past few years.

In summary, our paper introduced the discovery of social signals: elements of text in social media that convey information about peoples' response to significant events, independent of specific terms describing or naming the event. The second contribution of the paper was a model of the mechanisms by which such signals can be identified, based on the patterns of human-identified signals. Finally, a novel unsupervised method for the detection of social signals in streams of tweets was presented and demonstrated. For a set of corpora containing both natural and human-caused disasters, in each instance our model and algorithms were able to identify a class of signals containing the manually classified social signals.

Future work includes optimizing the algorithms for efficiency in processing streaming Twitter traffic. Incorporating estimates of provenance and trustworthiness from the Twitter social network, as well as accounting for the propagation of signals through re-tweets, might improve the relevance of the detected signals. Finally, we hoped to perform a sensitivity analysis of the algorithms by detecting signals across domains, e.g., using a baseline from one disaster-related event to detect signals during a similar event at a different time.

Success in OS X

10 easy steps (and lots of unnecessary prose) on how to set up a new PowerBook in 48 hours or more.

Five years ago I wrote about my travails solving some networking issues on my Linux laptop. I equated the experience to a classic XKCD comic in which a simple software fix escalates to a life-and-death situation. The same thing happened to me again, this time on Mac OS X. Read on to find out.

Hashing Pointers

or: How I Learned to Stop Worrying and Love C++11

For as long as I've understood object-oriented programming, I've had an ambivalent relationship with C++. On the one hand, it promised the low-level control and performance of C, but, on the other, it also had many pitfalls that did not exist in other high-level managed languages like Java. For example, I would often find myself in a position where I wouldn't be quite sure exactly what the compiler would be doing under-the-hood, particularly when passing around objects by value. Would the compiler be "smart" enough to know that it can rip out the guts of an object instead of making an expensive copy? Granted, much of my uncomfortableness could have been remedied by a better understanding of the language specification. But why should I have to read an ~800 page specification just to understand what the compiler is allowed to do? The template engine and the STL is incredibly powerful, but it can make code just as verbose as Java, one of Java's primary criticisms. Therefore, I found myself gravitating toward more "purely" object-oriented languages, like Java, when a project fit with that type of abstraction, and falling back to C when I needed absolute control and speed.

A couple years ago, around the time when compilers started having full support for C++11, I started a project that was tightly coupled to the LLVM codebase, which was written in C++. Therefore, I slowly started to learn about C++11's features. I now completely agree with Stroustrup: It's best to think of C++11 like a completely new language. Features like move semantics give the programmer complete control over when the compiler is able to move the guts of objects. The new auto type deduction keyword gets rid of a significant amount of verbosity, and makes work with complex templates much easier. Coupled with the new decltype keyword, refactoring object member variable types becomes a breeze. STL threads now make porting concurrent code much easier. That's not to mention syntactic sugar like ranged for statements, constructor inheritance, and casting keywords. And C++ finally has lambdas! C++11 seems to be a bigger leap forward from C++03 than even Java 1.5 (with its addition of generics) was to its predecessor.

As an example, I recently needed an unordered hash map where the keys were all pointers. For example,

std::unordered_map<char*,bool> foo;
I wanted the keyes to be hashed based upon the memory addresses of the character pointers, not the actual strings. This is similar to Java's concept of an IdentityHashMap. Unfortunately, the STL does not have a built-in hash function for pointers. So I created one thusly:
/* for SIZE_MAX and UINTPTR_MAX: */
#include <cstdint>
namespace hashutils {
  /* hash any pointer */
  template<typename T>
  struct PointerHash {
    inline size_t operator()(const T* pointer) const {
      auto addr = reinterpret_cast<uintptr_t>(pointer);
      /* size_t is not large enough to hold the pointer's memory address */
      addr %= SIZE_MAX; /* truncate the address so it is small enough to fit in a size_t */
      return addr;
Note that I am using auto here to reduce verbosity, since it is evident that addr is a uintptr_t from the righthand side of the assignment. The hashutils::PointerHash object allows me to do this:
std::unordered_map<char*,bool,hashutils::PointerHash<char>> foo;
The neat part is that C++11 has a new using keyword that essentially lets me generically alias that definition:
template<typename K,typename V>
using unordered_pointer_map = std::unordered_map<K,V,hashutils::PointerHash<typename std::remove_pointer<K>::type>>;

unordered_pointer_map<char*,bool> foo;
Note the use of std::remove_pointer<_>, a great new STL template that gets the base type of a pointer.

In another instance, I wanted to have a hash map where the keys were pointers, but the hash was based off of the dereferenced version of the keys. This can be useful, e.g., if you need to hash a bunch of objects that are stored on the heap, or whose memory is managed outside of the current scope. This, too, was easy to implement:

namespace hashutils {
  template<typename T>
  inline size_t hash(const T& v) {
    return std::hash<T>()(v);

  /* hash based off of a pointer dereference */
  template<typename T>
  struct PointerDereferenceHash {
    inline size_t operator()(const T& pointer) const {
      return hash(*pointer);

  /* equality based off of pointer dereference */
  template<typename T>
  struct PointerDereferenceEqualTo {
    inline bool operator()(const T& lhs, const T& rhs) const {
      return *lhs == *rhs;

  template<typename K,typename V>
  using unordered_pointer_dereference_map = std::unordered_map<K,V,PointerDereferenceHash<K>,PointerDereferenceEqualTo<K>>;
Note that, through the magic of the C++ template engine, this code supports keys that are pure pointers as well as C++11's new smart pointers.

As another example of the afforementioned auto keyword and ranged for statements, this is how easy it is in C++11 to hash an entire collection (e.g., a std::vector or std::set):

namespace hashutils {
  class HashCombiner {
    size_t h;
    HashCombiner() : h(0) {}
    template <class T>
    inline HashCombiner& operator<<(const T& obj) {
      /* adapted from boost::hash_combine */
      h ^= hash(obj) + 0x9e3779b9 + (h << 6) + (h >> 2);
      return *this;
    operator size_t() const { return h; }

  /* hash any container */
  template<typename T>
  struct ContainerHash {
    size_t operator()(const T& v) const {
      HashCombiner h;
      for(const auto& e : v) {
        h << e;
      return h;
Then, to make all sets hashable (and thereby valid to be used as keys in a map), simply add this:
namespace std {
  template<typename... T>
  struct hash<set<T...>> : hashutils::ContainerHash<set<T...>> {};

I realize that this post is a collection of rather mundane code snippets that are nowhere near a comprehensive representation of the new language features. Nevertheless, I hope that they will give you as much hope and excitement as they have given me, and perhaps inspire you to (re)visit this "new" language called C++11.

Killing Programs Softly

A quick script to gently kill intermittently unresponsive programs on OS X.

Seven months ago I asked the following question on StackExchange:

Sometimes, when I have many applications open doing many memory and IO-intensive things, my computer inevitably starts to thrash a bit. While waiting for things to settle down, I often decide to close some applications that don't really need to be open. The trouble is that many of the applications (especially ones that have background/idle processes) tend to be intermittently unresponsive until the thrashing subsides, so it either takes a very long time for them to get focus to send +q, or when I go to close them by clicking on their icon in the dock I am only presented with the option to force quit. I know that those applications aren't permanently unresponsive, so I'd prefer to send them a gentle TERM signal and have them quit gracefully when they are able. I usually end up killing them by using pkill from the terminal, however, that's not always feasible, especially if the terminal is also hosed.

What is the easiest way to gently send the signal to kill a process if/when that process is intermittently unresponsive? (In a situation in which access to the terminal and/or starting a new application is not convenient.)

The question didn't get much fanfare on StackExchange, and the only answer so far has been to use AppleScript to essentially programmatically send the +q signal to the application. I'll bet it's basically equivalent to using pkill from the terminal to send a SIGTERM to the process, but it might work in the event that my terminal emulator app is also unresponsive. Anyhow, it was the best solution I had, so I matured the idea by making a friendly standalone AppleScript that enumerates all of the currently running processes and prompts the users for which to gently kill. Here it is:

local procnames
tell application "System Events"
	set procnames to (get the name of every process whose background only is false and name is not "GentleKill")
end tell
(choose from list procnames with prompt "Which applications would you like to gently kill?" with multiple selections allowed)
if result is not false then
	set tokill to result
	set text item delimiters to {", "}
	display dialog "Are you sure you want to gently kill " & tokill & "?"
	repeat with prog in tokill
		tell application prog to quit
	end repeat
end if

粤式蒸鱼 (Steamed Fish with Hot Oil)

In which I adulterate a classic Cantonese dish.

Steamed fish that is finished with a drizzle of hot oil is a classic Cantonese dish. The way I like to make it includes a few Japanese ingredients, and my method is a bit unorthodox.


  • 12 inch frying pan with lid
  • A small raised rack that will fit inside the pan that can provide a centimeter or so of clearance above the bottom of the pan; the removable rack that came with my toaster oven works perfectly
  • A very small bowl
  • A small saucepan (it can be tiny)
  • A platter for serving that is large enough for the fish and deep enough to hold some sauce


  • 2 tbsp. soy sauce
  • 1 tbsp. mirin
  • 1 tbsp. shaoxing cooking wine, or you can substitute an additional tbsp. of mirin
  • 1 tbsp. sake
  • 1/4 cup of either water, katsuo dashi stock, or water plus half of an instant dashi stock packet
  • 1 fillet of a large flaky fish. Approximately 1 pound. I often use rockfish or wild striped bass. If you cannot find a large fillet, a smaller whole fish can be used.
  • 1/4 cup coarsely chopped cilantro
  • 2 in. knob of ginger, peeled and julienned
  • 2 scallions, both whites and greens, julienned (or thinly sliced at an oblique angle, similar to the thicker Chinese "horse ear" cut)
  • hot chili pepper, sliced thin at an oblique angle (optional)
  • 1/8 tsp. five spice powder
  • 1/4 tsp. toasted sesame oil
  • 2 tbsp. vegetable oil


  • Mix the soy sauce, mirin, shaoxing wine, sake, and water/dashi in the frying pan
  • Place the fish into the pan and marinade for 15 minutes, flipping once half way
  • Meanwhile, combine the ginger, cilantro, scallion, optional chili pepper, five spice and toasted sesame oil in a small bowl and mix
  • Remove the fish from the marinade, place the rack into the pan, and the fish onto the rack
  • Bring to a boil
  • Once boiling, cover the pan tightly, reduce the heat to simmer, and steam for 8 minutes
  • Meanwhile, start heating the vegetable oil in the small saucepan
  • When the fish is done steaming, remove it and the rack from the pan, placing the fish on the platter
  • Increase heat to high and let the sauce reduce to the consistency of Grade A (runny) maple syrup
  • Pour the sauce over the fish and put the herb mixture on top of the fish
  • When the vegetable oil is very hot, spoon it over top of the herbs

Lenticrypt: a Provably Plausibly Deniable Cryptosystem

or, This Picture of Cats is Also a Picture of Dogs

Back in 2009, I wrote about a thought experiment on how to subvert copyright law via plausible deniability. A couple years ago I expanded on that thought experiment by proposing a seedling idea on how to accomplish it via cryptography. Since then, I've slowly been developing that seedling into a functioning proof-of-concept, which has culminated in the creation of the Lenticrypt project:

Lenticrypt can generate a single ciphertext file such that different plaintexts are generated depending on which key is used for decryption:

$ python lenticrypt.py -e key1 plaintext1 -e key2 plaintext2 -o output.enc

$ python lenticrypt.py -d key1 output.enc | diff - plaintext1 -s
Files - and plaintext1 are identical

$ python lenticrypt.py -d key2 output.enc | diff - plaintext2 -s
Files - and plaintext2 are identical

Unlike alternative plausibly deniable cryptosystems like the recently discontinued TrueCrypt—whose ciphertext size grows in proportion to the number of plaintexts (i.e., hidden volumes) it encrypts—Lenticrypt's ciphertext size is proportional to the largest plaintext it encrypts. This is because Lenticrypt shares bytes in the cyphertext between each of the plaintexts it encrypts; they are not stored in separate regions of the ciphertext. Therefore, there is no straightforward way to estimate the number of plaintexts that are "hidden" inside a single ciphertext.

In fact, Lenticrypt has the theoretical property that, under reasonable assumptions, there is always a near 100% probability that there exists an key in the public domain that will decrypt a given ciphertext to any desired plaintext, even if that key is not known. Therefore, even if an incriminating plaintext is revealed, the author of the ciphertext can plausibly deny having created it because there is a non-zero probability that the plaintext was legitimately decrypted by random chance.

More technical details on the cryptosystem as well as additional use-cases are described in Issue 0x04 of The International Journal of PoC||GTFO.

Note that Issue 0x04 of PoC||GTFO is a polyglot: Among other things, you can also treat the .pdf file as if it were a .zip. If you extract it, there are some neat Leticrypt-related Easter eggs inside the feelies.

Random Solutions are Often Good Enough

Sometimes taking the easy way out isn't nearly as bad as it might seem!

Hard Problems

Computer Scientists and Software Engineers deal with computationally hard problems all the time. But those challenges don't solely apply to those people creating the software; they are equally relevant to those people analyzing it. For cybersecurity researchers, finding a hash collision… determining the user inputs that can assign a certain value to a tainted EIP… deciding whether a black-box binary is malicious… they're all really hard! As our work becomes more and more automated with the help of program analysis techniques, these computationally hard problems become more and more prevalent. But what makes one problem harder than another? In the remainder of this post I'll answer that question, and introduce some surprising new results of my research that demonstrate how good solutions to inherently very hard problems can be quickly generated. Read on for more.

Physical Security Followup

These Locks are Everywhere!

First of all, thanks for all of your positive feedback on our recent post on physical security. One of the comments we've received multiple times is that these types of locks and the practice of using mnemonics for their codes is primarily limited to government facilities. It turns out that's not really a limitation. Like a muted post horn, now that these locks have piqued our curiosity we're starting to see them everywhere we look. Check out this find we made today:

Kaba Lock on an ATM

Don't bother checking, we scrubbed the Exif.  Bank of America: Feel free to contact us directly if you'd like to know the location ;-)

Another thing we neglected to emphasize in the original post was that our time estimates are upper bounds based off of the assumption that the lock will have a four minute timeout for successive failed attempts. For locks that do not have such a timeout (e.g., basically all mechanical locks, and presumably the one on this ATM), you can divide the brute-force cracking time by at least a factor of four.

Exploiting Password Weaknesses in Physical Security

In which I channel the spirit of my eighth academic cousin thrice removed, Richard Feynman.

Digital spin locks like the Kaba Mas X-09 and X-10 are very common for high security applications like vault doors. US General Services Administration approval means that they are nearly ubiquitous in securing government filing cabinets that contain documents that are classified secret, as well as securing the doors to laboratories in which secret code is written and stored. Strolling through the hallways of any Department of Defense facility or government contractor's office, one is likely to see many of these on the doors:

Kaba Mas X-09 Lock
Kaba Mas X-09

These types of locks are very sophisticated: they are self-powered by rotating the dial and they have a digital printout on the top of the code as it is being entered. Security features to thwart automated brute force attack include random jumps in the numbers as the dial is being turned, algorithms for detecting "non-human" rotations of the dial, and long timeouts for successive failed attempts. These types of locks are usually configured to accept a six-digit code that is input as a sequence of three numbers from 0 to 99. That means there are $math$10^6 = 100^3 = 1,000,000$math$ possible combinations. With a four minute timeout, brute forcing all of the combinations would take almost eight years!

Humans are generally bad at remembering long sequences of numbers, and in most facilities the combinations to these locks are changed frequently. A single employee might have to memorize dozens of different combinations (one for each safe, door, etc.). Therefore, as a mnemonic, the security officers who set the codes often use six-letter words which are translated into codes using the mapping from a phone keypad. In fact, one will almost always see a magnet or sticker like this right next to the lock:

For example, the word "HACKER" would equate to the code 42-25-37.



And therein lies a vulnerability. First of all, using a phone dial pad for the mapping from letters to numbers means that the numbers zero and one will never be used! This means the number of potential codes to test has been decreased from one million to $math$8^6 = 262,144$math$. That's a decrease of almost 75%! But we can do even better.

Next, let's assume that the code word is a six-letter English word that occurs in the dictionary. Looking at /usr/share/dict/words, there are only 17,706 six-letter English words that have direct mappings using a phone keypad. That's much lower than the 262,144 possible codes without zero and one, which means that we have reduced the number of possible codes to brute force by over 98% from the original! For example, the code 55-57-59 doesn't appear to correspond to any words in the dictionary. That's likely because almost all words need at least one vowel, and the phone keys 5, 7, and 9 don't provide any vowels. Furthermore, many of those 17,706 words actually map to the same code. For example, the words "DOSAGE", "ENRAGE", "FORAGE", and "FORBID" all map to the code 36-72-43. If we just count the number of distinct codes generated by those six-letter English words, we get only 14,684, a reduction of almost 99%! With a four minute failure timeout, brute forcing those 14,684 codes would only take about forty days.

But wait, there's more!

Not all letters occur in English words with the same frequency; you're much more likely to find the letter 'e' in a randomly chosen word than the letter 'x'. For example, almost 10% of the six-letter English words in the dictionary end in the code "37", just like in our example word "HACKER". Therefore, if we cleverly order our brute force approach to try codes that occur most frequently in English words, then we can increase our chances of cracking the code quicker.

I sorted all of the 14,684 distinct codes in decreasing order according to the number of words that map to that code. If a code word is chosen uniformly from the set of all six-letter English words, then this ordering will maximize our chances of cracking the code quickest. In fact, after fewer than 250 attempts, we will have cracked the code with about 5% probability. In other words, with only about half a day of access to the lock, a brute force attempt using this ordering will have a 5% chance of success. In contrast, if the lock's code were instead set using a randomly generated six-digit number, it would require on average 50,000 attempts (~5 months) to crack it with 5% probability. On the other hand, if the code was set from a six-letter English word using a phone keypad mapping, the lock can be cracked with 50% probability after about 16 days.

Codes are not usually changed on a month-to-month frequency, so an adversary with intermittent access to the lock (e.g., after hours) could quite conceivably crack it before the code were changed.

If there exists an additional vulnerability that allows the attacker to detect whether each successive two-digit subcode is correct before entering the next two-digit subcode, then the expected value for the length of time required to crack the code is on the order of minutes.

So what can we do about it? The most obvious solution is to simply choose codes randomly from the set of all possible one million 6-digit numbers. If we really do want to have a mnemonic, though, we could do a lot better by using a letter-to-number mapping that makes use of zeros and ones, and also produces a relatively even distribution of codes from English words. Here is one such example of a mapping:


It's not necessarily the best one, though. Finding the optimal mapping—the one that ensures a uniform distribution across all six-digit code for six-letter English dictionary words—is actually an instance of the multiple knapsack problem (a variant known as the multiple subset sum problem) and is therefore NP-Complete, and likely intractable for problems even as "small" as this one. Let $math$L = \{a, b, \ldots, z\}$math$ be the set of letters in the alphabet and let $math$f : L \rightarrow \mathbb{N}$math$ be a function mapping letters to their frequency among six-letter words in the dictionary. Then we can relax the problem to the minimization of the mean squared error over the summed frequencies of the letters mapped to each number: Find a function $math$t : L \rightarrow \{0, \ldots, 9\}$math$ that maps letters to numbers such that $equation$\frac{1}{10}\sum_{i=0}^9 \left(\sum_{\ell \in t^{-1}(i)} \left(f(\ell) - \frac{\sum_{\ell \in L} f(\ell)}{10}\right)\right)^2$equation$ is minimized. The total number of legal mappings, $math$t$math$, is $math$10^{26} = 100$math$ million billion billion. With that said, the example mapping above will be at least a million billion billion times more secure than using a phone keypad mapping!

The moral of the story here isn't that these locks are inherently vulnerable; if used properly, they are in fact incredibly secure. The takeaway is that these locks are only as secure as the codes humans choose to assign to them. Using a phone keypad mapping on six-letter English dictionary words is the physical security equivalent of a website arbitrarily limiting passwords to eight characters.

Unambiguous Encapsulation

Defending Against "Packet in Packet" Attacks

A couple days ago, Dominic Spill and Michael Ossman presented an interesting talk at Shmoocon on using specially crafted error correcting codes to have unambiguous encapsulation, preventing attacks like "Packet in Packet." This appears to be the culmination of some work that was first proposed to the LANGSEC mailing list back in March. The underlying problem that they are trying to address is that the physical layer is almost always error-prone: bits often get flipped. The data link layer addresses this by using error-correcting codes: methods of encoding the bits to be transmitted such that certain types of errors can be caught and corrected, at the expense of some additional bandwidth overhead.

Packet in Packet Attacks

If a sufficient number of bit-errors occur, however, error correcting codes can still fail to correct or even detect an error. Protocols usually address this by adding a checksum in the packet header/preamble. To illustrate this, let's consider a made-up link layer protocol in which there is a header that starts with the magic number 0x1337 followed by a checksum and the message body. If a detectable but uncorrectable error occurs, then the packet can be dropped. If an undetected bit-error occurs in the body, then the checksum will fail and the packet will be dropped. If an undetected error occurs in the header, then either the magic will be broken and it will fail to be detected as a frame or the checksum will fail, in both cases causing the packet to be dropped. The one exception to this latter case is the basis for Packet in Packet attacks: If we were to start the message body with 0x1337 followed by a checksum and our desired body, then the link layer protocol will skip over the corrupted header and think that the actual frame starts at the beginning of our hand-crafted body! In effect, we have embedded a complete packet inside the body of another. It turns out that if we transmit this carefully crafted packet-in-a-packet a bunch of times, then eventually (in what turns out to be a reasonable amount of time) a bit-error will occur causing our payload to be interpreted as the real link layer frame. Attacks like this have been demonstrated on real data link layer protocols like IEEE 802.15.4.

Isolated Complementary Binary Linear Block Codes

Spill and Ossman propose using a special error correcting code that has a property they call "isolation:" the code can be split into multiple subcodes that have additional error-correcting properties between them. One of the subcodes could be used strictly for bits of the header/preamble, while another subcode could be used strictly for the body. This provides additional error correcting capability specifically for cases like that which the attack above exploits, without increasing the bandwidth overhead of the code. They call this family of codes "Isolated Complementary Binary Linear Block Codes" (ICBLBC).

As an example, consider the following eight code words:


Note that each code word differs from the others by at least two bits. This means that we can detect single-bit errors. For example, if 01110 is corrupted in the physical layer to 01010, then we can immediately know that an error occurred because that code does not exist in our set of eight. If two or more bit-errors occur, then an error may go undetected, which is the case that Packet in Packet attacks exploit. This is exactly like the types of error correcting codes used in most data link layer protocols. The difference, however, is that this particular set of eight code words has an additional property: it is an isolated complementary binary linear block code. Let's call the first row of codes "Subcode 1" and the second row of codes "Subcode 2." Now, note that every code in Subcode 1 differs from every other code in Subcode 2 by at least three bits. That means that we can not only detect but also correct single-bit errors between the two codes, giving the two subcodes properties of a Hamming Code. If we encode the header/preamble of the packet using one subcode and the body using the other, then we can circumvent Packet in Packet attacks with little to no extra bandwidth overhead!

Generating the Codes

Spill and Ossman have provided some code that can enumerate these codes. Currently, they have reported discovery of 6-bit ICBLBCs with up to twenty-two codes in subcode 1 and two codes in subcode 2, or up to ten codes in each if they are divided evenly. While their implementation does work, it is limited by the fact that it essentially performs a brute-force search to discover them, causing them to investigate a hardware-based approach using FPGAs.

Out of curiosity, I wondered if this problem could be solved more efficiently using constraint reasoning. Therefore, we implemented a generator that encodes the ICBLBC constraints as a Satisfiability Modulo Theories problem and used DOCO (Digital Operatives' proprietary constraint optimizer) to enumerate all feasible solution codes. Sure enough, our solver was able to almost instantaneously reproduce Spill and Ossman's results, as well as enumerate codes of larger bit length that would likely be intractable to discover using brute force search. For example, in less than a second it was able to discover this 7-bit ICBLBC consisting of fifty codes in subcode 1 and two codes in subcode 2:

Subcode 1Subcode 2
37, 118, 44, 21, 35, 28, 107, 61, 41, 25, 94, 13, 56, 84, 124, 31, 110, 117, 50, 76, 62, 49, 47, 81, 7, 19, 109, 87, 103, 22, 93, 82, 70, 26, 127, 115, 55, 52, 59, 73, 11, 91, 88, 67, 69, 42, 121, 14, 38, 1220, 96

as well as this one with nineteen codes in each subcode:

Subcode 1Subcode 2
23, 89, 17, 27, 9, 65, 51, 5, 83, 85, 0, 18, 80, 54, 36, 33, 113, 3, 11676, 122, 61, 127, 103, 14, 107, 104, 79, 56, 109, 98, 28, 70, 94, 47, 110, 74, 42

This 8-bit ICBLBC was discovered in a matter of seconds and consists of eighty codes in one subcode:

Subcode 1Subcode 2
1, 4, 7, 8, 13, 16, 19, 25, 30, 32, 35, 37, 41, 42, 44, 47, 49, 54, 56, 59, 61, 69, 76, 81, 84, 87, 88, 97, 100, 104, 109, 110, 112, 121, 122, 124, 127, 128, 131, 133, 137, 142, 145, 146, 148, 152, 155, 161, 162, 164, 168, 171, 173, 176, 179, 181, 185, 190, 193, 196, 199, 200, 205, 208, 211, 217, 218, 220, 223, 224, 227, 229, 233, 234, 236, 239, 241, 246, 248, 25366, 75

I have reimplemented this approach using Microsoft's "free" (as in beer) Z3 constraint solver and pushed it to a fork of Spill and Ossman's project, available here. As always, feedback on this new approach is greatly appreciated.