Presentations

Talks and Demonstrations

From Polyglots to Prompt Injections

Parsing is Still Execution

2025-10-11 Presented at JawnCon 0x2

Remember when we thought parsing untrusted data was hard? Welcome to 2025, where your PDF is also a Nintendo ROM, your resume photo contains hidden SQL, and your helpful AI assistant just `rm -rf`'d your home directory because someone asked it nicely. This talk bridges classic file format exploitation techniques with modern LLM security through the lens of Language-theoretic Security (LangSec). We'll start with examples of polyglot files that execute differently depending on the parser, then show how the exact same principles apply to AI vulnerabilities like model backdoor, prompt injection, multimodal, and Model Context Protocol attacks.Bring your polyglot files, your prompt injections, and your sense of humor. The parsers are still broken, they're just fancier now.

In Pursuit of Silent Flaws

Dataflow Analysis for Bugfinding and Triage

2024-04-15 Presented at Purdue CERIAS Seminar

In this presentation, I provide a thorough exploration of how dataflow analysis serves as a formidable method for discovering and addressing cybersecurity threats across a wide spectrum of vulnerability types. For instance, I'll illustrate how we can employ dynamic information flow tracking to automatically detect "blind spots"—sections of a program's input that can be changed without influencing its output. These blind spots are almost always indicative of an underlying bug. Furthermore, I will demonstrate how the use of hybrid control- and dataflow information in differential analysis can aid in uncovering variability bugs, commonly known as "heisenbugs." By delving into these practical applications of dataflow analysis and introducing open-source tools designed to implement these strategies, the goal is to present practical steps for pinpointing, debugging, and managing a diverse array of software bugs.

A Sermon on the Indulgences of Computational Sacrifice

The Superabundant Benedictions of Programming an Absurd NES Game

2020-12-17 Presented at A Midwinter Night's Con

One of my quarantine projects was completing the NES portion of my résumé. Among other things, that PDF is also a valid NES ROM containing a playable game. The game—which, to be honest, is only of minimal playable enjoyment—boasts a variety of fatuous Easter eggs and tricks. For example, it prints out the MD5 hash of the PDF. It also has a BF interpreter that steps through the execution of a quine. This talk explains how all of the tricks were achieved. We will also cover how everything was implemented on the NES’s puny 6502 processor with only 2kB of RAM, how its bytes were Tetris'd into 128kB of ROM, and how the file is also both a valid PDF and a valid ZIP. (Did I mention it’s also a ZIP? It’s also a ZIP.) Along the way, we’ll recite some parables, perform a devotional on the divinity of BASIC, meditate on what malware shellcode has to do with Brezhnev-era Soviet public architecture, and conclude with an allegory on how this all applies to a state-of-the-art LLVM taint analysis instrumentation framework. It turns out that forcing yourself to work in extremely constrained environments teaches you how to be a better hacker all around.

See Also: The NES Game Itself

Toward Automated Grammar Extraction via Semantic Labeling of Parser Implementations

2020-05-21 Presented at The Sixth Workshop on Language-Theoretic Security at the 41st IEEE Symposium on Security and Privacy Workshops

The presentation for this paper at the LangSec Workshop at IEEE S&P 2020. It is about mapping a ground truth parse tree to an execution trace of a parser.

The Treachery of Files, and Two New Tools that Tame It

2019-12-10 Presented at Empire Hacking

Parsing is hard, even when a file format is well specified. But when the specification is ambiguous, it leads to unintended and strange parser and interpreter behaviors that make file formats susceptible to security vulnerabilities. What if we could automatically generate a “safe” subset of any file format, along with an associated, verified parser? This talk explores that question, provides examples of malicious files, examines some troublesome parsers, and introduces two new tools for reverse engineering files and parsers. PolyFile is a tool for exploring the contents and structure of files to detect funky file tricks like steganography, polyglots, and chimeras. PolyTracker can instrument parsers to perform efficient universal taint tracking, to associate which bytes of the input file are operated on by which functions. Used in conjunction, these tools will permit us to specify safer subsets of file formats.

Fantastic Bugs and How to Squash Them;

or, the Crimes of Solidity

2019-07-25 Presented at Philadelphia Ethereum Blockchain Meetup

This talk covers the many ways the Solidity programming language allows you to shoot yourself in the foot. Topics include the common mistakes, as well as the deeply insidious idiosyncrasies that can trip up even the most seasoned developer. It concludes with a brief survey of open-source tools you can use to help you write secure smart contracts.

Anatomy of an Unsafe Smart Contract Programming Language

2018-12-12 Presented at Empire Hacking

This talk dissects Solidity: the most popular smart contract programming language. Various examples of its unsafe behavior are discussed, demonstrating that even an experienced, competent programmer can easily shoot themselves in the foot. These serve as a cautionary tale of how not to create a programming language and toolchain, particularly one that shall be trusted with hundreds of millions of dollars in cryptocurrency. The talk is concluded with a retrospective of how some of these issues could have been avoided, and what we can do to make smart contract development more secure moving forward.

Introducing Etheno

A Tool for Simplifying Formal Methods

2018-10-07 Presented at TruffleCon 2018

Etheno is the Ethereum testing Swiss Army knife. It’s a JSON RPC multiplexer, analysis tool wrapper, and test integration tool. It eliminates the complexity of setting up analysis tools like Manticore and Echidna on large, multi-contract projects. In particular, custom Manticore analysis scripts require less code, are simpler to write, and integrate with Truffle.

Parsing and Interpreting are Hard:

A Tragedy in Two Acts

2018-09-25 Presented at DC 215 Gathering 0x2

Act I covers file format trickery like polyglots and how they aren’t just nifty parlor tricks. Act II applies the lessons from Act I to some new formats and languages created for smart contracts, providing examples of why it’s a terrible idea to write your own parser and, generally, why we should burn all of this blockchain stuff with fire.

File Polyglottery;

or, This Proof of Concept is Also a Picture of Cats

2017-12-08 Presented at BSidesPhilly

A polyglot is a file that can be interpreted as multiple different filetypes depending on how it is parsed. While polyglots serve the noble purpose of being a nifty parlor trick, they also have much more nefarious uses, e.g., hiding malicious printer firmware inside a document that subverts a printer when printed, or a document that displays completely different content depending on which viewer opens it. This talk does a deep dive into the technical details of how to create such special files, using examples from some of the recent issues of the International Journal of PoC||GTFO. Learn how we made a PDF that is also a valid NES ROM that, when emulated, displays the MD5 sum of the PDF. Learn how we created a PDF that is also a valid PostScript document that, when printed to a PostScript printer, produces a completely different document. Oh, and the PostScript also prints your /etc/passwd file, for good measure. Learn how to create a PDF that is also a valid Git repository containing its own LaTeX source code and a copy of itself. And many more!

Automatic Construction, Maintenance, and Optimization of Dynamic Agent Organizations

Evan Sultanik’s Ph.D. Dissertation Defense

2010-09-08

The goal of this dissertation is to generate organizational structures that increase the overall performance of a multiagent coalition, subject to the system's complex coordination requirements and maintenance of a certain operating point. To this end, a generalized framework capable of producing distributed approximation algorithms based on the new concept of multidirectional graph search is proposed and applied to a family of connectivity problems. It is shown that a wide variety of seemingly unrelated multiagent organization problems live within this family. Sufficient conditions are identified in which the approach is guaranteed to discover a solution that is within a constant factor of the cost of the optimal solution. The procedure is guaranteed to require no more than linear—and in some well defined cases logarithmic—communication rounds. A number of examples are given as to how the framework can be applied to create, maintain, and optimize multiagent organizations in the context of real world problems. Finally, algorithmic extensions are introduced that allow for the framework to handle problems in which the agent topology and/or coordination constraints are dynamic, without significant consequences to the general runtime, memory, and quality guarantees.