Dustin Fraze is a program manager at the Defense Advanced Research Projects Agency (DARPA), where he focuses on cybersecurity issues. 74&WEST spoke with him about his work at DARPA, especially a new research program called CHESS (Computers and Humans Exploring Software Security), which attempts to combine the forces of artificial intelligence systems and human hackers to eradicate threats to cybersecurity.
Let’s start out just by clarifying some common misunderstandings about DARPA, since a lot of people picture it as a skunkworks with government scientists all hunkered down in a building somewhere doing secret research. In fact, DARPA is actually a funding agency, right? So rather than carrying out the research itself, DARPA dreams up these technological and scientific programs and selects “performers” -- scientists from academia and the private sector – to actually do the work, correct?
Yeah. DARPA is an interesting place. We don’t have any labs or lab space to speak of in our building. The program managers here are very bright in their fields, but the actual research takes place at the performers’ sites, not in our building. So while we come up with ideas and while we constantly brief our leadership and proselytize our ideas out to the community at large and try to get the best response from the people doing the work, you find very little code written here or prototypes developed on site. It’s a lot of travel to the performer sites to get that stuff done.
Obviously it’s an honor to be selected as a program manager at DARPA. Where did you get your start in the field of cybersecurity?
Early on in my career, I started at a small cybersecurity company called SI Government Solutions. They were doing a lot of things that look like more primitive versions of what the cybersecurity community has become today. I feel I was really lucky to get there as cybersecurity was unfolding and we were just coming to understand what vulnerabilities look like and what liabilities and opportunities those vulnerabilities afforded somebody who was aware of them. They got acquired by Raytheon. I stuck around for a year or two and then jumped to a smaller company where I was continuing to do the same sort of work.
Something called “Capture the Flag” figures largely in your bio. Can you tell us about that?
I attend [an annual hacker] conference called DEF CON. I think this was my 13th year this year. They have a competition called Capture the Flag, which is an adversarial race to find, fix and patch flaws in software. So a referee team generates synthetic software that’s never been seen before by any of the competitors, and they insert vulnerabilities into the software and then generally over 48 to 72 hours, teams of hackers reverse-engineer the software. They don’t get any source code, so they have to do binary analysis techniques to find vulnerabilities and then synthesize patches for those vulnerabilities to protect their systems and generate exploits to hack into their adversaries’ systems. They prove to the referees that they were able to compromise their adversary’s system by raiding a file that the game calls a “flag.” And that file contains data that should be secret between the referees and the team. If I get into your machine and am able to show the referees the contents of your flag file, the referees know that I was able to break into your machine. So there are a lot of Capture the Flags, but the DEF CON Capture the Flag is one of the biggest and longest-running ones. I’ve played that for a few years. I’ve won it once or twice. And then I went on to host it and I hosted it for the last five years. That’s kind of helped keep me connected. The community of CTF players is pretty insular, but at the same time, when you’re in, people talk pretty freely about what they do and how they do things. So it was a good place to learn and make some connections in that community. And then DARPA found me out of that community. You know, they saw my work at CTF, put out feelers and said, “Hey, if you’re interested in coming to DARPA, we’d love to have you.” And here I am.
We probably all have a reasonable sense for how important cybersecurity is, but can you just talk a little bit about the state of the field, what today’s threats look like, and how they might compare to those of recent years?
Yeah. I think the big thing that’s changed recently has been how ubiquitous computing is. And with that ubiquitous computing comes ubiquitous vulnerability. So 20 or 30 years ago, you might have had fairly simple computers in things like your car and [other] things we use every day. Today, there are cars on the road that have 15 or 20 cameras in them that are constantly calculating where the road is, where the lines in the road are, the probability of collision, and they’re feeding that into the navigation systems, using it to apply emergency brakes in situations it identifies as potentially threatening to the health or life of the person driving. And these computer systems are embedded everywhere. They’re not just in cars. They're in our stoplights. They're in your elevator. They're in your coffeemaker. I think it’s hard to find a dumb appliance today. And this is good. It has made life easier. But at the same time, we now live in a world where we have to worry about ransomware attacks on all of these things and people who would do us harm taking over these things and holding them at risk such that the things we’ve come to depend on and rely on are not as reliable as they should be.
And what has changed in terms of the ways in which people like you address cybersecurity threats?
I would just say that there is more complexity everywhere. We have all of these systems that now have very small computers in them that could be vulnerable, and it’s kind of scary to talk about that. But also, things have gotten a little better. We have operating systems that are more secure and more resilient to attacks than they were 10 or 20 years ago. At the same time, I think the model of the adversary has evolved. So we have things like data execution prevention and address space layout randomization that make it hard for unsophisticated attackers to excite vulnerabilities in these systems. Yet the more sophisticated attackers are just being more clever in how they attack these systems. It’s almost a cat-and-mouse game where some of the attackers will go away but those who remain will just be more sophisticated in how they attack things.
Interesting. It brings to mind a medical analogy where use of antibiotics ends up creating superbugs, viruses that are antibiotic-resistant. You get rid of the weak ones but then you’re left to contend only with the super-sophisticated attackers.
Yeah. It’s my belief that if we apply this almost evolutionary pressure where you just have to be a better attacker in order to succeed, we could potentially end up in a situation where we end up, to get a little flowery, “breeding super-hackers.” So one of my design goals in [DARPA’s] CHESS [program] was not the application of patches which should make exploitation of a vulnerability more difficult, but the actual eradication of the root cause of the vulnerability in the software, to entirely disappear it from the software. I think the eradication of vulnerability from software prevents that sort of arms race.
So let’s talk about the CHESS program. The aim of the program is to team human hackers with AI systems in order to increase the scale of threat discovery and remediation. Is that a fair one-liner?
Yeah, that’s fair.
Tell us about its genesis. Whose idea was it and how did it come about?
I came up with the idea for CHESS a few years ago. DARPA ran an effort that ended in 2016 called the Cyber Grand Challenge, where we wanted to see what pure automation could do in the realm of cybersecurity. And so they had a final event at DEF CON in August of 2016. I’d been involved as a performer in that program, and so I was watching this final event take place. I thought it was really interesting and exciting, but I also thought that there were some fundamental limitations to what the automation could do. There were attacks such as memory corruption and information disclosure attacks that are a bit finicky. There’s almost a precise mathematical way to think about reasoning about these vulnerabilities and how one could exploit them. But the Cyber Grand Challenge didn’t consider the deeper semantic vulnerabilities of an application. Things like logic errors and data misuse, things where you kind of have to understand the intent of the application in order to reason about [whether something] is a vulnerability or not. Intent and nuance are not the domain of automation today. And so I thought, “Well, what would happen if we were somehow able to impart some of this world knowledge that would allow a machine to infer intent and reason about data types, not in an abstract way but in a way bound to intent and real-world meaning of these data structures? Could we use this sort of symbiosis to allow semi-automation to reason precisely about these somewhat fuzzy concepts?”
Can you offer up a real-world example of the type of vulnerability you’re describing, where in order to recognize it you’d need a human’s sense of nuance and knowledge of the world?
Yeah, I’ll give it a shot with one of the examples I’ve tossed around a bit. Consider an online banking application. It’s reasonable to consider a banking application as having all the features we would think about when we think about banking: online bill pay, the ability to transfer money between accounts, check a balance, all the things like this. Now imagine a very buggy online banking system. Imagine that I went to do a balance transfer and I said, “I would like to transfer negative 500 dollars from my account to your account.” It might check. It might say, “Does Dustin have more than negative 500 dollars? Yes. All right. So let’s send that request.” And that reasonable but buggy-programmed application would take that request and go to your account. It would add negative 500 to yours, it would subtract negative 500 from mine and it would call it a day. To a program analysis suite, my bank account balance and your bank account balance are just numbers. It doesn’t understand that there’s a socially defined norm [in play]. Dustin really shouldn’t be able to just take your money without your consent and things like this. But humans understand pretty well that your money is yours, my money is mine, and by sending you negative money, it shouldn’t credit me and debit you.
And so in this example, an attempt to do this kind of analysis and discover that vulnerability in an automated way would likely fail, because of the analysis suite’s failure to understand that basic kind of nuance?
Right. What it comes down to is that money has a meaning that you and I understand just based on our context of the world, and automation doesn’t understand it. Like I said, your bank account balance is just a number stored in a database somewhere. And the fact that bank accounts belong to different people is a somewhat abstract concept.
Obviously, the way that the program will actually take shape will depend on what the performers in the program come up with, but are we basically talking about a program that goes out on its own looking through code for vulnerabilities but then is smart enough to know its own limitations, to essentially go, “Hmm, this is beyond my ability as a machine. I should ask a human for help here?” And then prompt a human for insight?
So that’s half of it. Although “goes out on its own” may be a bit aggressive. We imagine this being something that happens in a lab. Today, it’s not my vision that this is deployed roaming the internet looking for vulnerabilities. But it’ll kind of run in a lab. And so a human would point it at software that it wanted to make sure was internet-worthy. You know, we have ways to verify that vessels are seaworthy; why not have ways to verify that software is internet-worthy? So you could start doing analysis and, like I said, you hit half of it. But here’s the other half. Modern program analysis tools do a pretty poor job at emitting intermediate artifacts. If you take a modern program analysis tool and you point it at what it can do well today, which is discovery of a memory corruption vulnerability, it will either tell you it can and did discover a memory corruption vulnerability or it doesn’t yet know if there is a memory corruption vulnerability. And there’s no intermediate state, really, that comes out of these tools. And so while the tool being able to recognize it’s stuck and request help from a human is interesting, I also think it’s interesting to try to get these tools to [deliver] more intermediate state in to humans such that the automation is able to help the humans just as the humans are able to help the automation. It’s a matter of dumping out intermediate state and recognizing that you don’t have to necessarily get to a complete solution before dumping your state out to a human.
So kind of flagging gray areas, going in both directions -- humans flagging gray areas for the automation and vice versa?
Okay. So looking at the literature on CHESS, it appears that the first technical area, the first step in building this thing, is to study human hackers to identify their reasoning processes. And those insights are meant to be leveraged to “create a basis for developing new forms of highly effective communication and other human-computer interactions.” Can you kind of connect those two dots for us? In other words, how does understanding humans’ reasoning lead to better human-computer interaction?
I think that we as a community don’t really understand how experts find vulnerabilities in software and reason about software when they’re looking at these vulnerabilities. We have some pretty rough ideas, but most of them are driven by self-reporting, and self-reporting is notoriously inaccurate. I think that until we have a really solid understanding of how this process works in the minds of a few humans, trying to get this process to work where a team of humans can actually collaborate with cyber-reasoning automation – I don’t see a path forward without really decomposing the problem and understanding how reasoning is done at the individual level.
And how do you study it without relying on self-reporting?
There are a bunch of physiological signs we can use to measure things like cognitive load, and a suite of programs we can use to monitor what users are doing as they're running through an application. Combining that with some self-reporting allows us to more or less get some debugging on that information. So we can start to see if there are correlations between what is self-reported and what we actually empirically observe. You know, “Everybody self-reports this, but our instrumentation says that that’s not nearly as important as people say it is.”
So that’s step one. Step two is to actually develop some of these automated vulnerability discovery and patching technologies?
And then the other technical areas have to do really with establishing measurement and actually what the rollout of these programs would look like, right?
Yes, that’s correct.
Now, DARPA held a proposer’s day back in April. What kind of update are you able to give us about the current state of the program?
We have finished source selection and we are currently in negotiations and the contracting process on this 42-month program.
What else can you tell us about what makes this program unique?
I hope that one of the interesting things I’ve done in creating this program is I’ve attempted to make it a truly collaborative effort. I feel like a lot of times, programs almost end up being a competition between several performers in a given technical area competing for who has the best technology to solve the problem that DARPA put forward. And that’s a fine way to run a program. But sometimes some of the research can be stifled by this feeling of competition or this feeling that, “If I share my secret sauce with a performer in the same technical area, they might steal it and do better than me and then I’ll get kicked off the program.” My goal in building this program was to get several bright people in a room, and I absolutely want them sharing ideas with each other and building the best system for CHESS that we can build collaboratively. Not min/maxing for our set of challenge problems and cheating the metrics so that they can survive on the contract for another phase. We’re following this intense model of collaboration, this kind of the hack-a-thon model where we just get everybody into a room and we put out some interesting problems and we see what comes out [in addition to] the traditional research that they're doing back at their respective sites.
What other programs do you have going?
Right now, I have a program called STAC, which is Space/Time Analysis for Cybersecurity, which looks at vulnerabilities in side-channel timing and algorithmic complexity. This was, in its time, a somewhat esoteric class of vulnerabilities. You can see the echo of this class of vulnerabilities in some of the high-profile bugs that have been talked about recently, like Meltdown and Spectre. This was a program I inherited, so it wasn’t purely my design. But back when this program started, this class of vulnerability was few and far between and certainly not in the headlines for several days at a time.
So when you daydream about the success of CHESS, what do you envision? How do you picture it being of service?
I like to think of a world where the government or even companies running large services that are available to the public aren’t dependent on a pool of very expensive and very overworked cybersecurity experts in order to assert that their software is free of vulnerabilities or safe to be run out on the open internet, such that their users don’t have to worry about compromise of their personal data and things like this. And so I hope CHESS empowers passionate, intelligent people who just don’t have the years of experience it takes to find these vulnerabilities today to help find and eradicate these vulnerabilities from our systems.
Dustin Fraze is a program manager within the Information Innovation Office at the Defense Advanced Research Projects Agency (DARPA). There, his focus is cyberspace operations automations. Prior to DARPA, Mr. Fraze worked with Raytheon, as well as Cromulence LLC, for which he was the founder and president.
Return to 74&W Exclusives.
Copyright 2018 74&WEST LLC All Rights Reserved.
Do not reproduce without written permission from 74&WEST LLC.