Lightning fast CTF solving - Automatic Exploit Generation & Side Channel Analysis

Name: Lightning fast CTF solving - Automatic Exploit Generation & Side Channel Analysis
Uploaded: 2018-11-02
Duration: 47 min 43 s
Description: Christopher Roberts discusses automating CTF problem-solving using symbolic execution frameworks (Angr, Intel Pin) and binary instrumentation to speed up reverse engineering and exploitation. He presents Zeratool, a framework that automates vulnerability discovery, exploit generation, and weaponizat

BSides DC 201847:431.3K viewsPublished 2018-11Watch on YouTube ↗

Speakers

Christopher Roberts

Tags

CategoryTechnical

TopicCTF Reverse Engineering Vulnerability Research

DifficultyAdvanced

TeamRed

StyleTalk

Mentioned in this talk

Tools used

American Fuzzy Lop

Frameworks

angr PinCTF Pwntools Triton Zeratool

Hardware

Intel PT

About this talk

Christopher Roberts discusses automating CTF problem-solving using symbolic execution frameworks (Angr, Intel Pin) and binary instrumentation to speed up reverse engineering and exploitation. He presents Zeratool, a framework that automates vulnerability discovery, exploit generation, and weaponization, then demonstrates how these CTF techniques translate to real-world embedded-systems vulnerabilities in routers and firmware.

Show original YouTube description

Solving CTFs the wrong way has never felt so good. Program analysis frameworks are hard, Zeratool and PinCTF are no exception. I'm going to discuss the challenges that go into building a reliable framework for consistently solving reverse engineering and exploitable CTF problems. We'll talk about why so many reverse engineering problems are vulnerable to side channel analysis attacks and how you can reap all the points in your next CTF. Adding more computing resources to a program analysis framework won't guarantee that you find more bugs. I'm going to show you how these frameworks can be expanded beyond adding resources to solve real-world issues and how exploitable CTF problems don't model high impact vulnerabilities from a program analysis standpoint. We'll talk about what it looks like to expand these frameworks into vulnerability hunting and weaponizing machines! Christoper Roberts (Cyber Scientist at Battelle Memorial Institute) Christopher Roberts is a security researcher at Battelle Memorial Institute. He has extensive vulnerability research experience in embedded systems and program analysis frameworks. He is an active competitor and speaker at George Mason's MasonCC CTF club.

Show transcript [en]

besides DC would like to thank all of our sponsors and a special thank you to all of our speakers volunteers and organizers for making 2018 a success today we're gonna be talking about how to solve CTF problems really really really really fast we're going to also talk about what it looks like to build tools to solve these things really fast and then we're also gonna talk about what it looks like taking these tools these techniques that you're developing for CTF problems and applying them to real real programs real situations real systems so like every speaker I'm gonna start with a little bit of background information about myself my name is Christopher Roberts I'm a vulnerability researcher slash

reverse engineer for Battelle Memorial Institute I'm also a CTF member at the George Mason cybersecurity Club we've had a lot of recent competitions a lot of recent wins and those are in part to some of the tools that I'm going to talk about today and in particular for my research I love looking at low-level stuff I like understanding how programs work how all the bits and bytes work when I see assembly I'm like yes I don't have to deal with Java so I like finding bugs and systems I like trying to analyze systems faster because as it is systems are only getting more and more complicated and it is just becoming harder and harder on me the reverse

engineer to try and find bugs to try and find out how these systems are working so I need to build an up-to-date toolset to try and understand these things a little bit faster so let's start out with CTS there are a couple problems in CTS right now that lets you use these tools where you drop in a CTF problem when you get out of flag and the first one is reuse there are so many problems right now that have the varies I'd have a very similar approach every single time you break right at the function check you dump out your memory and you say hey there's a flag right there and that'll work for a lot of low point

problems for other ones they'll try and do some broken encryption stuff and do a check against that and have you reverse engineer that and then yeah if you're going to bigger problems a lot of times the low point problems might have some cool or a novel thing that just eats up your time you're trying to score as many points as you can in these problems and you're wasting time trying to figure out how to run a an nes ROM on your computer and that just eats up so much time and then finally some of the harder ones your 400 point your 500 point problems can just be tedious it might be doing the same thing over and over and over

and over and over again and you're just bored by the time you've actually solved it and so I like to try and reduce the amount of time you have to invest in those so that you can get to some of the cooler or more interesting or the harder problems so how do we fix this we automate everything and we have a number of tools that have come out in the past several years that help us do this two tools from debuggers and debugger scripts and symbolic execution frameworks like anger and Triton and binary instrumentation frameworks like Triton and intel's pin but that's way too many frameworks so for this talk we're gonna only talk about two and

those are two that i've been using for a while in my professional life and in my CTF life so we're gonna talk about anger and we're going to talk about intel's pin so these these systems are extremely complex and explaining every single piece of it is out of the scope of this talk so I'm gonna talk at a very high level about what they do and how they impact us in this talk so first up we have anger anger is what's called a symbolic execution framework and if anyone really understands anger they're gonna cringe it back because really it's can colic execution but we're not going to get into the Nitty Gritty it's a symbolic execution framework

and we can pull a control flow graph so what does that mean symbolic execution is the idea that we can represent a program as a series of equations every register every memory region every file input whatever can be represented as a series of equations because every time you're incrementing through the program you're saying you know add four to my program counter if it's you know a RISC assembly or if you're adding an input and you're trying to understand what your memory looks like you can say if it passes this check my input should be flag 1 2 3 4 if my file needs to be read we can also model that as an equation and so symbolic execution allows us to

model programs and that's the big takeaway from it we don't actually have to run the program we can start at any arbitrary point inside of there and model what's going on we can say that from this slice to this slice we know that we need to add 142 our our ax register or this memory agent changed by I don't know shifting something and so you get a really powerful framework for understanding problems that you can't run so if we're talking about that NES ROM and it has executable code instead of trying to Google NES emulators and then having to sort through the first 20 pages of Google to filter out all the viruses you can just tell anger lift

this code right here and run it and it will tell you to some accuracy what it should be doing and in the second big piece that we want from anger today is control flow graph recovery so what does that mean the control flow graph is a visual representation of how a program runs so if I send you input and you check it it's gonna branch we're gonna have one branch going one way that says you fail one branch that goes one way that says you succeed so every time there's an if a loop a while and ended call a jump whatever we have a different block to represent that code that we're going to execute and so you get a really

nice visual representation of what's going on and after I talk about Intel I'll show you an example of a control flow graph and why they're so important so now let's briefly about an Intel's pin it's a instrumentation framework that only works on x86 and AMD 64 sits so just modern system or excuse me not a modern system that's that's wrong on Intel chips right now but binary instrumentation lets you slot in code into other programs that does something that you would want so you could track memory reads and writes if your program wouldn't normally print those out you can hook different functions if there's a key check you could hook that and have that printed out but what we're

interested in is something called instruction counting and so instruction counting in this case is going to tell us literally how many instructions the processor ran when it's running through a program and that's what's going to be important for the tool that I'll be talking about in a little bit so colors aren't as good as I had hoped but on the right we have an example of a control flow graph it's ugly and hard to read but the big takeaway is we have these blocks that represent code and each of these blocks does something to registers or to memory we don't really care what those specifics are but anger does and so we can tell anger given this program

if I want to get to the end and I want to avoid some sort of bad condition if I want to avoid your flag is not correct I can tell it to start here and end here I can give it an exact slice that I want it to run through without actually having to run it and tell it to trace through these specific basic blocks and tell me how to get there and when you're representing a program like this you're able to get so much more insight into what's going on you can see what your memory needs to look like what your registers need to look like you can ask very specific questions can you even get

there from there oh yeah and then does one variable influence another yeah you get a small concept of taint analysis but that's also out of the scope of this we we don't care about that for the tools and techniques we'll be talking about here so the the next one I have pin I tried to blow it up as much as I can we're just running an example program and at the very end we have a count that count is the number of structions that this program executed from starting to linking to mapping memory to running through linked libraries to finally exiting every single one and pin let us instrument the code to increment that count every single time we hit an

instruction so I have a CTF problem here and I ran a into it and it gave me some bogus count I don't really care about that but what happens when oh it's supposed to pop up

Oh No my video is failing okay well I'll just talk to you a little bit looking back at that first one that's a bad demo Oh No so when you give it different inputs a b c d e f a a BB whatever you want you're still going to get that instruction count you're going to get something that's maybe changing maybe it's the exact same and in CTF problems there's a code recycling problem back to that very first point where they're generally doing what's called incremental checking every character in your flag is getting checked uniquely and we're actually kind of using some of libs C's optimizations against it so we jump past the video so

when we talk about a reverse engineering problem and we're trying to understand how it works we pull up that control flow graph again and if this is an easier problem or the CTF authors like us or maybe they just don't hate us that much you might get something that looks like this you get a couple blocks here and a little bit of logic there's like two if statements that's not too bad you can use traditional tools to solve this you can attach a debugger you can start statically analyzing the code but what if the CTF authors really hate us well they could turn these two checks into 160 and I know I wouldn't want to attach

to the debugger and stop at every single one of those checks and try and figure out what's going on and CTF authors don't have to be that nice they can do this that that is I think I had counted it out to 25,000 individual checks and for people who have worked with malware that this is actually the LLVM obfuscated post processor it is literally designed to make it nearly impossible to reverse engineer it now there are tools online to try and help that but that sucks so we need a way to try and solve problems like this so that we can get all of the points I want to come in first I don't want to just ignore this

one and say this one's too hard so we need to divide up the problem we need to understand what's going on in these CTF problems and generally they're kind of broken up into three parts you got to figure out how to run it you got to figure out what the check is and then you got to figure out how to break the check or how to make it work for you and so generally your steps kind of look like this trying to find a library trying to see if it's an NES emulator trying to find a weird environment variable you need to find your checks usually a lot of math and that usually whatever is the biggest function in

there is probably your checking function and then you go in gdb and you start typing in this a bunch and then you get a close out of gdb and you got to reopen and start doing this again and then a yeah eventually you might get repetitive stress syndrome but so because I want to have my hands forever I'm instead gonna look at what's going on so at some part in that check function we're gonna use one of these if you ever program didn't see if you've ever taken a CS class you've seen one of these and it's a comparison function they're pretty simple you give it two inputs and it says yay or nay yes they're equal now no

they're not equal and I think there's a handful more but they all work on the exact same premise so the one that was the smallest that was the easiest to bring up here was string compare whoo so given two pointers start at the beginning of the pointer check to see if the first two are equal then exit out and you do this again and again and again and again and again and you you run this until you find an alternate that that's just a normal thing for strings but what happens if only the first one is equal to that second one that first character in both strings is equal but the second character isn't well neck sits out early

so you might be able to follow along with what I'm saying here if my first character is right I execute slightly more instructions if my first two characters are right I execute slightly more instructions and this happens over and over again for each of these comparison functions so if there were some way to count instructions we could pretty quickly figure out what's going on so hopefully this next video works and all of my videos are failing me oh I just needed to double click it oh cool okay what if to reverse engineer it is it running man okay well I have the videos on here and once we get past this part I'll pull up the videos I've got them in a

different folder but what we're doing here is we're literally running through the alphabet for that first character a-b-c-d-e-f-g all the way down you know the alphabet and we're gonna see that one of those characters execute s'more instructions than the other ones what a coincidence and so this process can be pretty hard oh you can see the command up there I love gross bash one-liners but I know the security community doesn't so I decided to compress all of this and do a tool and put it all online so that you don't need to write a bash loop that loops over every single character in the alphabet and then runs pin against it to try and instrument it and do all of this

funkiness so I built a tool called pin CTF and all it is is its automating that process of checking the next input against your flag you can do parallel processing you can do command line arguments you can do standard in and you can also recover from canceled sessions because it can suck when halfway through you figure out you have an issue in your bash loop and you got to start all over again oh man that's why you shouldn't write bash one-liners all right so this next one is the one that I will pull up if it's not coming up got some technical difficulties all of my demos are failing me today alright and give me just one moment and

I will open up the videos for you

that cancel out there we go

you guys get to see all of my personal documents pin CTF video I took all this time to record these videos for you guys okay cool oh yes we can see it perfect so we're running it through once and we can see we're checking the very end character here and it's kind of going a little bit slow I wanted to give you guys lightning fast CTF solving so instead of running it on one core I'm going to run 22 concurrent executions of this thing on a beefy system I've got somewhere else oh the videos jump ahead of me and you can see this flag printed out before I could even finish that sentence and they wanted you to use Hardware

break points instead of instruction counting but because they used a mem compare somewhere down the line it didn't matter all right so let's find where we

that was real time yeah there was no speeding up in that video who forgot time I can run it a couple times and you can you can gotta get pull out your stopwatch if you want so that's cool when there's one issue to overcome in reverse engineering problems there's a mem copy or aster compare or oh whatever some sort of byte for byte comparison but what happens when we come to pony balls to exploitable there's so many different things you need to overcome you might be given a Lib C you might have to fight a soul or you might have to fight DEP you might have a weird puzzle to solve before you even get the

to the the function you I don't know you might have to make your own primitive and try and figure out how all of that works and then you have this nonsense of trying to connect to a port and exploiting it holy cow that is way too much effort so let's talk about a little bit about some of the most common vulnerabilities you see in CTF problems so the two big ones outside of heap exploitation are buffer overflows and format string vulnerabilities if you've taken a CS class that does C programming they probably talked about buffer overflows you have some variable on the stack that has too many inputs or I guess too much of an input fed into it

and it writes values all the way up the stack possibly corrupting your program counter and then you can point this program counter wherever you want so sometimes in CTF problems that make it convenient for you and they have a win function so that's one type you can have sometimes they're not as nice and then trying to emulate problems from the 1990s and you have to use shellcode okay that's not too bad you can Google shellcode and you can try and finagle what is little-endian and trying to get all this swapping done and then if they're trying to emulate more modern systems they might make you do something called a wrap chain and i won't go into

the specifics but just knowing that these are the high level techniques towards a lot of these problems is enough for this demonstration so that can be a lot of work I got to run all sorts of tools or I might have to build one of these manually yeah that sucks okay so now that we've identified the first sucky problem let's look at the second one format strings so if you do printf and you have your user input just right there in printf you can do whatever you want you can print out values from memory you can write to arbitrary locations you own the whole thing and so generally they're kind of in two parts one is you leak a flag from somewhere in

memory who knows and then the second one is you overwrite some writable portion of memory to make it point to something malicious like some shell code or in function or a rap chain whoa awesome okay and usually we overwrite the got the global offset table or the PLT the procedure linkage table and holy cow that's a lot of work that's way too much work for me okay so let's talk about a framework that makes this a little bit easier pone tools it's a Python library if you google pone tools format strings or poned tools buffer overflow they have a whole bunch of things that make it so much easier they can fix all of your

Indian swapping endianness swapping they can build some of your format string payloads for you and then it just sounds kind of cool yeah you can interact with network sockets you can run processes it's got the whole nine yards and it makes it really really really easy yeah it's probably my shortest slide so again we got to break up this problem if we want to solve this thing really really really fast we have to understand how to run the program is it taking in a command-line argument is it standard in what is it we have to overcome some sort of small puzzle it might be entering in our name it might be a menu some times

that make you do some math for them who knows I certainly don't and then we need to find the vulnerability we need to accurately identify what this vulnerability is and we need a reliable method to do this and then finally we need to weaponize it and throw it that's a lot of steps so reason why a lot of poem problems have a lot of points associated with them well conveniently we've got a couple tools that can help us with that so anger can run the forever the symbolic execution stuff that was tracing a program for us can help us overcome that puzzle and it can also to some extent help us find a vulnerability so all that we're left

with now is weaponizing the vulnerability and throwing it so if we had some sort of framework that made it easy to run these things or build format string payloads I guess would be in luck so there's a couple examples online on what what they call automatic exploit generation and I'll get to the talking about those and talking about why they suck in just a moment but we want to try and make this process easier so what are the hard part's finding the input like I don't know what we're trying to send into this program to make it crash or make it do whatever we need to fix common mistakes cuz I can't keep that in

my head I don't remember how it what kind of endianness I'm on and then there's a concept of bad characters in shell code so we talked about strings for a second earlier and we talked about a null terminator it's just a null byte in there that tells you to stop the string well if we put shell code into there it's gonna truncate or Shulkin it won't run properly so we need a method of identifying those bad bytes and then cuz because I'm a lazy human being I want it to send the exploit for me and I want it to print out the flag too so as you guys guessed it the reason why you're sitting here there is a tool that

can do all this for us I built it and it's called zeratul it was originally a slur that someone had called me because I was playing a character named zeratul zeratul 2ul and so it's the process of automating every single one of these components so that we can get all the points because that's what we want we want to solve all these problems really really fast so that we can spend our time trying to solve the harder problems so it's actually kind of easy to figure out what kind of input you're looking for you can just see what functions are in there if it has open it's probably interacting with a file if it's doing guests or scanf for read it's

standard in and anything else you can just kind of throw into a bucket and say it's probably an argument or or maybe I should just you know exit out as arrow tool and try and figure out what's actually going on alright so now that we've done the hard part let's talk a little bit about memory and how traditional or at least online automatic exploit generation works so this is our program here running in memory no videos this time so should work and so we have to stack all the way down there so we're gonna talk about stack overflows in this example cuz I think those ones are really visual when you're calling a function you have this concept of a

stack frame you have your return address you have any arguments you put in there you have your air local variables there and usually one of those guys one of the local variables usually gets overwritten for a buffer overflow so every single time you're running a program in say anger you need to ask did that return address somehow become symbolic is there's some value in one of those programs state representations that can overwrite that thing cool this seems pretty intuitive okay that doesn't seem too bad okay so we've got our symbolic return address well what next what we need to ask anger or our SMT prover a component and anger that tells us whether something can or can't happen

can this point back into some form of user input okay so we're asking it two things now it is a return address symbolic and can we point that program counter to user input okay so this isn't too complicated I can handle this kind of math and then generally you're trying to get some sort of execution not everyone is gonna be so nice as to give you a win function so we need to see if there's a way we can point it at some shellcode or a wrap chain and the current online examples show you some shellcode some crappy shellcode but I won't get into that so I've got some animation 2.0 going on yeah so we just

replace our input with shellcode we ask anger if that's possible so we have a couple issues here this is pretty bad like in theory it's nice and everything but buffer overflows are like almost never this clean what if you only have partial control what if you have bad bites what well what if your argue or your corruption is up there in the BSS what what if it's doing something in the heap like this is awesome for academic use oh they're here I was talking about the issues on the next page cool all right one shell code choice what what if you're one shell code doesn't work what what if that they're blocking Cisco's or you're in a

sandbox for the using P trace or one of any modern technique set trying to mitigate that stuff you need a way to overcome those techniques maybe try multiple shell codes or try and run encoders on your shell code and what if you don't have full control over the program counter sometimes that's okay sometimes you only need three bytes or two bytes or one byte there are no one byte examples in my examples but in theory it could work and then yeah I'm lazy it doesn't throw the exploit for you like it just kind of prints it out and it says good luck and if you run the current one on anger it just breaks it

doesn't do anything and then they don't have anything for format strings holy cow we've got issue after issue after issue with all these public examples so if you're willing to code up all of these edge cases you can use pwned tools to cover all of those pwned tools can give you all sorts of shell code it can give you arm shell code x86 shell code I think it can give you met shell code it can also run encoders on it so if you have a null byte or if you have a newline character and you can't have that in your input we can run encoders and get rid of that it's easy and so one

of the things that I added into Xero tool was to do just that yeah there's a concept of stack based shell code versus a non stack based shell code we don't care about that but that's another issue that can happen here that we need to be able to solve yeah can I have that point twice there cool so by combining these two tools together I can ask anger if it's possible to use a shell code I can say hey anger is are there any issues with putting a new line in my input are there any issues with doing a null character in my input and it's really good at saying yes or no and so

that's the SMT proved part of that again so this is all kind of cool but where are the examples so I'm going to talk about a super-duper simple binary with a buffer overflow it has a check and it wants you to have dead beef somewhere on the stack and then it lets you do your overflow if you've done Epona before this is really easy but if you're trying to automate that that could be really really hard you won't be able to do that with the online automatic exploit generation stuff because of that check it'll say yeah I can corrupt it but so what what happens I don't know so let's show you how it should be done with a

video that won't work alright well luckily yeah don't look at my Spotify okay now that we already have our videos folder up here I'm glad I have good folder management for this one cool so I gave it up I got to share this screen too

and presenting is hard all right cool okay so all these are super short videos so it shouldn't be too bad I'm giving it just the program I'm giving it this URL that they gave me and I'm giving it the port that's it I'm telling zeratul to find the vulnerability in this program overcome their puzzle and send an exploit for me and get me that flag so we're running through the program right now we're checking to see what the input is we want to see if it's a standard in an argument a file whatever we then are trying to find that vulnerability in there it already found it it'll build a proof of concept for us if it fails

somewhere down the line its overcome that hard check already and right now it's trying to use the point to win function technique so this is where it had a win function in it earlier and this one demos really well cuz it's works just every time oh cool and then I suck at programming so the endianness was messed up so it's gonna go back through there and fix it for me it's gonna test it locally yeah it's just stuff is flying by and it said hooray we did it now let's throw it remotely so it's fixing the endianness for me again because I'm still really bad at programming and it's throwing that right now for me and as soon as that

connections made it's going to run LS it's gonna try and find a flag and cat it yeah cool and then I guess CTF flag spoilers this one is called no you suck and it's currently up on the University of Central Florida's CTF training stuff I think it's a 100 or 150 point boning problem it's not too hard if you're a expert at this kind of stuff but to automatically solve that how long was that video that video was a minute and four seconds I didn't have to open up Ida I didn't have to open up gdb I didn't have to open up anything this is time I can now spend solving the harder problems

all right cool oh right and then format strings kind of suck right now in anger these are the only five lines that represent format strings as soon as you get to there it just kind of says yeah that sucks bro I'm sorry and then it exits out says I'm done stopping no more if we're trying to find format string vulnerabilities that's that's not cool that's not acceptable so what does it look like to model a format string vulnerability well generally it's kind of composed into any of these three bottom right primitives you you want to see if it's symbolic and then if that input to that printf can contain any of this this is nonsense

this long string here then you're gonna try and build that and every single time you're adding into it you're using a right primitive for the percent and that just makes it harder so as you're trying to build this and debug this all these addresses are changing and your buffers getting longer and it's a lot to manage if you haven't done a lot of format strings before but pone tools it's awesome they can build us a format string payload we can say I want to write this value at this address and it'll be like here you go it's awesome holy cow and so what we're gonna do is we're gonna pair up pone tools with anger and we're going to ask anger if we

can build this entire format string payload as the first argument into print F sounds like it'd be a good format string vulnerability to me look at that I already talked about stuff that was on this slide yeah one other piece here is as you're testing different addresses for your shell code or your point to win or your rap chain the size of your format string can change so if you're trying to port this to other CTF problems or if you recompile the problem the answer can be completely different but anger is really good at maintaining those addresses for us that's awesome so we can ask anger to build and solve those potential inputs automatically and

we're using the exact same logic we had before it overcome that puzzle or that challenge of that problem step we had if it's a menu or whatever and then if I didn't start these last week this is where my other video would be so these tools are all online the format string one takes a couple minutes it's three or four minutes I didn't think it'd be a very good demo to just drop that one in here but where do you go next how does all of this stuff translate to real-world problems well we have an issue of finding vulnerabilities we have an issue of trying to find memory leaks for trying to pair for even harder vulnerabilities

and my rob chain building stuff and there's kind of bad and there's better tools online to fix that so you can use modern buzzers like AFL or driller or tifa's to find these vulnerabilities for us you don't have to can kala Klee analyze it and run it at some random program slice and do all of that jazz because usually they just run with a good old dot slash my Ponyville problem and then info leaks if you're working against a SLR you need to know where some address is and generally you accomplish this through some sort of information disclosure you get some pointer that gets leaked out to you they can use to calculate the base of Lib C

and everything is awesome because now you can call a system VIN Sh that's something that I don't handle here but with more programming it could be handled and then I'm very bad at automating rap chains oh and then if someone is ambitious one day maybe heap exploitation can be automated so back to that so what these CTF problems are kind of cool I guess you can get some points you can translate that into bragging rights maybe you get a trophy at some CTF competition but how does that impact the real systems the systems we're running the systems we're using today and I like to think that a lot of CTF problems translate easily to embedded devices if you look

at routers if you look at smart lightbulbs or smart fridges or smart whatever and you google command injection router you're gonna get like 20 hits from just this week it's ridiculous you're gonna see 30 hits of buffer overflows you these things are wide open and they translate really well into these embedded systems but we need to scale them out even farther the next step beyond that is very likely trying to handle large systems and programs and even entire firmware or hypervisor slowly Kyle that'd be really hard so it's kind of nice talking about all of this stuff and being wishy-washy and saying this is what we should do so I thought it'd be cool to actually

talk about the translation of all of this into an actual vulnerability now I really like this one because the POC fits on a PowerPoint bullet like you go to the address of this thing you slot in a semicolon and then you just type in whatever you want it can be reboot it can be execute whatever who knows and this one impacted 11 of the most common home routers in the world the statistics on it were somewhere between 1.5 and 2 million routers were impacted by this like one one semicolon right there so it's really really hard to find if you actually go through the whole thing and it's just really easy to exploit so oh

no it's another video crap all right let's see if I was prepared oh I do I do have that video ok

so this one's using a more proprietary tool that's taken these techniques even farther we're doing representations across an entire firmware for this router we're eventually going to stumble across the web server in this example we're going to model every input that's going into every function we're going to distribute that across several cores I'm using a much faster computer in this demo and we're gonna check for those exact same conditions we're checking for memory corruption we're checking for format strings and for this tool I've even added command injection tests and we're seeing if any form of input could corrupt input that gets passed into a system call or an exec call or a P open or whatever have you and even before I

finished it it found it buzz so because you're representing these programs symbolically you have all of this tracing information you can find out what is going on behind the scenes with these programs and in this example right here I can see what all the registers should be to get to that command injection vulnerability well you say that's nice who cares I want to see where the actual vulnerability gets well we're also tracking memory we're tracking the stack we can see every single piece that's going on here oh no I built in time for talking and I paused it yes so we can track every single action coming out of our common functions we can see that s printf

copied something into something that gets called into system and because I like living dangerously I always make a copy and reboot if it can it's a very visual indicator to see if you can exploit it so if you do : reboot against one of these routers it just goes down you can see all the lights go down you don't have to worry about spaces or anything and that's again something we can see right here because we're symbolically executing it we have greater insight into where these bugs are and what the program state needs to look like to get there and at the very bottom I have a very pitiful example of what the stack looks like yeah I think because it

went into the the PLT or the got you had like one little value there on the stack but what one day that view will be better so you can find these vulnerabilities so much faster when you apply tools and techniques for CTFs towards systems like the embedded here and I'm thinking as we're moving forward we need to translate our tools our fuzzers our analyzers to handle more things like this we need to be able to just put more compute resources into finding bugs I don't want to have to go through and look at a giant embedded firmware and try and open up every single one in Ida Pro trying to get every single one oh it loops look at

that and trying to find those explodable conditions myself I can make the tools we have today do that for me and that makes it a lot easier to find these bugs and vulnerabilities so I tried to rush through this one a little bit I've got ten minutes awesome so two days ago I finished writing a tool for even more CTF solving and I've named it rocket shot and it's got this really complicated thing in here called backwards program slice stitching I think I saw like ten people's eyes roll over so what does that mean I have no idea we're talking a little bit about a control flow graph earlier and each of those little blocks

on there can be a slice of the program and so I asked myself is there a way that you can go inside out in a program instead of having to do all this crazy initialization all of this hard checks is there a way that we can skip all the hard stuff and go right to the very end where it prints the function or prints the flat excuse me yeah and then another reason why I put it on here it's really really fast and so I've got two examples I've got one video for rocket shot that's going to do just that it's going to take this control flow graph we have here it's going to run from the start to

the end of every single basic block it's gonna say that one's bad this one's good we're gonna track it we're gonna say oh okay if this is a graph what do our predecessors look like I'm gonna say okay well let's step it out again let's unwind it even more it's hard to see so I gave us some labels so we can see where we were the left ones good right ones bad so we can start here we say okay I want this one what's above it how do I get to it and so I can start my execution on a program slice program slice at the top of that first basic block going down I can do this again and

again and again and I'm running slices of programs that run so much faster than the actual programs themselves what is this one work oh it's not it's just full screen right better believe I've got a video for this one

so this is a challenge from Peco CTF 2014 where we're gonna iterate over every single node in there doing that small slice execution this is in real time and we're checking for some sort of file descriptor activity we want to see if there's some flag something going on and if we get it it's gonna print out the flag ins gave us some garbage because we have uninitialized variables yeah so as soon as we see that we remove all the other ones when we say okay what are the predecessors of that one and we start doing this recursive thing here it's already finished I kind of slowly sits down and so I'm gonna ctrl C here

cuz it says the flag is done right well I was very tired last night when I was building this one oh there it is says solving equations is lots of fun I should have zoomed in on that but over the space of 45 seconds about a minute we're able to run this program with next to no arguments and have it spit out this flag from this technique or unwinding backwards up all of these problems and so here I'm just confirming that we rerun solving equations as lots of fun and we get our flag

then I'm very quickly running out of time three minutes so all of these tools are up on github I posted rocket shot this morning I started running it through the entire suite of the anger CTF challenge problems and it's headed surprisingly good return on finding flags for me and so I'm hoping going forward that by giving you guys these tools that we can encourage CTF writers to build us even harder problems hopefully ones that model more vulnerabilities that model more real-world systems so that we in turn can build even better techniques and tools to solve those so with that thank you [Applause]

Lightning fast CTF solving - Automatic Exploit Generation & Side Channel Analysis

Related talks