The Aftermath of a Fuzz Run: What to do about those crashes?

Name: The Aftermath of a Fuzz Run: What to do about those crashes?
Uploaded: 2017-06-27
Duration: 50 min 17 s
Description: Fuzzing is a highly effective means of finding security vulnerabilities - new, easy to use and highly effective fuzzers such as American Fuzzy Lop and libFuzzer have driven its increased popularity. Once a fuzz run has found cases that crash the target application, each must be reduced, triaged and

BSides SLC · 201750:17146 viewsPublished 2017-06Watch on YouTube ↗

Speakers

David Moore

Tags

CategoryTechnical

StyleTalk

Mentioned in this talk

Tools used

AddressSanitizer American Fuzzy Lop GDB libFuzzer Radare2 Valgrind

About this talk

Fuzzing is a highly effective means of finding security vulnerabilities - new, easy to use and highly effective fuzzers such as American Fuzzy Lop and libFuzzer have driven its increased popularity. Once a fuzz run has found cases that crash the target application, each must be reduced, triaged and the root cause found to enable a fix. In this presentation, David Moore will describe tools, tactics and techniques for performing post fuzz run analysis on the resulting crashes with the goal of fixing the vulnerabilities. The first section of the talk will introduce/review fuzz testing and memory corruption bugs. Then a complete crash triage/root cause analysis workflow will be outlined including the use of corpus and test case minimizers, debuggers and reverse debuggers and automated memory analysis and crash triage tools such as Valgrind memcheck, Crashwalk, and Address Sanitizer. Finally, examples of memory corruption bugs of varying degrees of exploitability will be presented. This talk is suitable for anyone with some C programming experience and an interest in using fuzzers to find security vulnerabilities. Attendees will learn how to effectively analyze, triage and fix crashing cases.

Show transcript [en]

[Music] all right good afternoon my name is David Moore uh very happy to be here uh first I want to say thank you so much to the staff and volunteers and all the participants as well at bsides I'm a huge fan of bsides I love bsides so it's really nice to be here and um just to get started a few details about my background I became a professional developer of software in 1994 had the opportunity to work for some pretty great companies great people in engineering sales Consulting and Business Development roles and um that was very cool I was having a really cool career in that uh early on in my career I got the chance

to work for next to work with Steve Jobs not directly but um I was actually one manager away from him one of my managers left and it took a months to set up a new manager so I reported to the VP of uh Professional Services and he reported to Steve so that was interesting that went great though got bought by Apple in one of the biggest Acquisitions ever um went on to uh you know some other really cool companies and um but after about 10 years of that I decided to take a break and kind of move into Consulting and so there I'm called Consulting in Indonesia and uh got a chance to to do that for a little

while and continuing my break I actually also spent a few years as an opera singer I trained into Opera and was able to get to the professional level semi-professional frankly and that was cool but I took it about as far as I could I wasn't really loud enough and sometimes I had trouble with a high notes so that's not going to work and at the same time I was seeing you know I was always following Tech and wanting to get back into Tech and I was seeing a lot of amazing um things happen in the security space even going back as far as stuck net and some of the really big hacks of 2012 and so I started doing security as

kind of a hobby thing just kind of doing um uh vulnerable web apps things like that and about three years ago though I decided to to Really um go into a fulltime and concentrate 100% on offensive security so um so I started participating in bug Bounty programs uh my apologies

small slides show glitch sorry about

that and so I these bug Bounty programs are awesome and I was working mostly through bug crowd and Cobalt and hacker 1 and so these let you hack websites without worrying about um being sued or being interrogated and if you find a vulnerability and and report it responsibly it's really nice because you can get some money back from it but really more importantly it's the recognition that you get from it as well as the training you learn a lot hitting real websites um and so uh I did that for a while I found stuff in Google uh full disclosure that was a couple of Google Acquisitions a couple of different ones um and so that went great but then I was

thinking about getting my certificate offensive security uh ocp and I knew they had stuff about memory corruption and things like that and I didn't know that much about it at the time but I decided to start studying that and I really got into it and so I essentially started doing that only and moved away from the web stuff pivoted into fuzzing and memory corruption and I was mostly using a new fuzzer called AFL it stands for short for American fuzzy lope and this is developed at Google by a guy named Mikel zooki and it's a groundbreaking new fuzzer and very very

powerful and um and then I started a company called fuzz station I got so into fuzzing I started a company last summer called fuzz station and we offer fuzz testing at scale in the cloud and okay so enough about me uh we're going to talk about the talk outline today when I first started fuzzing I wondered to myself okay what am I going to do if I found a crash the fuzzer is powerful you can make things crash but I didn't really know what I would how I'd handle that what I would do if I found one so I worked it out through just reading and trying lots of stuff and this talk is the results of

that so we're going to be addressing today memory corruption issues in C and C++ programs and I kind of call this the middle section of the process so I'm not going to talk about um doing a fuzz run or what I call the art of fuzzing which is choosing a fuzzer choosing seed files how long to do a fuzz run for uh that's been pretty well covered in a couple of talks in 2016 and we're also not going to talk about exploit development that's definitely far beyond the scope of this talk or what I know so it's really the middle you've done the fuzz run you've got a bunch of crashes what do you do

the goal is to get to the point where you can triage those crashes understand if they're exploitable or not uh and also to an extent um start down the road of maybe finding the bug if debugging is your goal and so we're going to introduce we're going do a quick review of memory corruption bugs go through the workflow I developed and then look at a couple of real world examples from my research and so here's a quick review of memory corruption bugs um really a lot of times they come down to invalid reads and writs what do we really mean by memory corruption and to really reduce it down it's invalid reads or rights if you can coers a

program to read or write outside the bounds of where it's supposed to On Any Given buffer or environment uh I'm sorry variable that's what we call consider an invalid read or write also called an outabounds read or write uh sometimes abbreviated as o reads and writes you'll see and just real quickly the broad causes of this kind of memory corruption are so often um many many cases of off by1 errors so I see off by1 errors and over half of the crashes that that I find then certainly unvalidated input uh can can cause problems and in some cases still the use of Unknown Known unsafe functions in C like stir copy and get string uh can cause quite a few

problems and so in a process there's two areas of memory the stack and the Heap uh either can be corrupted local variables are stored on the stack so if you're doing intx equals five that's going to go on the stack um memory obtained or uh using Malik like in a buffer that's going to be located on the Heap any Heap memory uh that's been obtained with Malik has to be explicitly freed with a call to free if not things can go south quickly we'll cover that a little bit later Heap and stack buffer overflows are pretty common still so we see a lot of them both uh in general stack buffer overflows are getting harder and harder

to exploit so mostly attention is being pointed at the Heap at this point and one quick note you'll hear the word stack Overflow strictly speaking that's when recursion gets out of control that's not a memory corruption bug so if recursion it just keeps going and you keep pushing frames and frames and frames on the stack eventually the stack is either going to meet the Heap or something else is going wrong if it's in a browser in JavaScript uh sometimes that will get caught and so this is distinct from what we're going to talk about today which is a stack based buffer overflow um in fact here's a quick example of one really simple if you have a a program with main it takes

an argument from the command line as input creates a character array of size eight and then you use a known unsafe function like stir copy and buff copy the argument from the command line into there that'll work fine up until you send it more than eight characters so here we're sending it 12 Capital A's and that will definitely overflow by 4 bytes so that's a simple example of a stack buffer overflow um and so what happens when free is not called Power properly on memory obtained with Malik this is another kind of memory corruption bug called use after free commonly called a uaf and this is just what it sounds like when a program continues to use a

pointer that's already been freed these are highly likely to be exploitable especially in C++ code they often show up during error conditions and Corner cases so anytime it might be unclear who or what part of the program is respond responsible for freeing Malik memory if that's not really clearly laid out you can definitely run into use after free vulnerability for instance who frees the caller or the collie of a function that's not clearly laid out you're going to see these and here's a quick example of a uaf and so here we have um a PO or two character four bytes we do something with it we free it some more stuff happens and we forget we freed it or

maybe there's an air condition maybe there's a bug and so we dreference that again and um and that's your standard uaf error and then there's other kinds of memory corruption bugs as well um certainly a double free or an invalid free it's also called is when you call free a second time so distinct from uaf where you call free and then uh de reference it in this case you're calling free and then you call free on it again and so those are pretty hard to exploit directly they can be under race conditions in U multi-threaded applications however almost more importantly is that a lot of double frees can be leveraged into a use after free so there might be a code path that

results in a double free but if you see one they probably or there may well be another code path where instead of being freed a second time erroneously uh the pointers D referenced and then you have a uaf so anytime I see a double free I'm definitely going to be pretty interested in trying to leverage that into U somehow try to find another path through the program another kind of memory bug is when a conditional depends on an uninitialized variable so if you just say int X and then immediately say if x then do something uh you have a conditional or a branch that's based on completely unitized values and that can make a control flow attack

possible then finally uh last kind of other kind of memory bug is uh memory leaks and we're all probably pretty familiar with those and that's so if free never gets called if the programmer forgets to call free or the control flow path never reaches a free and that's just going to use up lots and lots and lots of memory and in that case you are um potentially exposed to a Dos attack so if a programmer or uh an adversary can leverage that again and again use lots and lots of memory it'll exhaust the system and uh exhaust the process and bring it to all right and so next um what is exploitability what do we mean by it

actually before we go on this I want to see if there's any questions anybody have any questions about anything here so

far

yeah yeah the question is what about languages like rest and go that are memory um memory managed or even Java or C and this this kind of fuzzing really just targets crashes and so it's really only um leverageable against C and C++ programs and the idea of fuzzing generally is that you you throw lots of you know garbage data lots of random data unexpected data at an application and then you monitor for some anomalous condition and so normally what we're talking about today in most fuzzers in C and C++ the anomalous condition is a crash and so you send it lots of weird data and try to make a crash but certainly I have an idea that maybe you

could do the same thing with memory manag programs but you just have to monitor for other anomalous conditions and the tricky part is is defining those and so um so that's kind of the idea you know in memory managed cases how you might be able to fzz them yeah thank you yes over

here so the question is if if a conditional depends on analized variable like um where do you see that um I haven't seen it a whole lot I've seen it a few times I mean it's just it's it's sort of just like a standard programming error um you know you're always supposed to initialize whenever you right and so um so it's really just bad

programming uhhuh okay yeah so the point is that the compiler will catch that and um and that's that's a good point um I I I have seen it a few times in code i' I've worked on um and so but but thank you for that yeah anything else okay so what we mean by exploitability is essentially one way to think about or one version of it is reprogramming with input data and not code so if we canst strict the program into attacking uh and executing attacker controlled input data as if it were code then we have a code injection uh exploit there's a pretty famous hacker named halvar uh he's at Google project zero now and he has an idea a way to

express this he says input streams become instruction streams or from an attacker's perspective can I make your program run my program and so this typically involves controlling either EIP and 32-bit programs or rip the instruction pointer rip and 64-bit if you get control of that then you can point it at um at maybe some code you've injected uh however injection attacks have been really well mitigated over the years it's really hard to get Shell Code uh or custom code into a process now because of a lot of mitigations that have gone on so most of the exploits that we're seeing now are reprogramming with existing code in the process so rather than injecting new code we're doing a

code reuse attack so as opposed to a code injection attack this is a code reuse exploit and this is called return oriented programming it relies on manipulating the return pointer on the stack to make this happen Roop it's also called Ro and it's also technically called weird machine programming and it's not easy there's the idea is that you leverage um code and this is already compiled code um and you need to string a bunch of the stuff together to get what you want typically popping a shell and there are tools that will um essentially search uh process for uh you know for different chunks of code you might want to use and you have to tie

them together and it can be like a rubbe Goldenberg machine if you've seen these or like you know when like the ball comes down and bounces and I mean it can be really hard to run these things but this is how um most most exploits go these days and then does exploit ability matter um why does it matter so I mean sort of yes and no I mean a crash should be fixed if you're of a memory um uh corruption vulnerability or or bug it's good to fix it h however prioritization is important too perhaps there's many bugs to fix it's important to know which are exploitable which are not if you're uh involved with maintaining some

code and also certainly if you're doing white hat work and Reporting the bugs it's pretty important to know how exploitable they are for a couple reasons first of all you want to motivate the vendors or the maintainers to fix so if you can show or demonstrate or or uh uh given a good idea there's some kind of exploitability that's going to motivate them to fix it another point is that when you report a bug uh you have sort of a choice in a lot of Open Source cases or other situations as well where you can call it a security bug or not and in open source code a lot of times you um if you report as a security

bug it won't be in the bug tracking system it won't be uh publicly available which is good um because if you were to report something otherwise that is in the in the bug tracking tool you might have just dropped a zero day like if it's a really critical bug and you didn't know that and you just reported it it's like boom it's out there and then the other side of the coin also is that you don't want to cry wolf you know like you want to really if you report something as security um vulnerability you want it to really be one because you know sometimes the developers are sort of like pushing back a little bit right

so you want to have a good case that it is exploitable and then it's a matter of exploitable by whom uh there's lots of there's a broad range of exploit Dev um capability out there so certainly project zero I talked about them before they have some great hackers uh and other groups certainly have a very very high level of really great people and lots and lots of resources uh this is um one agency we heard about the CIA very recently doing similar things and certainly who else who knows who else like I believe that probably every country has some kind of offensive capability and there's other groups and what whatever so the general idea and just to reiterate we all know

this but security is never 100% what we're trying to do is raise the cost to the attacker we want it to be more expensive to run the hack than it than the value of the data or whatever nefarious purpose is going on uh might be to the attacker and then one other important point is that most exploits nowadays are bug Chains It's not just one bug it's chaining two three four bugs together to reach uh an exploitable situ situation and so an unexploitable bug one that's definitely not exploitable could still play a critical role in a bug chain that might lead to an exploit like an rce and then one more U important Point here is that some bugs are really

surprisingly exploitable they wouldn't seem that way at first there was a bug uh recently disclosed in um a DNS Library called C areas this is in the Chrome OS and a researcher or researchers they Anonymous found a remote code execution that was um pretty bad and but what's interesting about it is that it was just a one bite overwrite and so and it was only a single bite overwrite uh past the buffer and not only that but it was always the digit one so the attacker in this case could not control what got written in there and this was reported there's a 37 page report I didn't read but this was reported by uh um Anonymous researcher

researchers and it was rate initially related as moderate security impact and but a 37-page report proved that it could gain uh complete remote code execution so very very serious bug and here's another um another page about it and the way this kind of bug would happen is there technique called Heap grooming where you make lots and lots and lots of calls ahead of time to arrange the Heap in an exploitable manner and here's the trigger this is what triggered this bug and so this is a DNS Library so a typical DNS name but if there was a trailing Escape dot uh that would trigger this bug this is kind of thing that um you know we don't know how they

found this but uh strongly suggests that fuzzing might have played a role in

this okay and then I i' like to go over a few um mitigations that have made exploitability a lot harder the last 10 or so years the first one is stack canaries and a stack Canary is just a random integer pushed onto the stack in between stack frames so just like a canary in a coal mine uh stack Canary indicates when something has gone very wrong it's time to stop the program uh the way it works this one way to look at it stack frames and you got your canaries in between there uh it is a random integer so this maybe a little bit wet uh better better way to visualize it and so if an attacker can

make a uh stack based buffer overflow attack they can't cross the boundary into the next stack frame and that's typically want what you want to do because that's where the juicy pointers are a lot of times so in this case if an attacker wanted to do an overwrite had a bad right the attacker would have to have these numbers and recreate them but the attacker doesn't and can't get them this is in the OS it's not available to the attacker and the operating system will check these numbers every time um a stack uh is access so if there's something's gone wrong with these numbers they're not what they used to be that's a clear indication that there's a

hack going on it's time to stop the program uh next up is data execution prevention and this simply marks some region of the memory or several as non-executable so it says there's never going to be any executable code in here it's always going to be data so if the execution pointer ever points here that means something going on time to stop the program this is supported at the hardware level by the NX bit and it's present in modern CPUs and so the combination of Stack canaries and data execution prevention has made exploiting especially stack bugs a lot more difficult uh one more really important mitigation is aslr address space layout randomization in this case um the OS

scrambles the memory it kind of shuffles the memory so just like when you buy a deck of cards they're in order by number and suit but you shuffle them and that throws them out of order a little nicer way to maybe visualize it just as we map virtual memory to physical memory aslr adds another level of remapping over that and what this does is that even if an attacker control the execution pointer they don't know where to jump to maybe to start their rope chain maybe they have a Target where they know some good is that they want to run but they don't have the real address of it they have a scrambled address of it and so it

makes it very difficult for them to jump there and run the attack now one thing about aslr it's not very effective on 32-bit systems uh 32bit systems have only a 4 gig Heap size and so Random Heap spraying attacks can sometimes work so this is sort of a brute force or you know many many times attacks but you're sort of targeting a section of code you're hoping to jump to and if on one of the executions you do jump there then then you're in business so um 64-bit is a different matter aslr is very very effective okay and that kind of concludes the memory corruption review we're going to talk about the workflow uh before we do are there any questions

on anything we've covered so far

yeah um is I I'm not um is that when did that come out pretty recently last week or two or February 15th yeah I'm I don't I'm not um I'm not really familiar with that I remember when it came out yeah this is a there was a JavaScript attack on aslr I do remember what you're were talking about and um it it's important I mean I didn't get a chance to really study it as much as I would have liked but it's it was a really interesting attack I don't know how How likely it is to be leverageable um but um uh but that was a really cool that was a really cool thing yeah there

was a bug that allowed um JavaScript to defeat aslr in some cases anything else cool okay so the next section we're going to kind of go through the the workflow itself like what do you do once you've done a fuzz run you got a bunch of crashes how do you deal with it the steps are to minimize the crash Corpus uh use several memory corruption analysis tools to get more information about what's going on and then finally determine exploitability or if if need work on trying to find the root cause of the bug and so the first thing is minimization so there's two kinds of minimization um but before we go into that the reason you know the idea is

you've got a bunch of cases you've got many cases like a couple of C couple dozen cases let's say that create the crashes and these cases are files so many times the fuzzers they do their best to try to make them make each crash a unique case but it's hard for them to do it during the course of fuzz run so you tend to have a lot of cases that really are the same bug and so so it's important to to minimize those uh and that's called minimizing the Corpus of crashes and so there's a tool in AFL called AFL semen CM i n there's also a tool known as C reduce that will do this

as well and the idea is to just have fewer make sure that every crashing case is really distinct as possible and so that's minimizing the Corpus once you've done that it's also important to minimize each crashing case individually so the fuzzer you know just making up random stuff trying to make things crash a lot of times the cases that it comes up with has a lot of extraneous bites meaning there's there's material in that file that has nothing to do with a crash and so to make things uh easy to debug uh it's important to minimize that there's a tool called AFL team in which handles that for you and the way it works is

that it takes all the bites and sequentially or uh tries to remove them and see if the same crash still happens uh if it does happen if the same crash still happens with that bite removed then tmen knows that that's an extraneous bite and it's removed if the crash doesn't happen or if a different crash happens uh that means that that bite is uh very important to this case and it stays in the file so when you're done with both this you have you have fewer cases and each one is really has only the bites that are relevant to the crash and then one final tool that I like to use is called f dupes and so

even after all of this there are cases where you can have ex bite forbite identical crashing files multiple of them and so what F dupes does is it does an md5 hash on every file in a directory and if there's multiple files that have the same md5 hash meaning they're bite to bite identical it'll remove all but one of them for you so so that's another nice way just to kind of clean things up and get down to the minimum set of cases

right now we're going to go through a few memory corruption analysis tools before we get into that it's important to note that all bets are off when things go south in a SE program when memory gets corrupted um you know crazy stuff can happen and it can even corrupt the tools or cause the tools to give you in some cases erroneous information so a little bit of skepticism is always very useful and so the first corruption tool to talk about is called a dress sanitizer and this is a facility in compilers like GCC and clang that is the um flag that you use to to uh to make sure that you can compile with a dress sanitizer it's

often abbreviated as asan and a dress sanitizer operates at both the compile time and the runtime at compile time it adds instrumentation into the binary that lets it track memory use at run time it actually replaces the Malik allocation libraries with its own runtime library and again en enables it to really track what's going on with the memory and really bring a high degree of accuracy into into things that it finds and so here's a quick um view of it a little bit hard to read maybe for some people back but it's nice because it tells you uh address sanitizer it says use after free on address uh so and so and it gives you information about

the stack pointer and the base pointer as well and a stack Trace in this case it says a read of size four was found at this memory location and so it finds invalid reads and writs uafs double freeze memory leaks um uh issues in both the stack and the Heap and several other things as well another really important tool is called Val grind or that's actually a family of tools the tool itself is called M Che but a lot of people kind of just say valind when they mean M Che and Ms distinct from address sanitizer because there's no need to recompile you can run it on any binary uh it does put out lots and lots of input which needs

to be interpreted and it's also distinct from asan and that it'll allow the process to run until the crash whereas asan will stop as soon as it finds any any memory corruption error at all it'll simply stop and so with Val grind you might see more things going on before the crash actually happened or maybe it's not even a crash there definitely can be Memory corruption bugs that don't result in a crash and so I like to use them both I like to use valr and asent because they kind of give different perspectives on the crash uh one important thing to note though is that they don't play very well together if you try to uh run Val grind

on an asand compiled binary uh you're going to get an error it doesn't doesn't work out they don't like each other very much and so valgrind's very accurate as well and here's a quick example the output and a little tough to read but it's saying invalid right of size eight uh and it gives a stack trace for that as well as the memory location and then down a little lower you have an invalid read of size four so anytime you see a bad right in the Heap uh that's that's a bad thing especially if you have a write and a read then the attacker a lot of times can manipulate things and turn that into an exploitable

bug and then finally a last tool called exploitable just simply called exploitable and this was developed by C it's now been open source and available on GitHub it's maintained by a person named Jonathan foot foot te and so under JF foot exploitable you'll find this it's a an extension to GDB and it but it also includes a script to run it outside the context of GDB and it's written in Python what I really like about it is that it attempts it does its best to categorize uh the crash and so in this case we have it's saying that this is in category exploitable this is example of output of the tool exploitable and it gives you a really nice explanation as

well and so it's saying here the targets back Trace indicates live C has detected a heap error or that the target was executing a heap function when it stopped and so so it's pretty uh pretty handy uh pretty handy tool again you want to be a little skeptical you want to do your best to to ver ify these things um the tool exploitable will use um or offers four different categories uh for a crashes those are exploitable uh probably exploitable this could be a stack beffer Overflow probably not exploitable so that's like things like a null pointer D reference or maybe a floating Point exception and then it'll categorize categorize things as unknown as well

there's times when it just doesn't have enough information to say what's going on with a crash in that case you need to dig dig down a little bit further and so altogether these tools give a great deal of information about what's going on it'll put you in good shape to to kind of manage the next step which is uh just determining exploitability digging down to the root cause of it before we go on any questions at this

point okay so one important thing to do before you start working is to disable aslr and so the reason for that is if you you know if you don't you're going to have different memory locations on every execution of the program and that's going to make it extremely difficult to debug and so and certainly you want to do this carefully you don't want want to use this with with caution behind a a knat router or a firewall or something like that this is the uh command onbu to disable aslr temporarily and it's in a file called randomize Vore space and that's under proc so you can't just uh use VI or another editor you have to Echo a zero

into it to turn it off uh this if you turn it off this way when you reboot you'll have aslr back back in uh back in shape so um and another thing is like I I like to fuzz on 32 bit I like to work on 32-bit a lot and so um if you're already on 32bit anyway I said before aslr is not as effective there so disabling aslr is not NE necessarily as as bad of a thing on 32-bit and then really The Next Step that I like to do and I even write these down is to really understand what the critical memory locations are because it can get a little bit confusing you're

looking at lots of lots of hex lots of memory locations dumping lots of things and so it can be confusing about you know where the crash happened was it you know what instruction where in the Heap whatever and so I like to write these down and the first step is understand is it a code you know is it a memory location for code or is it for data so where the crash happened itself that's going to be a code location maybe where an invalid read or write occurred that's going to be uh in data where the memory in location was allocated or freed where in the code that happened and then maybe during the course of the program the

data is reassigned to another variable or copied moved around things like that and that's actually going to be both code and data that that you're going to need to track in that case and once you do that it's time for GDB um and so I spend a lot of time in GDB the first thing to mention is you you definitely want to recompile with- G and - capital O 0-g will get you the symbols get you the function names and and you definitely want things also not optimized and so the usual approach is to set a breakpoint where the crash happened and then run the target with what I call Canary values so in this

case Capital A's are really nice they turn out to be uh in HEX 41 that's the asky number for capital A and the reason to do this is that it's really easy to see once you start dumping memory and trying to figure out you know kind of where the memory wound up the memory input or maybe trying to do an overflow where did the Overflow wind up and it's really easy to see the lots of 41s together if you see that in the wrong place um you know you you've got something going there and so in the general approach is a set a breakpoint where the crash happens and work backwards and so a lot of times you need to set

break points a little sooner in the control flow to really understand uh how things got to the point where uh the memory corruption happened other than that like digging into GDB is definitely outside the scope of today's conversation um one issue with GDP is that GDB is that you often have to run it over and over again A lot of times it can be pretty tedious you keep approaching it you're running the program from start to finish and it can take a lot of time and it can get pretty confusing so there's another tool that I like using a lot and it's called RR and this is a again a plugin to GDB this is developed by the Mozilla

Foundation an open source tool and the way it works is that you first run the program under RR with the input with the crashing case input and it will record the execution in a way that once that's done you can set a break point again probably where the crash happened but you can step backwards you can reverse time and so it's almost like a time machine so you can reverse back up through the control flow uh of the program and that can be very powerful it's a can be from difficult to triage bugs it's a way to really understand things in a more efficient manner rather than using GDB and going forward in time every every time and just trying to kind

of work your way back up and figure out where things went wrong a couple of caveats with RR you can't alter variables like you can in GDB you can't say you know set x equal to 5 uh you also cannot make calls like you can from the GDB command line you can actually just make you know call a function either one of those things will screw up the state of the program which has been recorded by RR and it cause it to crash and here's a screenshot from the rrr GitHub and uh talks a little bit about how this works very cool tool I don't use it every time but if when things get difficult and tricky then I

definitely go for it okay another thing to keep in mind is that this is this is tough I mean this stuff is gnarly and a lot of times you're looking at assembly code in some cases you're looking at lots and lots of data um it takes persistence but it also take you know taking breaks and just working through things so I kind of compare it to doing a manual source code review too just literally looking at code with your eyes trying to find bugs in it uh it's a great way to find bugs but um you have about maybe an hour of of time in your brain where you can really concentrate the level uh

necessary to do that and I think this kind of crash triage can get to that point too so uh one more thing about uh about solving these bugs once the bugs are fixed once the crash has been fixed it's important to reuz the Target because a crash is a dead end for that code path and so there could well be more crashes more bugs further down uh the code path that that didn't did not get discovered because the program crashed especially given if you're an area the code where there's already a memory corruption bug uh that means that there could be more memory corruption bugs so it's important to go back and and fuzz those

things okay and uh the last section we're going to talk about some real world examples I I found uh before we do any questions at all on any of this

yeah okay question is is using R arm properly can make it crash is that exploitable and I mean it it might be I mean it actually takes down everything like it takes down I mean it's it's running as a plugin within GDB and you're you know and you're running a program too and so if you do either of these things it just the whole thing just just stops and so yeah yeah so yeah it just generally just stops like crazy I mean it's just um it's immediately in an unknown State and so um I'm not sure how you could exploit that that directly in terms of tax surface like how um but I mean you

know maybe yeah exactly I actually found uh a quick aside for a while in AFL in about version 1 196 or so there was an overflow in AFL and AFL names files in a in a particular manner when it finds a crash it names it in a a very particular way that that lets you kind of manage manage it and it's a long file name it's not pretty it's pretty ugly but but I had a couple of cases and I could never reproduce it but AFL instead of that file name it was garbage or it actually looked it it looked like like the data in the file it looked like stuff that had been fuzzed so there was definitely

an overflow in AFL itself too right and so these afl's written in C uh you know fuzzer doesn't really I don't think has to be written in C but but you know that's not you know it's not a bad idea to look at this stuff I mean really any security tool should be fuzzed and audited heavily um because we're the ones that use them anything else yeah use P yeah the question is do I use paa P Da and yeah the answer is yes yeah and so I haven't yeah actually I should maybe mention that um um that's another great tool so paa is also a plugin to GDB but it gives you a really nice print out of

of the stack and and other information about it and it does it um after every time you do a step right or a next things like that it immediately kind of updates the whole situation so you see everything at once where normal GDB you do a step and you're still just at the command line so then you got to dump this memory dump that memory look at the registers things like that and so so yeah it's called GDB peda I believe and it's a plug in GDB app highly recommended all right and so just to wrap things up we're going to look at a couple of bugs that I found in my research uh the first one I found in PHP

it was a low bad read and low meaning uh low in the memory space uh definitely not exploitable I was fuzzing the PHP in file and which is a weird thing to do because that's typically associated with running in the context of a web server I was running PHP from the command line driving it with a fuzzer and so but I figured you know what the hell and I found a crash and reported it however it's a low read and um it requires a crafted any file this this uh this slide is not quite as important as the next one this is the the more interesting uh slide on this this is the result of uh mem check

and it said there's an invalid read of size four but down at the bottom it says address 0x10 is not stacked Malik or recently freed and so that means that the read was was very low in the process space it was at um at hex uh one or 16 and so there's never anything down there there's never anything interesting to either read or corrupt or anything like that this is this is a null pointer D reference so not exploitable in the olden days sometimes you could exploit the but those days are gone and so this is a little tough to read but this is the diff and all they did was this is their fix and they just checked if

they're operating in the context of a web server or not and do the right thing accordingly and so you know so that was cool it did get fixed these all have gotten fixed um so I turned my attention to Ruby and started fuzzing that this is this is um uh the standard Ruby interpreter Mr Mr MRI I think it's called and so I found actually Four bugs in this in the redx engine and one in unmarshalling of the redx bugs this is the more relevant one for today and so I was fuzzing the redj itself meaning what a programmer would write and so I was fuzzing essentially the compilation of those rexes and I found a pretty nice bug in

that a heat buffer overflow and here's the results of asan a dress sanitizer said there's a heat buffer overflow this address a read of size four and that is in the Heap and so we have a bad read there um and so that looks you know bad reads are you know they're hard to exploit I mean they're bugs but they're not really that bad Standalone however then I ran Val grind on it and Val grinds indicating an invalid WR also as well of size four uh and an invalid read also of size four so this is a asan just quits at the first corruption it finds so running Val grind shows that there's a right to

and these are fairly close together in memory I apologize it might be a little hard to read but the memory locations of the read and write are close and that suggests that an attacker could probably um probably leverage that probably get probably write something and then later read it and in fact anytime you have a bad read in the Heap that's uh considered exploitable and so this is a pretty good bug and however uh you know I would consider this definitely exploitable the good news is that it's tough to exploit this and so there's not many applications that allow you to upload Lo arbitrary regular Expressions which is also a good thing for uh an

unrelated reason is that it's an easy way to potentially cause a Dos attack if you allow untrusted rexes to be uploaded into your system a Mal actor could um use a essentially a red JX bomb or something like that and so it's it's already insecure practice anyway so hopefully the attack surface on this is pretty small and that somewhat amates you know the fact that this is probably an exploitable bug and so I report reped that and they they fixed it the interesting part about this bug as well is it was a really weird Corner case if in the Rex you opened a character class that never got closed and there's also an octal number

uh in the in the reedax that would cause this crash so super weird Corner case I mean it's not surprising there's stuff in here um but this is the kind of thing again fuzzers are really good at finding and then the last one is a I found in a uh open source tool called Netflix Dynomite I found an invalid right here uh Netflix Dynamite is a replicator and shter for key value storage systems like redis and mcash and so uh Netflix uses it other people do Netflix uses it definitely at scale and production I believe they use it essentially between the internet and and these uh and rtis and memcached perhaps to store metadata related to

people um watching movies and so I fuzzed it and I fuzzed um I fuzzed it in kind of an oblique manner like I didn't fuzz it head on like I try I figure highly audited code bases are pretty well fuzzed and maybe not but I mean I didn't fuzz it head on uh meaning like trying to bring in you know stuff from the outside internet I decided just to fuzz the config file which is a yaml style file and so in a way it's not a great thing to fuzz because even if I find something that's going to take um an admin or somebody already authorized to L this so you're only looking at essentially cases where

you have a compromised admin or a malicious admin could leverage a thing like this and I figured maybe it'd be a bug in the yaml as well too so I fuzzed it but I found a cool uh crash and you can see here sort of on the right hand side there's a bunch of A's there and some other garbage that is the crash case and then down at the bottom if you can see it there's a bunch of 41s that I dumped out of GDB and so I had a six bites of contiguous right into the Heap and so that's pretty exploitable um but again it's like okay whatever it's just in the any file so I reported it and to

my surprise they Trac the bug into their string functions the string duplication function how they're handling all strings and so they had a string library but they weren't always completely using it they were sort of writing a little extra string dupe code over that they weren't completely using it and that's how they got into trouble so I was pretty surprised by this and so I I haven't confirmed it haven't tried to really run the hack but I think that this could be a pretty serious attack it's been fixed for a long time hopefully and everybody's using it I reported this in May and so um this is the diff where they fixed it and little

hard to see again but it's a classic off by one error so they're trying to manage things uh lengths of things in doing string dupes by hand in the code uh rather than relying on the library that they were using and so that's kind of how they got themselves into trouble but um I reported it they fixed it and I got into the next Netflix Hall of Fame uh which is a nice place to be some other pretty good researchers in there and so that's going to kind of wrap things up just the last thing to say is there's a few really important references uh renier Polytechnic Institute has a great course called modern binary exploitation and they very

generously uh open source the course so on GitHub they have a bunch of vulnerable um programs to practice against and I think it's on the r RP site somewhere you have to search around for it but they have all of the slides and PDFs so they essentially have the complete course and you can easily do what I did which is just take the course on your own and learn a great deal not just about exploitation and memory corruption but they cover uh reverse engineering very well also and then um the book is hacking the art of exploitation by John Erikson um pretty much a mustre for any kind of this any work like this I got the

version one the first edition in 2005 uh there's another Edition now now highly recommend that and then project zero is a great blog and then Sean Helen's a um um he's right now I believe he's a PhD student but he he finds a lot of great stuff and he wrote a great uh blog post on using the RR and so uh that's going to do it any questions [Music]

all right that's great thank you guys very much thank you appreciate it is there a is there a happy hour does anybody know if there's any evening activity tonight or what any

The Aftermath of a Fuzz Run: What to do about those crashes?

Related talks