← All talks

Batch Firmware Analysis

BSides Toronto · 201432:021.5K viewsPublished 2014-12Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
StyleTalk
Mentioned in this talk
About this talk
Jeremy says: Finding vulns in firmware is like shooting fish in a barrel. Watch as layers of obfuscation are peeled away with binwalk and the juicy squash-fs file systems, MIPS binaries and source is used bug hunt like its 1997. Laugh with me as we analyze injection, overflow, and disclosure bugs on routers, NAS, and other devices. These vulns are easy to understand, incredibly powerful and sit on embedded systems. Firmware is rarely updated unless the user is having a problem making this a persistent problem. Bonus fun is analyzing extracted MIPS-ELF binaries in IDA Pro and performing remote GDB with qemu-mipsel-static if there is time.
Show transcript [en]

[Music] so so this talk is called batch firmware analysis um basically what I've been doing for the last uh three months or so is messing around with firmware um quickly who am I uh can you guys hear me I have a bunch of people saying pick up your mic all right is that better yeah all right so um I go by diagnosis on Twitter if you want to follow me after this um I do vulnerability research reverse engineering and exploit development um so all of that kind of works together uh and it kind of goes together with uh what I'm going to be talking about um I wrote code for Saint Corporation this is the only like commercial slide

you're going to see and I thought it would be nice to throw it in because because every single Friday they let me work on um messing around with hardware and firmware and uh then they let me talk about it um I've been doing research for about five or six years and I haven't been able to get on stage and talk about it because the companies think oh this is IP this is important we want to keep this secret so uh instead now I can actually talk about it so St Corporation that's letting me do this okay so um this official looking slide there won't be too many of these um I'm going to tell you uh what I've been working on

uh how I've been doing it uh what I've found so far and then we're going to get into a research drill down which is really me just showing you about six or 70 day on uh some routers uh and then the conclusion well you'll see you'll see my conclusion the research isn't done yet um I'm still doing tons and tons of um reverse engineering I downloaded a about 200 gig worth of firmware so I'm not even close to done yet but I've done a lot of static analysis and that's what this Talk's going to be about so I've done some static analysis I found a whole bunch of back doors buffer overflows Etc and so

I'm going to show you those and uh how I'm planning on moving forward all right so my project thus far the objectives have been to collect a large sample of vendor firmware images take those firmware images and extract file systems out of those images and I'll talk about that uh a little bit more later um then I leverage that file system access to really take a look at what the device does so um I'm going to be looking at squash FS file systems and when you look at it I extract it and it's got your standard Etc www these are little Linux devices um so I do some uh static analysis on that and uh and find out you know what's

running uh the binaries that are running are they vulnerable to things like cql injection buffer overflows Etc um so I've uh like I said I've uh gathered 17 174 gig worth of uh firmware updates for tradet dlink and net gear I Harbor no grudge against these companies um I I don't hate them at all uh they just made it really really easy for me to download their firmware so that's why I focused on them um the unfortunate Trend uh is that uh vendors are moving towards signing and encrypting their firmware rather than improving their code but I'll talk more about that later all right so the methodology First grab a [ __ ] ton of uh firmware images extract

the file systems uh perform diffs so bind does anyone know bendi hands bendi yeah cool so uh bendi is an Ida Pro plugin Ida lets you um reverse a binary into assembly and bind will take allow you to take uh the disassembly of two binaries um one pre and one post patch and uh it will compare them and it'll show you the code that's changed so if you've got firmware updates for five or six different devices you can do this bind against all these different versions and find out uh what they're addressing uh in between firmware images and a lot of those things are uh vulnerabilities so we'll look at that um I do do a little bit of um

exploitation and exploit development using qemu uh does anyone use qemu here oh awesome C cool so I use qmu uh mysol static uh and then I change route uh into the extracted squash FS file system so basically what we can do here is we can take a a CGI or a program that's supposed to be running on your rotor and we can run it on a Linux system and attach debugger and that's really really useful if you want to do something like fuzzing if you want to send a whole bunch of uh data to that process and take a look at how it handles that input you can do that on a Linux system in

parallel and uh and and look how it handles that malicious input all right so my key findings uh there are vulnerabilities in third-party libraries and third party services so these routers that are running things like an HTTP server uh they're running something called light HTTP uh and they're running really really really old versions of it so we'll look at that um they also run custom CGI so a CGI is just a a a binary that runs on the system it's like an exe for Linux and uh we'll I will uh reverse engineer those and take a look at the strings inside of those um UPnP and tr69 um those are both uh protocols that uh routers have to deal with and

uh the the amount of vulnerabilities in these these two protocols is unbelievable Jamie says I need to lift the mic thanks Jamie all right so everything I've looked at so far has a vulnerability like a remote code execution vulnerability everything is trash all right so I'll talk about the data collection really quickly uh trenet has a website that if you go to downloads. trend.com you can W get SLR that [ __ ] dink they shut down their uh FTP server dlink.com FTP dlink.com no longer exists FTP D.C does Oh Canada all right uh and the other one net gear net gear was a little bit more contrived uh they've got like a web interface where you can like pick

your product so I had to write some um selenium web scripts to automate um a browser and so I told the browser basically hey browser download all this [ __ ] so I downloaded uh all of the firmware for trendnet Netgear and dlink and that's what I'm going to be talking about today the scope of the project is much bigger than that I'm looking at nazes refrigerators TVs all kinds of [ __ ] but uh for this talk we're just talking about dlink NE gear and trenet okay so trenet um they really suck at their naming conventions so if you look yeah see uh no you can't see because this projector is so um I spent more time uh doing

normalization of data than I did uh anything else really so I downloaded all of this firmware and I have to have it uh in a naming convention that programmatically when I consume it I can understand where it came from so basically I needed to take the firmware and say the firmware is this vendor this hardware version this firmware version right now if you look at what I just

showed yeah okay so they're really [ __ ] at that um you You' got a whole bunch of them that start with FW uncore you've got one that starts with Tew there's brackets there's a bunch of [ __ ] so I had to manually or not so manually I could like Mass rename [ __ ] but um I had to go in and make sure that all of this made sense so that my later Python scripts when I'm reverse engineering it and and uh Auto extracting it kind of all uh matched up oh [ __ ] now what

all right so these URLs are here because somebody's going to want to download the slides later and know what I said so this is the URLs to download um all of the uh all of the firmware I don't expect it to last too long but uh there they are so we've got 53 gig uh total worth of firmware for routers uh uh this is a screenshot that I took after I organized all that garbage took a long time okay and now we move on to binwalk show of hands who've heard who's heard of binwalk awesome cool to you too um so binwalk is absolutely fantastic we're going to be using binwalk to extract file systems so

there's something called a squash FS it's squash file system it's a file that holds an ire Linux file system uh that's what I focused on uh just for this presentation there's all kinds of file systems there's all kinds of ways that you can package a firmware but for the purposes of uh this talk I I've just focused on uh squash FS so the way to extract that very simple binw walk- e and then the binary file name right so uh what that will do it'll take the file name and it'll scan through it'll look for some magic bites that say a squash file system starts here and then dump that out it also does a whole bunch of other

really cool things that I just want to mention really quickly because Craig is a really cool guy uh Craig's the guy that wrote binwalk that's not me uh Mark said something about standing on the shoulders of giants this is a great example of it um so yes we can- e we can extract the file system- m does uh that but recursive so after it extracts a file system it will also extract any AJ um gzip uh Etc uh compressed files inside what what again I swear it's legit it's not but it's it's a it's a virtual machine so what do you want from me okay um50 does anyone here from Microsoft camera honestly it's legit I bought a

copy it's just you know um okay it also does binary diffing but not in the way that bind diff uh for Ida will do binary diffing it does do binary diffing uh but it'll show you a hex dump so you're you're you're stuck with that um it's really nice though uh you can identify a whole bunch of binaries that have changed between releases and then throw that in ID to Pro so I'll show you uh how I went about doing that in a really ghetto way okay so when you first run binw walk- e on a binary file that's really

blurry no garbage garbage I can't I can't do anything uh the top block here shows that it runs on a myips CPU uh second it shows a bunch of uh stuff about the squash file system that it's little Indian uh 3.0 the size so it lets you know how many files are in there uh and then it tells you that it's uh extracted it into squash FS root uh anyone that knows anything at all about Linux will recognize this it's got a bin Etc has spin this looks pretty normal uh so basically what we've done here is we've taken the firmware we've extracted all the files out uh and we've got the little Linux system that the uh that the

device is running available to us to analyze so in this case the first place that I looked was the ETC directory um again people that are familiar with Linux will know that the um RC C.D uh directory and the uh RCS file is actually an initialization script so that's where all of your uh files live that run when your device starts up so I open that up because that's the first place I want to look um and from top to bottom we see a bunch of stuff about mounting so we take a ram file system and we Mount that into the ETC directory not really all that interesting uh but it does explain why Etc is so

empty um and then we see here cp-a Mount SLC so that means we've got a ram file system take everything and copy it into the ETC directory we'll ignore all of this garbage and then we'll come down here there's two binaries that launch when this router starts up one is system manager and the other one is tftp d right okay tftpd is running so knowing that I strings the ultimate Elite hacker tool strings um I list all of the strings that are in system manager and I GP out Etc what I'm looking for here is files that I can steal with tfp um because I assume since tftpd is running I can just download files so the

system manager references a bunch of files Etc rt. DB apdb and APC DB these are SQL light 3 database files that have the configuration information for your router so that's all of the encryption keys and everything else um what else do we see here we see rm- uh FC www tgz tgz and the unar so the tar zxvf of Etc www tgz so this is the extraction of all of the web files all of the HTML all the stuff that gets uh run on the the router's web interface and um anything else interesting here well there's tons of interesting things here but that's that's what we're going to talk about for now um so my first ultimate amazing zero day

exploit was to tftp into the router switch to Binary and get rt. DB and like I said that had all of the router's configuration information and the devices owned so that's garbage so we'll pretend that that doesn't happen what they did they actually patched it so I let them know and they're like oh that's not good we'll patch that don't worry so they patched it they released a firmware they sent me an email and they're like hey we fixed it check it out so I looked at it yep okay you remove that from the uh from the script perfect good job um so anyway let's pretend that you couldn't do that because you can't now

uh well you you can but I'll talk about that later um so let's pretend that you can't do that um so we're going to search the file system so I ran findname HTTP and what that did is it found a whole bunch of binaries in that directory uh and one of them is light httpd that's the web server that runs on the router okay it also found oh sorry I did a search for uh cgis as well and it found myor

[Music] cgi.pm you just go to their website and we see that there are a bunch of security fixes for this uh most recent March 12th 2014 so there's been a few months I'm sure they've patched it right I love you guys all right so so I again I did a strings the ultimate hacker tool so this this is strings and then the light httpd binary and I greed out server colon right because inside the binary you know when you connect to a web server and it's like server colon IIs 7.0 so it's the same deal you connect to a light httpd server and it's light HTTP H whatever uh 1.4.18 so you look here there's a security fix in

1.4.3 5 and they're running 1.4.18 um they haven't fixed that yet so moving on um there's going to be a whole bunch of zero day that I'm going to be talking about you guys deal with it um so next up I was like okay let's let's run my ultimate hacker tool strings against my CGI and um down at the bottom here with red arrow is W c-l that just counts the number of lines so that's me showing you that there is 984 strings that were recognized there's no way anyone could see that this this is all that's important so after the 900 and some whatever strings scrolled by this was at the very bottom so there's two things here

right uh Cameo sw5 and Superman after each other that kind of made me and then the next one is Ping Das C1 per s and then it pipes that output to a file present s so that's probably not going to be good all right so it turns out Cameo sw5 is a back door and the password's Superman yeah thank you um just to like confirm this after you look at firmware long enough like you see [ __ ] and you're like oh I know what you did there um but this is this is kind of just like confirm to you guys so I opened it up in Ida Pro and Ida is really cool in that

if you see a string that's interesting you open it up in ID Pro shift F12 opens up the strings you double click on the string and then you right click and you go cross references show me everywhere in this program that this string is referenced so that's what I did here see uh no you can't see um so there's a function there's a function called admin in login and it it checks to see if the user is cameo sw5 and if the password is

Superman you don't need to understand assembly to know this is

bad all right the next one ping d c percent s really so here's the function you want some feedback for next year okay so this ping test function can be called unauthenticated unfortunately and uh down in the bottom right hand corner I'm sorry you can't see it but uh right here that says system and this is Ping and this is percent s and then this this is a Sprint F and basically what happens here is this function takes whatever you say and it concatenates it with this pin command so of course you can uh pipe together commands using a semicolon so if the user provides ping 10.0.1 do1 semicolon run my [ __ ] it runs your [ __ ] and what's really

cool about this Vector is that the rest of the command really helps you out because it pipes it out to aex text file and then the the CGI takes that text file and includes it in the response so it's like you can't get any better all right so this is what can I all right Bert proxy yeah um Bert proxy is what I used to kind of show this um so the IP address that I uh supplied is 1921 168 10.2 uh percent 3 or percent 3B I'm pretty sure is hex for semicolon I mean it must be uh and then I and then I ran busy box uh so on the right hand side you can see in the

uh HTTP response it has the whole busy box output like you didn't run me properly give me some some parameters so I give it some parameters son of a [ __ ] so here here are the parameters I'm like why don't you run a telnet server on Port 444 and Link bsh it says cool so now I'm sshed uh into the box what's really cool about this is that it doesn't ask for credentials it just like spawns a shell so don't worry about that you've got a shell all right so this is my CGI do CGI so complex how could I ever so the first problem starts right here um the way a CGI Works um let me just explain it

really quick I've got no idea how much time I have I'm just blabbing all right cool so the way CGI works is you've got your web server right and then you've got a program it's like an exe it's like you just run it right so the way the web server does is it sets environment variables okay so the HTTP service says set environment variable content length equals blah set environment variable blah blah blah blah blah so you get to know what the client requested and that's how the CGI and the uh web server uh talk back and forth so the biggest problem uh with a router and they hav't haven't fixed it yet actually if you

haven't noticed yet anything that I have in red text has been reported but they haven't fixed it and I've given them a ton of time so don't feel bad for them um so anyway the content links when you make a web request you know that HTTP header content links so that gets passed to this program and then it statically allocates a buffer based on that content link and then it copies whatever you sent into that buffer so there's a buffer overflow just by specifying content length we'll pretend that that doesn't happen because if like if if we end it there then that's like that's really boring so um so let's just pretend that the content length isn't a buffer

overflow that we can exploit pre-authentication and gain complete control over the router let's just pretend okay so the next section here is the bunch of string Compares right so the way it works is you make a request and in the URL header it's like request equals blah question mark blah equals blah you guys know what a URL looks like um so this is just a bunch of string tests to see hey was this in the URL um you can't see that so let me try and zoom in again here's one no off zoom in a little bit more it says no off guys come on that's funny no off you supply the string no off and you don't

need to authenticate all right so you supply the string no off and you don't need to authenticate and then next you can uh say no off but I want to do an admin login and then that's this Cameo blah blah blah so no off and then you log in as admin with Cameo and then after that you launch the admin uh tet interface all of this no authentication required I'm not going to bother zooming in you guys believe me right okay the slides are going to be available you can like zoom in and be like oh okay all right and so this really this really bothered me I was going to say grinds my gears but you've totally

screwed that up so this really bothers me um so I told them about this back door in their firmware right and I like I email them I talk to them Johnny and I we go back and forth and then he sends me a new firmware and he's like how's this so I throw it and night I'm like cool you guys did it you removed the the you removed the back door I don't know why the back door was there in the first place but cool good job guys um and then November 10th 10th they released a new firmware and they reintroduced it not by mistake don't not by mistake they actually encoded tftpd this time as base 64 so they tried to

hide it they they said no longer does it happen when you request my CGI CGI that CGI listens so if you request seark 524 CGI and you supply the parameter Accu MP 524 and pass is KM Mark 43 this goes on for a little bit the code needs to be SM smwc the hash needs to be/ lj9 W blah blah blah then commands and then this where the arrow is that's a base 64 asy string TF tpd so they insted an obstacle course in front of your vulnerability yes and they lost my respect like I actually talked to uh the people that are paying me to do the research and I was like this happened what do we do

from this point forward and they're like screw them they're basically what they're doing is they're reimplementing a back door but they're trying to make it harder to find so um neier will not be getting any vulnerability uh reports from me anymore so anyway um they've reintroduced it on May 28th

2014 zero minutes I'm done holy [ __ ] all right uh I'm gonna [ __ ] I'm gonna excuse me I'm going to power through these slides really quick so I did a fine- name binwalk I extracted all of the binaries and then I'm like hey let's do a find uh for all my CGI right we know this is vulnerable how many routers is it running on okay good just two wrong actually it's uh 1 2 3 four 5 six uh and so it turns out they're also running that CGI in other ways so there's another two routers that are vulnerable um oh there's a shadow file and yeah there's a whole bunch of built-in uh accounts um I didn't have a

time time yet to throw these against ocl hashcat but I'm sure they're crackable um here's a whole bunch of uh HTML files um and it turns out that if I put those in burp and do a ww uh enumeration um the Netgear 6300 if you request the file congratulations to. HTM you get the uh WPA keys for all configured network devices so okay um so there's a whole bunch more stuff I want to do I want to do cross referencing vulnerabilities against all known platforms I'm just going to leave uh all all these slides are going to be available online please feel free free to check them out um I'm going to be doing a bunch of myips uh exploits um

what I really want to focus on is Kernel exploitation so oh um they're bling at me I want to focus on kernel exploitation because it it's it's not just these web

applications these devices actually run wireless drivers so you can you can connect to them and exploit them without actually having a w PA key uh I also want to look at Nas's IP camera TVs cars refrigerators and IP plugs I've already done so here's some zero

day so if you want a qap device don't uh this is a bunch of other stuff I'm

done [Music]