← All talks

Exploiting Alpine Linux: From vulnerability discovery to code execution

BSides DC · 201740:341.1K viewsPublished 2017-10Watch on YouTube ↗
Speakers
Tags
Mentioned in this talk
Tools used
Platforms
About this talk
Alpine Linux has become ubiquitous in containers due to its small size and security focus. This talk reveals two critical vulnerabilities discovered in apk, Alpine's package manager, through fuzzing. The speaker demonstrates remote code execution via man-in-the-middle attacks on package updates, then walks through responsible vulnerability disclosure, CVE assignment, and vendor coordination.
Show original YouTube description
Alpine is a Linux distribution promoted as lightweight and security-oriented. In the last years it has become widely popular, mainly thanks to it's use in containers. In fact Docker itself has hired Alpine's creator to migrate all official images from Ubuntu to alpine. The official alpine image has more than 10 million pulls! I've found two critical vulnerabilities in apk - alpine's package manager. In my talk I plan to explain how I found the vulnerabilities (by fuzzing specific functions), and demonstrate the exploitation process that finally lead to remote code execution. A full attack using the vulnerabilities consists of MITMing an alpine machine or container and providing it a malicious, carefully crafted update file (See teasers 1, 2). I will also discuss the process of assigning CVE IDs, approaching the developers to responsibly issue fixes, and finally publicly disclosing the vulnerabilities. Ariel Zelivansky (Security Researcher at Twistlock) Ariel Zelivansky is a security researcher at Twistlock, dealing with hacking and securing anything related to containers. Ariel is a veteran of an elite Israeli intelligence unit, where he served in the role of a researcher.
Show transcript [en]

the besides DC 2017 videos are brought to you by threat quotient introducing the industry's first threat intelligence platform designed to enable threat operations and management and data tribe a new kind of startup studio co building the next generation of commercial cyber security analytics and big data product companies hello everyone my name is ariel nevelsky I'm a security researcher a twist look we do security for containers as a part of my research I started looking to alpine Linux oops so how many of you do know Alpine Linux OS that's a few hands so alpine in essence is a very small distro that's used mostly in containers but for embedded things - the model is being like small

secure and and very simple so but the actual leave up that the image itself is about five megabytes in size they put a lot of effort in the security they pass the kernel with JIRA security in packs that's a bit less relevant with containers but it's good to have that all your space by name is a compiled with by no executes full level stack smashing that's like a stack enemy and all these kind of things these are all the features you could compile you could put on compilation time to protect your binaries so that's that's pretty useful alright so why is a plan relevant it's become very popular for containers on the official hub you can have you can

see like it has more than 10 million poles you have tons of images based on our plane so that's that one is from hacker news the dock companies hiring nano copper the developer of Alpine to migrate all images of official images to our plan for mobile tour Debian so essentially you got like everyone is going to be using Alpine sooner or later because it's so small and secure and it's very efficient so that's that's gonna be an attack though for people to look into so I was looking into it the file finds on the abilities inside it right so to do that I would have to take any imagine see what it consists of so the most basic thing is lip see you

probably know know Lipsy that's like the the regular see library that you use like when you just do normal GCC but muscle is just a different implementation of that you have busybox which is like it's it's a package that has all the basic Linux tool that you know like grape or LS or all the basic things they just put it in one one package and it's very small that's the thing so so alpine knew that that and then you have apical tools that's the the package manager it's just like apt-get you have on a bun to a demon you'd use it to get other packages from the official repository alpine so I wanted to see what what do people do

with Alpine right they when you use containers you're gonna you're gonna like take Alpine as your base and then you're gonna do something on ponent so one of the first thing is get more programs you'd do foam outline and then the next line you'd run a PK get to download something okay so that's why I wanted to look into a PK now a PK was originally willing as a shell script about 2008 they we were doing everything to see so that was that was good for me as a researcher because I like sick old means more bugs more many memory bugs all kinds of thing that I might be able to explore in the future so to use a PK

you'd go a PK update and I pick a ad for the package you want to do just the same as you do with any package manager and like my my thoughts were can I somehow like damage the packages that it download it's can I put my coat on the packages the Downloads can i maybe download packages and use some vulnerabilities that would that would be like an attack tractor if I couldn't make it somehow download bad packages or old packages then the hell point would be vulnerable so to start I went to wiki they have a pretty nice wiki with lots of details on how they build an image so I became looks at EDC apk repository for

all the list of the positives you can have local ones you can have remote ones and when you start on docker image by default it uses HTTP okay so that was the first bad sign because even uses HTTP a man-in-the-middle attack err could just interfere with with whatever they think path and and put something else no it doesn't mean it's vulnerable because they can still sign whatever they pass on HTTP it just means someone could interfere with it and indeed they do sign all packages you have EDC apk keys which has all the other side keys of the developers so so it's not as simple as just you know changing the packages over HTTP but we might still

have an attack vector here so I looked into apk update when you only one I together it what happens so from the wiki again you download this effect a file called apk index dot a dot GZ now from that file it takes all the list of the repositories or the packages it wants and it just like posted that and and takes only laid out so I wanted to see what that file is made of and again the wiki was pretty helpful Wow in times like that's a bad 3-inch shot but it's just like essentially to make an apk index file what they do is concatenate two files to touch it files the first one is a signature and

the second one is like the will text data with all the least of the packages so that wiki was pretty useful because otherwise I would just like go and want file on this and try and see like what it's made of and I would enter it and you know this just gave me alright it's to concatenate files now having that signature as a third jersey meant that the package mentioned we have to take that file and and like lantau it would have to G and V paid and enter it and then with a signature right so that's a whole bunch of codes that's gonna run before they even look at the signature and since it's it's passed on HTTP I

could just put whatever there I want on that piece of code alright so that was for my next step which was fuzzing how many of you heard of fast fuzzing for security purposes okay so I'll just go it pretty shortly fuzzing means giving a lot of data to some code to make it crash all right let's say you have some piece of code that passes JPEG images you just fill it with tons of binary data until it crashes okay and you check like which data made it crash hmm now you have dump fuzzing which means like trying all random bytes until it crashes like just modifying the bites all the time the earth crashes and then you have

like real fighting is instrumented following so you have some program that takes your code your C code or whatever and and when on compilation time when it translates that assembly it puts all it it's it's its own code to look at which branches you take so in essence you want to feed that data to the defer to the binary you want it knows which branches takes every time like an IFO execution okay so I'm not saying example of a JPEG you have some magic on the header all right so you'd fill it all kinds of random random bytes and it would just stop because the head of his bad and then once it gets a magic it knows he takes a

different branch and then it thinks like oh I can just do that magic and get to that branch every time and and if you do that and um finding you would just like have hit that magic once and that's it and you'd never use that again and it's not funny and you keep using that magic because it takes a different branch so what that means you can get like a lot of code coverage with with smart fighting and you can find crashes pretty quick now AFL American fuzzy loop is a very popular father that's open sourced it's it's found lots of bugs in open source projects if you should look that up and I've used that before and I've do

that again for for apk to see if I can't find crashes in the code that part of the apk index file so I went and got the source code and and like I had one more step to do before fighting because I couldn't just take a PK and give that to a film tenet alright fast that because a PK has lots of features one of them is downloading packages and updating but it can do all kinds of stuff so I just had to to find the piece of code that would download the packages and and feed that like feed fill the data form FL to that so I found out they'd say it sounded like was gonna be be the relevant file

but it had nothing inside it the main was in apk that said that was another file and although in India bit dot C at the end add this this piece of code that said alright that's an update applet they're the only definition of applet and inside it I found that they did that apk update cache flag okay so I just grabbed that in all the source files and I found in a file called database dot C which has a function that's called from the main function and what it does in essence is just update the cache update the repository cache like go to to the remote repository download the epic annex and parse it and do everything

that I'm looking for so I just like I like the concept of an applet and I wrote my own applet for apk just so I can fuzz it easily so I made this after the main function and this was the function that's gonna get all the data from the father and and I used their own internal function to like get data from a file which was a PK base to inform file and I fed that to I pick a DB index we'd compiled it and I put AFL inside a container just so I can want it quickly and you know whenever I change the setup I can just we money easily laughter well the father did nothing if

it actually tells you that you know I'm not I'm not finding anything it's all stuck and it did have that line that said they start making progress so I was stuck and I try fuzzing some other functions and like I kept looking into the code and find find like that function that's relevant for me because it does a lot of stuff before reaching that that tough without passing so I found a PK top pass which were the function that passes doubles but the fuzzing was still first low a PK gives you the speed that it takes for every execution and it was like a few tens of seconds which is way too much for a

father you want it to be milliseconds to actually make real results with it so I had to find what's the bottleneck what's making it very slow and then kill that so I found it does emit open SSL and it does a PK DB II need an apk to be open well the most important one thing is in it open SSL because like on every execution it starts with in an open SSL and this function relies on entropy and that means the the Linux term will help to generate randomness every execution and at one point it would run out of randomness and it would take like few seconds to generate enough randomness to fit in an open SSL so since I didn't

like I didn't need the signature parts I just commented that out and I went without that and the execution was pretty quick after that like in milliseconds he wouldn't post every a tough file alright so I went in first again and I got crushes I finally got like crashes that I can deal with now when FL gives you crashes it's just like a bunch of files in a directory that called a crash in the in the binary and you can get hundreds you can get like a lot of files to look at and what I first did was manually go for each file with gdb and try and see like why is it crashing but that was inefficient

because I had again lots of files so I use crash work it's a nice tool it's also open source what it does is it's dude jelly what I did just automatically it ones every clash with gdb and it checks like on the memory to see what why why did it crash and and even goes as far as telling you if it's exploitable or not and what the owner ability if it is like in this one it says that the memory access relation thing which was which is nice because I could know know exactly which crashes were relevant to which which bug so I had six crashes that seemed to be from different sources eventually I found two

lines that these crashes originated form but before I could have done anything with it I had to to reproduce it with the original code right because I changed the code I put out a lot of functions there coming and a lot of stuff so I wanted to see that the crashes were not originating from the fact that I've changed the code and removed functions so I took the bad crash file I Jessop tit and put it inside an apk attach a file on my nginx server I used the add oast flag for docket to what it does is changed the hosts file on the docket when it sets it up so anytime the container would find which

city and alpine linux dog it would go to my server so that's a very simple man in the middle attack and i when i PK update and as you can see I get a segfault a segfault means crash means I might have a vulnerability here so this time I open gdb with the original source and I found the region of the crash and I'd like to explain what was the line causing that that crash so inside a cave dot C with the apk pass function that is signature that the one I used now to actually make sense with why it's crashing I had to understand how do I go and pass the tar file yeah because I'm reading that's the

source of passing the tougher I never to understand what what that is before I can actually do anything with it so tar is a very old format everything it goes for for tape archive yeah it's very Stoica and it goes in blocks of 512 bytes right the first lock is the header file the header block and then you get like data blocks and then the head of again and then data so the first one would be like telling you right the next dog is the file and that's his name and that's that it's left and then it get the data so one of the fields of that header was a type flag that's like we

tell you what the type of the next file is one example is a normal file with the file and another one is a link and then you have all kind of special special flags one of them is the long name extension now why do I need that yeah that's an extension to Talon and they made that because you can only have files with with names of only a hundred bytes because in the header the part that that has the file name is only a hundred bytes so if you have anything bigger than that you're gonna need to tell it all right the next block is going to have the like the side of the of the file name yeah so

to that what this extension was full and I learned that because because of the bug had no idea like I didn't look into it so that's found the apk sauce what it does is check for the type flag if it's L then that's AG no long name extension and it does it does something pretty simple it take the long name buffer and it allocates it with entry dot size bytes okay so that like NV dot size is taken from the header and it's just the side that's going to be of the filename that looks over I would implement it like that too and then I noticed a used blob reality they don't use real I can use something

internal that they've ridden for themselves so I look into it and then you'd st. blob relic it takes a peek a blob T and and the new size and then it just calls via a lock or on that buffer with with the new size but they did add this check they do if be at land which were the old side of that buffer is bigger or equal to the new size than do nothing well why would they do that what they probably want to achieve with that if you have if you want a smaller buffer than what you already have there like I've been shrinking it just don't shrink it okay the just if you need more bite

then they expand it they call the other and if you're gonna use less but then what's already in that buffer then it's okay right it can be bigger and you just use part of it so that that would probably the idea behind this but they did have a sound flow in here which was that well new size is int and B and land is long they're both signed now that makes that comprehend sign so let's say if new size is anything bigger than eight thousand bytes hex now it would be negative right because the most significant by means the sign and anything we can max max and int is gonna be a negative its negative

integer so like it could be minus one new size could be minus one and then it would be smaller than B at land if it was anything like than that so in that case I could just okay so that that would be okay as long as is a twig would treat entry dot size as sign - okay because if it if it was like minus one then and like I said we would say all right that minus one so I'm not copying anything because it's negative it makes no sense but what happened is I assert we'd went to GZ I read which were the function that treated size as size T which is unsigned so if it would be a

negative side it would just treat it as a huge size okay so like minus one would be just a very huge number and and that's that's the origin of the craft and I had now you think to fake that you could just put size T in blob you look and then I would just you know be beyond sorry it would be okay but then you have the plus one so you should like make sure we doesn't overflow the max oxide that you can get and the second one ability I had was with with there's just like the same thing that I ate here just with a different type flag which was a box header it doesn't really matter so

much so so yeah so I was just gonna develop an exploit for that I was gonna try and reproduce everything myself so I started and build small tough file that that's the one I used on my hex editor and I had that you can see a small L over there I know if you can do that from here but that a small L which is for the new long name extension and the size is the size for a tar file is like in octal in ASCII okay so that that shows how its TOEIC on that formally so just one of seven 777 which is essentially minus one and if you located as a sign and the sign in jail and and I

took that file and I packed it again just as I did before the crash and I got a crash on the execution now what's happened here that I just I went in started with the long name vulnerability straightaway without having that buffer allocated before so it starts with now Wow if you don't allocate anything to it it's going to be zero at first so entry at name with zero and it just tried to copy you know all that all that minus one bites into zero and it crashed because it can't copy to now so that was all right I just had to do the same as I did here but put put one block before

that that would allocate a normal size I don't know just a field like a few hundred bytes just to have that long name not zero so I did that I had one block that would allocate the long name buffer and the second one would just crash it and I managed to to actually make it make it crash and I put like I use gdb and I saw that it's it's overflowing the ice at weight was copying all the bytes from the minus 1 whatever to the to the long name buffer and and I had an overflow okay so one thing that really helped me here with that gzip read which would the function that was called for my slave it would it

would stop once it sauce runs out right so if I give it - 1 bytes to read it wouldn't expect to read the amount of bytes from the file so I could just put whatever I want in the file and would stop when the file ends and not when it wouldn't keep reading from the file non-existent but so what that was pretty helpful for developing on exploit and and I had to do something who's that EEP overflow now I don't know how many of you have story the hippo flows before but there's traditional ways you can you can exploit a nipple flow with with lip see with normal lips like old male look at some unlink vulnerability there was

there were lots of of problems with the way lips he allocated the memory itself he could just like over white struts of lip see but I didn't have no lips in here I had muscle and I didn't want to look into muscle and to understand how I would lay out the memory so so I just decided I'd go with something simpler I just look into the hip and see if there's anything I can overwrite over there and no make sense with it for instance I thought I really wanted that to happen but it did not to have a flag on the hip that would tell it to to ignore the signature alright maybe they have a flag that says your signature

when I would put table on there and they would not use a signature and I could put whatever color I want in the packages but I didn't have it and the second idea I had was to go with callbacks I thought they might have them callbacks on the hip that I can just call back says in function addresses that I could put any address that I want on there and any of them called had that idea because I know they use is a tweed which means if if is is going to be on the hip then it's gonna have all the callbacks on the hip too and I might be able to override them now that

solution will work as long as we don't have a SLR which is address space demo and immunization it's one like difference method that mean on every execution you have of an elf file all the addresses piece is going to be randomized okay so you can't you can't know beforehand like which section is going to be at which address so if I want to call an address I want instead of one of their callbacks I'd have to know it's a dress and we realized that's not going to be possible but there are ways to bypass today I think most people do use like they find other memory leaks and then they know the address space and and then

they they like they would calculate everything on the spot but I just wanted the small exploit like a small proof of concept so I just disable this a lot for this looking to the hip and I found I assert width on the heap as I thought and its definition was just like that it has three functions one is get mera there's read and there was closed now I don't I don't know what get married for but it was never called so I didn't look into it and with was the one that actually made the copy and I just I put a breakpoint on it to see its address and I found it serviced and luckily it was

ahead the buffer that I am overflowing right so you can see the difference between the long name the entry that name buffer to the highest rock is something like it's 0x 1538 zero bytes so on the hip I'd have the the one I'm overflowing then all kinds of data that I don't know what's made of and after that the is buffer so I opened my top file that we don't want that I did override in and I put different mount of bytes with with DeRose or whatever random bytes and after that I put just zeros now that was pretty risky because if there was anything useful on that 15:3 a they were bytes that i overridden

i would crash it but it just walked and it crashed on 0 and the crash was originating form from that call to iose a trade so it would just essentially call the the devil address that i put instead of that i said wave so that meant that i can put whatever adwords I want on that and that callback and it would be called with my parameters yeah but parameters how would I pass parameter to the function that I'm going but it's gonna bring whatever is a tweed was was getting to my to my call so the first parameter it gets is is itself now let's look at ice again it starts with get Mara so that's eight

bytes of data that I could put whatever I want in and would be the first parameters and I use that so the function that I wanted to call with system system is a cisco that that takes a string a shell string and just execute it so that's pretty simple for proof of concept let's say I just write echo one in in it and it would just echo one on like wound one that so I put like one instead of the eight bytes of the get meta and I just ran that and I put system in the read address and I got it running I got one point into the screen and then a sec for now I found that was

good and that's why we're producing it on a different camera version and it didn't work now try and figure out what happened there and it would just like since I said we would copy in chunks it would just copy at one like at one time it just copied for bytes instead of eight and and then call I assert read again but the new address of I said weight would be four bytes of the original address and four bytes of my overridden address and it would crash so I had to find something else on the heap I could do and and I just looked in the hip again I put all kind of random bytes in

that I put tons of eyes in there and I found GIS PS which was an internal pointer for the gun zip function it worked just the same with the is one only it had flags the flags parameter to and struck so instead of eight bytes to to overflow it now had 16 ones I could use for my payload string right because I don't really care about the flag that it gets so so that was that was a screenshot from my hex editor making that exploit I put a code twist look inside the first 16 bytes the first date you can see is gonna be the flag the second is going to be the weed not with the get meta function and after

that it's going to be the weed function which which I put the system address that I checked for instead and everything after that is irrelevant because I didn't care about recovering the execution I just wanted my code to run my my pedal to do its work so I ran it and I got waistlock printhead and that's I home and that essentially meant that I could run whatever code when I was 16 bytes off of Shels doing that I went on on that Alpine and and I just all right so I went and what a different exploit instead of putting the equity stock frame which is pretty useless in itself I just put like net cut on there which

which listens for from from a port and then I just I popped up a shelf home from a different system and I had it working and I just could run whatever whatever like I want from the shell and then when I closed it it would crash but sophisticated attacker could just take that and actually recover the execution and that meant that if you don't man in the middle or the image is pretty blurred but what it showed is that a man-in-the-middle attack er could just take there's one you want apk update it could put his his own payload just like I did there and recover the execution so you never know that happened and you'd

think everything is okay when you want update and it get its payload running there and and that would happen anytime you would you build docker image based on Alpine or you update your current Alpine Linux says and and that's pretty nasty so I promise to talk a bit about what do you do when you get that kind of vulnerability I might have found this I've exploited this that's like effective now it could affect people right now so how do you go about sharing that with people closing it what do you do so in general you have three ways to to disclose like free way of disclosure of a new vulnerability the first one is non-disclosure which

means you found something and you're keeping it for yourself you're not sharing it with anyone or worst case you're selling it to people you might be using it maliciously that's like the the blackhat side of disclosure now on the other hand you have full disclosure which means taking everything you've found and just putting it out there all right just like sharing everything including your proof of concept online and having people do that and I know some we should think that once we do full disclosure it puts a lot of effort on the vendors to fix the bugs and well that's true it does put a lot of effort on them but it also creates a lot of

chaos okay so what you should do is follow responsible disclosure or coordinated disclosure which is the process of contacting the vendor privately let me know if the problem and asking them to fix it now there's no exact definition of how you do that but but like what I like to think of that first you estimate the impact you would have proof of concept if you have enough time for that and then you send it to them because some some researchers just send the crash report to the vendor and they think they'd fix it but just sending a crash report is not enough at all you'd have to explain what the bug is and then try and get them to fix it

actually so just write a short paragraph explain what happened there and and let them give them a few days to look it and fix it now you'd expect the response back certainly now I think most vendors do to have like emails especially meant for that right you have a security team that meant just for for clothing such vulnerable T's so you would expect to get contact a few days and and know when a fix is gonna is going to be like you and ask for an estimate and and and that's it it will be closed and at the same time you'd want to sign a CV ID now a CV is how many people if you do know CV right so

that's most of you it's certainly a number that assigned for each vulnerability so so you can look it up online and you that number whenever you talk about the vulnerability and that way everybody knows everybody can refer to that specific bug at the same time and so to get CVE you'd have to contact meet with thats organization it's a non-profit american organization that's that's one of the things they do is sign civvies and organize a whole TV thing and and they first asked you to search for a vendor that that's responsible for that for that product that you found the bargain so let's say you found a vulnerability in iOS you would contact Apple alright

and Apple would give you a CV but as long as it's an open-source project and there's no one specific responsible for that the mitra is going to give you the CV ID and once it's fixed once it's all done you'd go and you you'd write about it online you would make blog post about it you make an adverse away there's always social security for open source projects and then you'd get people knowing about the vulnerability and fixing it and issuing updates and then the whole process of of patching bugs will go for and well what I did was go on IRC because hackers like Stassi and the developers were gonna say so I just went there and I told them I found the

bug in Alpine what I do and I got talked to Timothy Bosch which was the one either the person responsible for apk in itself and he took him just a day of you to fake that and asked me a lot of questions and how I found it and now I exploited it and instead of just fixing this specific bug he actually went and and removed all callbacks from the heap so my specific exploit wouldn't walk again that was nice most developers just closed the bug but they went all the way through and closed the next toy too and and I went in more than adverse story like I said and I made log post about it and that was

pretty much the end of this specific research but it's not really the end because there's so much still left to do well I fast only the top posting function but there's so many functions and like data posting parts of apk that that I might fuzz and you might as well find fuzz alpine and alpine stores if you do have that in alpine apk relies on lib fetch which is a base D library for for networking and and like HTTP or HTTPS and I don't think it's maintained I'm not sure so I might as well check that and try and find that because you know network protocols it sounds like good vector for vulnerabilities and I'll

go for demonstration now demo usually don't work at current so I just preferred something beforehand and I'll just open it nice so that's that's me on the first machine that I made and second one is a vulnerable machine all right so here I go and I just put my payload inside the engine server

the edges a bit just like a success I did before all right and then I just like I checked there's a connection yeah all right so what I do here is one a docker which are these men in the middle okay because it whenever it will get to that city and the Alpine City answer would go to my my several dots below and then I when I became update and it it stopped running fetch and it stopped here now why did it stop because I put the payload there with the net cut listening and then on the second server I just when that cut and then I get a shell I'm would I'm running a list doing

whatever I want when I end it gets a sec fault but I really did whatever I want so that was it alright so if you have any questions I'm going to be here you can come and ask check out our Twitter we just started a new Twitter for a research team with a blog that's it thank you [Applause]