← All talks

Checking Your --privileged Container

BSidesSF · 202024:29675 viewsPublished 2020-03Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
StyleTalk
Mentioned in this talk
About this talk
Sam "Frenchie" Stewart, Maya Kaczorowski - Checking Your --privileged Container
Show transcript [en]

hello everyone welcome to our talk today we'll be talking about privileged containers and what those are yeah thank you - I'd say for finally getting in monitor a big enough cool I'm Frenchy Sam Stewart is my real name but a lot of people know me as Frankie on the Internet I'm the infrastructure security engineering manager at cruise cruises company make self-driving cars infrastruc DOS many things one of them is container security cool and I'm I am a product manager at github working on software supply chain security you may know me more recently I was working on container security at Google and we're talking about how to how to help protect privileged containers in your environment so on today's agenda will be

covering privileged containers so why this talk while using the privilege to flag is exceedingly common and it's also kind of dangerous so we want to make sure that you're following a best security practices here a good security hygiene the idea here is that you can you know take what you need to know and go back to your company as your clients to tell them to stop using the privilege like so we'll cover what a container is and what things like container D and SEC comp and all these acronyms and terms you've heard before are what the privileged flag actually does in terms of all the features that you can control individually and capabilities that you

can control using that flag and like what they do what happens if you don't block those and then we'll talk about isolation and kubernetes and some options you have for further restricting what containers can do so a very scientific audience participation analysis I only speak pirate so please give me a big loud yarr when I call out your particular distinction who here has never heard of the term containers before today one person I don't believe you in anyone else no wonderful goods at least somewhat familiar who's heard of them maybe once or twice but not super familiar maybe you know use it once or twice big yar please excellent wonderful who's familiar using an improv or even

you know maybe like work at docker or some other container companies big yar okay cool about even split there and who here is just the post means on Twitter most of the audience exactly what I see thank you brilliant so diving into what a container is containers are just a logical extension of kind of where the industry has been headed in general for many years in terms of making applications easier to package into deploy so first you were running like a rack like somewhere on Prem this meant managing your own data center with your own servers the the OS is that you have for those servers patching all that stuff you basically have like two at

least two full things to manage you're managing a data center like actual hardware and then you're also managing the apps that run on top of those this is particularly annoying for your ops team though good morning my name is Gustavo I'm here to introduce her nice I guess I am current speaker thank you so this was particularly annoying for your ops team so if a piece of hardware went down something like that your application was down and you had to then I have maintenance windows like between you know 2:00 and 6:00 a.m. on Sunday everything went down and that was totally normal and somebody was in the data center and they hated their jobs then a few tickets back virtual machines

came along with scene they became increasingly popular as a way to manage your applications so with a VM you can move your workload virtually between several different servers including between servers on Prem but also from on Prem to the cloud and not just major app slightly easier to manage and now we have containers so a container it's just a way of packaging together your application with its libraries and dependencies this lets you abstract away the underlying hardware as well as the underlying OS so you don't even have to worry about what SROs you're running on on those machines and they can be different the goal with containers is to be able to write and build once and run anywhere where

anywhere is your on-prem datacenter or private public cloud anything your IOT device wherever you want to and so let's quickly distinguish between a couple of different projects if you're not as familiar with the container space so on the left hand side here our container runtimes a container runtime is a standard for what you use to run your containers many of these are now based on the container runtime interface or CRI some common container runtimes you might have heard of include docker container D run C etc and a container runtime is where you can actually implement a lot of the controls that we'll be talking about today when it comes to restricting what a container can do so because those are

implemented at run time which makes sense now if you only have a handful of containers then you need a runtime like container D but you don't really need a whole lot else like you're okay managing those you know semi manually if you're running a lot of containers and running applications at scale on them running anything in Prague then you're gonna want something also called a container orchestration system on the right hand side here like kubernetes OpenShift docker swarm something like that this lets you ensure that you have the right number of workloads running and that your load balancing between these properly can give you monitoring for your workloads you have put our idea of what's actually going on in your

environment and then container orchestration systems use container runtimes to actually run those containers for example in a pod and cribben at ease sometimes container orchestration systems also give you controls to allow you to define additional controls or like to implement but those are still implemented at runtime for what we're talking about today will primarily be talking about how the povich flag is implemented in container D and controls in kubernetes so one important thing that into that as well you can kind of think it like goes down further the stack right run C then talks to lid container which is in the kernel itself and then the kernel kind of does the container stuff below but privileged by

itself the flag does not exist in the OCI spec it is not a necessary part on containers it was a convenience flag that is originally from docker there's been an inherited down into container D for historical reasons but to run a container in compliance with a spec you don't need to have the privilege fight now back to the good security stuff so containers are really just based on C groups and namespaces these are - Linnaeus constructs that are used to isolate resources on the same host C groups are resource limits that prevent any single process on your hosts from consuming too many resources like memory and CPU this is about preventing unnecessary or unlimited use of a

valuable resource on a particular host so you can restrict one project from using another projects resources namespaces are a way to segment processes so that they're isolated from each other for example for network resources or mounted file systems this isn't like security isolation isolation it's more I don't think of this as a strong security boundary it's more about restricting one project from accessing and projects information and so when we talk about containers and up and protecting applications of container we also need to talk about capabilities Linux capabilities are individual privileges that a process can use so these include anything and everything you might want your application to do such as writing or writing reading or writing audit logs

bypassing for missing person checks or specifying configs for mandatory access controls or Mac an unprivileged container runs as well what's now considered kind of normal first you're gonna verify that it has the appropriate restrictions then those are met before allowing the process to run and then you can ensure that it only runs with a given capabilities however a privileged process bypasses all permission checks so it runs with a user ID or UID of 0 which is effectively a root or a super user and those privileged containers can perform actions with any capability and so if you think about this like why does that really make sense well this came out historically that capabilities came after the concept of a super user so

before you ran everything as a super user I was route and it was like well that doesn't really make sense I want to restrict some of the capabilities this have this doesn't actually need that many capabilities so we started chunking up individual powerful actions and allow you to implement the principle of least privilege in your environment I know that you can also still restrict some root user permissions though and so now that you have a rough understanding of capabilities how do you actually grant and limit these there are a couple Linux security contracts here the first is a Parmer which is a linux security module or LSM that lets you restrict your programs actions things like file rates

rights and executions it basically means that somebody can't run arbitrary commands in your environments that you might not want to run and SELinux or security enhance linux is also a linux security module that lets you restrict your programs actions specifically around mandatory access controls it's very similar to a Parmer but instead of identifying what file files by their path which is what a primer does SELinux uses I know numbers you don't need to use both in fact you can't use both because they they have the same kernel interfaces you can only use one of the two LSMs but just pick whichever one is best for your environment given like your understanding of inodes and file

positives cetera and then the third item listed here set comp or secure computing mode basically filters the set of sis calls that your application can run it puts your application in a sort of like one-way secure state so that if your application tries to do a particular assist call that it's not allowed like exit then that process is automatically killed in your environment and can't execute there's a pretty good default out there the darker side comp default profile that limits about 50 uncommon or potentially unsafe sis calls in your environment and take a look at that if you're not familiar what to do here and use that to get started and iterate on that wonderful so again audience

participation who recognizes these y'all now what Konami code alright that's the easy one what about the other ones doom yes excellent what about the third one you can you can okay no one else plays that I'm still finding yeah trying to find someone else who played Duke Nukem as a kid but yes those are the cheat codes each of those effectively I mean Konami Cohen did lots of things but they all turn on God mode right so you can think of privilege as effectively god mode for containers alternatively who knows Dan Walsh he did a lot of talking around Linux I was one of the tech lead on the project now also does a lot of talks on on container security

but he's partially known for you know stop disabling selinux comm which is wonderful website that talks about using seven four zero versus seven force one right and privilege is effectively they set in four zero of the container world right security teams may go to great lengths to invest in security controls but as soon as you use dash dash privilege you've just thrown the baby out with the bathwater so you know taking those analogies to know back to what we're talking about before it privileged using that privilege flag basically undoes all the good security work that you've just done using all the features that we just talked about like a burner and see Linux and set copy

cetera so it lets your processes run free with all of the capabilities so privileged containers don't run with rooms like a primer and as if it were a root user privileged containers give you access to everything for example all the file mounts in your system it's very dangerous is that allowing those privileged containers is just a single flag so it's so easy to forget this when copy-pasting a command or when you're typing it into your Prada environment instead of your test environment and that is extremely Theory so to scare you a little bit more look demos even more scary is if the demo gods decide to that I've made sufficient sacrifices and they want to work so let's try that and then

I know someone in the front row just said what's he doing is changing HDMI cables in the middle of a presentation oh no did that just flash up for us we got one we got I can see it magic hey there we are wonderful okay that's the end of the presentation thank you for coming know a little bit of a code walk here so this is the source code from container D recently and this is how the with privilege flag is particularly implemented so with each of these will basically prettify it and then talk through each of those so with all capabilities adds all the next capabilities as my I was talking about before so this is commonly used

when you try it someone tries to do a thing and then it doesn't work right like I tried to you know expose a port below 1024 contained it failed oh I I tried to you know send some real packets container fail right the intent of this section here is as the audience here you already probably care about security here are some alternatives that you can propose instead when you go back to your business right so cap net bind service is a capability that you can add instead of cap sysadmin the super privilege god mode privilege captain ed Roark yeah it does like a bunch of things but hey if that's not working for you try kapner

admin that's still less than caps sysadmin which allows for full container breakouts there's a few other there as well capture and cap sis cap nice typically if your container needs to do something a little bit weird on the host don't jump straight to cap pretty caps this admin with - privilege try one of these instead yeah take a screenshot of this super stretched out yeah thing to tell you tell your developers what - - instead that is intimidatingly lodge cool so now for the demo time so let's try sweet so can you all see that is that big enough no it actually isn't so there we go this is so satisfying I highly revered one submits a feed next year this is

good cool so here we have that's a be ridiculous so we're running a normal container and here we're gonna do some crazy container breakout magic and then so we're trying to mount Oh can't find mount can't see that far so we try and mount it permission tonight a new route well I am worried right so I do was denied the capability of mounting fosters and this is just one that we're really highlighting here so let's try that again and then we'll do that with privilege and see what happens so now we can actually see the you know the files so that's a device on the house what's that oh you can't take no sorry uh let me see

if I can do that is that better probably not that was fine yeah but then you can see my personal financial documents below I don't know if anyone else caught that so maybe that's not so good let me open a new window and do that and then hey so yeah trying to mount a particular device here now I can actually see that particular thing and that what I've mounted there is the host file system and so if I list out root you can see the super sensitive passwords and the dad jokes that exists on the host file system that I don't want anyone to access so there's your massive container breakout but it's not really fair to

call it a container breakout because everything preventing you from breaking out a container has been disabled this is just to highlight how easy it is right we mounted a file system now you're one of the things that still exists with with containers even in privileged mode is root filesystem isolation trivial to bypass and isolate you can do the same thing with namespaces associated with processes and all the other stuff as well but this is the lead hax that we want to demonstrate of just how easy it was now the file system cool so next up we have with masked paths so there are certain paths in Linux that are particularly sensitive mainly under the proc file system so

basically under linux everything is a file and that's part of the philosophy that goes behind it but they tend to put some sensitive information about system processes are including hardware configuration information under a virtual file system called proc so just to pick on one of these for example proc que corre allows for dumping of memory on the host and so that dump could then be passed into gdb or volatility or other memory analysis frameworks that you can then use to possibly extract some values out of memory so we won't do the gdb side of stuff so we'll just demonstrate how we can do this next one and cool so normal container try and hit proc and we can

see here the fire other the size of it is is one right nothing special not particularly huge same thing again with the privileged container now how big is it 128 terabytes so from within the container you can actually access the memory of the host so then yeah you could dump that out and start doing shenanigans cool so not great read-only paths so there are some particularly sensitive parts under the proc file path that a similar to mask paths but some of these allow for greater system configuration if they're written too so hence by default they're read-only if you add privilege however this disables this so probably the most interesting here is proxies the proc system directory is slightly

different from others under proc where it not only provides information about system but allows the administrator to immediately enable and disable kernel features so in this demo will specifically demo proxies kernel randomized VA space which is the configuration field for SLR everyone's favorite address space layout randomization is an important memory protection feature which makes straight memory corruption bug 90s style smashing the stack impossible because it randomizes the memory layout still possible to to have memory corruption issues but significantly harder and effectively the threat model in this case is you're running an application that has a known vulnerability in a container but it's a yeah it has a buffer overflow this makes it harder to to get a show out of that

particular bowl so I stand with that cool so here running normal container cutting zero in there and it says read-only file system denied okay let's try that again and we can see here actually here cat proxies can randomize says to which is enabled so let's try that in privilege mode so here cutting success no issues we zero but then also let's exit out and then we'll hit on the host as well and so here we can see cat proxies kernel around of mais VI space on the host is now disabled for the entire system so all a SLR is disabled because of one privilege containment next up writable sisyphus what happens if you thought of

it what happens if you knew hey I know that there are some sensitive mount I'm gonna write only mount them as read/write well conveniently privileged flag has thought of that and will iterate through every single mount and then take all the read-only ones and make them read right great thanks alright next up we've got controlling C groups so we mentioned C groups with C groups the particular thing here is is our resource utilization and so the risk use effectively dose as we mentioned but let's let's make a single and see how that works cool so no we've already done that one maybe four cool so here we've got a little test script so this is just says

hello besides SF and so we'll run that and says hello besides hello besides thank you and then here we'll we'll make a particular C group so if you make a directory that's from the kernels perspective there are binaries that you can use to interface with this but with the kernel if you just make your directory under this FS save your memory and then the name that you want to call it so we'll call this one big hello besides SF still running and then if we go in just LS what was in that directory there we can see out oh there's a whole bunch of stuff yeah it's created this we didn't make any of these files but

there's some interesting things here that we can play around with so you group rocks and usage and bytes and so on and so will you know see what the memory limit currently is and that memory limit is over 9,000 terabytes per thousand yes over 9,000 not scripted at all and hello besides and so at the moment there's nothing in that secrecy group we just made it so if we look at the the C group of the particular process it's under this user slice the default but let's go and chuck it in that secret so there's a crow that process ID into secret memory secret props and then if we cut out for example we want to have a

look at say yeah yeah it's in there great and then also if we have a look at the same again we can see now it's under this big matrix wonderful big new secret rather sweet so memory usage in bytes we can see it's now using you know four hundred and nine thousand bytes for that one particular process cool so let's go and make it neither one will make it small and then we'll echo a limit of five thousand now for very astute observers in the audience you'll notice that five thousand is indeed less than four hundred and nine thousand very important cool so living in bytes we'll set that up so we've got a small C group

we'll modify our look that's funky there we go we're basically just edit it for that those who missed that basically we just edit it from hello B sites and besides sucks which obviously we don't like that will trigger that one again and then will echo that process ID into the small C group and it doesn't run and we can see here it's killed right so the small C group was memory constrained basically so that's how C groups work we just made one manually and so yeah the small one that used way too much memory with skill excellent so some of the other ones here so actually Linux and a pieman profile will kind of deal with

these both together because they're they're both mandatory access control systems effectively this is a great security system selinux seven-four-zero we're talking about before if enabled on the host you have the capability to disable it or effectively from within a container so the demo for that is their nature from this that last one we get cool so we're in a container here we'll make a new file and the thing we want to highlight here don't necessarily need to know how to seal any two works but there's this container valid label that's already applied to the file so it's created in this particular container if I try and change that label to something like super secret or super secret if I could

type check yeah I failed to change the context of new file to a particular thing right can't do it okay so let's go and now around the privileged contain them and do the same thing and cool so basically from our first step will delete hacks to break out of the container we're going to mount the host file system again because we want to access files on the host so we'll go and do that root and then we can see here the labels oh this dad jokes text super-secret label and this effectively in this hypothetical circumstance say that denies read access to that file right i me is my process I don't have the ability to look at it so let's go

and change that to unconfined which lets anyone read anything wonderful great and then I can see the label on that has successfully changed so I can modify selinux on the label on the host for my container and now I can read out the dad joke which is ultimately what you're all here for I'm sure no reactions whatsoever not even a groan okay okay wonderful and then finally a second one can find I know particularly demo for this but this is a in Dhaka by default about I think roughly fifty say Scrolls a block there's actually a white listing approach with hey and so this this removes them to see for the full list go on sat-comm

so this is a security feature that is also disabled great so we know it's getting our problems containers and how do we actually prevent that from running so keep in mind again that we're not talking about like strict security isolation like something you'd use to do malware analysis but you still need multiple layers of isolation for for your containers you're gonna wanna have two sets of controls so that if one layer fails you're still protected right defense-in-depth and so one of the things you can use in kinetic renée's has several nested layers of isolation there's a blog post on that and i'm just gonna skip through you can also use in community security context and pod

security policy or alternatively a pod security policy open policy agent and gatekeeper those are admission controllers that you can use to enforce what gets deployed it does it Finkel KRL we're talking on a tomorrow to live in a and come check it out and then lastly isolation in kubernetes using things like cotton containers chivas are another container x' to actually prevent what's running in your environment and I'm skipping this and summary so here's a summary slide a bit of a lightning round off in the technical details but yeah the idea is that privilege flag lets your processes run free you're gonna want to restrict that there are lots of different punctures in your environment that

Frenchy went over and then you can use various things in urban I use to actually provide those two layers of isolation check out some links yep and then besides we'll talk the slides up later let's leave the links yeah so you can take photos but we've got some questions on slide I please hit us up or come say hi and if Frenchie or my here somewhere down here sweet thank you [Applause]