Secure Containers: Do Reduction Strategies Fix Your Nightmares? - Michael Wager and Michael Helwig

BSides Munich32:02103 viewsPublished 2023-10Watch on YouTube ↗

Mentioned in this talk

Tools used

Maven Trivy

Platforms

Docker Kubernetes

Concepts

distroless

Show transcript [en]

thank you for the um intro so um we're going to give the talk together I'm doing also a bit um kind of intro to the topic and then Michael will present you the other Michael will present you most part of the presentation and um the Deep dive basically into the Container security so our talk is about do compon component reduction strategies fix your container security nightmares because now doubt if you worked with containers before you probably have some security nightmares around it at least if you took it keep a look at containers um let's quickly start with an introduction um Michael do you want to yeah sure so my name is Michael and I'm working at secure Consulting for Def

secops and automation of security Assurance before I went to security I was working as a software engineer for many years and also there I was focusing on automation for quality insurance so yeah makes sense that I'm and was nice for me to switch to a security appsc area Okay um my name is also Michael Michael H I'm more on the side of strategic Consulting around building secular software development life cycles on death SEC Ops area um interested in all things security and I founded this small consulting company that logo which you see at the top Securo okay let's Deep dive into it so first we're going to talk a bit about what is the container security challenge

um what what challenges do containers bring if you use them um in terms of security then we will Deep dive into one um solution strategy maybe for for many of those challenges which is component reduction methods and um this is basically what micro will also take over and talk about the distrus concept we will see a quick demo on how the distrus can help you to avoid certain exploitation possibilities um then we will go a bit into the the research in this area what is um what has been researched there on the market and and papers and so on and um arrive at a quick conclusion so um so what what are security challenges with containers we

often come into companies do a bit of Consulting and we find out okay they are using containers but they do not really have a strategy for it yeah there is no real processes for the usage of containers so that means there is a lack of transparency into vulnerability scanning into vulnerabilities within those containers there is no scanning or there is some scans but no one looks at them um there are also no trusted repositories that means there is no limit on the base images that companies use you do not know where containers originate from where base whiches are pulled from um no control No Limit so um a lot of security headaches uh at the

same time containers are everywhere so even your cloud services may run on containers you get containers from third party vendors as a delivery as software deliveries for commercial of the Shelf applications um so basically there's a lot of places in your organizations where containers might already may already be in use but you are not really aware of it potentially um at the same time with containers there comes a bit of resp responsibility shift because containers now suddenly lead to developers taking over a larger part of the product Stacks they suddenly have um OS libraries or S parts that they are responsible for and they are not used to that let's say it like this yeah so they are not

necessarily aware of it they built the application they put it in a container they ship it everything seems fine um but they are not necessarily aware of of all the baggage of all the overload that they um have in that container um and one common reaction that we also experience unfortunately when talking to Developers then is a reaction it's not our codee yeah so it's it's um we do the application we just put it in a container but the rest is not our problem we are not responsible for the vulnerabilities inside the container um but Ops is also not responsible so who is in the end um you do have complex attack surfaces with containers um we will not deep dive into

this but just thinking about it you have the application layer you have OS layers you have configuration issues potential network communication container Networks um that might be interesting and you have hypervisor attacks so you do add a lot of complexity when using containers and another thing is the more different base images the more containers you use the more headaches you will have in the future because security is something that does not stay constant security degrades over time there will always be no uh new vulnerabilities new issues appearing and the more diverse your container landscape is um the more headaches you will have in the future to keep all that under control um if you look at statistics

about vulnerabilities um this is some this report is not from us it's from from a vendor called cystic but we found it very interesting um you see that 87% of images have high or critical vulnerabilities um 30% just have low medium or no vulnerabilities so there is a large number of images out there um I'm not 100% sure how many they they scanned I read in the report the number was at least 25,000 but um there are other number so maybe check out the report yourself if you're interested um and 7% have higher critical images uh vulnerabilities and only a small number of those vulnerabilities is actually exploitable so they said like 2% is actually

exploitable of course you don't know which ones those are so what does it help you to know that it's only 2% if it's inside your containers um but the other interesting statistics was that um there actually is a a patch available for many of those fixes and for many of those issues that means um 71% are patchable it's just not applied in the base images so so um the base images are not up to date um but only 15% of those vulnerabilities are actually in use um of the affected components are actually in use um from the application at runtime that means you have a lot of stuff in those container images that are that is not used um but that contains

vulnerabilities and um yeah so what do you do about it as a security team it's quite straightforward you look at the oest design principles for example and you find hey the more software I have the more complex my landscape is the more vulnerabilities I will have so I'm looking into minimizing my attack surface and this is basically one of the strategy that we also explored in our company to bring to our customers to tell them okay what can we actually do to minimize the attack surface on containers and to make sure that there is not so much overhead for applications that is not needed but just introduces headaches for the security team um okay yeah and that's it and now I'm handing

over to Michael to show us basically how that can be done yeah thanks Michael so I want to start with the sentence it's secure because it's running in a container so we get that sentence a lot it seems like still the misconception that if you throw something in a container and maybe it's kind of true because processes are isolated from each other and you have that Linux kernel features uh making up that concept of a container but still there is something running inside the container and and I I really love that illustration of it so just as an intro to tell you there's still a lot of things can go wrong and also we get

another sentence from stakeholders especially stating that teams are not allowed to select any base image they want but I I want to compare it with open source components and the image illustrates um for example maybe you heard about lock for J was a critical vulnerability in a Java open source Library supporting logging and uh in the end hackers don't care with which part they take to exploit your application and that's why we we compare it with open source components so teams are allowed to select any base image they want and that's why you responsible for the security and uh of your selected base images and this is why why it's important that we scan them guess you

all know security scanners container scanners are scanner tools who statically scan your images for non vulnerabilities they also find stuff like Secrets or uh credential lying around in the container and they are very easy to integrate in into cicd pipelines so you can easily set up something like not allowed that developers are not allowed to merge a put request if they are security findings because everyone's talking about shifting left and automating security assurance and at this point a little story from our experience working uh Consulting in a centralized product security team so imagine lot of more and more companies are moving to the Cloud we're talking about Cloud native and and then they have like 10 or 15 years old

Legacy Java applications should now get containerized right and you have these management policies dictating that they are not allowed to deploy to production if they are critical or high severity findings and then imagine us coming and scanning these applications 10 years old and they have like 200 findings only critical in high severity which is often in our experience that WTF moment for the for the teams coming to us like t Security team first uh we want to deployer in two weeks and second we it's not our like Michael said it's not our responsibility and not our code so this is why this is one of the reasons why we why we came to all this component

reduction methods as we um call it and what it what it is is basically um a tool um like tools who minimize the components lying around in your container with the goal of minimizing the attack surface and the noise of container scanners and there are four important tools available which I going to present now so first we have Google distes it's an open source project by Google and they made the bus word distes I guess some of you have heard about it famous already 2007 and they're providing some production ready images for certain runtimes like Java NOS python go and they're very small in size compared to regular container images like the static image is just only 2 megab which also

nice side Advantage then we have redhead Ubi micro based on redhead Universal base images and the nice thing here is that you get they don't provide so many prod ready images you have to build yourself for certain run times but you get the same Linux um security hardening and uh response team as you get with any redhead Enterprise Linux and funny thing is that if you Google for distes you will you will certainly find a blog post from redhead bashing against Google distes and you read it and you think like okay maybe it's not the best but if you read it until the end it's just an advertising for reded Ubi micro so then there is Ubuntu chisel

it's pretty new from 23 and highly inspired by Google distes um it's open source project by canonica the organization behind Ubuntu and they're also not providing so many production ready images you also have to build them yourself um but a nice thing here is that you have the Ubuntu policy behind them um making sure that all critical and high severity vulnerabilities are fixed within 24 hours and finally we have chainu guard images the chainu guard is a security vendor founded in 21 and the interesting thing here is that they're providing images for all runtimes even maybe not so popular ones like Ruby or PHP so and also they really have zero vulnerabilities for for all the runtimes

they provide so will present that in a second but for now coming to a little live hacking demo showing you like an example of a vulnerable application and how distes could prevent these classes of attacks so let me see if that works yeah I made a movie so to not blow it up to IP addresses okay what you see here guess it's should be large enough right is uh a no sh application simp simple application just um starting a listener um acting accepting an HTTP request on the route and if there's no query parameter it just sends an HTTP response no query parameter provided but if there's one it will directly execute a system command

with the input of the query parameter which is of course very bad example but there are more complex implementations coming down to exact this issue right so just to to demonst stra and it will send the standard out of the system command back as HTTP response so we want to check I already built it simple Docker file um it's based on node version 16 Debian Busey and it just basically puts PS over copies over the source and installs its dependencies exposing the port and run it so let us try to run it can see in the bottom and now it should be up and I want to test it with with curl so let's just

fire simple HTTP request without any query parameter okay application is up and running so let's let's try some and as we know it executes a system command let's try some sh commands like list directory and at at that point I guess you already get the point we get we get uh directory contents from inside the container from the outside but I want to present a little more sophisticated example so there's a remote machine I have a net cut listener on a port 4445 and I want to exploit the application now with getting a reverse sh against that machine so and as netcat is is inside inside that not 16 image I'm lucky and so this the payload will

be a little bit more complex and I blurred out the IP address but basically it connects against the IP of my remote machine with the port 4445 injecting bin bash and to get the reverse share connection and if you fire that HTTP request you will see that no response is coming back so line 14 here in the code will now connect via netcat to the remote server we get the connection in that machine and now we can of course try it's working and we got a reverse shell here so who am I we even root so you see we from the outside we got a sh share connection inside of the container and it's it's as it's root

like maybe you know it with Docker if you build an image it's it's unfortunately default defaultly built as root and the root inside of the container is the same P ID as on your host which is which is a real issue with uh that's why it's a best practice to always build explicitly as user declarations to your Docker file so not build as root so we can do a lot of eneration here like we as we root we can install stuff I guess you get the point so now we just want to show how dist list could prevent these kind of a taxs in a very simple way let's have a look at our Docker file and in Docker you have a

concept of multi-stage builds and in the second stage as you can see we're just switching to the the Google dist toist base image for njs version 16 and copy everything over basically and then we expose the same part and run it so first I need to remove the old machine and now I'm going to build it again and as you can see already it's fetching from Google Cloud registry build is done let's run

it and test it with curl again so let's test our list directory it's not working anymore okay let's let's test our so the application itself is running fine which is quite important yeah just uh it's running with this less um and let's try our more complex payload again and it's not working HTP uh response is coming back directly and we not don't get any connection on the remote server and of course we want to see what why is that so let's quickly have a look at the locks just need to find the ID of the container and now we can have a look at the locks and I just have to blur out the ID IP but in the top here you can

see that it draws an error that there's no shell so basically that's the solution and also at the for the end if we had scann this with trivia which an open source container image scanner we would have for the old image we would have 238 vulner public vulnerabilities of which are 221 high and 17 critical so imagine the team coming to us like how should we fix that um and if management dictates you are not allowed to deploy to production with that so after you switch to this L you only have two of s twoo High so pretty nice outcome I would say okay then coming back to the demo we also think that it's a good idea

to have scientific proof for what we say and with CIO we have a long running collaboration with the University of applied sciences in alburg and lately we joined a little research project on container security and we want to have we wanted to have a look at three questions here first we want to see if these component reduction methods really drastically um remove the the noise of container scan findings also we want to see if typical findings are in actually exploitable like exploitation probability and also third we want to see some Implement implications when integrating these methods in into the development life cycle and we are in the middle of the research but I can present

already a little bit of the results so for example we scanned five uh we scanned like 20 image or so all from the presented methods I I showed you and as you can see for example if you have a look at the chain guard images or the distes images or for example the Ubi micro ones for Java they all have like zero findings really really zero this is also the the thing they advertise with for example chenard yeah so this looks good for research question one and for the second we we looked at we in total we found uh 563 cves and besides the besides the csvs score which is typically used for remediating vulnerabilities for critical and high

there's also from the first organization the epss which is the exploitability exploitation probability scoring system and we check that API for all of the cves we got and we see that only one has a and what what the score says is that it's the exploitation probability within the next uh 30 days and they taking much more sources into account like Metasploit exploit DB GitHub advisory database and much more to calculate uh the score there's a nice um paper explaining it and we only see one cve with with a probability of 38 one with around nine and all the others of 500 are like under 5% which is prettyy interesting so exploit regarding at least according to that score

exploitation probability is pretty low for most of the findings so I hope you feel like that after you deployed the dist this image coming to the advantages like we get minimal images only containing the runtime and the application and its dependencies um no shelles no package managers no no other stuff no maybe no netcat binary which your application doesn't need anyway and therefore reduce attack surface and less findings of container scanners also Prett it removes entire classes of attacks like like I thought in the demo and of course the side Advantage here is that you get faster transfer times and Less storage size which also leads to less costs which also is a nice business case

you can argument when you want want to integrate that and we get faster build times which is also always a good thing regarding def secops and coming to the disadvantages or rather challenges so we have complexity and compatibility issues so to get let me give you a quick example so we recommended this to L for no CH application and they tried it and they got a runtime exceptions coming back to us and say like hey security team this is not working and then we we had a look at the application and saw that they are dependent on on a Kafka Library which is in turn uh um a rapper for Plus+ Library which has the dependency on a

compression shared object binary lying around in a normal Linux distribution which was not lying around in the Google dist drro and after some fing around with ldd we get the dependencies and we found out that you can just install in the first stage the shared objects you need and you SEC you copy them over in the second stage which is quite uh which seems to be the accepted solution and uh the thing is just this is a little bit complex and that's why you need really good understanding about the underlying systems current features Linux distributions and that stuff but it's always a good idea and we we we learned stuff so there's this debugging issue

for example if you don't have a shell you can imagine sometimes you need a shell to debug stuff Google disc to L you don't have a shell but they're providing debug images so there is a simple solution to in within your def environment you just build them with uh with uh the shell and for production you have another pipeline which build it without shell and no support for certain run times at Le for Google dis toist but um so if you need PHP or Ruby or something then you should rather go for chard so coming to the conclusion just want to say one more time that teams are responsible for the selection of the base images so this is this saying it's

not our code is it's just not really working and dist methods do make your applications more secure even scientifically proven so if you can dig into and use them and some final recommendations so we as a final recommendation we would say use Google diss if you can or chain guard um that would be the first we would recommend you to have a look at or even Alpine if you can but it depends on the on the on the C library you need uh scan your images and failure build we say that always and don't build your image as root because our scanners we have just 90% of compliance checks don't build your image as as roote

um yeah and finally if you want to integrate these this this Concepts we made good experience with creating a community approach around around it so if you have multiple teams integrating and you have these issues like I said with a Kafka Library it's a good idea that if someone solves this the other ones haven't solve should not solve it again so to share knowledge and yeah at this point we I'm at the end accepting questions and thanks so much for listening yeah

so we have our first question uh thank you so much for the presentation um clearly you're you're making it so hard for hackers to you know live live off the land basically there's no land but still um do you think it's practical that um for for some let's say developers or companies they're building a complex application distributed that need to be maintained with um add eight features and whatsoever uh with minimal uh images is going to be super hard for them to you know add features that leads into this being maybe impractical for some applications right so how how would you say that adding features would would be or like dis would blocking adding new features

no I mean like uh for example they build an application and after a while they need to add some features meaning some libraries uh dependencies packages that means that the maintainer of the image every time they have to go and add this package for example manually and that leads to you know people people being so busy containing the package itself yeah but but you're talking more about that features lying around in a Linux distribution because the normal features and dependencies you adding to a project like think about Java in Java you have Maven and the pom XML defining your external depend dependencies right so adding new features coding adding new dependencies to your pom XML from Maven

then you just build it in your pipeline using depend using Google dist as a base and deploy it so not with every new feature I would say you you need a new um dependency it could be possible yeah it could be possible but it's I would say it's rarely the case that was just a special case um what I meant is uh at some point the developers would start adding packages or libraries for example that lead to more attack surface and they may give up at the end like let's add a shell because we want to debug something you know so does is this practical do you think have you seen cases where on the long run people

saying right this is this is perfect this is a good solution to be honest developer uh motivation integrating this quite low to our experience yeah because like of these complexity topics when we showed them like this Kafka okay this no application is is running it in production but yeah I if I see it practically from a security point of view of course yeah but that's maybe always the issue if how how do you bring these guys uh into using it yeah yeah I I would add to that I mean normally adding new features and and modifying the application itself I think that that is not the issue yeah you you change you deploy new features all the

time and ideally you have a CCD pipeline that can run multiple times a day to to give you new features and patches and so on um what might be a a hurdle or what we what we got as feedback is that developers start need to build their own images basically and so they need to maintain the docker Docker scripts and everything and that might be a bit less inconvenient for them than using um pre predefined images from nodejs or whatever and um yeah so that was one of the feedbacks I think that we got um but on the other hand it is always like if if you build your own images and you know what source You're Building from

you have way better control of what's inside these images so there has been other research not done by us but but by basically showing that um even in not images you might have a cur that is pre-compiled and loaded into this image and as a security you have no where that comes from but if you start with your clean images then you have way better control of what's in there so that that is a bit overhead but it's recommended to do that way anyhow Sor thank you was just trying to open the Pepsi but I couldn't make it hey yeah thanks for the talk very interesting um so I have a question you mentioned uh one of the disadvantages is

that like debugging that you're less flexible when doing debugging for example on prod I means something not working maybe um connection to some Upstream server isn't working on on prod but on all your other stages Could Happen um what would you recommend to teams like adopting or trying to adopt this approach where for example you don't have a shell or like common debugging tools installed is there a way to still be able to then debug on those production images or do you need to then like switch to a different kind of debugging image deployed in your kubernetes cluster and then like you know try to reproduce um the same issues that they have brought on this container

so yeah to be honest like we never had the case that you really have that issue what you just described that you have to get into the shell of your production image so and we we recommending it from a security perspective and did some research on it but we have not so much experience that we can recommend all the use cases I guess that would be more like for developers integrating it and then you you come to the point and you find a solution so at but at the point currently I don't have any solution um what one team of course is doing is is capturing the locks like in an elk or something but you you don't really

can uh but chain guard the chard images they have a shell mhm yeah at least you have the decision if you want if you want it or not so yeah and you could if you be concerned about that class of attack which I just showed in the demo you can also integrate Cloud workload protection and PR preventing shell uh unwanted shell execution or some behavior in production so then this way you could you could debug your image in yeah okay we have time for I guess one last question uh uh just a a reflection in uh in terms of implementation because uh everything that is orchestration and and cloud computing has many hidden cost that

probably an engineer is not aware of basically patching secure configuration in kubernetes ETC they are not aware basically they are people that they are good that wrri in code but they are not infrastructured people and uh if we propose a solution like this dist ress the concept they only see the cost of implementation but they don't see the hidden cost of vulnerabilities because uh unless they are uh asked by policy that they have to fix the vulnerabilities uh before they go life Etc they don't see the cost for them implementing distes is extra work that is not there if uh we don't ask because they just download something from dockerhub so probably before trying to

make this kind of implement attention we need to have a conversation with the engineering team so they understand the pros and cons and what we are trying to solve in first place otherwise uh based on personal experience they will not buy the project yeah I mean I mean that's always a good point now that you need a business case for security in a way to make sure while it's worse while your time um but I mean if you're really struggling with high number of vulnerabilities in images like we did yeah then you have the developer pain anyhow so and then then they are very grateful for any any um way to to solve this or any approach that you can solve

it um and then in the long run instead of um dealing and assessing all because what you do if you cannot if you cannot go for something like this TR then you have to go into details and you have to security people sit there and assess the individual vulnerabilities at least the critical ones and look into it and decide at some point can we go live with this is this something we can ignore or is it something that's really critical and this is I think the hidden cast that you're also mentioning because um then you have people spending time with it and you pay your security team and those are expensive people usually for looking

into it and if you can it solve in a more technical way like solve it once Implement a pipeline and then you have less way headaches and have less way Cod to spend on people to fix it or assess it then that's also some kind of business case you could construct there okay so we are at the end I noticed there are uh still some questions I guess Michael and Michael would be happy to take them um in the breaks so thank you very much for the interesting top thank you everyone

Secure Containers: Do Reduction Strategies Fix Your Nightmares? - Michael Wager and Michael Helwig

Related talks