← All talks

Falco: runtime security analysis through syscalls

BSides Athens · 202020:02218 viewsPublished 2020-06Watch on YouTube ↗
Speakers
Tags
Mentioned in this talk
About this talk
Falco is a runtime security tool that traces system calls from the Linux kernel to detect anomalous behavior in containerized and cloud environments. This talk explores Falco's architecture, its use of eBPF probes and kernel modules to capture syscall events, and how it enriches raw kernel data with container and Kubernetes metadata to enforce security policies. The presentation covers the detection-vs.-prevention security model, the unique challenges of high-volume syscall processing, and live demonstrations of Falco detecting malicious container activity.
Show original YouTube description
Abstract: How to secure things by tracing signals from the Kernel up?Our daily job, as Software Engineers, is commonly to build software, a.k.a. abstractions. While doing so, we hide some complexity, but at the same time, we also increase the entropy and often the attack surface too. It turns out that to secure things we need to dig deeper into the abstraction layers, uncovering all their complexities that we carefully tried to avoid, putting those abstractions in place. For example, to securely run our applications on our Kubernetes clusters we first need to understand how all the Kubernetes layers interface with the Linux kernel. To understand it, we need to have full visibility from the kernel up. A way to have broad and deep visibility into our systems, when doing security analysis, is going to look directly what's happening into the Linux kernel. This is what Falco does. Falco provides runtime security using an eBPF probe or a kernel module as the driver, plus a ring buffer, to trace syscalls caused by userspace processes. In every Linux system, we have the syscalls interface to trace what user space processes are doing at the upper level and eventually take action. Anyway, this is easier said than done. Because tracing and processing every system call in userspace results in a very unique set of challenges. Join this talk to discover exactly what those challenges are and how Falco approaches them using eBPF or a kernel module! Bio: Leo is an Open Source Software Engineer at Sysdig in the Office of the CTO, where he's in charge of the Open Source methodologies and projects. He's a core maintainer of Falco, a Cloud Native tool for runtime security incubated by the CNCF. He is also involved in the Linux Foundation's eBPF project (IO Visor) as a maintainer of the kubectl-trace project. He's also the creator of go-syslog, a blazingly fast Go parser for syslogs and transports, and of kubectl-dig, a tool about deep visibility into Kubernetes directly from the kubectl. He's also involved from the early days into the CNCF SIG-Security. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Security BSides Athens 2020 CyberSecurity | InfoSec | Ethical Hacking | Computer Security | Evolving Threats | Threat Landscape | Privacy | Cyber Resilience Security BSides is a community-driven framework for building events by and for information security community members. These events are already happening in major cities all over the world! We are responsible for organizing an independent Security BSides-Approved event for Athens, Greece. More: https://www.bsidesath.gr Follow on Twitter: @BSidesAth
Show transcript [en]

hello everyone during this session we're gonna explore Falco and its unique approach to tracing things from the camel app to provide the last time of the trends in cloud security context is in an in-person conference I won't dare ask the audience hands up who already knows what Falco is but I'm not sure that this is gonna work in a virtual conference so let's move move on before to start briefly a bit to history Falco was created as an open-source project from the one by sysd in 2016 the par sleeps in Stephens bbc's B is another open source project from cystic to match security threats against Cisco's about two years later it was donated to the

cognitive computing foundation a branch of the Linux Foundation becoming the first ncf sandbox runtime security project it came some traction to the point that me and the others maintainer had to start a weekly community call which by the way I invite anyone to participate every Wednesday at 5 p.m. ET long time and this January finally folk has been the first-ever runtime security project to be promoted to the CN CF incubation level anyways my name is na Dylan Otto I'm an open source software engineer at C's de Guerre my daily job is to code Falco evolve it and maintain it in case you're wondering yes and one of its main tenets where I spend all day

on the top you can usually find me on Twitter github and generally the web with a nickname blue Dido feel free to drop me a line follow me send direct messages ask questions no problem at all so the plans for the next 20 minutes have to first describe the context and the environment folk is made for then we'll talk about our it approaches and solves the detection problem for security anomalies and finally we'll have some fun together watching our it's able to detect potentially malicious behaviors so what does security mean to me how to characterize the security problem I don't know you but I personally don't want anything happening my systems without even noticing it I want to control

things that can happen and things that can not since parental control is not always possible I also want the visibility to my systems to be able to know as soon as possible what just happened basically I think of security terms of two words prevention and detection what the two words have in common policies both concepts use some kind of policies to describe the lab or disallowed behavior for a process in terms of system calls their arguments and those recesses assessed the differences are that the first word prevention is connected to the concept of enforcement do not allow some actions for some persons to happen a tort because of the policies those in this category changed the behavior of a

process by preventing system calls from succeeding or in some cases also killing the process trying to perform those actions on the other side the second approach to security is to use the policy to monitor the behavior of a process and notify when it steps outside the policy thus we can also think of these two aspects of security in terms of enforcement and not versus and auditing some examples of enforcement tools are say cops Tom BPF Cellino sharp armor I bet everyone here knows even the atomization mechanism like that mission controllers and the role based access control for kubernetes fits this category why to likes audit DM falco itself belongs to the auditing side of the security topic a topic that

especially in collaborative environments has not been solved yet can focus all all our concerns no securities made of layers and is here to be one of them you will have to combine it with other layers in many ways the first example that comes to my mind is to use faculty identify malicious attempts to assess sensitive resources by observing the world behavior of your environments and then right enforcement policies that will prevent the episode to happen again in the future basically implementing a sort of feedback loop continuously continue improving the security posture of our environments since Falco runs mostly user space this may get somewhat a softer target but on the other side this also makes it able to have a much

richer set of information powering its policies and we went at the kind of policies that's very difficult to implement completely at the kernel level on the other side our enforcement prevention tools enough the first ones were let me ask you some more questions how to trust cloud providers and their ability to detect malicious or compromised insiders how to prevent an undisclosed vulnerability or judaize that allows someone to break into your systems I mean CVA still happens Linux kubernetes whatever project name a project that has not experienced a CDA and suffer even explore it does not exist what I want to say is that prevention alone is not enough and these two approaches to security are and must be

in my opinion complimentary not mutually exclusive so since it's clear there there's no such thing as perfect to save and perfect secure software Radames security is basically the last line of defense think of it this way I put locks on my doors compliance rules but if I don't use them all this violation or if someone breaks my window anomalies zero days I'm also glad I have an intruder alarm Falco so let me telling exactly what happened in detail prevention is about locking the doors that action is about monitoring the inside and the Falcon unique waited this is by tracing and detecting everything happening inside your box from the bottom paternal up instead of using the usual top-down

approach how that's possible tracing Siskel's where they happen in the kernel and asserting rules against events containing the sea schools plus other contexts seen for like arguments kubernetes metadata container metadata and so on but somebody could argue this point but why tracing the Cisco let's look at what we have today here on the side I tried to draw a diagram to represent the nowadays neutral set up our production environments are filled with so many power services containers tools it's it's very complicated and well it turns out that this complexity is the exact reason we should go look under needs if you think about it in the end whatever program you ran it will end up making a lot of siskins these

regarding the cisco grapple it uses this because system codes and web programs ask the kernel where everything really happens to perform some tasks whether the task regards networking au processes and so on these does not matter really that instead of looking into every one of the androids of layer and abstraction we run our application actor which is even more valid argument with wall cloud meta storytelling the Falco proach to the tech security threat is to go at the lowest possible level and trace all the footprints and the context in which they happened finally combining these signals with meta de metadata from other layers of interest like kubernetes audit locks container metadata and so on now to do

this a set of unique challenges arises Siskel's are basically the API that obstructs all the artwork for us they are very powerful mechanism but not very much from a user space perspective to bring every Cisco happening to user space for example you have to do another cisco and guess what you have to do to know the time and even tis happening in the kernel I mean time is important the security context well guess what another Cisco except for the baby Braves playing with air DTSC and things like that but that's another story so Oh so sis codes are a lot and with every new kernel release they can change gain a new parameter new sis codes can be

introduced old ones deprecated things change so it can be really really painful to complete the complexity of the picture try to imagine a system able to keep up and process millions of Siskel's per second in real time allows their context and their arguments because this is what happens on our machines millions of sis codes per second also we want to combine this user flow of terminal events with data from other various tools that are very common into the very complex set up because we know that said Cisco's alone will not be enough to create a software that deals with runtime security for example we want to connect these events we're detecting with the kubernetes metadata

or also with data from container and time but the dagger demon container the api that kind of things are way slower than Cisco's right so falco dust will finca synchronously and reconciles them later this is all Falco does in words we are gonna deep diving into it right now but before let me ask you myself last question how to get Cisco DS piece what approaches do we have what solutions can we implement the first option is to write a cable model for that right which comes with a lot of downsides buff on development side kernel panics and the deployment one imagine telling then your customer a a new version Falco's me released with a new Falco kernel module

to please reinsert all the modules to all your 300 machines late I mean it's painful an alternative that we have easy to implement any bit we have probe that does feature parity with the kernel module I think that today almost anyone heard about BPF right and for I cannot spend 20 minutes talking about to be Phe the stone I would love but I cannot but in case you want to know more about the PAF I suggest you the book from my friend Lorenzo or go to list and my podcast for Google about BPF I put links at the end of the deck basically the advantage of the pair is that you are able to co the

caramel to obtain the siskel in a safer way since its code is run into dedicated virtual machine with an emphasis on security except when this fertile machine has security backs but also this is another story so we have this concept of Falco drivers to produce siskel input events for Falco reality we have three drivers the kernel module a Namibia pro and recently we introduced a petrest based producers called PD which you can find on the tab for security organization it is lower very rocky but at full in some environments like managed at kubernetes clusters far get places where you cannot install command module or the peer probes in such case you cannot use the art performance in

kernel tracing mechanisms but with pig you can still run Falcon left III saw the Cisco thanks to magic with assembly lines clearly there are also other ways to implement this but I invite anyone that has other ideas to come to the weekly community calls of Falcon purpose them we will of years there's a diagram describing no Falcon box when using the Falcon kernel module as an input trainer as you can see there's a separation between the cabinet space in the user space at which boundary will sits Falco the kernel module attached to the start of execution of error Siskel's and those to the end of Cisco's executions it grabs the arguments it augments the scene for with contextual

information coming from the operative system with a mechanism we call fillers and it finally all this insider in buffer well it actually stores the pointer to the very containing this data entering buffer does effectively sharing data but wait a minute what's this ringi are the at the middle of the slide it's a ring buffer an elegant circular data structure with a fixer size eight megabyte in our case that acts as a first thing for South q it's a very important piece of folk architecture because having minim Siskel's the trace also hard the requirement on performances in order for Falco to be effective so the ring buffer is the ideal choice it does not require elements to be shuffled around when one

is consumed and it gives us assessed to date in constant time the ring buffer is then consumed in user space through the devices by ellipse cap in fact when using a with the camera module you'll probably notice a bunch of devices one for CPU therefore code zero DeFalco on and so on those files are the concrete implementation of the communication mechanism between kernel and uses piece that I just described after this step Lib c-span reaches all these data coming from the terminal executing callbacks to grab things like container metadata you remember finally the Falcon gene asserts the current rule set the one loaded by user against those data and eventually chose alerts outputs then you have companies that want is

Falco but they cannot install the kernel module and they ask you to do and we apply them well we got you no worries we developed the same functionality with this new kickass technology called VP fo in this case as the diagram shows the snoring buffer involved because vpf does not allow you by design to move memory from the current space to the user space and for reasons I mean the BPF purpose to be safe will they be completely fall so in that case but likely BPF as a mechanism called the BPF Maps so in this scenario lips cap the library that before was reading the device's low the health of an IPF program into the

penile BP official machine where the DPF verifier makes sure you don't mess up with the colonel and execute see really limbs car also send little bit programs to be executed error type document the traces Siskel's the film's feel as we mentioned it before so that in an in an event you see the actual value of a cisco parameter had not some random x additional values this is what feel as though at this point the data flows back to lips cab ellipses thanks to those fancy BPF maps and Lipsy scan again - it's magic enrichment all they say what remains on Falco to see about Falco is that it is just a continuous loop the threads those input events and then

outputs alerts about security threat she declared with farkles when the engine finds a match we will see now that the languages also advanced the operators and filters provided by ellipses to make things easier this is an example for all that we are gonna see it in action now the language as you can see is very simple it's a young'n subset and this makes it very easy to learn and product unlike the policy languages of other tools in security field well since its sam'l it's very hard to dent correctly that's not not about the rules fault let's see this rule in action ok so here we have the rule that was in the slide we created a drift llamó rule set

because we don't want to use the wool Falco rule set that contains a lot of rules as you can see this rule is using is open access which is one of the routines that Lipsey's provides us we check for the type of siskel happening we check that this is happening into a container and other good things the idea is that when someone runs a container and tries to create an executable file taco shoulda left us so let's run Falco with this rule set okay Falco start let's now run a container

be ok let's create a file with C with the execution permission the file is named bla bla let's now compile it and execute it and as you can see error drift detected open + create new executable created our container the output also says a else as the command that has been executed then the name of the file so yeah let's now come back to the slide because we have no more time left bad things happen right for example recently a CDI occupant has been found and it affects a lot of cooperate aggression I have no more time to talk about this but I put the slides here so that you can declare into them and

detect before propagate up kubernetes how to mitigate this cv or at least elect nurseries about this there are rules to do it here this is the slide with the resources to know more about BPA ring buffer fillers how colorful rosette I hope that everyone enjoyed this presentation thank you everyone ciao