Detectors as Code - Building Better Detectors

Name: Detectors as Code - Building Better Detectors
Uploaded: 2019-11-11
Duration: 37 min 52 s
Description: Security BSides 2019 College of Charleston, SC November 9, 2019 @BSidesCHS Title: "Detectors as Code - Building Better Detectors" Speaker: Brandon Poole

BSides Charleston · 201937:5262 viewsPublished 2019-11Watch on YouTube ↗

Speakers

Brandon Poole

Tags

CategoryTechnical

StyleTalk

Mentioned in this talk

Tools used

Caldera Carbon Black QRadar Snort Suricata Terraform

Platforms

Jenkins

Frameworks

Atomic Red Team

About this talk

Security BSides 2019 College of Charleston, SC November 9, 2019 @BSidesCHS Title: "Detectors as Code - Building Better Detectors" Speaker: Brandon Poole

Show transcript [en]

all right I guess we're about to get started like here at the welcome everybody up my name is Brandon Poole I'm going to be presenting detectors as code building better detection so Who am I so my name is Brandon Poole I work with citerior I am a detection engineer that's my official title however really and truly I guess I'm more like a person who does a lot probably like my biggest bullet point my job description is other tasks as necessary but yes I've been with stared for a while my role as detection engineers in the dart so what I do kind of in the dart is it is my job to go through and build detection in

order to detect advanced adversaries so we do this by a number of means signature based rules behavior based rules and really try to build out like a solid product and this is kind of where this talk actually came from is some of the methodologies we've actually implemented so in the beginning man created the code and the task and the task was tedious and error-prone and no one liked it and then man said let me automate it and he automated it this brilliant truly is the whole power of computing if you think about it this kind of happened with the way we do security but first let's talk a little bit about the history of InfoSec and

detection so back in 1972 there was a paper published by James P Anderson he was a general in the Air Force or a lieutenant in the Air Force and the paper noted they were a growing number of security issues they noticed more and more kind of like viruses weaknesses with systems the resiliency wasn't really there so he documented this stuff and said that you know he really saw like cyber security becoming the issue here in the future he followed up with a theory about automating intrusion detection in 1980s on another paper that paper ultimately led to the first intrusion detection system being created in 1984 but Dorothy Denning and Peter Newman this isn't the IDS that we know today

this was actually a what they call like an intrusion detection expert system this was a precursor to the AI deep learning and everything that we heard now like a lot of the marketing we see now it didn't have a whole lot of potential and so ultimately kind of fell apart but this was really the first intrusion detection system what they tried to do is they try to take what an analyst would look for wait these things out and try to mimic actually how a human would think about things in order to detect these malicious events like I said wasn't really all that successful they continued to kind of tinker it and kind of worked out all the technology

then in 1987 the first antivirus was released and then we kind of had like a nice detection kind of like winter I guess you could say until like 1999 when each security released the first sim a few years later they were snapped up by the Novell and then in the mid-2000s the first edr carbon black came onto the market and if you think about it all these tools in a wave required sections to be made and these detection that we kind of make if you think about it our code really in and of itself your EDR tool is like an interpreter of sorts or a compiler you write a rule for it interprets it compiles it and does what

it needs to do with it it might not be a Turing complete language but same thing really with your IDs you're writing some type of code you can view your IDs system kind of as a compiler or interpreter and you're passing it it's executing some kind of like functionality so when you think about it we've actually been doing kind of like this detector is code stuff for kind of a while so yeah I don't need to tell you pretty much already been doing detectors of code it's kind of surprising right yep thanks for coming the thought no I just kiddin so there's a lot of problems with our current detective strategy yeah we're doing it as code but the thing is

that we have like a ton of issues still like these are actual quotes that I've heard during my time as like a security like person like hey Brandon you know that rule I've just made completely blew up the sim like 20,000 alerts in like five minutes probably want to clear that out it's not really something you want to go through as an analyst to try to clear out like 20,000 alerts especially when these types of rules often create like a backlog effect so you kill the alert and you continue to still get like another 20,000 alerts over the next two or three minutes another thing that I've actually heard is you guys noticed we haven't got any

IDs alerts since we upgraded the hobbyist client six months ago Oh should have called that one or you know the EGR rule that creates the detectives we got pwned again yeah I didn't really quite test that rule logic the way I should've and by the way the guys are still evidently in the networking all about like 20 or 30 assets not something not really a talk you want to have with your c-suite and probably the most disappointing one is know that noisy rule I tried the two now I broke it you don't happen to remember what the old rule logic was where do you I didn't write it down it'd be nice to somehow another get it back I mean really truly

like the way that we implement detector detection as a as a whole is just completely and totally broken and we kind of realized that here it's Soteria when we went into it we really wanted to solve this problem and there's a couple other organizations that's also kind of follow about this problem and this talk really is about our solution here so I mean let's let's really touch on like you know I gave you a nice but let's really touch on what the problems are all your detector is reliable if I upgrade an operating system are you sure it's still gonna work are you sure that some part of that EDR systems not gonna break are you sure

that that IDs is still gonna function the same way those IDs logs and alerts they gonna still have the same fields you can parse or my detectives resilient if I make a detector that looks for meaning at somewhere in there and that changed to me me dogs it's just still gonna fire do I provide all the coverage I need with my plethora of detection in order to make sure that more organization is secure and as we've kind of heart on our my detectors bug free they answer that question we'll always be known as essential both positives are that's essentially what false negatives are false positives overwhelm analysts when they're overwhelmed they kind of get into this routine like oh that's

always been a false positive kind of kind of ignore the false negative issue is you can't fix what you don't know so if someone's in there and it's a false negative and it never fires I mean that that's essentially a bug you want perfection but unfortunately like just about any other software product out there there's no such thing as perfection the solution to the bug you currently have in place is just ultimately gonna be cause the next bug you have to fix but that being said there's always an acceptable number of bugs that you can push into production that you getting really just handle so really the ultimate question is does your detectors work as expected

yep you can see the panic on some people's faces not really but uh it's kind of a scary thing it's dark play really kind of a dark thing if you think about it like the tools and technology even if you're not writing your own detectors if you're using a Palo Alto and you're using that IDs function built into the IPS function built into it do you know that it's doing what's supposed to do it's kind of scary I mean you pretty much pretty much you're putting all your eggs in one basket we got a solution to that DevOps the reason is the solution is because it's a really cool public buzzword and to be honest actually works

so you know we kind of talked about like detectors as code like what actually is like detectives code well what if you know if you think about it software developers solve these problems these bug problems these resiliency problems does my code do it does my code work the way I expect it to work and really and truly if you think about there's no reason to reinvent the wheel we can just take from there 30 40 years of practices of developed like a actual lifecycle for this type of stuff we can adopt it we can apply it to our own code as you will our detectors and use the same strategy now DevOps is kind of like

one of these newer coding strategy it would be probably pretty helpful to you know break it down for those who are unfamiliar in the room so really truly like DevOps at its core is like a set of software development practices that really true what you're trying to do is you're trying to bring that dev part in the operation part together the problem is there's usually like a disconnect between the people developing the rules and the people that's maintaining the appliances they don't get along they point fingers they don't work together tightly enough and because they don't work together tightly enough there's communication problems all these things that seep in that's ultimately what causes a lot of the issues you see in

production so the object ID abide DevOps is to kind of merge those almost until I can one type of organization they have like very specified communication channels developers are trained in operations operation folks like understand development so they both understand each side of the equation they help each other out it's kind of like a really it's not too novel if you think about it but it's really been kind of like earth-shattering and the way we do things also a big function of DevOps is really trying to minimize what we call whip work-in-progress the idea is if you as a company are paying for ten million dollars for some software app application to be developed you don't

want to wait three or four years for that ten million dollars to be realized they would much rather shorten that work-in-progress cycle and get something out that maybe isn't as feature-rich but get it out in production so they have something so they can start earning back some of that technical debt they've accrued they're like start making it dent and like just to find this ten million dollar investment and DevOps really helps out for this because first of all having really the operation folks the IT folks and the developers count on the same page understanding stuff communicating better really understanding the technology stack I mean that's that's one thing to kind of push that out but there's a whole

different piece to that and that's where the continuous integration and continuous development pipelines come in the idea is that you know way back in the early like 90s even like mid 2000s a lot of like the testing that we do for software to make sure that you know there's a except you know a user acceptance test are there gonna be as few bugs as possible that users aren't gonna just like rage quit that was all done manually and you know eventually you know developers kind of wised up and decided we can automate these tests I mean a lot of these are very like simple tests we are kind of like automate unit testing so we can probably automate user

accept testing regression testing all these things and really in truly this cow we kind of come up with continuous integration continuous deployment continuous integration is that idea going through in pushing pieces of code as quickly as possible that actually function get them out there so they can be used that continuous deployment is kind of like that piece that's actually doing the testing so what happens is as you deploy code the code is automatically tested if it passed the test it moves to the next stage so from the dev environment to the QA environment maybe in the QA environment there's some different tests that are done maybe you're testing for scalability not the code in QA that

passes those scalability tests that stress test then it moves over the prod automatically it really truly if you think about it you well will touch a little bit more on this change controls kind of already built into like this tooling and technology so that's one of the ways you can automate it because that change control process that's usually like a foundational piece is actually built in and it's very easy to revert back bad codes that you thought was acceptable maybe made to production that you didn't have a test for it's really easy to kind of like move that back especially with the use of source code management tools like github get lab because I actually have version

control built in if something doesn't work I click back it reverts goes right back through that same see ICD pipeline and then boom right back to the same state I wasn't before I push that back code so if you really think about it our solution are redefining it detectors this code it's really adopting the strategy storing the detector logic and some type of like software management code repository like github or gitlab or even bitbucket or really anything out there that most of your software developers use building a CI CD pipeline to run automated tests against these detect these this detector logic and deploying the detector logic if it passes all the tests automatically no human interaction involved whatsoever

but it is important to notice that this is not dev sec ops we're not going to touch on that I'm not gonna try to teach you like a whole crash course on develop dev ops here so this is not what that is that's the process of actually building security to these pipelines we're not going to touch on that so please don't confuse this with that so the benefits we've kind of already touched on them but let's really kind of like hone in on change controls built in as a core feature I put it in code and I check it into the repository for it to go to the staging or master branch which is like

another place that the code is moved to it has to be peer-reviewed so like my peer David sitting up here in the front we look over and says yeah this this looks like it's pretty solid it meets our our standards I'm gonna approve it and he gets approved and you can have two three four however many steps of peer approval you want in there and once it meets all that peer prove what you can even break it up at the front at the end throughout the whole process and as it as it goes through this almost like you're changed control its audited right there into your source control management system the version controls core feature of these source control

management systems we're reverting back to old code that whole like up forgot the sim logic you guys remember what it is easy just button click right back we can also improve detector resiliency through automated testing so frameworks like the atomic Red Team caldera by miter all this stuff you can actually build into this CICP pipeline test this stuff and you know exactly if your rules are going to fire the way you expect them to fire and the big thing is we're reducing the time to deploy detectors this way I'm not having to wait to go through like some long rigorous like change management thing wear some socks in order to keep the sim from breaking

it's like alright well we're going to deploy all the rules Wednesday at 9:00 a.m. if a threat comes in on Thursday you write a rule I mean I don't think your threat actor is gonna wait till next Wednesday they like breach your company I mean if they they do that's really nice of them how long have your problems to be honest but you don't have to worry about stuff like this anymore you don't have to worry about going through a process that's gonna slow you down and that lack of visibility because that process of getting new stuff deployed it's so slow so let's talk about the process overview here so this here's kind of the process that we've developed

and I'm going to touch on a couple of different things here and kind of some customizations and tweaks you can do so as you can see kind of like our stage one is uh new detectors are really submitted to our source control management platform but who are like building these detectors well primarily our detection engineers but the thing is like that our detection engineers don't necessarily know everything so what we've done is using the source control management systems there's an issue tracker in there usually used for tracking bugs what we actually do is we open up to the whole company so our detection engineers can say hey you know I read this blog article or working you

know this alert I noticed that there was this other activity that didn't fire we should make a detector for our threat intelligence team you know we got some threat Intel through some sharing programs this you know this threat actors using this method I don't know if we have coverage or not but I'm gonna put it in there and open up an issue now we're reverse engineers if you're lucky enough to have them maybe they break down a new way that malware is executing or maybe escalating privileges on the end point they can put in say hey notice this new unique way they're escalating privileged this dude baking crow truth put it in there incident responders I

mean those are God's boots on the ground they see something and they're concerned that we don't have coverage they can put it or request security analyst really anything like we even open it up to like c-suite I mean I don't know the c-suite it's gonna put a detector in but I mean hey they have an idea they're usually just in the know because it's their business to know about the rest of the company they can put an issue in our detection engineers take it they kind of build that kind of detectors from those issues now the question is how do you build this like your Sims not gonna use the same language it's your IDs like you

know cue radar doesn't take the same rule logic but sericata does sericata doesn't take the same rule logic that uh perot does ros not going to take the same rule a carbon Blackwood so really truly what we've done is we've developed like a gamma file the animal is really really simple we'll take a look at it here in a minute but what we're able to do is we're able to put that code logic for each platform in the mo file able to say what platform it specifies - you're even able if you want to build out an actual playbook for how to handle that in the CMO file and you can actually use this Jamo file it's like a playbook kind of

like your one-stop shop for everything you need to know about this detector and you can push this animal file through your CI CD pipeline so we use Jenkins which is a traditional developer tool Jenkins is very nice it's open source it's free there's plenty of training out there on like plural side or just buying your trading platforms you can go buy books on it there's other ones out there - Circle CI is kind of like a cloud one built natively into github so it's Travis CI to be completely and totally honest a lot of the stuff that we're talking about doing if you have a sore platform like the mr. or phantom they have connectors right into github you

don't have to go and try to figure out Jenkins and learn Jenkins and stuff like that you can just use the plugins built right into the misto right into phantom don't even need to pay for it you can actually use the free version and just do the same thing because they have the API integrations into your sim to push new rules and update rules so I mean it's really that simple like you don't need to have necessarily a full-time dev on your staff to implement this kind of stuff now one of the things you can do is you know you want to have some environment to kind of test this in you can use terraform to like build up a

environment now what's interesting and this is kind of the whole reason we jumped on this kind of like thought process for this is terraform is infrastructure as code this here is something that you know everyone thought had to be like a rich turning process hands-on there's no way you can automate it and pretty much like hashey corpse came out with terraform and proved that you could actually just codify all your infrastructure all your configurations and set it up on the fly and tear it down and that was also one of the reasons that we really thought that this model would apply to our detection scope so I can build an environment from scratch test in it push it through

bada-bing bada-boom and once my environment set up now I can use kind of like my own internal tests I can use my desk Altera atomic red team really and truly like any of these frameworks that are designed to kind of test your security attack IQ if you want to go on the paid side there's a lot of frameworks out that you can really kind of use the test your detection z' and really kind of like step step six as all the stuff gets fed back into the Jenkins if it passes the testing phase Jenkins will go through and like tear down that whole terraform architecture and they'll push it out based off like based off the ammo

file it'll push it out to your EDR sense or whatever it is your network security platform so sericata snort security onion zeke formerly known as bro your sim and you're pretty much up and running just like that yeah I know it's a lot to take in so we're gonna break it down a little bit further so phase one we kind of touched on it it's a new detective submission so the way we decided to do it is we are a very use case driven organization we don't want to make just detectives for the sheer reason of having detectors and so when a c-suite person comes up they're like you know I've covered for this you're like

oh yeah sure we have like coverage for this I'm sure we got like 20,000 rules and all we come to find out like it's the same 20,000 rules covering the same thing so we bill out use cases what do we care about what do we want to detect so you know the folks put in these issue things and what we do is we ask them to put in kind of like a summary here that you can see on the screen so summarize kind of what I'm looking for the threat model why am i actually concerned about making this rule we've got some additional details here so we like to map our like stuff over to mitre so we

can actually map out what the coverage is you know aren't tests exist so that's atomic Red Team so is there an atomic red team test that can are utilize or I'm gonna have to build my own test there are the additional notes if I want to learn more about this what references did you use to come with it so I've kind of build out this whole template and the nice thing is if you use something like a github or get lab you can build these templates automatically someone someone goes and clicks new issue it's just a blank template then they have the summary threat model additional details and it's got to kind of like fill it out

another thing is over here kind of on the right-hand side you can see what we call labels so labels are kind of like a tag so we can apply these tags to it so I can apply does this rule work for Windows doesn't work for Mac so I can talk about the OS a applies to is a network layer type of detection or is it more of a host base or an application base I can tag it whether or not like a test exists already for it I can test if it's in my ders tack framework a lot of people think that minor tech framework is comprehensive it's not really it's a work in progress it makes it for like

very very very easy kind of like organization categorization it really helps us kind of break down like if you have like customer coming in and say you know I'm in a Mac environment you can really going to be able to help us I can pretty much use these issues or what's closed because of the issue is closed we've actually developed it and pushed it into prod and I can go through and say yeah we have like you know 20 detection for Mac which you know might be pretty on the low side there but just throwing something out there 20 Texans for Mac and it covers like these seven different use cases these might or techniques and I can give you details

just based off this issue tracker and then the nice thing is most of these source control management systems you can kind of group a lot of these issues together and what we call milestones so it gives you a way to kind of manage these this is like a project so I can actually go and say you know this week I want to work on like these twenty detectors you know Joe's gonna take these five Jill's gonna take these five and you kind of break it up in the measure progress

Oh No locked up there

yeah hang with me here just second so everybody do by Tina room IATI folks technology folks

let's try this again there we go alright so phase two we're actually developing the detector so we're taking these issues now we're actually trying to make something of them so this kind of goes into that gamble format that we were talking about so there's a couple fields we usually required in a camel and we kind of recommend everyone do once a universally unique identifier so this kind of helps you know makes it easy for us to count the number of detections we have it also makes sure that you know detector is not gonna be overlapping a name as like actual like a separate detection all together the other thing is like you know you you IDs are great

kind of like IP addresses though they're hard to memorize and if I say uh you know alert 86 fired no that is I don't so you really wanted to have kind of a detector name or some way to actually understand what's actually firing so this is kind of our human kind of like readable it's like our DNS name for these rules so we're popping that in there that's what we're kind of like pushing to the IDs or the sim that we actually read so status so this is another thing it's like maybe you're right like this really good rule you think it's a really good rule time you go through you deploy it and let's say

it's not as good as what you thought it was maybe you don't need it anymore because you realize that you are I had overlapping detection you can have a status field so I can have a status field as active so it's actually gonna deploy to prod archive if I really realize it's no good anymore I don't have to delete it so I can still say hey we've tried this rule at one point in time it didn't work or for some reason we've archived it you can even do something like a testing so let's just say that you know you're just not into all this new age DevOps stuff just you know I deploy prod still want to keep

some of that old-school kind of mentality there you can to put it like a testing tab and deploy it to like testing systems or maybe like you know you just want to push to some non-critical assets like IDs is on some parts of network you don't really care you want kind of get a better feel for the performance impact really like the last ones you need and kind of the essential thing is that detection logic so what is the thing that's going to go into the sim that's actually gonna compute and do the hard work what is the actual IDS rule that we're gonna push the snort you can just store all that in the annual field - now

there's plenty of other things you can put in these e ammo fields they're great ideas play books so once again like you know you've got like a URL reference in your snort rule that says hey go through this URL for more information think about it you put the URL to that detector logic in github so when it fires you click that link it takes you to the github page you can see the rule logic you can see who created when the UUID you can pop a playbook into that ya know file wouldn't get pushed necessarily to your IDs but it's right there so you can read about like the playbook you can read about known false

positives how to kind of work this stuff so really truly this becomes like your kind of like knowledge base at this point in time if you want to do it that way really the world is your oyster the amyl you can put whatever you want at that it's almost like a JSON but it's a lot easier the reason that JSON phase 3 is that automated testing we're gonna spin up that infrastructure with terraform once again if you don't want to be all new aging you don't feel comfortable terraform you can just use physical hardware VMS are persistent deploy to them in your gamma file you'll specify like the test the run against it so Jenkins knows what kind of tests

actually run against it deploy the rule to that EDR since our on that box that you know I'd yes since they're here on the segment of the network even Friday's stuff you don't even have to deploy it to an actual real IDs sensor you can deploy it to a VM you just spun up use TCP replay have like some packet captures you want to replay against it replay it and if it generates the alerts you expect it and move on now we kind of touched on it before there's a lot of these like frameworks out there there for free so atomic red team's a big one there's a whole community around this miners caldera we

still have on there they've actually just recently moved over to utilizing the atomic right team can detentions and kind of methodology in games red team automation ubers got one but the name kind of evades me here but there's all paid solutions I mean you've got threat care you've got attack IQ you've got all these solutions you can use to the simple button clicks and to be honest with you I'd recommend you probably using kind of a combination of all these plus maybe even building your own atomic red team are gonna test or built a certain way to test a certain technique in games red team automations built another way so the thing is your rule

could fire on the tonic red team maybe wouldn't fire for the test that in games red team automation framework casts so maybe you can look at Tom mixing and intermingling to to get your coverage up really truly what it boils down to is you really need like good internal testing techniques I mean does that detection logic actually detect the intended behavior does the detection logic detect any unattended behavior are you generating too much false positives like I said you're always gonna have false positives the question is is the number of false positives that you are creating is it acceptable to the number of staff that you have and are you willing to accept the risk of minimizing

those false positives and possibly introducing that false negative or my detectives resilient enough to withstand common invasion techniques by changing me me cats to me me dogs is it still gonna work I can build that into like the test really quick really easily if I know I'm gonna upgrade my idea IDs sensor I can upgrade the IDS sensor with terraform deploy the codes there have packet captures of known bad replay it and see if the new versions not going to break something so I can determine whether or not I actually need to or want to upgrade those sensors or if I need to wait a little bit and possibly reevaluate the way I'm doing my

detection or maybe reach out and put in the bug foot bug request phase four this is pretty much where like the gold is if you ask me it's not automated testing our automatic Matic deployment so over here kind of at the top you can see we have our gamma file to store to get we have some of the fields that we talked about so we have like the UUID are UUID for this one is leap oh I thought I was gonna get more laughs in that but the status is active in this case we're actually looking for MSHDA executing on a URL or you can see here kind of like our detector we stated the

detection platform is the Lima Charlie EDR and our detector code is just you know kind of a dummy field at this point time that's good go to Jenkins Jenkins gonna go and check and say is the detector active yep deployed to my terraform inch infrastructure run my test did it pass all the tests yep deploy the detector logic to the Tector platform they which in this case is Lima Charlie EDR you're up and running that simple now let's talk about like tying this all together really and truly if you take this image right here and you compare it to the software design lifecycle it's almost a perfect overlap at this point in time we are identifying our issues whether that

be it for tuning request does it doesn't have to be a new detector you can put an issue in for a tuning request a coverage gap a new technique I'm identifying this stuff I'm writing it up I'm doing my planning then while I'm doing it for my planning I'm actually going through and I'm building it I'm researching it I'm developing the Hamel I'm determining what is the acceptable false positive and false negative rate that I'm willing to accept for my testing now pushing it off and I'm performing my testing the tests all ran against my new detectors we see if they do what they expect we expect them to do and you can really like I said touched on earlier you can

do all kind of testing you can do pretty much like your resiliency tests you can push it to sensors and you can like just stress test it to the the core and see if it's gonna affect like how that end point the ER tool like really runs there's so much you can do there and then ultimately the last stage is actually deploy production I mean that's the most important part if it's not production I mean who really cares the thing is that with this this kind of like DevOps detection is code lock model we're pushing them straight to production in an automated fashion when they pass the test when the proper reviews have been done and then we're able to kind of like

monitor after it's done there and we're kind of able to like metric that false positive a true positive rate see if it comes in production is the way we expect it to if not that kind of gives us ideas of things to go back and build a new test for for this detector and others and then ultimately like I said at some point in time like you deploy a new technology or something like that it's probably gonna create a lot of false positives you got to go back retune it start the whole process over again

all right so questions concerns comments any of the above

all right see you then I'll stick around so if you have any you want to talk another thing to denote we are hired and looking for interns so if you need you guys are out there looking for full-time jobs or internships maybe next semester let us know part of the reasons we couldn't really do a demo today is because we sacrificed all the other interns to the demos God's presentation we did a while ago so like I said we're always looking for new interns to sacrifice the demo guts so uh yeah but uh thank you for coming out

Detectors as Code - Building Better Detectors

Related talks