← All talks

Compliance meets SIEM automation

BSides Newcastle · 202033:3847 viewsPublished 2020-11Watch on YouTube ↗
Tags
Mentioned in this talk
About this talk
Bridging security operations and governance, this talk demonstrates how SIEM platforms like Splunk can automate compliance measurements and quantify security posture against standards like PCI. Through practical examples using event correlation and multi-source log analysis, the speaker shows how to track user actions across systems, automate provisioning decisions, and turn audit requirements into measurable, data-driven security outcomes.
Show original YouTube description
SIEM automation is driving a lot of SOC roles but the end-users of products like Splunk tend to be engineers or "techies". Approaching it from a GRC perspective gives us the ability to "prove" compliance. Industry standards such as PCI give focus to traditional receptors for logging: security incident and response teams. However, when you try and quantify compliance - e.g. what percentage compliant are we - we get a measure we can use ahead of audits. Plus other cool stuff like event correlation for tracking a hacker compromising a machine. Captured using OBS: Open Broadcaster Software®️ obsproject.com Edited using OpenShot Video Editor | Free, Open, and Award-Winning ...www.openshot.org
Show transcript [en]

um right so compliance meets seam automation um essentially the reason i wanted to do this talk is because um i've been an application security specialist now for um for a while but before that um i used to be in security risk and compliance and during this time um i was working with a lot uh with a lot of security guys who are very much like audits pci compliance um so i was kind of like um i was there as like the techie person and the one thing i noticed was um the fact that uh whilst these guys are great at like conducting audits and um going through their various assessments um what would be even cooler is

if um some of the results they were producing like whether it's um compliant non-compliant if it was actually backed up by some of the logs um and obviously like that's kind of start off like a whole big thing where i was um working a lot with splunk working a lot with like steam and automation so um let's introduce myself first as i already mentioned um application security specialist moving jobs so going to be an application security specialist for a whole uh two more days before starting as a devops security architect for ai which might be the coolest job title i've i could possibly dream of um so again search your risk and compliance did that for like an entire year

just essentially building dashboards building applications in splunk got quite a quite a bit of a way into it producing um various reports and things for like cisos and department heads and again just loving it finding it quite interesting at the same time also doing some of the remediation work on um on various applications and systems as a whole and like there is a big focus at the moment with like pci compliance where um financial services are kind of taking off and with uh processing card payments obviously there's like certain criteria that need to be fulfilled so a lot of my job was was doing that um in my free time i usually develop applications um i guess what started me off in tech

is web development so javascript node react um recently also been dabbling in java with android apps um so yeah kind of like developer in my free time um quite fun um special interest in quantum computing so it's kind of um i got i was applying for a role with ibm a while back and they released a really i don't know if you guys have had but um they've essentially got their own like quantum computer but the cool thing is that um they've got a tool called kisket and um they've essentially got a tutorial on on youtube online for how you can run simulations against this quantum computer and if i think it's like if you pay some

amount of money you can actually run real simulations um so i thought that was quite cool um some of the other things i'm interested is uh interested in is getting girls into tech so i've been like an instructor in web development and uh speaker at one of the panels for code first girls that was quite quite a good uh quite a bit of fun so yeah kind of um that's kind of been my journey of the past few years um and now why seem so as mentioned before um again there's like a move to a shift towards like cloud computing big data containerized platforms on various cloud platforms um so i guess the big thing with that is

that all of these all of this new tech is generating so much data and i think i think there should be a bigger emphasis on actually understanding the type of data that's being generated and actually making the most of it because what i found is a lot of the time in industry people tend to have the requirement to all have like audit logs or just have logs going into a sim engine but what they don't know is that there might be some overlap between like i.t and security or like data governance and security in terms of like the data that uh that's being used or the way that's um maybe one team did some analysis that

could uh prove essentially useful for some other team so yeah um that's kind of why i'm doing this talk i think it would be really helpful to kind of get to the lower lowest levels of logs and kind of build it up from there um yeah obviously aggregating logs things like that um but also putting it into the context of the wider business so not just having um not just having i don't know security logs and have it being monitored by the sock team having actually see if those security logs are useful for things like compliance or um finance or hr as some sometimes it is um for example like i guess if a user um

like user onboarding and deprovisioning obviously there's like systems that um system permissions that need to be revoked at the same time as hr like gives notice um yeah doing like they're emphasizing things like that um so what can we learn from log data this is essentially a list of the different logs that i've come across things i found useful and things i've used in the past to do things like tracking down users um building like building meaningful data and showing like results uh for various remediation projects to show that a system actually has improved in its security so um like the biggest thing for me is the active directory server logs so obviously when a user signs in

generates a log a log event and um yeah that that's just kind of that um what i started doing at one point is actually um using the active directory groups that we had a list of so i think the idea is that you have a list of active active directory groups and users that um that belong to certain groups um and then you also have logs generated when those users access certain systems for example um uh let's say there is a database that's being accessed and it's being governed by a um by an active active directory group but um the user uh you look at like the past 30 days and you find out that the user

even though they're part of that active directory group they haven't actually used um that database in like over 13 over 30 days so that's when um things like splunk come in useful where you can essentially generate a ticket off the back of that use like the two comparison lists um and for anyone who hasn't accessed a particular system have it automatically de-provisioned the user so that's something that's quite easy to do but in practice takes um like done manually it's just a huge headache for um for the provisioning team uh so yeah automating things like that and then there's obviously like os system logs uh for tracking down ips correlating ip addresses with host names

um and then one of the most useful ones i think is the security tool logs because a lot of it they um they do a lot of the analysis themselves so things like bigfix things like crowdstrike um you can essentially tell a lot um they essentially like analyze the endpoint and they can tell you um how it's configured how um how many like what what kind of users are signing in um whether or not it's it's a machine that's like got a lot of vulnerabilities like if you get if you get it with qualis um so you can use all of those different tools to make sure that the configurations on a particular machine are correct for

example like version number uh whether or not it's got automatic patching um yeah and in terms of like again tracking down users and tracking down like um service owners things like that network logs are just so useful just because um essentially like type in an ip address and you can see what kind of traffic um what kind of traffic that ip address is getting and then you can build out a whole map of your system that's kind of like i guess almost live if you if you run the query um yeah and then obviously application logs things like um is that user actually supposed to be um doing like crud operations are they allowed to be deleting databases

things like that why are they deleting databases if we didn't think they had permissions to delete databases um and then obviously db logs so yeah um lots of like as you can probably tell there's lots of different um sources lots of different um uh streams of data so if we actually look at splunk i didn't you've probably uh seen these before but what i've essentially done is load in the tutorial data and that is quite easy to get hold of it's just on the splunk website download a literal zip file and then what you've got is a um what you've got is a store that's essentially um a demo version of a real shop with different like

sources with different users including okay let's see

right so tutorial data um what have we got here we've got [Music] sources we've got the web access logs we've got mail server logs and sales data fine um again vendor sales what i'm actually interested in for now is these things so the point at which that this is again rule rule source um what you can do is look up the actual url that users are accessing and then it gives you things like the query they used um uh what browser things like that um pretty pretty uh simple stuff really the shop is called uh buttercup games um so yeah that's that's the data um let's see so some of the actions here someone's clearly done some

um analysis into what the queers look like when someone actually does these certain actions so let's try action equals just so we can do things like um seeing from which ip addresses have been making the most purchases cool so that's single logs at the moment uh single log events um where it actually does get interesting is where you combine events to get relationships so this is the definition i pulled up online from some security tool it says that event correlation takes data from either application logs or host logs and then analyzes the data to identify relationships so application logs or host logs i'd say just blogs in general um i think the key bit is making sure

that you're viewing um you're viewing the events as part of like a trend and as part of a pattern rather than as individual events so things like using timestamps users server ips to trace um to trace a user's actions over time and i don't know if you guys are familiar with like security forensics but um if a machine has been compromised they obviously take over um take a sample of the logs and see at which point the machine has been compromised by looking over um which ipa address has been um which ip address has like been known to go into a system and then seeing which account credentials it's been using so for example if um an external ip has

been generating a lot of traffic and and ended up getting like a root password after um after like generating tons and tons of traffic you can probably tell it's like a brute force attack things like that um you can also trace uh who's been accessing a server to find product owners so that's something i've done in the past again using network logs using security tools um i think the key bit for me was being able to go um find information from one log and then use that information in a different log if that makes sense so finding out for example that the um hostname is correlating to some ip address and then finding that ip address via

like smb or http traffic in the network logs so that's that's been really good um and then things like making sure that the security tools you guys are using are actually accurate that um they're scanning everything is there anything like missing from the security tool why is it missing um and then also like what is it doing its job well so at this point you'd be looking into um how much like cpu uh usage the um uh the scans are taking whether or not it's it's too much um yeah essentially like proving the worth of the security tool and seeing if there's any improvements um from looking at the logs so let's look into doing some

event correlation on a user so here you can see i've i did like an example earlier but we can use a different ip address for this um what i essentially want to do is track a track and ip address to see what kind of actions it's been taking over time so let's do an external one that one seems quite interesting and that is the i believe client ip yep let's do

so let's take away the action equals purchase

right so looking over let's see what date is that 8th 23rd of august let's look at what they've been doing the 23rd of august i think that might be when i downloaded this data right so you can see on that they've been browsing um this product went to um went to the sports section and then made it made a purchase from that and then they made another purchase and yeah and it's uh the with it is that they didn't actually add anything to car before they made a purchase but fine um so fine tracing a user that's done the difficulty comes when there isn't enough information in the log so in this example we can see receive

disconnect from let's see if i can make this bigger cool receive disconnect from 10.3.10.46 uh disconnected by user however we can't actually trace this back and that's quite significant for if we're looking into doing any kind of compliance uh obviously if we're looking to do an audit we'd like to be able to know who's accessing the internal uh internal systems so let's do that search by 10 dot anything and look at the internal accesses receive disconnect from 10.0 so same thing um however that doesn't tell us much user isn't very helpful by itself so what we're going to do is

if it works receive disconnect fine [Music] cool so how are we going to get around this well um let's see i need to get that query

probably should have planned for this

let's just extract some fields so tutorial data looking for um the password for field looking for the internal ip and whether or not the whether or not it's been passed or failed all time cool so here we can see uh i'll just show you some of the fields we're pulling up essentially uh these racks bits are classifying the um password uh what are they doing they're essentially pulling out everything after that says password for and classifying everything that's after that phrase as the user so you can see password for and sharp password for mu1 fine and then got the internal ip and whether or not it's passed or failed by just pulling that up cool so how are we going to track down

that user so we're going to do stats values internal ip by user so we've classified the fields fine the um ip address we had earlier that we didn't know uh from here it was 10.3.10.46 we can now tell that d johnson has disconnected and if you wanted to add that logic in all you'd have to do is save the search and have the map ready or even build it into um into splunk as an additional like extracted field cool so we can trace a user and then the next bit from that is seeing um so classifying by every single user how many uh failed and accepted passwords there are so just taking it that one step further

so by that we can do something like [Music] um event account successful user

and then i believe oh user so now we can see that b johnson the guy who um disconnected earlier has had 121 failed passwords although admittedly i think it's over like a few days still a suspicious number and we'll we'll get to that in a little bit but you can tell like for the majority of users um mostly passwords accepted however for when there aren't any passwords accepted or for um alerting on when there is a suspicious amount of um failed passwords what we can do is so again account success uh by user and success let's see what that produces so b johnson has accepted has had 16 accepted passwords in one second which is a little bit weird but again

tutorial data why not um what we then do is add a field that says a lot if success is failed so if the password is failed and um it's had more than five failures essentially and then it will produce a field called alert saying a lot so there you go so d johnson has had seven failed alerts in that time um and what you could do afterwards is do something like send the email this is the command my email address um it will send you an email to this to this address with whatever subject whatever message and the results as a pdf um so then if you have an ops team it's quite good to

have them on the on the lookout cool so um next we will be going into the seam become being conscious of time i might skip this bit essentially it's about having the um having the nist compliance bit um having the nest um i guess standard being fulfilled by the fact that information system endpoints which is essentially what entry and exit points are are being scanned for malicious code protection mechanisms so essentially like um having security tools on your endpoints and the way you do that is what i mentioned before which is essentially compiling a list of your endpoints from the network logs um correlating it with um correlating with it with other logs to make sure that

you have like host names as well as ip addresses and then having um and then again having another another list from whatever security tool you're using and making sure that everything's being scanned the way it's meant to be so pci dss which is the bit that i mentioned i was working on earlier uh we're going to use a sample standard because the um the logs have essentially um they have a um a lot of login information so that's like the primary security event logs that we have so limit repeated access attempts by locking down the user id after not more than six steps fine make sense so if we do that and i no longer have to write out my

splunk queries it's essentially internal looking at internal um traffic filtering out the user fields internal ip fields whether or not the password has been accepted or failed and then counting let's do that counting the number of successes by users and by um by time so how are we gonna check that um if there is more than if there's more than um five failures get their last it's essentially building on what we did last time so um stats count um should be familiar so same as before we've got some alerts where users have been failing passwords more than five times so as per the pci requirement um what we're gonna add to that is essentially like a percentage

that's not good let's try that again

um stats let's see about a field what did i do

um let's do something like [Music] but the worst bit if this doesn't work because oh no it's fine so um let's just remove that so what we have is essentially an aggregation of the number of times that um the word a lot has come up in the allowed field and that's this bit and then we're counting that as number of alerts we're counting the total number of events as the total and then dividing the um the number of the last by the total and then we should get a hopefully oh well that didn't work for some reason but anyway um the idea is classify the number of events that have had and a lot and divided by the total

number of events giving you a percentage for the pci dss compliance where um users have been limited by the more than six attempts although clearly it's a setting that hasn't been switched on for this demo application which honestly i probably wouldn't expect anyway cool so moving on um ah conscious of time i will be done soon um let's do the next one now so the next one says that um this is again what we've been talking about before which is having the non-repudiation bit essentially making protecting systems against an individual falsely falsely denying having performed an action fine so where we were looking at before not having a user um not having a user um

be able to deny having disconnected from a saturn server for example if the salvo's compromised it's quite important that they're the ones getting the disconnect so what i've done here is essentially with the regex commandment from before pulled up the user field fine and then we can see that the number of users from this summary page is 75 percent of events have the user identified fine so um this is a slightly lazy way of doing it but um we're counting the number of events with d johnson muen and chop and then counting the total of events

so a number of uh events with users identified that the total number of events is somewhere in the 2000 range and then that should tell us that 75.137 of the events have the users identified which which is kind of like yeah it's good fine um good web app [Music] um and then the last bit is for um where you do have like a time chart um what you would do i'm glad i already have this up is um you have the source from the secure uh logons you have the failed password events and then what you essentially do is track it over um track it by name over time so wow it's slow anyway the idea is

you get a pattern with the time chart and um use the predicted um use the forecast time series model so kalman filter in order to be able to forecast this is like failed logons by the admin root users things like that um and then you have this is what we have at the moment and at this point is where the model actually generates um the predicted failed passwords which it thinks is going to be 172 um later on that evening so yeah um hope that was helpful that's pretty much it for me um yeah thanks very much for that that was that's pretty cool i think there's some questions coming in on slido i think some other people might have

some other questions um so we have anonymous who the infamous hacker anonymous has asked on the pci dss example that you showed as a reason why you limited it to internal ips only so uh with the web app that we're looking at the demo tutorial data um the purchases by external users aren't um the users aren't identified they don't need to have an account in order to be able to make a purchase which is poor security decisions obviously but if we did look at external ips we wouldn't be able to track down the users so a lot of the time um the work that i've been doing was tracking down internal users anyway so yeah that's why internal cool

awesome uh does anyone else have any questions in the chat no i think i think we're all good