← All talks

Breaking NBAD and UEBA Detection

BSides NoVa · 202148:3035 viewsPublished 2021-07Watch on YouTube ↗
Speakers
Tags
Mentioned in this talk
About this talk
Network Behavior Anomaly Detection (NBAD) and User and Entity Behavior Analytics (UEBA) systems are widely deployed to catch advanced threats, but their data collection and processing pipelines create exploitable attack vectors. This talk examines methods for poisoning detection data, evading alerts through baseline manipulation, and masking reconnaissance activity—with proof-of-concept code—then discusses hardening strategies and needed changes to detection standards.
Show original YouTube description
Network Behavior Anomaly Detection (NBAD) and User and Entity Behavior Analytics (UEBA) are heralded as machine learning fueled messiahs for finding advanced attacks. The data collection and processing methodologies of these approaches create a series of new exploitable vectors that can allow attackers to navigate network and systems undetected. In this session, methods for poisoning data, transforming calculations and preventing alerts will be examined. Proof of concept Python code will be demonstrated and made available. Approaches to harden against these attacks will also be discussed as well as outlining needed changes in detection standards.
Show transcript [en]

welcome everyone i'm excited to announce the next talk breaking nbad and ueba detection by charles herring before i hand the mic over to charles let me run through a few housekeeping items you can use the chat window to the right of the screen to ask questions throughout the talk and we'll take the last three to five minutes for questions and answers session maybe even earlier at that time we'll read through a few of the questions to the speaker we besides nova also encourage everyone to stop by the sponsors area which can be found in the expo channel this platform if you visit a sponsor page they should have resources and people available to chat about job

opportunities there is an open invite happy hour for those with the b-sides nova ticket at punch bowl in arlington tonight saturday from 6 to 9 p.m the address is 4238 wilson boulevard suite 1180 arlington virginia three if you really need the zip code now i'm handing it off to charles hey thanks appreciate it brian um feel free to drop any questions in a chat as we go along i did drop in a couple of links that i'll be using um throughout the talks feel free to you know bring those up if i get too dry for you but uh let's dig into it um oh so my last oh what just happened powerpoint just died that's a new thing

there we go all right so a little bit about me started my career in cyber security just after 9 11 at the naval post graduate school it's detailed there to spend up the network security group i also worked with m4 world magazine in the test center testing security products up until the test center uh shut down in oa um left active duty in 2005 did some consulting work with dod state department around sharing cyber security information how to operationalize it that type of thing i went to work for and the commercial space with lancope which is a network vehicle anomaly detection tool that i did some research with um back in uh at the post graduate

school that company was acquired by cisco and spun up uh whitfu in 2016 where we study what i call the seven unstable conversations of cyber security operations we're going to spend most of our time today on the first one which is investigators needing to understand what their data is telling them so we're going to break the talk down into a few different uh categories first we're going to go over sort of how in bad and uva works in general we're going to talk through data poisoning how to prevent poisoning through non-repudiation and other protection mechanisms we'll go through the attack scenarios um and attack techniques to to do the poisoning and then we'll roll into some q a so um

before i go too far i'm gonna kick off a script uh here that we'll dig into later it's it's the script it's up on paste bin but it's gonna take about five minutes to run to generate some nonsense messages so i'll take that off as we talk through some concepts so dr dorothy denning at the naval war college in 1986 long time ago now wrote a paper called intrusion detection expert systems and it defines the ways that we detect bad things happening how do we do that with cyber security and really while marketing terms have changed and products have changed and all that stuff the the basic science behind it's pretty much been the same since

dr denning wrote it but a signature-based detection is when we inspect an object a file a packet a communication there's some object that we're looking at and we notice there's something known bad about it a file a specific hash packet with specific markers those types of things you're looking at the object and so signature base detection really cool we know what it is we have cves and other types of documentation associated with it we tend to know what to do with it how to block it how to remediate it how to protect against it so intrusion prevention systems work really well in that space antivirus er content filters are all technologies that live largely in signature detection

behavioral detection is when we look at not the object but the subject so what does the computer do after it eats the bad thing um things like currently uh ransomware we start seeing cryptographic key um encryption that happens across a disk or a machine being used to send out spam or machine scanning the network so these known bad behaviors we might not know what the exploit was but we know the outcome is that there's a number of things that computers in general shouldn't be doing um there are nefarious or suspicious behaviors so uh technologies use sandbox malware sandboxing is a great way of you drop the malware in there you watch what the system does after it eats the file or

executes the file you detonate it and you see what happens if the behavior is a known bad behavior you know that the file or the packet or whatever triggered that bad behavior is also bad um network behavioral anomaly detection and uva look for these bad behaviors uh and bad on the network uva sort of at a higher level i host space intrusion prevention systems for monitoring um for known bad behaviors like reading the entire address book or sending spam scanning the disk doing encryption for extortion variants somewhere that type of thing and of course sims live in that space as well and anomaly detection is still looking at behaviors but it's not looking for known bad behaviors it's

looking at really white listening or baselining normal behavior and alerting when there's a deviation from normal so there's really not the implication that something bad is happening it's just a non-classified or abnormal behavior and so the way that these things line up is anomaly-based detection helps us create new behavioral checks we have some new anomaly like hey if we see these behaviors occurring this behavior is something to look for it's a new type of buffer overflow or a new type of scan or dgas or one of the domain generation algorithms or something that we've realized are uh generally associated with more modern malware so that was detected through anomaly detection because why are we communicating with geographically

um anomalous locations we never talked to russia now we're talking russia or some other locale so anomaly-based detection creates new behavioral checks behavioral checks create new signatures and signature-based detection really good for known exploits very low cost of deployment high fidelity and blocking really good for non exploits oh day zero day exploits best caught with um behavioral so you might not know the payload but you know it's the same old thing we're trying to steal our money transfer data trying to extort uh our information and trying to bring down services so the behaviors tell us that something's going on even though we don't know the exploit and of course when you have humans involved there's really not a

payload and you're able to have time and work at the speed of humans instead of the speed of computers credential abuse whether it's you know the credentials are stolen or the person is compromised it's an insider threat anomaly detection tends to be really the only way to catch and to catch those types of things so when we talk about bass lines it's a few things to understand if you're doing anomaly detection you have to have a bass line you have to establish normal the purpose of the bass line is so one thing to look at is do we baseline each entity do we have a baseline per computer per user per relationship computer to computer computer user

or do we look at it at a set level so wireless machines to email servers you know email servers the dns servers and so or to start baselining these sets these members of entities it takes time to build baselines and not all time is the same humans interact with machines and we tend to sleep we tend to go to the bar we tend to go on vacation we have different buying cycles prime day is coming up you know so we've got all these different things and the way that humans interact with each other that are not consistent so that means if you're building a baseline you need to build baselines that take that into account generally means it

takes a long time to make a good baseline um so what are you baselining you have to have a metric or group of metrics that you're going to uh to building around so supervised machine learning we just pick things that we're going to baseline how many bytes are consumed by this machine these are this group of machines how many bytes are exhausted or sent out of the network from these machines what time of day is it used all these different things you basically pick a certain number of variables you'll assign them to sets and then you just build those baselines over time by monitoring the telemetry coming in on it unsupervised machine learning would just

be building baselines upon every possible variable and every combination of variables so you might say the type of services from these types of machines on these types of days with these protocols or whatever your whatever you want to build and you look for building sort of these nebulous or permutations and permeations of baselines across every possible variable site would be unsupervised machine learning for the purpose of building baselines uh network behavioral anomaly detection really came around uh 2002 or so it's when we started doing that as people and it works off of meta so a really cool thing about nbad is you don't need to have a ton of data to make it work uh not not

volumetrically a time it's netflow is you know layer four and below type of data it's telling you who talked to who how many bytes were exchanged what were the tcp flags that were involved in those conversations and it's just a record of a network conversation so you can monitor what's going in and out the network but also more importantly you can baseline what's happening internally inside of a segment even with some of the newer technologies or segment to segment across different locations or different areas inside of a network because baselines are built off of numbers uh those the metrics the packets the bytes the connection counts the relationships um virtually all that data is in netflow

and when i say netflow it can be in flo b5 e7 b9 it could be ipfix jflow c flow d there's a lot of different flavors generally coming off of routers and switches generating that or firewalls also do it visibility fabric from manufacturers like gigamon or shaping traffic and creating great application metadata from those records and that can be shipped in either ipfix or in syslog using generally common event formats um so it's just meta just counting things that are going on and are able to deal with that and generally because it's built on embeds built on netflow it's sometimes grouped into netflow tools list of um common things that do netflow so when you're looking for building

baselines for the common ones are service traffic threshold anomalies so you look at okay from wireless to our email servers how much email traffic happens how much ssh traffic happens how much web traffic happens or such a thing so you build baselines on uh groups sets to sets and the types of services that they are using and how much how many bytes how many connections how many packets are exchanged and you build baselines on that and so there's a deviation from from that you can alarm on it service type anomaly is pretty much the same thing if a host is on the network and it only consumes uh web and email services and all of a

sudden it decides it's going to get into tour or something else that's a service that's a net new service anomaly you inventory all of the services geographic traffic so if you only if you are doing all your business with north america all the traffic or large portions of traffic's going north america then all of a sudden you're seeing it going to different geographic locations that's a geographic anomaly that uh what's baselined geographically is changing uh data hoarding or data staging is a really important one in data exfil evolution so you count how many bytes each individual entity is collecting from internal sources how much data is being moved from the servers to this particular client

or group of clients and so that you can detect data staging so if it only consumes you know 500 mags a day and then on a new day we get um you know two terabytes of data collected from the network that's a huge anomaly which tends to indicate normally somebody's getting ready to quit they're taking some taking some records with them um data disclosure is very similar it's a client or group of clients that normally send again 500 mags out of the network to external sources and today it's much higher it's volumetrically higher than that so you can alert on data disclosure or data exfiltration in those cases didn't list it here we can also look at

beacons if you start seeing irregular connections that are occurring they tend to indicate a c2 command control it can also be detected that way so with uh in cyber security operations when we first started doing it those of us have been doing it for a while we start we started at the signal level we'd get an alert an alarm an event that would come in and we had an alarm table and we'd try to work our way through that never ending alarm table and uh investigated that level that was obviously too much that was not sustainable didn't have enough context way too much volume then we started moving to a computer-based or host-based investigation so we would

map all the alerts to the hosts that they're alerting against you'd have things like most alarming hosts and you would investigate the host and look at the alerts that are associated with the host which gave you some level of consolidation uh sort of the next layer we went to and detection was uh mapping the user which users are associated with which with which machines and which events are mapped to those machines and so you have sort of these relationships to bring human credentials into play and you're able to do user-based investigations and so when we talk about user and empty behavioral analysis we're talking about not really looking at the alarm level we're looking at the characteristics of

the object or as we would call it today graph theory right we're looking at the graph relationship the nodes and the edges how they interact with each other and how things change and we do alert on known bad behaviors and anomalous behaviors this you can pull up there on things that currently do this some common user anomalies are magic carpets really common one that if a user logs in in cleveland and a few minutes later logs in from tehran it's not physically possible given the current state of uh transportation for him to get from cleveland to take it on that fast that's a magic carpet login and so you look at that type of thing you're maintaining

state on user cleveland user um somewhere else and then when those things when you look at those things together those entities compared to each other tell you you have a magic carpet type of attack the other thing to look at is baselining when do people log in if someone always logs in from you know nine to five then all of a sudden we see a 2 am log in there's a reason for concern that's a baseline against tod or time of day anomaly host access if it only access if the user normally logs in from his workstation that's the only place that he or she logs in from and all of a sudden now there's a direct login to

a critical server that's a host anomaly that that credential is not normally associated with that host same thing with data access if a user's only hit an email and um the hr application and now all of a sudden the user is accessing data like source code that they would normally not do that would be a data access anomaly same things through a service if someone's never used no one if a sales guy all of a sudden decides that they're going to start using ssh or something along those lines like that doesn't seem like something that guy would do that that person would do so that's all the background on how that stuff works let's talk about how to

break it so um a few things i want to talk about the types of data poisoning we have is what i call mass implication which is my favorite one i'll demo that to you but essentially what that means is i'm gonna do something bad but i'm gonna make it look like everybody else is doing it and so when the investigators are trying to find me they're gonna have to go through all these ghosts all these fake attackers that really weren't attacking define me um second thing we're looking at is baseline boiling since um anomaly-based detection is based on baselines what if we just break the baselines what if we make the baseline so big that they can't be violated i'll

talk about how to do that also look at attack masking so how to do an attack and then make it look like you didn't do the attack a couple of methods we're looking at is log spoofing log spoofing is when we generate a lie a believable lie for something to consume and uh and process and make decisions based upon the line the second is where we're going to create a lie we're going to create a an artificial behavior and have something else generate the lie the network switch or something else we're going to look at doing that inside of a machine vm container and also getting network devices such as switches access switches to do the lines for us

so this is what it looks like so if this is the attack i'm doing if i'm scanning the network i've got a command and control channel back here data staging data exfiltration here's the malware if this is what's going on and for poisoning data we want to one of two outcomes if this is what really happened we either want to make it all invisible that's masking and so there's nothing to investigate or we want to do mass implications so where you had clarity on what was happening here now it looks like everybody's guilty of everything all the time and it makes it virtually impossible uh to do an investigation so these are the two outcomes of how

what we're going to do to poison the data um do you want to talk about a little bit of protection i'll end on this note as well later just recap it really important authenticating your sources are critical you know for a long time we've treated cyber security operations like an extension of i.t it really has nothing to do with i.t it has everything to do with law enforcement we're really extending uh investigations that are about detecting crime responding to crime uh and whether that's a civil matter such as violating corporate policy or it's a legitimate you know

um so we need to treat data not as data or as logs or signals but we need to treat them as evidence which means we need to be very careful on how we collect them how we store them we need to handle chain of custody and a big part of that is authentication doing client-based authentication from the sending source so having a public key infrastructure that issues client certificates to the firewalls to the log servers so that everything's being transmitted over client authenticated uh tls is the most bulletproof way of keeping that of protecting the evidence um of course there's a ton of problems with that um your rfc 5101 defines ipfix and it makes room for

tls but i can tell you i have looked far and wide and i have not seen any solution that ships ipfix you know over tcp much less tls so that's really more of a hypothetical scenario when you're dealing with a net flow ipfix and that type of thing it does it's expensive it's computationally much more expensive to do tcp tls handshake key exchange um certificate refrigeration checking um so you gotta have beefier boxes to do it it means you're spending more money to encrypt not just encrypt the channel but also authenticate the channel and then you do run into potentials of denial of service by reflection attacks but again that is if you can engineer that

where you can put client authenticated exchange in logging that is so important so critical to closing it off anywhere you can do that and when you're working with different vendors manufacturers it's a good feature to make important to them where you can you want to do ip spoofing protection udp you can mangle the header and say that you're whatever ip you want to say you are so you can make your sending ip address in the udp packet the firewall's ip address and that makes it very hard for incident responders to figure out what the what real logs came from the firewall and which ones that charles made up right so ipspoof protection is available in some versions of some

routers and switches this command ip verify reverse path interface interface name um that's the command in most cisco switches for routers and switches for verifying is that the ip being reported is actually coming from that ip zero trust tight apples uh controlling how information funnels into your analysis tools critical uh if you leave it open it's gonna be a problem um when you're collecting information you want to tag it with as much information as you can what port it came in on was the certificate involved what was the transport all of those things really important when you need to filter out poison data which look at a little bit and then honey pots are great catch-all

whether you're doing deception technology trapex makes a good one as well but honeypot shouldn't be there so no nothing should be talking to it so in the case of where you have poison data honeypot sort of become your last line to know where the attacker is whatever's talking to this non-existent box is a bad guy the zero the false positive bracelet honeypots are very low and they're very difficult um very difficult to poison so the attacks we're going to talk about now are basically the entire kill chain minus initial access you have to have some access to the network for this stuff to work this isn't an exploit to get you into the network these exploits or

tactics are to break detection once you're in there so this concept is really critical whether you're a red teamer or um or just a bad guy yes the pump and dump technology or a tactic rather is about generating attack telemetry from as many unique mac addresses and ip addresses as possible and the reason this is important is if they figure out the data poison data is coming from a specific ip address the investigators can filter out that ip address but if you generated the traffic from 100 different addresses that's a hundred different filters that they have to detect and implement to get the poison out so best thing is to do this pump it up very

simple use a vm or a docker container kubernetes whatever you want to use manually set the mac address let it spin up grab the dhcp lease which gives us a new ip address do is just say perform attack not attach perform attack where whatever you're going to do a sim scan or any of these other things we're talking about ipconfig release right and then uh kill that box spin up a new one with the new mac and uh repeat so you're generating as many fake source ip addresses as possible as many fake mac addresses so then even when you're talking about doing layer two type of response you can't do it um favorite way to do this is in docker

um so docker create the network gonna use is connect to the the the nic of your machine that's running docker set the mac address tell it what uh docker image to use i named it kelly one i start kelly one i do stuff kill it repeat new mac gets a new ip address again just making it difficult for them to figure out which which ip address to come after okay pocket dimension i wish i had a good diagram for this apologize for not having it but the whole idea here is spin up as many uh fake containers as you want in um in a docker network and they don't have to be inside the network so they can be

an external 4.2.2.2 and 10. whatever 192 and whatever addresses you want you can create any fake ip address in that container in that docker container so that you have a fake internet and i call it a pocket dimension because what i can do there is i can create a behavior that says i'm connecting to office 365 ip addresses even though i'm not i'm connecting to a fake ip address because the route is happening inside of this docker network this lab network and then at that point i can generate telemetry out of that pocket dimension that will tell different stories and so the way this works you create a new network i call it bridge or it is a bridge and then give it

the internet as its ip space and then you can create here i created one container with 4.2.3.1 010.10.10.1 and anything that i'm doing between these boxes it looks like it's happening now it looks like it's happening on this bridge network and if this bridge network is connected to an access switch that is configured to generate net flow it will generate net flow saying that 10.10.10.1 with 4.2.3.1 in whatever way i communicated with it and so now you have the actual architect infrastructure network infrastructure generating records about things that look like they did happen but they didn't happen in the context of the real world they happened in the context of the uh of the pocket dimension so a lot

of things um um a lot of things can happen there in uh inside a docker whether you're we'll talk about some more that later now log spoofing is the most direct way instant gratification on confusing analysis tools um the hard part is finding it who do we lie to where do we send the lie so there's a few approaches first dns you know i would highly recommend not naming your log server log.acme.com or simac.com or in bad or uva or squonk or whitfu or whatever don't name them those makes it really easy you can go in with a dictionary of things someone may have named it then you're doing ns lookups to find an ip address that resolves to one of

those names if you have a compromised machine that's logging that which every machine should be logging netstat will tell you who it's connecting to on syslog so that's another approach and you've always got our good old friends in map to do a tcp send scan to look for 514 there's virtually no tools out there that aren't receiving syslog on 514 udp and tcp um so you can do that of course it's noisy so if you're going to do that make sure you do a pump and dump because they're going to catch you if you're any good um then but once you find it you're going to send the artificial records do the pump and dump

and uh and move on and of course the other thing to note is if you're doing udp you can do a header masking so i want to go over this specific um very simple low brow way of getting a line into a system so this is just using netcat so here we're we're echoing this message over netcat to a server and i'll show you the the paceman um the first link i posted in chat is right here and um there's four messages that i'm using in uh in this batch script um first i just create some time stamps because different um log formats use different date formats or yeah so generate those i'm going to

need i set the destination or i'm going to send it in this case i'm sending it to port 514 this is a symantec endpoint protection message so i just put the variable stuff in into this who i want the attacker to be and then i ship it i have an asa message does the same thing a fire eye message and a trapex message and then what i'm doing here in the script is i i want to spoof as many fake ip addresses as possible because i'm i'm basically going to do things that will trigger these types of alerts so i want it to look like everybody's doing the thing i'm doing so that we have this is a mass

implication attack but again just using netcat use a couple of switches these might change this is this w is really important for um debian ubuntu colonel but you know you can play around with that but this is basically going to send for 10 seconds the script that i ran at the beginning of our talk of our time together here um so i sent uh 47 000 messages to two different destinations that spoofed almost 12 000 ip addresses so i'm one ip address there's another 12 000 out there so then when we go back to start looking at the log so i sent i sent some displacements and some to um to with precinct and so if i go in

and i just say give me uh actual search criteria here in splunk um show me everything with the last hour this will pull back those fake records and you know there's in spawn if you're going to build different layers of analytics but you can see here's the time where i shot the messages over you can see all these different ip addresses um breakdowns of those messages how many did you see the source is more than a hundred so there's we did 11 000 unique hosts so this is a lot of poison data 41 000 poison messages if um search here if we look at that same stuff in

precincts

okay so pulling back these records in this in this mess you can't even see this graph here but uh because it's so many um but if you look here my real ip address is one of these and uh the rest of these are fake because i spoof them all so my attack is sort of hidden here and even more you know if you go into uva kind of stuff this is what again the incident looks like this might not this might not even have my ip address in here but there's a user credential that i'm spoofing the different places i'm attacking here's 207 different um different alarms from the asa we've got a trapex we've got symantec

we've got all these things happening which is these fake messages and again this is the mass implication that how do the investigators respond if they can't figure out which one of these is really me um when we do look at uh when we do look at these records but it'll work i guess we can see from the source host uh the sending host

sender host that it all came from the same ip address and if i look at that even more simply in splunk you can see the port it came in on and then the host so if i filter out that ip address all that poison data is gone so this is why if we did a pump and dump and used a hundred different addresses i would have to do this filter over and over and over again for each one of the um for each one of the ip addresses that i picked up on the pump and dump um so yeah so the two had a question the it was uh splunk enterprise was one and then with food precinct

w-i-t is w-i-t-f-o-o um so that's that's netcat all right now this is a cool thing if you haven't played with samplicator it's it's been around for a long time on github i'm not associated with a project but i do love it um essentially what samplecat does is you say i'm going to listen on whatever address 10.10.10.10 in this case on port 514 so it's going to receive syslog messages and then you give it a list of destinations to send those messages to so the reason this is important let's say we know it's subnet the sims are on or the embed or the uva tools are on but we don't know which where it is and

we don't want to do a scan what we can do is like list all 254 addresses um here send the message into samplicator and this applicator will send it out uh to each one of those destinations so if it'll hit something is the idea there as long as the apples aren't in place so it's a great way of getting messages to a lot of destinations is particularly if you don't know the correct one so great project only runs in linux kernels but check that out if you haven't looked at it and also just as a secondary note for those that are setting up logging architectures applicator is a great way when you need to get some logs

into you know one of the different tools i want to send some of these logs over to um splunk some of the whitfu some to qradar some to what's up gold whatever you want to do it's a great tool for working through that tls a little bit harder than netcat but really not that hard so this is some python code this also works for tcp if you take the ssl context out um but it's just essentially creating a socket then the rest of this is exactly the same stuff as exactly the same stuff as the netcat whatever message you want to send and i will say if you're a lot of my team spends most of their

days researching what messages what there's a ton of the stuff documented so you can go and google what's the format in this case the semantic endpoint and just look at the format swap the variable data out what you want with what you want and ship it out so but you can ship the tls now obviously this only works if the server isn't doing certificate client certificate validation that's why pki is really important because that would shut this down because the server would reject the socket unless you were able to generate generate a certificate so have a private key to do that in probes another cool project check that out at intop.org the way it works very simple uh run in

probe uh point it to point it to e0 the interface i'm going to sniff is h0 and i'm going to send it to my p address so this is great for generating netflow records there's other stuff you can configure here but where this is really cool is let's say you have a pocket dimension and the switch won't it's not going to generate the net flow so you got to generate your own net flow you can create this whole fake pocket dimension with whatever scenario that you want to see and then have m probes sniffing uh the docker nick and then generating netflow records about it and sending it out and so really cool tactic for making

your own netflow records and it's really difficult to discern um uh which netflow source is the right one which one's not and the way these tools work in flow analytics and bad is they have to de-duplicate netflow records netflix records are unidirectional meaning they have half the conversation and a notice of records have the return server client communication so they really all get sort of stitched together and bolts together so you're able to poison that really easily with improbe so a couple of uh a couple of things we'll talk about using that in a bit um api stuff this is a command to post data into elasticsearch as an example dash k on curl so curl is your

friend just like netcat was your fan with udp dash k that says ignore the certificate uh the server certificate post to whatever end point you need to post to dash d is the data and then here i'm posting a json object which is what elastic likes um so got that so embed mass implications sort of what i just what we just did but what you can also do is um um build a pocket dimension and in the pocket dimension you can have everything scanning everything and then whether you're doing in probe to send the net flow out or it's bridged to an access switch you can um have the switch send it but what it looks like then is everyone

everything is scanning everything you can't research anything as we're seeing earlier in the other mass implication attack but it's a great way if you want to do recon and you want to do a scan of everything and you don't want them to catch you you know one approach is let them catch you but that they got to find you in the other you know 100 000 other hosts you just generated doing the same thing which one is you so they can't even track you down so um that's one approach to covering your tracks is just make everybody else guilty so they can't find you oh it's netflow masking so to be invisible the way network tools

work in detecting scans particularly tcp scans is they look for a send packet so host a to host b there's a send no syn ack came back host b send went no syn act came back and so when you start seeing these these ratios of send to synthetic skewing you're going to flag it for a recon so let's say we do a sim scan and we hit the host send packet goes out sin act does not come back why not generate a record to set a synthetic did come back so you can do that through the pocket dimension or throughout to other tools so um great way of covering your masking your tracks all together there now this

is definitely a baller advanced you know lee kind of move here and i'm going to prefix this if there are red team guys on here please do not you're going be very careful if you're going to do this because baselines take a long time to build so if you use these approaches in a pen test the blue team is potentially going to be left with broken bass lines for a very long period of time so i can't give a big enough disclaimer on that so if you're going to have backups have a restore plan for getting the old baselines back in but essentially the approach here is let's generate traffic that will start pushing the

baselines further and further and further towards infinity so a five percent um daily bump is generally enough just to gradually move the baselines without triggering an anomaly alarm so an example might be um say the baseline is wireless network to data center so we just generate a couple of packets each day that show a few megs of data moved and then a little bit more the next day and a little bit more the next day a little bit more the next day and so you can do that really in every scenario whether you're talking whatever you're baselining if it's data staging whether it's data excel tradition whether it's time of day have you slowly moved people from instead of logging on

at nine what if if they log on at nine at 8 58 that's a little bit earlier but not early enough to trigger the alarm and so if you start moving those login events generating a syslog message of authentication really simple do that with netcat you're able to push the baseline to where everybody's logging in all the time from everywhere into everything as you start moving those baselines around so obviously it takes a long time you can't do that overnight but devastating to an any machine learning model it devastates supervised unsupervised models this is a boiling the frog type of approach to blowing the baseline um not repudiation just review this one more time please where you can get

client authentication on tls please do it please please do it if you can do it it does mean budgets are going to need to take this into account that we need to protect chain of custody authentication non-repudiation with our records um ip spoofing easy that's a low-cost one right just make sure the person doing it knows what they're doing building around zero trust controlling uh the network flows of data into your analysis platforms strong apples strong segmentation honeypots pots as the catch-all and then as many ways of doing chain of custody whether it's certificates that are going to the records the source ip addresses what port you received it on the path it took all of those things

really important last thing talk about here and then i'll leave the rest for any questions you may have a lot of the research that i work on with our team is around comprehension or conversations there's an open source project i'm releasing the the code at defcon this year um it's an apache 2 project but essentially what it allows you to do it's at logruber.com to sort of automate a lot of these things you create the stories you want to tell you create the things that you want to emulate the firewalls you all want to have an asa firewall and a crowdstrike falcon edr all these different fake things that load what we call reframes a lot of

my research is around natural language processing using a thing called semantic framing which is basically fully comprehending what every message means well the opposite of a frame i'm calling a reframe so it's basically taking a standardized language and converting it into a proprietary format so sort of one of the harder parts of you know what i just showed you all is you have to understand you know how does an asa send a message differently than um than palo alto or checkpoint or so forth so the reframes are in the library they'll be available but keep an eye out on this and then targets are just the places to send it um so with that i'll take any questions you

guys have got about 15 minutes left for anything comes up see um yeah no sweat on docker docker's really powerful you know particularly when you're doing research it's a great way to um great way to experiment drop off the network emulate all kinds of real world scenarios in a fake world and really data poisonings about that um yeah christian research for me um the short of my story is you know when i left when 9 11 happened um i was uh a navy aviation electronics technician i fixed f-18s and that was a very mature craft we had um pubs that told me exactly how to do the work and people they were i was trained uh there was qa

there was metrics on all those pieces transfer out have to figure out how to help the navy do network security or cyber security and we had no pubs and so um the the mission that we set out to do over the last five years at whitfu is how do we mature the craft of cyber security operations and the basic thesis is we've built we've built on i.t where we should be building on law enforcement models and there's seven conversations that i research the first one is investigators don't understand the data that their tools are providing the second one is the managers of the investigators um aren't able to meter the effectiveness of the investigators how

many units of work should each investigator do what tools do they need what training do they need third thing um security practices can't communicate with the broader business so trying to make all of the board members cissp or equivalent it doesn't work the security practice needs to learn how to speak business we need to convert uh data into uh into profit and loss statements fourth thing um is vendors lie so how do we hold accountable the people that we're entrusting to do network monitoring anti-virus zero trust how can we turn that into business metrics and hold them accountable and i actually just have a healthy relationship between organizations vendors fifth problems organizations can't share information

without inducing risk so how do you safely tell your neighbors that bad things are happening in a way that's useful to the neighbors without creating a lot of work or creating a lot of risk sex problems law enforcement and private organizations aren't able to communicate well with each other so how do i call the cops how do i inform law enforcement that bad things are happening and there's some broken pieces on both sides of that that we actually researched and the seventh piece is just law enforcement needs evidence to put cyber criminals behind bars so those seven areas what we researched and so it's just looking at on on the craft side there's also pieces

around natural language processing how do we comprehend all these different types of messages data big data problems how do i ingest terabytes or petabytes of data each day make sense of it and federate those use cases but yeah that's been the that's when yeah no no kidding about two cups two pots of coffee i've been chained to this desk doing research with the team for the better part of uh six years now so don't get a lot of walking around time any other questions before we wrap it up i do appreciate the time and hope to see you guys next year um in virginia be great uh great to start socializing i'm hitting the road

tomorrow for uh first in-person conference in myrtle beach out at the um forensics conference given the same talk so i'm doing it on stage but if anybody has any questions hit me up charlottewithfood.com i'm at charles herring on twitter find me on linkedin but i do appreciate the time you guys let me know if i can be any help to you back to you brian all right that lens cap is a powerful thing let me turn that up there everyone thank you for attending this has actually been probably one of the most uh interesting talks i've been to this uh today i want to remind you of some of our housekeeping items uh there is an open invite happy hour

for those of the b-sides nova ticket at punchbowl in arlington tonight that's from 6 to 9 pm at 4238 wilson boulevard in arlington on that note if no one has any further questions thank you charles and we'll be seeing you