Cleaning the Apple Orchard: Using Venator to Detect macOS Compromise

Name: Cleaning the Apple Orchard: Using Venator to Detect macOS Compromise
Uploaded: 2021-05-01
Duration: 32 min 29 s
Description: Richie Cyrus demonstrates Venator, a tool for threat hunting macOS systems and detecting post-compromise activity. The talk covers macOS persistence mechanisms, how to build robust detections from endpoint telemetry, and practical case studies of identifying malware and suspicious artifacts through

BSides Charm · 201932:2914 viewsPublished 2021-05Watch on YouTube ↗

Speakers

Richie Cyrus

Tags

CategoryTechnical

TopicDFIR Malware Analysis Threat Intel

DifficultyIntermediary

TeamBlue

ResearchCase Studies and Incidents Analysis Technical Deep-dives

StyleDemo Talk

Mentioned in this talk

Tools used

Filebeat Kibana Logstash OSQuery Splunk Venator

Platforms

Apache Kafka ELK Stack

Service

VirusTotal

About this talk

Richie Cyrus demonstrates Venator, a tool for threat hunting macOS systems and detecting post-compromise activity. The talk covers macOS persistence mechanisms, how to build robust detections from endpoint telemetry, and practical case studies of identifying malware and suspicious artifacts through data enrichment and external intelligence queries.

Show original YouTube description

Richie Cyrus is a Senior Consultant at SpecterOps where he specializes in detection of advanced adversaries with a focus in MacOS and Linux environments. Richie has a background in incident response, forensics and security operations spanning across the Fortune 100 and the public sector. He currently maintains a DFIR focused blog at https://medium.com/securityneversleeps.

Show transcript [en]

is using venator to detect mac os compromise my name is richie cyrus i'm on the defensive services team at spectre ops in a senior role but prior to my time at spectre ops i was actually at apple on their insulin response and forensics team so as such i've come across a lot of prompts that look like this on a lot of people's computers and so we now live in the time where uh this is no longer true that macs don't get viruses even when they stated that back in the day max did have malware on uh systems all across america the us uh like globally malware existed back then so kind of a little bit of false advertising

but even so much more today than now apt groups are starting to get interested in mac os malware where you have the apt 28 group actually creating something like x agent and so this is the upper trend that i see with mac os malware such that you have very advanced adversaries starting to get their hands on it create it and start to use it against organizations they're interested in and so you're also starting to see the cross platform rats such that the rat works on windows and also is ported to mac os and so as i mentioned before macs are just like no longer in their own little bubble where they just don't get where they just don't get compromised

and so let's think about max in our organization and so uh most macs in an organization are typically issued to those who have the most to lose and so they're usually like your security team full of macs uh your developers your uh executive teams because they like to be cool and they want a mac and they think a mac is cool um also you think about it nowadays when someone joins an organization they're given the choice of a windows computer or a mac os computer and so when you think about defenses around mac os computers there's typically none there it's non-existent or there's low visibility and so if you think about the data that's contained on this mac os system

well usually people have like a vm which has a window system that they're able to connect to and that windows system has the ability to connect inside of a domain or the credentials used for that mac os system are also good for the windows domain and so now you have to think of these macs as a prime target for data expo one because they're in the hands of folks who have valuable data whether it be source code whether it be financials uh data that an executive would have these are things that adversaries know and now they're starting to go after so because of this fact we need security controls in place not just on windows computers but on mac

os systems and so that brings me to threat hunting and so at some point in time you have to understand that endpoint security controls will eventually fail or be bypassed you have to understand this fact and so on the windows land we've already assumed breach we have uh the assumed breach mentality especially when we do hunting but then on the mac os side people are still hung up on the fact that max don't get viruses they don't even consider like hey this mac could be already compromised then they're already living inside of my environment so with enough time money effort an adversary is going to get in this is not just exclusive to windows systems

so we have to we have to shift our focus to comp uh detecting post compromise activity and just not relying on all the tooling and blinky lights that we already have saying this will keep us safe that's not true so in order to carry out and actually build robust detections the number one thing you need to do that among anything else is not people could be technology but you need data and so without data itself good luck trying to find an adversary you won't be able to do that and not just data for just data's sake but are you collecting the right data the things that you're interested in do you have the data set

to represent what you're trying to detect for most organizations they have to take a second look at their data they haven't thought about this and so there's tons of tooling on the windows side a bunch of blog posts around how to do effective windows threat hunting and what type of data sources you need for the mac os side not so much and so if you think about all the products these are the top players in the mac os space how many of you in the room have at least one of these products on your mac systems in your environment today right you've really seen at least one of these and so these products by themselves produce

rich data that could be used for hunting you could leverage that data for hunting purposes and build pretty robust detections one of the issues that i've come across not really an issue per se but i want to present an alternative to you today is that all of these are agent-based and so have you thought about the data that this produces and then also the gaps between data sets between uh actual products right so you have something like os query which is basically a pull approach where you tell it what you want to collect and it basically gives that back to you as opposed to carbon black which is always like streaming data to you when somebody

executes something on a system two minutes later a minute later depending on your pipeline you actually see that data on that system right so this is an alternative that a lot of people haven't thought about unless you're a consultant or maybe you dug into the weeds on the defensive side but what if you don't have access to an agent or you're not allowed to use an agent right and so i have a cool compromise assessment story to share with you so pretty recently i was on a compromise assessment actually two compromise assessments both were kind of related one organization wanted to make sure that other was good before they did merger and acquisition and so

in this compromise assessment just like any other they said yes please look at our windows systems good we can do that easy peasy they also said can you take a look at our mac os systems where everybody starts to freak out he's like well how do we do that the caveat with that with the windows system was like yeah you can't install agent cool we had a platform that kind of helped us to do like pulled back the data that we needed for analysis for mac os we did not right so he said i can't install any any of these agents i can't get the data from any of these agents and you want me as an external entity to

pretty much get that data from your environment and bring it back to my environment and start to do analysis on it well as as you can imagine this was a difficult task so in the first compromise assessment we worked with a vendor and because of the status that that vendor had at this company they wanted to basically leverage that vendor's tool in order for us to extract out that data and so we actually reached out to the vendor and said yes can you please give us this data set uh this is the data that would help us to determine if maximum environment have been compromised that didn't go over too well they gave us back uh

pretty much like half or three-fourths of what we wanted actually wanted and so when we started to hunt against this data it was okay but it was less than desirable and so in the second phase of that uh the second company had the same vendor but they didn't even really know how to leverage it so they said hey you guys figured out feel free to do whatever you want a lot of you might be familiar with patrick wardle he has a bunch of tools out there on objectivec.com and so we utilize this tool knock knock and so recently knock lock had a command line argument in it such that you can actually pull back some json from uh

you know a scan and then take that json and then ingest it into some centralized location well the initial version that he put out actually uh it was an application and so when you ran this in cli the knock knock icon continued to bounce while it was scanning and so they freaked out and said we don't want this bouncing icon while you're trying to do your assessment across the entire environment but we're like it's a security tool like people should be and they're like no we just get rid of it cool so we took a second approach uh we used osx collector which is created by the yelp team we tried to leverage that and so that

worked out pretty okay but let me tell you how that went first scan we got all the scans back the resulting file from os x collector is actually a tar ball that is gzipped and so what we had to do was collect all of those gzip files get them a centralized location actually uh unzip all of those and then extract out the json file that contained the data we were interested in and then take that json file and actually ingest it into our like uh centralized location for hunting and so that step that like long process was just like agonizing and we didn't want to go through that again so we kind of ran into a scenario where

uh one tool was kind of like too hot they gave us too much data we had to work through and actually pull out something useful out of that and one was too cold we barely got anything and so uh we started we started to think of what would be something that would be just right and so i have a story about how that came to be so uh i was also on a trip with uh jared atkinson the creator of get injected thread and also power forensics pretty well known in defensive community and then also on the same trip i was with roberto rodriguez the creator of hulk and the dirt hunter playbook and then

you have me and so as you could imagine uh as they're there working on some pretty cool projects they were coming out with i was twiddling my thumbs and i started to think well what if i started to create this perfect solution what would those requirements be and so i wanted to share some of those requirements with you before i move on to the next thing and so one if i was to create something out of the box i want it to be compatible with any mac ever right so basically if you're running high sierra or mojave even if you just took it out of the wrapper open it up no applications installed this code

should be able to work right to extract the data that i want so that was a requirement so to leverage that i use python i use native python for mac os which is still two seven i know i know but uh it's moving toward three hopefully and then i'll rewrite the tool to go to three right another thing was the output from that i just wanted a json file i don't want anything else just give me json and so every sim solution that you have can work with json files and once you get that json on there you're able to do some relationships and parse out the fields and build some robust detections based on that data

hopefully and also from the data that we have existed in that json can we build uh kind of like external enrichments such that if you have a file maybe like a hash you could take that hash and query against virustotal so those are some of the requirements also something that was a pain with osx collector it's designed to be run on a single system when that system is suspected to be compromised good luck trying to find uh if you're using it in a hunting scenario if you run it across a lot of hosts and you send that data back in those json files that you get back don't actually have the host name in it it

just has the name of the file the json file not necessarily the host name it went back to because the intent is to be run on a single system so i wanted a tool that could basically once you find something interesting you can map it back to a particular host so i want to give you a demo of the data that venator provides to you and then we'll take the next step so here we see we have uh index ingested data from uh venator and so the first thing that we want to look at is a common persistence mechanism on macos systems called launch daemons this is similar to like your windows services in a way

and so if you look at this data set you see that you know you have a few things in there you have the hash associated with the binary that's associated with the launch daemon you also have the osx name uh the host name module the path of the actual launch daemon and then basically the program arguments what's going to actually run when this launch daemon is uh started on boot so we want to take a deeper dive at this if you do a lot of detection hunting work um you'll see here that we're about to do long tail analysis so basically uh lease frequency occurrence right and so one of the ways that we could do this is

actually take a look at all of the labels these are like the unique identifiers for every launch daemon and so if we pull those back those that are uh basically least frequent in their environment are more interesting or you could just even take a look at the labels and see things that kind of don't make sense so you see most of them start with comm and then the product name i know most of these products you should probably too microsoft facebook whatever it might be but then we see this com class trade pro it's like i don't necessarily know what that means it could be bad it could be good let's take a deeper dive with that so

with that i want to actually take a look at the term and the term that i want to look for is the signing info i want to know is this binary associated with the launch daemon sign and come back we take a look at clash trade pro and see that it's unsigned and so as a defender this kind of raises red flag because things that are unsigned typically don't map back to some entity that's trusted and so i want to see what host is this launch demon on so i could take further actions and send that off to my incident response team or actually take care of that myself and so here you see we're going to

actually pull back the host name associated with uh this launch daemon and so we see we have pedro's mac and so now that we have this information we could go back into kibana in our second screen where we could parse all we can see all the parts data and query for pedro's mac and see okay what else is on that system and so we scroll we see we have like several other launch demons on that system but then shortly here you'll see that we'll get to celestra trade pro and so we identified it now we can pull back the hash associated with that weird launch daemon and so this is part of the enrichment process i talked about where you could

do this automatically or you could do this manually like i i'm about to do you can take this hash associated with the executable and query something like virustotal to get an indication of what type of file you're dealing with if it's been exposed to uh the outside world right someone uploaded it to virustotal so surely here we're going to take that hash copy it query virustotal if i could type all right so now we're at virustotal we put in that hash and it comes back that this thing could be malicious you think maybe and so uh this malware is actually associated with the lazarus apt group and so that's something that if i come across in a compromise

assessment i want to take care of right away all right and so you're probably wondering how can i get this data what does this look like when venator actually executes right what kind of data sources are you giving me in this tool well this is a list of all of the modules within venator and so you might be also wondering well why the heck did you pick these like what's so special about these well when you're doing point in time analysis when you're doing a scan of an environment at some specific point in time not on a reoccurring basis like a carbon black would do the assumption that you're making is that an adversary is persisting

in the environment such that the things that you want to pull back are areas that uh indicate persistence on a system so if you were to in a perfect world go after all the persistent items for mac os and the likelihood of those items being used these modules represent the ways that your mac will probably be compromised if an adversary chooses to persist right survive reboot and so that's kind of the reasoning behind just these modules today but in the future more will be added but i started out with these just to be uh you know just for completeness and so we get all these modules but then what does that resulting file look like

you just get one json file and so with that one json file now whatever tool you have to ingest data into your centralized location whether that be like file beat or some like splunk indexer or whatever it might be that can now just consume that json file and you never have to worry about it so let's go into a demo about how that looks from soup to nuts all right here uh first i want to show you uh this is actually i'm we're leveraging hulk in the the entire time roberto rodriguez's uh project and so with that file b actually has to send some information to kafka and then kafka is going to basically ingest that data into logstash

and that's how you're able to see in a cabana but for my logstash config you'll see that for filebeat what we're going to do is we're going to have the path such that any dot json file is automatically going to be consumed and sent off for ingestion right so now we want to run venator all you need is python you don't need anything else no external dependencies we see here that we run it oh we need root well yeah because i need to parse some artifacts that are uh you know kind of privileged in nature so that's why we run our help on this again with pseudo privileges and we see that there's a directory that you can specify

so now that we see that we have uh all json files in the temporary directory is going to be consumed let's put our output to the temporary directory because we know we're going to get back a json file so venator is doing its thing it tells you which stage it is in the collection process and then at the end here shortly it'll tell you how long it took and it'll tell you how many records and it also tells you the location of that json file so basically if you forgot where you put it honestly or you didn't know what you named it here's a second reminder all right so now this is automatically being sent

to that temporary location and filebeat is doing this thing behind closed doors and it's already shipping that data into uh logstash into cabana and now we're about to view it here so last 15 minutes we see that we get some data back from venator and so with this data what can we look at what are some of the modules that we've seen previously that we could start to pull apart to identify uh anomalies and maybe an indicator of suspicious activity here you see again we are exposed to that rich data set that allows you to build robust detections but let's take a look at the modules that we have available to us let's not go for launch statements that

one was pretty easy let's look at event taps so basically every mac os key logger out there in the wild is going to leverage event taps and so these are the core graphic event taps and so if i was to build a key logger it would be event taps you can also install a keylogger via like a kernel extension that's less likely due to the protections that apple currently has in place and so let's take a look at our module event tabs and actually do the long tail analysis on that data set all right so let's look at the tapping process name so this is the actual process that is registered as a process to do

the event app right so this is your external process actually doing the event tapping right tapping into the system resources to get this rich data for key logging we see here that we have something called blue blood everything else is in the system uh you know folder or user bin like blue blood looks weird here so let's take a second look at blue blood and we see that the tap the process id basically uh if you're tapping if you're using an event tap and you're tapping a specific i use you're tapping a specific process then it'll show you that specific process id anything that comes back is zero means that it's actually tapping all processes on the system so i mean

that's something that a key logger would be interested in no matter what process you're in you're going to be able to key log that information if you're event tapping all the processes on the system so that's why you come back with that uh process id of zero this isn't the actual process id of zero you're not mapping that back to like a pid name it's just saying hey we're tapping all processes and so with blue blood this is weird we will kick this over to our insulin response team and figure out what the heck that is now i want to show you basically a day in the life of venator um so with venator i had in mind that as a comp as a

consultant doing compromise assessments what would be the best solution for me this so this is the perspective of uh doing a compromise assessment and then using that data to extract out useful information that might indicate yes or no you might have a problem in your environment so let's take a tour so we see we ingested some data from different uh hosts and we have 701 events and so let's take a look at the module launch agents right this is similar to launch stamens but this is more at the you don't need root permissions to actually leverage launch agents so here we see we have a lot of the same information we had for launch statements we have the

host name we have the executable and the hash associated with the executable so these are all things that could probably be enriched we can use that information to you know query things externally so like we've done in the last two examples let's do some long tail analysis on the module launch agents we see we have 24 hits and then from that let's take a look at the labels again because that served a pretty good purpose last time and so with this we see that we get a few back and this is the only one that doesn't have like com in front of it or it doesn't have anything in front of it or after it at all so this

meteor managers yeah it's kind of suspicious i don't know if it's bad or not yet but i'm hunting trying to find it out i'm trying to get some more information so let me take a look at the program arguments just just to make sure so with this you see that this is actually going to spawn java and it's going to execute a jar file it's like already that looks kind of sketchy typically when you have a launch agent it executes a particular thing right media managers if you just take a look at that name and you query against it for google you see that it's actually associated with like malware so that might be an

indication that hey maybe things have gone awry i need to take a second look at what media managers is maybe you'll use like some tool to actually take a deeper dive into that jar file so let's move on to the next thing so we have this rich data set here what we're going to take a look at we see we have some similar information here this is actually login items this is another way for folks to persist when you actually open your mac and you look at the users if you look at the next tab next to your users you'll see all the applications that are designed to start up when the user logs in

so that's something that would be interesting to us let's take a look at basically the applications associated with our login items so we see some of the things that we're familiar with one password for password management better snap tool to kind of snap things in place for mac os but then we see this final presentation.app why is a final presentation an application that's weird to me so let's take a look at some additional details around that let's take a look at the hash of that actual application so now that we have the hash back for the application we can now query external sources using that hash just to make sure that we have it come across something

suspicious and these are like extremely contrived examples like you could build the textures off of this data but not as simple as this and so we see that hey like this thing might be bad we need to kick it off to incident response to start further actions right and so the next step you might actually take is to take a look at what host this actually happened on so let's move on to the last module i'll show you here today chrome extensions this is something that doesn't that people haven't come across all too often but with most chrome extensions you'll see that the url the update url associated with that chrome extension actually points back to google right and this indicates

that the application itself came from that google app store right the extension came from the chrome app store anything outside of that means that someone installed the extension that wasn't in the chrome official app store so let's take a look at that some of that data using venator

all right so we're going to pull back the extension names here and we have a few that look familiar chrome media router isn't suspicious that actually exists on pretty much everyone's instance of chrome but you see we have this youtube downloader at the bottom that's typically weird like why do i need to download videos off of youtube i could just watch them youtube is free uh so let's go into the extensions and actually look at the update url and so with that you see all of them kind of point back to google in some shape or form and then you see this one is like totally not youtube downloader dot xml that's not sketchy at all uh let me just

continue to have that on my systems and so if i identify this i might actually tip this over back to my instant responders or me personally if you're on a small team you might take care of this yourself but this is just some of the examples of how you can use venator to actually pull back this rich data and then start to build detections off of and start to do long tail analysis and start to find compromise within your environment so future updates uh one of the things that we ran into as consultants which was a pain when we did these two compromise assessments was once you collect all of the data in another person's environment how do

you get that data back to like outside of that environment so you can actually you utilize like your own splunk or your own uh elk stack to do analysis on this well most clients basically like give us like access to the share and then we can access the share pull it down to our local machine plug in the usb that usb uh holds all the data that we're interested in we take that usb back to our system then we actually take the files off to our system and then ingest that data that's a pain and so one of the things that we're actually interested in doing is being able to ship those logs from each endpoint

to a s3 bucket that we control that s3 bucket be controlled by us kind of locked down secured and then from that s3 bucket do whatever we please with that so that s3 data can actually be sent directly back into something like splunk or an elk stack or if we just wanted to pull it down and do the analysis locally we could do that as well so that's something that we're going to support in a future update additional modules folks would probably be interested in things that are being downloaded to temp because if you're dealing with mac os malware they're typically going to utilize temp in some shape or form um and then also downloaded files how

many times have like grandma grandpa downloaded like adobe flash update on their mac os system to the downloads folder and they downloaded a number of times because the first time they try to run it something didn't happen and nothing happened but now they're laced with a bunch of like different malware variants that are adobe flash updater right so downloaded files is something that we're interested in then something like uh bash rc just things around bash in general that could be weaponized and a whole bunch more basically anything that you want to see this is going to be free and open source it's already out there on github if you want to see something in there

pull requests happy to do it and so where can i get this uh currently i this is out there on github i released a blog post uh probably a couple days ago it was wednesday i believe such that i described why there's a need for this type of tooling and also a rundown of some of the considerations i went over today about why this tool came to be and so an alternative way to look at this is like maybe you have carbon black data or real-time data various with any various tool and you don't have that additional data that will help you say okay well where's where's all the launch agents or give me all the launch agents or launch demons

in my environment good luck trying to do that with carbon black because you can um it's just real-time data right so if i wanted to look at the launch agent and then maybe the binary associated with that launch agent can't happen but you could get that type of data from os query but that involves you installing os query and all of your endpoints and that's another agent you then have to take control of someone has to manage it someone has to careful you know about the deployment it's a it's a headache right probably for that especially if you don't have many resources to to begin with and so you could utilize something like venator to give you that additional

extra data set all native to the system without any additional uh overhead right and so that's kind of the idea behind that as well it's just json that you get back what you do with that json is completely up to you but that data could be valuable to you and so at this point in time i will take any questions my twitter is r.r cyrus uh my company that i work for spectre ops their twitter is out there as well you can catch some other things around red teaming blue teaming purple teaming in general and so if you have any questions at this time i'll take them [Music] does a minotaur have to be ran against a

system that's up and running and you run it against the image of death that's a good question so currently the question yes so the question was can you run venator against basically like a dead box like image that you've collected um i haven't tested it against a dead box image but the intent was for it to be uh for a a live system um so yeah any other questions uh it looks like this is intended to be run locally manually how would you do it for a ten thousand map ah so the question was this intent this this seems like it's intended to be run pretty manually on a local system how would you leverage it to run it

across the fleet of max well when we do compromise assessments people typically most mac environments i've come across have jamf and so jamf has the ability to execute stuff and so if you give it this python file it could execute this across the entire environment to some centralized location given the parameters that could be ingested also you could use something like ansible to possibly help with that as well so any other questions concerns all right if not thank you for your time you can catch me i'll answer some more questions

offline you

Cleaning the Apple Orchard: Using Venator to Detect macOS Compromise

Related talks