← All talks

2015 - Kuba Sendor - Squashing Rotten Apples Automated forensics & analysis for Mac OS X..

BSides Manchester43:28391 viewsPublished 2015-09Watch on YouTube ↗
Mentioned in this talk
About this talk
OSXCollector (https://github.com/Yelp/osxcollector) is an open source forensic evidence collection and analysis toolkit for Mac OS X. It automates the forensic evidence collection and analysis that previously Yelp's team of responders has been doing manually. We use Macs a lot at Yelp, which means that we see our fair share of Mac-specific malware alerts. Host based detectors like antivirus software will tell us about known malware infestations or weird new startup items. Network based detectors see potential CnC callouts or DNS requests to resolve suspicious domains. Sometimes our awesome employees just let us know, “Hey, I think I have like Stuxnet or conficker or something on my laptop.” When alerts fire, our incident response team’s first goal is to “stop the bleeding” – to contain and then eradicate the threat. Next, we move to “root cause the alert” – figuring out exactly what happened and how we’ll prevent it in the future. One of our primary tools for root causing OS X alerts is OSXCollector. It was developed in-house at Yelp to automate the digital forensics and incident response (DFIR) based on our past experiences when dealing with the malware infections and other threats haunting Yelp's corporate network.
Show transcript [en]

all right everybody we're into the final stretch the last two talks of the Daystar about now is it does anyone think about it ok is everyone glad that they make that pea-sized today cool as what think of feedback and things that we can improve and things that you'd like to make see differently for next year and please do pass that on at the end of the day feedback is really really important for any conference and particularly for Abby sides which is really thought about you you program the community out there at life because without that we don't know whether everything's really gonna read bad so please just fill up feedback with when you get an email or saying a person to

anyone over here as well and we'll make sure we incorporate that into the future events so without further ado if you get your own there for some OSX up forensics and automating it and doing all that good stuff so I'll hand it over to shoot symbol or thanks all for coming here are so cubana work at Yelp just a short disclaimer at the very beginning the title might be a bit confusing so this talk is not going to be about the latest advancements inside the productions that you're like prop very trying to be like that and you can do is about some spirit technology yes or disappoint you we're going to think about that microbes so

just show quick break enhance like how many of you are actually using back those I see one guy over there awesome so yeah I'm not going to be lets say hello what's going on the mark to respect to even run a con it's not an actual been there for so few words about me so I joined Yelp last yeah in July I'm most involved in my worries in the waist pants and apart from that source of circumpolar business abilities are in creating all different software for automation of our security processes of various business processes so everything red like security avantis management professor at work for about three and a half years as a safety sake importance

of France where I was part of that security and Trust research group I graduated 2011 from ATT mercy of science and technology krakow poland and also student both double degree with select important your homework in so cal poly's so yeah yep I'm not sure how many of you know and use Yelp i'm using it every day it's free clothes bishop your new college for me it was my chest up it was called on 11 years ago about that i think looks like early august two thousand forward gentlemen with that Jeremy Stoppelman our current co-founded with another of this colleague from papal and what's really interesting it's like how a word is done the air to be rude to right now

support / 29 locals and that's true what how accurate is this like this we're launching also in your countries over this year we introduced like yelling between your countries and then us up to around 33 million of the mobile users and also around 139 million visitors on our website and what that means is that we have wrong 3,000 employees right now and most of them I using math and problem is when something like fishing or some other sources of malware viruses are triggered by these employees and they download some his dog some employees taking some phishing emails going by the divils well you were thinking that maybe like in the max they're quite secure that was like

previously with there were moments like better than windows machine species it's no longer the case there are sort of like victims of their own popularity so because right now more and more people are using macbooks it's also that the attackers people create more work to create all this fishing campaign there also are getting more and more mac users and especially for instance if you go to else it's like cnet CC netcom they're actually full of malware like most of the applications they're a family with malware if you're running any sort of like corporate firewall for blocking websites falling block Donald come because it's coming always package malware so any kind of applications like advertising themselves free media player

free media converter for mac book it's pretty much most advanced model of mahler so we see that more and more nowadays and for the past most we didn't have really proper response for that like whenever there was an employee from their hot desk with very fun not like the most of the reasons was looking like animation so we just like okay yeah sorry for that we was going to whack your machine to you a new one in your system because it was only so much work to do we were looking this certified for investigative power to few hours was going wrong with the mall we came from and so my manager Ivan he said no the

weekend and created this solution POSIX collector which is really cool for forensics of disconnection and natural so we have examined all these components for that so this project is open source it's a little their own via hub you can just go to get out it's very simple so it's just like 15 and 5 so whenever we have some infected machine or how this person will just go to the person just run this like to do first of all is like we won't prevent any further progress on the power so you really thought this person will directly disconnect the machine from the network so the pro like you were running any other principle was like hey how do we actually get there so

maybe not very secure manner we usually head over this file one USB steak temperatures making a very secure practice so thank you ms such simple who can we just shipped as apology on the verizon might book so it doesn't have any other dependencies and that so usually could just run very easily and what it does it collect all this all these different forensics from the machine what it produces eventually a JSON output it's very nice additional days beautiful so it's very simple it's very human readable a relation rippable in the sense that it's very process so each sort of a forensic source produces one just now to climb that we can later on analyze what's also both to about OS X

is that it's caused most of this data in their particular format so it was very important for us to collect as much as possible admission was already giving to us and part of the data start on the macbook stays in a sicko I databases so it's very easy in Python to actually just drop the content of the of the SQLite database so injustice few code we are able to grow some pieces of forensic information from the machine what's not unless too late databases they're very particular format similar to windows registry in Michael Westen gold the list prepared lists its symptoms in the binary format sometimes it is just a plain text PDF a folded some sort of like XML that you see here

on right inside for and yeah there is there's like multiple other of different for instance information stored in different parts of the of the system that classics collector easily collection including one very uniform art form so do with this illest we were actually using foundation library from business of the burro program all objective-c actions so they are very that mean their names so that's maybe something going away from all the spice and conventions this is more sort of like this objective c walker that made it possible for us to use therapies go to for instance Ruthie Phyllis in your sis collector code so what kind of information is X collector is covering from the operating system

from Marco ass apart from like some general system information like you student version which accounts are running on the system it also collects kernel extensions like this downloads of vacations also so the web browser information so browser history multi very important in the forensic analysis so we see user's activity in the Indy browser we see which password download that we are also able to grab some human formation so was like ow attachments from the system application and all different solid items that also must ocean

so as mentioned before this is sorting long uniform Jason sort of globe which contains or unified field for us so for instance wondering whatever there is involved in download browser download restore md5 checksum for the first time stamps also for instance signature chain and some the list employed what was the actual source of this information so passe exploiter does pretty neat job of unifying all the time stem so I've shown British like they all different sources of foreign state information and this is very cumbersome when you're trying to gather them manually for all these different sources because a version of the old different web browser place or time some information and all their respective format so you can send us a 5

milliseconds fill 9070 or seconds from 91 or like different browsers look like different particular but they considered your orgasm sorry book so actually one of my first just to figure out what is using which former to store the information through so like in Bangkok approach to store one single from other types of information because this is very much important when you do for instance on Liz's show some examples later on I talked about the hashes so while you may hear different stories about houses nowadays something that's free like yeah the virus is welding you can do really just stupid past operation this is not with article we should be a gravel is this the sort

of stuff because you can obfuscate that well I mean if you've only been definitely that this is one principles so stool hashes are very important in your music service like I result this is something that might be really at least the first item to try together so they're pretty useful or not that also the corner piece so whatever you download something on the interview is using up is this block of this window inflammation displaying I hey I'm sure you within this application I mean yeah the old idiot from into and that so I'm free to have to endure what's really cool about the heart from the big open is that this information actually stays forever on your disk in a Phyllis so

once you're running the forensics collection on the impact system will actually see all different point teams that were displayed for a user we can already grasped information about life what's the source of the infection a certain items is another thing that I was expeditor collective on our Center like around seven different places where they started items in pop up like you're on directory the main republication start directory things like that so those covers all the difference Papa stomach by the way and so yeah fertilize it pretty important because you have somewhere not like it's like came over so yeah pretty much for us information like that at least for this machine what we'll do is probably does

this certified with the machine to swipe at night don't try to so think about signature change like Alex doesn't really care so much about the weather the batteries are signed on that but for you it might be really important information at us to your processivity of small-time finery this might be also useful for your forensic analysis so yeah this few slides were showing more informational like all different things that I was expelled there is a book we collect but actually yeah the collection is sort of like this hard carbon a certain way you don't want to do manually because my stuff what's really cool and finds is sort of like between art and science is the project

let's say more interesting so all we have are selling reduces so as many previously time stops are very helpful when it comes to figuring out what's going on with the machine so you have your infection alert for instance you have some information from the virus like oh yeah this machine doesn't like that this this time you could already stopped by the way with very simple grabber on the time stamp like few seconds open back will already give you sort of like first and it's like where this came from if you change with tools like GQ which is really cool way of making Jason even more readable and for instance shooting something like SQL SQL queries or Jason content and very few

steps at least you all URLs or visit time it can show you the activities from a set user of the machine also for us some indication was like whatever the couple of accounts same machine where machines were used among different employees some of the malware who is already there before we started like doing all this transient analysis was staying there and easy for us to figure out like a yeah this is coming from from the other user that's not only more on this machine so yeah that's wipe it right away there is no way we should continue using summation really working the system fresh installation role in the other things like that so this is really neat

but when it comes to like doing it every day rev real life you have it might be a bit cumbersome so why don't we actually automate that and this is what we started doing so all this sort of manual steps that we noticed that we're doing continuously of every incident we started like looking at is like yeah there are some small fingering every time like sorry for example from grab around the time I'm sample T of the infection contacting some credible sources like virustotal or opendns with the hashes or the domain user visit than see what's wrong with that so we just select the islets let's little to my death and we created sort of channel

filters that we can pass this to the wire second Mexico how Isis collector through will enrich all this information one by one so they're very simple and the way they work like is each filter sort of like that's more and more information into this out generated from us X collector and the very end will of stuff is sort of like recommendation and things that office filters can suggest you blocking certain domain or other certain files st or internal blacklist so for instance by Thome at find online still there we'll just try to have fun URL for instance some dough has a sub domain and try to dissect it into list of possible domains that could be this could be related to

so give me a subdomain we'll just so resolver like the root domain you have some URL you try to get all different domains that are appearing in this URL so for instance if some of them domains are in troopers parameters you also try to identify them take them out put them in this separate them why this is very useful is the later on we can directly check this against some of our internal blackness so throughout this continues always a business process we gather or serve internally blacklist with all different suspicious URLs suspicious domains that we were checking director either way before let's say digging deeper into into any of the other so if user or visited some

samples already potentially a sign up yeah this is there is something wrong about the machine so I knew things like yes during football it's called things like that of all pretty mention dolls has come see that yeah there was a visit to that website you ever like wires are sure that user has something gone the machine it was already indication for some purchased absolutely so this check like this filter is just checking this boeing's the way sort of excavated from the initialize X collector up with trying to see the agenda of the blacklist and trees what's going next is so we have all this different sources jim was mentioning in the previous talk things like our conver useful source to

see there's something wrong with the domain there is also show the server like Natasha's there is virus although we're using so for any hash in the output of the USS collector we can just try to c virus level information of its hash and will give us indication about how many heads were work on some work virus not always doing is using is it's not let's say antivirus or any scanning looking on its own it is more like covering sort of responses from all their friends and the viruses like as far as yourself goes make it there run like 60 these are sources of different things that this is checking into so it does hold the cellular think we you know

have to go to any of these websites on your own you can just like some it file file has the virus the one take it back to what we've done we just rather on virustotal api in this outfit filter and it just goes for us throughout the different hospitality Charlie Francis collection and augments the initialize X collector up with with the information from the barcelo API there are some also thanks included in this lookup filter that makes it a bit more speaking there's like only one by one for my request eventually issues in the same time so there is peace of it's it's quite a bit there is also sort of internal hashing so we we are using premium subscription

program so that we have an online I think 25 requests remember something like that someone mounted capabilities when it comes to request right but i think that pre version comes just like before it was per second so this is sort of like very good because you see a little bit pulling it back from doing this massive analysis of all the domains knowing that if you run OSX collector on the machine that was run for couple of moms there are probably thousands thousands demise that the person visited so running through all of this for every machine you have repeatedly very long so what we're doing we're cashing their responses so if you have the same website visit by people went

wrong the same foreigner styx analysis of the USS collector output people just instead of calling again by Ursula API will just pull your internal hash and drop this information for you of course this comes with a little bit of background like yeah there wasn't change in the domain reputation for instance or passages i doubt it will change so much what all these with actors in the things discovered done it for try to call it again so this hash is this channel is right now very simple but I think we may think of life you pre populate your hashtag basis the public databases speak like nests ash databases so the question is that we decree both of the from the public that

they've a database of this past week we are not necessarily doing any creation ization of the database this is more like when something was visited like we will just cut the response we learn to imply something let's download everything at the very beginning at double cross but this is a very idea that maybe I should also investigate something in doesn't matter

similar to a virus vocal another source you can use its shadow server so server will do something opposite so rather than checking like that scans of the Lassiter bad it will actually tell you which files are good so all the fuss that are startled items of the brush systems they will be served whitelisted sort of acting as a white list for for all the houses that alone on this and the same can be said about opendns for the domain so virus local can check for us the patches and domains and IP addresses would open dns those is checks regular domains and omega architect or IP addresses so we can have information like relate joins to the domain with we

are looking for this is very useful because you will spell that like maybe the domain itself is not looking very suspicious but it's really a couple of suspicious domains like it sits on the same idea present as some people stop so we can already be some further indication that we may be look further into this domain name say it's really just discussing itself as potentially no not G domain or it's really malicious so that's over the air filter opendns uses around seven or eight different a Diane point on the slides I think there are only two mentioned related domains and my reputation there is also a categorization filter so it can usually like at a very peaceful time that's

coming from the power category yeah it's rather something that so there is nobody like that this is something to consider suspicious so for instance the security game point it gives us what there is nearby work on the ice either friend apt about who were coming on so what all the services like virus all the lab open dns server my letter is all this is sort of like collective cyber threat intelligence what they do is they like shatter all the different pieces of information on the users of the virus opal because all this knowledge from every sample that these are submitting so it can leverage this and thanks to the subscribe this resources you can already on your own

Rita this sort of power of life I collected threatened roses so this security API it will actually give you information like for instance that the main generator that all your source hopeful that's for sure to tell you for instance how likely it is I was just generated from some machine that's what's they trying to register some domains do some bad stuff from this dumbass and just like go sit down today there are some things like for instance domain age that can be also very useful when it comes to analysis of the potentially malicious sources usually if you see something downloaded from the de- create that there are like couple hours ago minutes ago this is rather

malicious it was just like generated online something that we do for her on the back so yeah almost different filters are chained together it's very easy Talia run filter as i was showing all the examples there just like equipment including little inputs you can think of like hey I want to do something else would but I don't do something else with each it relates goings I can just like either again while the filter chain them together put some people there's in between you can run them separately or you can do what we sort of combine the reasons why a multitude of filter as well so we have all this information gathered yeah it can take some time as

just like collecting all this information from an ADA from this API is like Aristotle token dns because this measure would mind for a couple of hours the response of information test to check for so these things like caching like ideas for instance like people databases it might be it might be a bit more speedy fight we're also thinking about doing something like this called an underserved like service and for this analogical could run automatically machine collected from from the traditional so they're possessed i classify you as of like we using this is these are grating and yeah like the one most staff is recommending steps cuz we're able to solve the goalies information for you I

like that but ashes and 25 fountain see as malicious and will suggest you next step so in our case it might be there like yeah let's Lock this website alright that's like this this hash or like yeah you should probably look at this one because it's suspicious so like Ronnie the service as I made a part but like later on there's some of my own pocket all that you on your own workpiece suggested suggested next steps what we have done with all this credibility is that once we had all this cult in Isaac's collector we actually take them out put them in a separate library which old friend play p is also often source on tha just go check it out

it has this sort of like dynamic way of calling obviously api's mentioning the further before so far worse LOL been dns and shadow server so you can just use that like go and call it their all included in your own projects that you're doing automated do not do a no longer had to go and more like the bar to the website looks like in the hills maybe we'll do that when you you can sort of like China with some other peoples that you're using until it's the business process what's really cool is that to eternally try to use it in our several projects so for instance one other open source projects that is up on

our github page is Ostler sort of like a lurking out of elastic search data so what we are doing if we have some some alerts coming from this loss success from from us that we can add some more enhancements to google alerts and they were for instance to information from opendns about particular palette ashes or IP addresses we can have information and and the others as well so it's very easy to combine two only need to do is just survive what they keep that key there is also like this tu casa de output in some sort of five this is one very simple cash some better idea about how to make it more info contributions so the life let me know if

you're using it or if you're interested in using it if you have some ideas you can push them out on the market a page sounds pull requests on Berlin conversation so that's pretty much as if you have any questions or we also the party later on so together that's really impressive cool but it must have been a lot of work for an internal tool when you start it how did the how did the other alternatives compare like what's the National Commercial alternative that came closest in what you want to do or to do me like the alternative which is generally like Frances collection or rather like this analysis part because they're like sort of two separate

against more than asked our analysis seems like a like Mallya section and kind of surprised already so i think it was well liked around when I was young like from year ago that we started doing all this process at the point of time to repay projects like I mentioned is also based and some others project we have quanta in class X monitor which was doing sort of like smart kind of thing like just pulling information from salsa so it wasn't a celebrity when it comes to like this x down or like different sort of formatted without putting two goes by without food for us what was really important as the very beginning before the sole information would be also

filters what we were doing was just to grab all this information manually certain it was large output it was very important that like unified way of doing that right now we do also sort of more reactive way of looking at what's going on at the end points with our spirit of facebook a couple of months ago so this is sort of like more reactive way of the same sort of principles govern foreign six price information what's going on the system causes violent really thank you the tools you will start to our reactor to the UN if i bring you analyze or actual pieces or you've actually suspect as a problem what you very close to with

your analysis blackboard is to be able to start doing proactive intelligence proactive finding problems before you know about because you have we suspect these sites back from this laptop who else's pee parasites who else's laptops we've got those sites and history to start stitching together a more complete picture of your organization yes so I think like the reservoir like above this described be part of this incident response oh yeah let's see what we use at the building and what let's go there was sort of like designed for at least amaneciste collection on what to do really like yo ask their reasons to that we've described the machine the network because figure out what's wrong with it and like yeah we can do

several things like block this domain so stockbrokers some other users have have the same there which is the same fishing Kabang sort of things but then when you follow it pull it back you'll see more laptops so I think this is more coming with what we are trying right now to achieve with wires Barry where this is sort of like more correct it sorts of information for us what's going on in that particular time frame all the appropriate machines well this can give us in cities like yeah I mean potentially we have also quite suspicious file business suspicious domains we built this blacklist on our own text to this analysis we can right now use it for instance combined with

those light or aspiring to take out more proactive it up like yeah so from this another this week we are this machine and then right now thanks to us where you can see that the absolute ultimate and I said there is definitely a benefit for not using this tools in isolation of like combining all this different possibility that's cool possible q so doesn't like maybe we're not as a sex line right now as we've tried to combine all this source of information with either the tax collector extra like this and I see that we're right right now for a couple months with y squared which is already started doing that so thanks to helpful to speed up a bit like

yes so uploading the binaries to virus also this is a bit more Pro blahnik year because this is sort of what our analysis when the floods forensics collections have I think serving our essence is like yeah we should solve supple bleeding as father we could we potentially any way to have something like a virus or we just take much that we don't know what's going on there so the pond this slide when this is happening what was expected we don't really have network connection so this is like other problems maybe lately during the rules sometimes well if we don't see by lunch in water so low prices the prophet saw some certain titles it's not right now anyway

so don't do it on a regular basis we moral and these informations or the other yeah and I just went in what your thoughts were on the news of rootless security that's coming out for OS X and how it might affect what a tool could get from P lesson system forward and things like that yeah so I think this is very interesting format they have so much going on there which is like reverse capabilities I think yeah the assumption here is that this Frances to do so that's why you can't like always look for information so yeah the money very interesting to see like it

so just kind of a Buddhist di or less have you done any work with analysis of I am at the volatility scram no really this is more like for the books so room and that's I'm completely mistaken this tool rooms on a live a library sex machine is it not then susceptible to advanced malware target in the program itself and maybe cook it into it and manipulate in the day that comes back and paint in it now in particular but not while we're it could be possible to do it what we was also possible to do is instead of running it directly on the machine that's impacted just to take damage of the machine and

just like molten image and run from a superb machine flowserve like easy to circumvent the problem we have the police thought that this as that's really something right now it's very nice idea if you bought the release rhetoric also they have obviously because it's open source is wrong we did you give everybody good the code yeah I'm and defeated like all this the way it uses all this is sources of information the answers like public knowledge stuff during any finger to say magical services but potentially there could be in all the work go for rivals there are no questions thanks a lot

you