Open Source Malware Lab

Name: Open Source Malware Lab
Uploaded: 2016-07-04
Duration: 49 min 41 s
Description: This talk examines how to build a comprehensive automated malware analysis lab using open source tools, chaining together entry points for file, URL, network capture, and memory image analysis. It covers Cuckoo Sandbox for file execution, Thug for URL analysis, Bro Network Security Monitor for packe

BSides London · 201649:4114K viewsPublished 2016-07Watch on YouTube ↗

Speakers

Robert Simmons

Tags

CategoryTechnical

TopicDetection Engineering DFIR Malware Analysis

DifficultyIntermediary

TeamBlue

ResearchTechnical Deep-dives

StyleTalk

Mentioned in this talk

Tools used

Cuckoo Sandbox Snort Thug Volatility Zeek

Platforms

VirtualBox

Protocols

HPFeeds

About this talk

This talk examines how to build a comprehensive automated malware analysis lab using open source tools, chaining together entry points for file, URL, network capture, and memory image analysis. It covers Cuckoo Sandbox for file execution, Thug for URL analysis, Bro Network Security Monitor for packet capture inspection, and Volatility Framework for memory forensics, demonstrating how their outputs can be integrated into an automated workflow.

Show original YouTube description

The landscape of open source malware analysis tools improves every day. A malware analysis lab can be thought of as a set of entry points into a tool chain. The main entry points are a file, a URL, a network traffic capture, and a memory image. This talk is an examination of the major open source tools that satisfy the analysis requirements for each of these entry points. Each tool’s output can potentially feed into another tool for further analysis. The linking of one tool to the next in a tool chain allows one to build a comprehensive automated malware analysis lab using open source software. For file analysis, the three major versions of Cuckoo Sandbox will be examined. To analyze a potentially malicious URL, the low-interaction honeyclient, Thug, will be covered. Next, if one has a network capture (PCAP) to analyze, the Bro Network Security Monitor is a great option, and will be covered. Finally, if the analysis target is a memory image, the Volatility Framework will be examined. Each of the inputs and outputs of the tools will be reviewed to expose ways that they can be chained together for the purpose of automation.

Show transcript [en]

thank you all right so my name is Rob uh my handle is utenos which means duck bill platypus in Russian um and today I'm going to be talking about uh open source malare labs and how to automate them and how to connect uh different components together and also how to use some of the components that that I'm going to talk about so uh who am I next slot there we go all right uh I'm the director of research Innovation at threatconnect and I run uh The Innovation team uh component of the research team and uh we're responsible for finding new ways to analyze uh malware and analyze threats and come up with uh innovative ways to uh protect uh

organizations so why do you need a malware analysis lab so there's a number of different reasons why you might need a malware analysis lab and what I want to show you on the right side this is from uh Lenny zeltzer uh forensics 610 class uh this is a graphic from that and it describes the different uh uh stages of malware analysis and in increasing difficulty so fully automated Mal analysis is at the bottom and then you have static analysis uh and then the next step up is uh interactive malare analysis and then the top most difficult part is full reverse engineering um to know the capabilities and looking at the code itself so what we're going to talk about today is the

bottom two so these are the easiest these are the easiest of the the maare analysis stages but these are also the ones that are ripe for automation uh so one of the main reasons you would need an automated malware analysis lab and this is the thing that I do is malware research so consuming file from a lot of different locations and consuming different uh um feeds and then analyzing them in an automated way and looking for uh you know C2 and and other features of malware in volume um also if you have a one-off if you need to research one particular file this works as well so uh you also can enhance your threat intelligence so if you have a

threat intelligence program uh you can analyze malware from uh from incident response teams you can also analyze malware that's found from hunt teams uh and then it's also one of the basic pieces of automated malware analysis is Network defense so you need to have a uh automated malware analysis system that's pulling uh files from uh Network intrusion detection systems uh such as bro and actually we'll talk about bro in a little bit uh but from Network traffic you also have email attachments and then some host base intrusion detection systems will pick a file that has been uh identified as either suspicious or uh hasn't been seen before and then they will send that into the central uh

either a Sim or somewhere else and then from there uh that file would need to be analyzed and so an AMA is the way you would analyze it and then finally this is the main reason I like it is uh it's a lot of fun malare analysis is really awesome so uh I'm going to focus on four entry points in the malare analysis process there's a lot more than this there's many places you can begin uh a malare analysis process but what I wanted to do is narrow it down to the four major Mo the four major ways that you can enter a malware analysis process so you have a file so a file can be any

uh executable file uh it could be a flash file it can be a Java jar it could be a uh Word document it could be a PDF uh basically anything that you anything that can execute code uh or through an exploit also execute code on a workstation or server so those files are the first uh first entry point that I'm going to talk about uh the second entry point is a URL and so when I talk about URLs there's two uh just basic categories of URL that we're going to talk about with uh regards to malare analysis there's a lot more if you're going to talk about fishing and uh you know uh credential drops and

things like that but uh what we're going to talk about right now is basically malware download URLs and then drive buys and a driveby is where there is some form of exploit and the user visits the visits this URL and then a payload is delivered without their knowledge um Sometimes some of the exploits are a little bit better and it won't crash the browser and they won't notice but many of the drive by uh driveby downloads it might crash your browser but either way uh malicious code has been delivered without any user interaction you know the user hasn't said yes to download something uh they haven't run something they've just visited a URL typically it's buried in a uh an email and you

know a fishing email and they may click a a URL that looks like you know a LinkedIn request or something like this so these are another entry point to the malare analysis process that we're going to talk about and then the as to may be a little bit less obvious uh but these are really rich ways that you can get uh more information about malware so the first of them is a pcap and so pcap is a packet capture this is basically a recording of network traffic between two points in time and so from that Network traffic you can learn a lot about the malware and what it you know what it's saying what it's doing uh and finally a

memory image and so a memory image is exactly that it's an image of all of the volatile Ram uh captured at one point in time off of a workstation or a virtual machine and from that memory image you can also learn a lot you can learn a lot about uh what the malware I like to say the memory image is what you you can learn what the malware is thinking and the pcap is where you learn what the malware is saying so to to analyze these four uh types of data we're going to look at four open-source tools all of these are publicly available um each one you can download and install uh and also contact

me after this if you need help uh installing these or if you need to learn more about them uh I can point you in the right direction um so cucko sandbox is a automated M analysis system uh I know uh many of you that have used cuckoo you're going to say well cuckoo includes volatility but there's a Nuance with volatility and cuckoo that I'll talk about in a little bit but uh the second one is thug and Thug is a low interaction honey client so this is what you would use to visit a malware download URL to capture that piece of malware or to visit a driveby and hopefully trigger the driveby exploit and then capture the

payload and then bro bro is a network security monitoring tool and many of you may be familiar with bro but we are not going to use bro in the way that you typically use bro you probably use bro as a network security monitoring tool we're going to use it as a malare analysis tool and these two things are slightly different and you'll see why soon so volatility is a uh memory analyzer it's built in Python and it analizes memory looking for uh a variety of uh indicators so you can find things that are uh different you can also extract information from memory and we'll talk about that in a moment as well so the first thing we're going to

talk about is cuckoo sandbox so cuckoo sandbox provides static and dynamic malware analysis and this is uh this is just a screen capture of one of my uh uh cuckoo sandbox instances and if you're not familiar with a Sandbox a Sandbox offers you a controlled environment where you can detonate malware safely um I would recommend that you put your uh your malware sandbox off of your own network uh isolate it completely I know some people who have begun using M uh uh cucko they've downloaded it and loaded on their laptop and then all of a sudden you know they've got C2 callbacks coming from their home uh uh IP address so you know make sure that you separate uh your

malware sandbox from your own network uh and when you begin the a good way to begin is actually to have a closed Network and just observe the malware traffic without having it uh reach out to the Internet so this is uh a place where you can also get static analysis techniques so uh static analysis basically comes for free because static analysis does not take as much uh uh resources as Dynamic analysis does so Dynamic analysis you're actually running the malware in a VM for 5 minutes 10 minutes or some amount of time and so running an entire operating system even if you're emulating it uh that is a resource intensive process and so to look at

certain static analysis features static features in malware that takes a very short period of time so it's you know you it there's no additional time for looking for Strings and things like that so it it they you basically get it for free with cuckoo so I know everyone wanted to see something as long as a a dynamic analysis but you know what dynamic analysis is very obvious what you use it for so Dynamic analysis you run the malware it calls back to an IP address or a domain name or a URL and then communicates with its command and control structure and then you capture those indicators of uh where it's calling back to that's pretty obvious if

you ask me so I wanted to focus uh more on static analysis techniques that you can use to get stuff from uh cuckoo that can shortcut your analysis process these are things that can assist you with uh kind of uh looking around later in your analysis process so one of them uh this is a sample of Mac OS ransomware this is called key Ranger uh and so I ran it through uh cuckoo sandbox and got back uh got back some information and so one of the things when you're doing those later more difficult stages of malware analysis is you need to reverse engineer the malware and get the assembly code from it so to do that you have to run it

through what ever Packer it was packed with and so sometimes you need you can uh there are ways to detect the Packer uh also like you see here uh I ran strings and it shows what Packer this was what Packer was used and then also which version so this can actually be forged I mean I could go in there with the hex editor and change this uh but malare authors are typically lazy and so this is probably the true uh version of upx so now I know which upx version to use to unpack it um and so I don't need to go figure out which one to use here you have AV detection so for every file

that you submit you can also submit it to uh an AV scanner Bank uh and so these are AV detections that you get back from it and these sometimes you'll see something that's a very generic detection such as the kasperski detection uh but also EET and Tren micro uh they have actually got a very specific signature for key Ranger a and so they tell you this is key Ranger a and because I saw two AVS that said the same thing it's more more likely that this is uh correct or I mean the other thing is they share signatures so they share the same bad signature but at least I have some idea of what this is

and so instead of just fumbling around looking at the malware I can actually go to Google and get a shortcut and see if anyone else has begun reverse engineering this malware and has information that I can start with uh because if you if you've been hit with something and you're on an incident Response Team time is of the essence and if cheating if it means cheating and going to Google and finding someone else's work and using it uh all by all means please do so uh the next part and this is actually a different malware sample uh as you can see these are uh PE specific uh portable executable M uh Microsoft uh Windows specific uh components of a file

and so I've run this through cuckoo again and gotten back a few interesting things so if you take these sections uh and this is the rsrc section and take the md5 of this and then when you take when you take the hash values of the sections in a piece of malare uh you can find related malware that uses that same section and so you can can set up uh kind of families of malware uh that are potentially related and so uh this is actually the md5 that I use to search in virus toal for other files that are related to this particular PE file and I found 52 other files so now I have a you

know a group of files that I can send all of them through my automated Mal analysis system and then I can take features and say well this one's a false positive I don't think this when is actually related but this set is related and then from there I can gather uh maybe not just one C2 address now I have a set of C2 addresses I have a set of uh domain names that I can now watch for IP changes and I can look for other patterns that I can find uh more information about the adversary that might be running this particular attack so uh in the same way as sections and I apologize this one uh it ran over this

is a oop sorry this is a Shaw 256 under the RT version and it ran to two lines which is very long um but so this is a resource in the PE file uh rtor version and so I found by running this through uh the virus total database I found 99 other files that are related and use this same uh the same RT version resource so then I can take those uh 99 combine it with the 52 there was actually a lot of overlap between those two groups of files and now I have an even larger set of files that I can analyze and find more information about what's happening to me in this particular file that I

found now this last one this is this is a really awesome technique by the way this is very very awesome so every PE file has the Has a Field it's not always filled out some uh some malware authors will just zero it out and it'll say uh you know January 1st 1970 uh but the this is the compile and Link time of the file and so what you can do is if they didn't adjust this if they have their clock set correctly uh you can then compare this to the time that it was submitted to virus total or submitted to malir or submitted to whatever uh public sandbox or place that you've got that's collecting the data

and so as you can see this was compiled and linked at 941 and 34 seconds and then it was submitted to virus total at 942 and 47 seconds so my theory is that this did not go out to a email and infect someone and then someone take it and then report it to virus total this is the malware author so virus total gives you a uh a hash of the submitter and so now in the future as this same individual submits other payloads for testing to virus toal I can see those payloads and know about them immediately without them actually going out into the world and starting to uh infect people so this is a very interesting technique

I hope you take take it with you uh so let's look at the different uh sandbox flavors of cuckoo and so we have plain vanilla this is still the version 1.2 stable version right now uh about a month and a half two months ago they released the 2.0 release candidate one so this is the next next generation and they're probably going to release the full 2.0 um at some point this year and cuckoo modified this is actually a uh a fork that was created by an individual that worked for acuant and it basically fixed a lot of the things that uh he found wrong with the analysis in cuckoo and a lot of the changes the

modifications that he made have now been upstreamed into the 2.0 version so what are the things that uh that this gets you cuckoo modified over the 1.2 version uh normalization of file and registry paths and this is actually very important for analysis and I'll show you in a moment uh but you get 64-bit analysis uh service monitoring you get an extended API you can use tour for the outbound connections and then uh it has a Mal hure integration for looking at heuristics so why is normalization important so normalization of Pathways is very important because those two paths at the top uh are essentially the same piece of malware dropping this uh you know in the

bonso directory that it created and then the aid vf. jpeg this is a file that was dropped by the malware and then these are just two versions of Windows operating system so if you want to have a a variety of sandboxes in your collecting data from a variety of sandboxes you now have a problem for doing data analysis across these different flavors of sandbox and so when you normalize it you replace the the the unique uh unique sections of the path with the official environment variable uh the you know the recognized environment variable for that path location and now you have one path for both of these sandbox flavors and you can do analysis across a lot of

different sandboxes so cuckoo Next Generation has a lot of new features um I'm actually going to kind of fast forward through this uh you can see these slides later uh they'll be posted online and I've already given this talk at a few other bsides so you can get the slides there um but so this is paranoid fish so paranoid fish is a way of uh discovering anti-analysis techniques that you have uh not corrected for in a sandbox so many many malware authors will put uh things into the malware which looks for certain registry keys or looks for certain features uh inside of a VM to know that this is a VM or that this is a Sandbox

or that this is an you know a debugger and this sort of thing so all of these anti-analysis techniques uh have been rolled into this uh executable file called paranoid fish or paw fish and so what it does is it tries each one of these anti-analysis techniques while it's running in your sandbox and then at the end it reports back and it just basically displays this on the screen and you can if you have a screen capture of it you can see uh you know you're okay on all of these things you know wine detection okay with virtual box uh VMware but they traced this one registry key which has detected that this is running in Virtual box and

so this is the time where a malware would probably say all right not going to do anything I'm just going to sit here so what you know now is how to correct for this and so you know the next this begs the question how do I corre correct for all of these things that malware is going to detect so there's another open source project called VM cloak and VM cloak gets you two things it actually dynamically builds your ma uh your malware uh VM and installs software for you and installs the uh the the windows uh uh registration key uh but it also Opus skates all of these things that malware looks for as an anti-debugging and

anti-analysis technique so you can not only build your your uh VMS in an automated way similar to vagrant when you're using vagrant you just type vagrant up and then it builds your your VM and then after a while it's done so this is the same sort of thing you know you run the VM cloak command uh you go get coffee and you come back and you've got a a fully operational uh Windows VM which is uh obfuscated so what are the what are the cuckoo outputs so there's a number of different cuckoo outputs but I want to focus on a few that are important for this talk specifically so it's going to give you a pcap file and it's also going

to give you a memory image it's also going to give you dropped files so dropped files you're going to want to send back to cuckoo and have it analyze any of them that are executable you also want to have a uh gut check in your uh in your your Loop there because you don't want to have a uh piece of malware that writes a uh differently padded version of itself as a drop file and then you're going to get into this nasty infinite Loop of actually the same malware so uh I typically say do about six deep you know six uh six children uh down and then stop it but then also alert you know have a little alert for

yourself to tell you hey hey this thing hits six maybe you should look and see uh if this is actually just dropping different copies of itself or if this is some sort of you know strange new malware that actually drops uh six children before it hits a payload um so uh pcap and memory image like I said earlier the memory image from cuckoo is actually uh it's fine for cuckoo's own analysis of the memory image it's perfectly fine however if you're using some volatility techniques cuckoo uses some of the same techniques that malware does to observe malware and so the cucuman uh um the cucuman component of cuckoo actually pollutes the memory image in certain ways and so it's better

if you want to have a clean memory image to extract your memory image separately than running it through cuckoo so run it through cuckoo allow cuckoo to look at its own memory image and the the data that it can extract out of it but if you want to do more uh analysis using volatility you want to collect your own separate memory image from a clean VM that has nothing else in it or from uh uh from a piece of uh bare metal hardware so let's talk about thug thug is a low interaction honey client and a low interaction honey client is is a it's this particular one is written in Python and it pretends to be a browser

pretends to have certain versions of java certain version of PDF uh and you can configure all of these and it's basically pretending uh so that it can trigger a drive by by being the vulnerable thing that that malware driveby is looking for and so the goal is to capture the payload capture the malware payload from that driveby and I like to say that uh that Thug is a wolf in sheep's clothing so the one of the things that you can configure is the user agent you can change the user agent to be almost anything you want uh and these are two sites that I use when I want to configure my user agent and you know be a little bit stealth or

uh try to find one that malware is actually looking for so this top one is user agent string this is just a library of uh almost every known user agent string and then browser info.net user agents uh this one uh is an individual who is collecting the user agent strings that visit his we site and so he ranks them in uh frequency of being observed and so you know for at least his uh sample size of of visitors on the internet uh it can tell you what is the most popular user agent string uh and then you know again it simulates the variety of uh versions of vulnerable software that you can have as plugins uh

this is just a list of the variety of user agents and and operating system combinations that are provided with it um the Firefox on Windows 7 uh I found is a pretty good one also IE8 and Windows 7 is another very good one um so what are the output of Thug uh it gives you a lot of payload files so if it gets the payload it gives you the payload it also collects all of the different HTML and JavaScript files that get you there and so you can analyze those and one of the neat things about it is it arranges them in time so uh you have you know the the URL that you visited and then

children and the time that it visits them you know you see the sort of a family tree of what begat what and then what begat what and which loaded which URLs uh there's a lot of other types of of of output but the main thing that you're going to get from uh Thug is the payload files which you're then going to send uh automated back to uh uh cuckoo and I wanted to mention one of these I know that we're not going to be talking about this specifically in my my presentation today but HP feeds is really cool so HP feeds is a published subscribe protocol uh it's run by and and published by the you know the the

the source code and everything is uh from honey net project and one of the individuals in Honey net project also runs a website called HP friends and so HP P friends is basically a uh social social media site for malware researchers and so you can sit there and chat with each other and you can also publish And subscribe to each other's Thug and cuckoo and other instances that speak HP feeds and you could sub subscribe to each other's uh feeds of malware and malware data it's pretty neat so uh this is this is sort of my one of my favorite parts of the talk so I'm going to talk about bro and this is a network uh you know it's a network

analysis framework but we're not going to really use it for its typical use case um typical use case is network security monitoring so you would have it doing live capture and then looking for things that are unusual that occur on your network uh we are not going to use it for that because I'm going to run something that I know is malicious and I'm going to listen to it using bro so bro is a series of scripts these scripts generate logs and the logs are text files in uh it's it's a little bit like a comma separated or space separated uh uh text file uh but it's a there's there's a little bit more to it than

that uh and there's a very rich community that produces open- Source scripts and so all over GitHub there's many other bro scripts out there um for for this talk I'm only using the plain vanilla bro scripts that come with bro itself when you download the the the the original source so what I wanted to do here is just focus on bro for a moment and not think about any other tools at all so I'm tying my hand behind my back and only using bro so my analysis Target is this uh Excel document um and I collected this from a uh free you know open public malware sandbox called hybrid analysis so if you wanted to

later uh you know follow along with my analysis here you can download this file the Sha one is is listed and I want to answer the question what can I learn from pcap only so uh the first thing I do is I go to the connection log so the first log the first most important log that you want to look at in bro is the con dolog and so con dolog shows you each and every single uh gran connection that has been found in that particular packet capture and so the first thing you want to do is get rid of the garbage so uh most operating systems are going to have typical traffic that they naturally you

know just uh create uh as their you know housekeeping and typical traffic that's all benign so you want to remove all of that so this is the this is the uh you know operating system uh benign traffic and so as you can see there's a lot there's some icmp pings that occurred on uh uh IPv6 there's a little bit of uh IPv6 uh name resolution local name resolution and some uh broadcast uh broadcast traffic so uh not worrying about this so I want to just ignore that this is important so this is the typical uh DNS traffic and so this is the malware trying to resolve uh two two uh hosting excuse EXC me two host names so

this is important we'll come back to that uh this is also important it made two HTTP connections outbound on Port 80 to these two IP addresses and then this is the traffic so bro being a network security monitoring tool um you know it's it the the the output and the way you you look at it is different than wire CH but the idea is the same it is configured to understand and identify many many many many many protocols that are used uh legitimate protocols so here if the service has a dash this means bro doesn't know what that protocol is I usually call this the WTF traffic and that is almost always C2 traffic from

the malware so let's look at the DNS log uh the first thing you see here wh sorry uh first thing you see here is it's resolved a queer to S01 Yap files. and as you can see it looks like this is either uh round robin DNS uh or this is a fast flux botnet but the thing is I you know I went and visited Yap files and I can see this is a file sharing site and so I'm going to guess this is just uh this is a benign site being leveraged by the malware uh so this second one uh rman C.R is interesting so we're going to take a little bit of a look at that in a moment

but so I went and I looked at Yap files. and I found that this is a uh image and video sharing site for uh Russi a Russian language image and video sharing site and you can upload a file so of course I'm thinking all right so this is uh you know a little bit suspicious so let's look at armanis so I go to armanis and I mean I I speak fluent Russian so I can read the rest of this but for you see remote manipulator uh that is you know very suspicious uh this is in the same sort of uh vein of what I would still consider malware as Hawkeye key logger and things like this that they're

selling it as legitimate software this is actually an office with a phone and people and printers and you know the police are not arresting all of them which they should but um but you know this is this software is supposed to be it the the target audience of this software is if you want to put a key logger on your children's uh you know machine or if you want to spy on your wife or something like that and so you know there's a gray area about that but I consider it all malware so this so now remember what we we're only listening to the malware I have not looked at the malware file itself yet but I now know

perhaps how it's communicating so it's communicating with that Yap files and I have a pretty good guess as to the identity of the malware what what am I working with here so the next thing I want to do is look at the HTTP uh log and as you can see up here the malware does a get to S01 Yap files. of a JPEG but because bro is smarter than uh your average computer program it knows that this is this has a MIM type of application X dos exec so I'm going to guess that this is not a jpeg uh so even though I said I'm going to guess I'm actually going to go out and try it try to check it out so I

tried to load that jpeg in uh a browser to see if it actually loads and as you can see the image S01 blah blah blah cannot be displayed because it contains errors really what kind of Errors does it contain does it start with the characters MZ maybe I don't know um all right so the next thing I want to look at is uh the this so this is the FID uh bro is really awesome because when it finds uh one granular thing like one one row in its uh in its log output each row is assigned a uid so that you can take that one row and relate it via that uid to other brols

and you can follow uh either that connection across different protocols or you if it downloads a file in this case I can go to the extracted files directory where it's carved all of the files out and dropped them I can see which file it is that was dropped there so I then go to extracted files and then run uh the file command to look for the magic uh string and uh you know the the magic string says that this is a PE executable so this is definitely not a jpeg all right so now what I want to do is uh gather a little bit of stuff that I want to collect for later uh so I run

uh I look at the files. log and files log gives you the bite size of the file and then goes ahead and creates the three uh hashes so md5 shaan sha 2 256 now I want to look at the WTF traffic what's going on here uh and this is one part where I'm going to cheat I want to replace this part of the the the talk with uh using t-shark rather than wire shark but because bro you don't have a very good uh look at what the packet itself is so what I've done is actually opened that one packet or one uh one uh connection in wire shark and you know loan hold we've got uh XML C2 traffic

going back and forth between these two locations I know this looks a little bit like Angry fruit salad so I'm going to uh kind of spell it out for you here where I you know when I review what we've learned so uh what have we learned just listening to the malware and not actually looking at the file itself so we know that the adversary is likely rophone uh they use an Office document and the office document is generating Network traffic if you have an office document generating Network traffic that's bad very bad uh the payload is remote manipulator the payload we of we even have a TTP here you know tactics techniques and and procedures so we've

got the pay they hide a payload uh as a public image sharing site as a JPEG uh and then this stuff at the bottom I extracted from that XML we saw just a moment ago and now this is very unique I mean I've got the the license number for the C2 server that they bought from remote manipulator uh its internal ID Trojan version and you know server versions so now I can write a very accurate uh snort signature that will find this adversary's c2s when they reuse it with a different attack if they put another uh payload out there and reuse their C2 I'll be able to connect that C2 back to this one so this is a set of indicators that

I've gathered from this uh you know I've given them a couple of different ratings uh Yap files I rated this one skull which is just uh suspicious rather than actually malicious because it is a uh a benign website I did contact them and I said hey you might want to check that things are actually jpegs before you allow them to be uploaded uh but I didn't hear back um and this is actually on GitHub right now on my GitHub page uh this is my bro configuration so this is my local. bro uh config file and so please take this use it uh I spent quite a lot of time going through each and every config and

making sure I know what it does and how to use it and and how to configure it properly so bro gives you a number of different logs and then it extracts files those files that you get extracted from here like that payload lo I would want to send to cuckoo and do further analysis there so let's move on to volatility uh volatility is a memory analysis framework it's a very very very good way to extract artifacts from memory uh while the malware is running and so it gives you a really good view into the malware it's also a very good pentest tool because you can see you know uh logins after reboot and fun stuff like that uh I encourage you to

look into volatility and use it uh all the time it's really fun um there's also a tool in the recall project which is similar to volatility uh the recall project has a tool to dump running memory so you can actually pipe uh memory into volatility while you're in the VM so you're you're looking at live memory Pages uh which is very useful and there's a there's a version of that for Linux for uh for OSX for Windows Etc so it supports Windows operating uh uh OSX and it also supports Linux and what we're going to do again is the same exercise I did just a moment ago and I'm going to analyze one piece of malware

but without looking at the malware file itself I'm just going to look at what the malware is thinking in memory and by the way if you see an executable file that's like b. exe or 2.exe or fexe that's malware that's almost always malware um so again I encourage you to follow along with my analysis here you can go download a copy of this malware yourself uh same place and this is the Sha one for this piece of malware so one of the things uh and you you can look this up when I publish the slides later but uh so I beat my head against this problem a number of times I'm like okay I dumped the the memory image from from

uh a virtual box uh volatility doesn't like it what's going on so I finally figured out that you have to run uh first you dump the memory image uh from from virtual box and then you have to use volatility to convert it to a format that volatility understands and so you don't need to bump your head against the same thing that I did so just follow this process and you'll have a perfectly good piece of uh perfectly good memory image that you can analyze using volatility so one of the first things that you want to do is take a memory image of the VM or the P or the the workst when it's clean so that you can

compare some of the output from volatility so one of the things that you would do is PS list and PS list shows running processes and so as you can see I did the PS list from the clean memory image and from the malware memory image and now I've taken a diff of the two outputs and you can see so search protocol and search filter and service host these are all clean these are part of the clean image but now you can see explorer.exe is now running in this VM and by the way I did not run explorer.exe I clicked on a malare executable so uh this is very suspicious so let's take a deeper look into what's

happening there and run it through Mal find so malind looks for process injection and any anything that has been injected and you can see uh explorer.exe and again you know you'll see a theme when you're using things like volatility if you run one tool and it and you see huh explored ID exe is running then you run malind and oh so explored ID exe was actually injected so clearly it was injected you've got dot uh do Mz which is uh PE executable so PE executable is actually sh into uh Explorer and then I dumped those so I dumped those processes and ran that dumped file through virus total and I found uh that a Vera and quo 360 uh have

flagged this memory address as uh malicious and then quo 360 also flagged the second one so both of those are found to be malware now what I want to do is look at the connections so again I can find what outbound network connections what outbound network connections were made uh in that memory image and so in the clean image no you know no network connections and then all of a sudden when I clicked on this malare executable I've got these two uh these two uh connections and so 216 17126 105 now we have a C2 IP address so what have we learned using volatility uh we know that the sample that we observed uses process in

injection uh we know that it inject specifically explorer.exe and then we've also gathered a command and control IP address that we can then use to look uh maybe through uh passive DNS this IP address might have uh further uh domains and host names that we can use to Pivot out to look at uh previous attacks and get an idea of who it is or what organization or what malware family or or connect this particular attack with previous attacks so uh again uh uh the organizer said we're running late so I'm going to to kind of skip over this this is something you can this is just tools that I find very interesting and useful in volatility so you can come back to

this uh from my slide deck um and then volatility there's a lot of things that it will extract but mainly you're going to be pulling dlls you'll be pulling processes uh sometimes if you pull a process that's in memory you may need to reconstruct the um the import table uh but there's techniques to do that uh but you want to you basically want to take any of those executables and then run them through uh through cuckoo uh you may get URLs uh one of the things that's cool is volatility can look at your uh Internet Explorer uh visit history so kind of scary but also you can pull URLs that that have been visited there so you

might be able to if you had a memory image from a uh you know incident response of a user and they clicked on a driveby now you can collect that driveby URL uh you know forensically collect it from that uh that workstation and then you can visit it using Thug so I want to tie all of this together uh and so I hope this is not a too much spaghetti to look at but I'll kind of explain it here so you start with a file uh you upload the file to cuckoo sandbox it performs static analysis it then performs Dynamic analysis uh at this point the output from cuckoo you want to upload to a

threat intelligence platform and then if it collects a URL or if it collects peap you want to start and send those to Thug you send a URL to Thug send the pcap to uh bro and then start their analysis process so you process the URL you process the pcap you want to have some logic in your queuing system where if you have seen a file before you don't need to send it over here again so if it has been seen before you want to send the output of that to your threat intelligence platform and then send uh if you have not seen it before uh you want to analyze it using cuckoo sandbox uh I know that volatility is not in this

uh set of swim Lanes that's because if you add volatility uh it just all looks like complete uh spaghetti but what I've done is I've separated volatility out and you can on your own kind of logically combine these two uh but visually it just doesn't work very well uh so volatility you upload the memory image from volatility uh does memory analysis if you've seen a URL and you know this is basically the same logic over and over if you've seen it uh then take that analysis add it to your threat intelligence platform if you haven't seen it before then and analyze it with the next uh open source tool so when you're orchestrating this and uh automating it there's a number of

different ways that you can uh have your message Q set up uh rabbit mq and redis so reddis is a memory key Value Store database excuse me and then zeromq is a um distributed very quick uh memory uh not memory I'm sorry uh uh message Q I prefer zerm Q it's very very clean very fast um one of the things again like I've beaten my head against certain things in the past and I don't want you to to to do that yourself and so when you're transferring files among different systems uh you want to use the message cue to tell that next Tool uh I have something ready for you but you don't want to use the message CU to

transmit the file itself you want to use uh engine X or Apache or something like that and have the tool drop the file into engine X and then the message cue says pick up file XYZ at this URL and then have it uh collect the file using HTTP or FTP or whichever you want to use um because those Protocols are built to transfer files it's it's uh cleaner easier and less headache um and then I use elastic search to pull all of the data and kind of uh you know uh collect it and so cucko uh cuckoo modified has a elastic search output but Thug has native elastic search uh export uh bro you can export logs in Json format

there's a line in your local. bro which tells bro to create Json format logs and then volatility as well there's a switch in each one of the when you run V uh there's a switch that says I would like the output to be in Json format uh and then I prefer to glue everything together with Python 3 um python 2 is is dead I hate it it's gone and especially like the way that it that it handles utf8 and character encoding um stick to Python 3 so do you have any questions

Open Source Malware Lab

Related talks