Malware Behavior Catalog

Name: Malware Behavior Catalog
Uploaded: 2019-10-26
Duration: 54 min 45 s
Description: The Malware Behavior Catalog (MBC) is a publicly available framework defining behaviors and code characteristics to support malware analysis-oriented use cases, such as tagging, provenance and similarity analysis, and standardized reporting. As a malware-centric extension of the MITRE ATT&CKTM knowl

BSides DC · 201954:45141 viewsPublished 2019-10Watch on YouTube ↗

Speakers

Desiree Beck

Tags

CategoryTechnical

ResearchTechnical Deep-dives

StyleTalk

Mentioned in this talk

Service

Joe Sandbox

Frameworks

MITRE ATT&CK Framework

About this talk

The Malware Behavior Catalog (MBC) is a publicly available framework defining behaviors and code characteristics to support malware analysis-oriented use cases, such as tagging, provenance and similarity analysis, and standardized reporting. As a malware-centric extension of the MITRE ATT&CKTM knowledge base, MBC draws upon ATT&CK’s success by applying its philosophy and methodology to malware. Namely, MBC maintains a malware, code-oriented perspective and focuses on real-world use of behaviors through empirical malware examples (note there is no formal relationship between ATT&CK and MBC). The presentation discusses real-world applications of MBC and will show how behavior indicators identified through static and dynamic analysis can be mapped to MBC, illustrating the depth and precision MBC provides. Desiree Beck (Principal Cybersecurity Engineer at MITRE) Dr. Desiree Beck joined the MITRE Corporation in 2001 and is a principal cybersecurity engineer in the Cyber Operations and Effects Technical Center. Her work focuses on the research and development of malware analysis tools and techniques.

Show transcript [en]

that's my officemate Rosie and we normally don't do photo shoots together but my brother is a photographer he never leaves his equipment in the car so it's often in the house and it's in this case it was um a family get-together and this was my mom's backyard so that's where that came from hi my name is Haley so I just stand up artists I graduated for number of years went back to grad school got mastered agreed and went to work on my turf or thing about three years three years now as a malware analyst and researcher so aside from that you know I'm interested in systems systems of machines systems of software systems of people I love to

observe the component of say systems and you'll see how to interact with each other I mean clearly system of people is the most interesting and more normal hobbies I am into quilting alright so here's the outline I'll be talking about the things on top of the dotted line so basically the overview of MBC some of the use cases it's relationship to attack and then talking about how you actually would use it to do basic labeling or mapping and then Haley will give some case studies after that so um I would guess many people are familiar with attack but in case not it's kind of a basic you need to know something about it to understand what we've done so it's

it's a curated knowledge base and model for cyber adversary behavior and it you know captures the TTP's and the things that adversaries do you know when they make their intrusions and so there's various pieces of attack but attack for enterprise is where we've kind of spun off of those listed there there's 12 they call them tactics which are the high level objectives of the adversary and so that's kind of where we started from with MVC and I should say we are not you know attack was developed at we're from mitre but we in fact are not connected to the attack project and we're not on that team so they're they're very separate or independent efforts although related you know so you

know attack is widely used and it is used currently to capture things to capture malware behaviors the best examples are probably you know some of the automated sound boxes like Joe sandbox Falcon they have their report and they map to attack and so actually it works really well in many cases you know because adversaries they use malware the objectives of each or both are obviously overlapping similar so for example if there's a behavior that malware does where it's using command dot exe you know for commands execution that would be something that a sandbox might report it does map really nicely into a tacks technique which is command line interface but there are some cases when you're looking at malware and you

find the behavior that you know attack doesn't work for or maybe it's not quite malware oriented enough the best behaviors in that category are probably the anti Analysis behaviors such as the examples at the bottom of the slide there you know queries kernel debugger information you know if it sleeps things like that so that's where you know that was kind of the motivation for developing or defining MBC so in the fact that you know attack doesn't have everything you'd want when you're analyzing malware we've extended attack so we've to date made 37 new behaviors better not to find an attack and as much as that we've also looked at attack and we've kind of pulled out the things that

pertain to malware because attack is oriented toward the adversary NBC is oriented toward malware and so we've looked at attack pulled out what applies to malware and in that way I've kind of reduced it so that it's a more focused on malware thing NBC is and if you are familiar with attack those bullet out items in blue at the bottom of the slide this kind of the malware Orion analog for what attack has done an attack has been really successful you know so we tried to kind of emulate and borrow as we could trying to make NBC you know as useful and so the the first thing is that we do maintain or we try to you know the

malware code or added perspective that's what how we're defining things and that's kind of where we're coming from it focuses on the real-world behaviors actually identified in malware what that means for us is that we looked at you know reports that people were putting out you know analysis results from malware analysis tools and we're looking at those exact behaviors and we want to be able to capture those and that's something that attack is done one of the main reasons I think attack has been so successful is it's really tapped into the real world that's made itself very useful because it pertains to what you're actually seeing so we've tried to do the same with NBC and then the third

thing is you know trying to maintain a level of abstraction that works for the malware analysis use cases that we're looking at so it's terms of the use case is the most obvious is you know first on the list standardized reporting so if you're able to you know you find your behaviors and that's what you want to find out when you're looking at malware and if you have some consistent way of capturing those that's going to make it you know you've got this standard way of doing it it's going to make it easier to actually use the content of the report for detection mitigation or whatever on a related note you've got the correlation of results and so you may

have you know three different analysis tools that are giving you reports and you look at them and you want to be able to say are these two or three tools saying the same thing are they conflicting or what and being able to if NBC is what they're mapping to so you've got the same behaviors they're talked about in the same way you'll know that okay yeah this is you know we've got some correlation here that validates the results or maybe it's going to point out that you don't have some agreement in which case it's an area for you know further research or further study it also can be used for creating labelled datasets for malware research where you

get you know a bunch of malware you tagged it MBC and now you've got you know if you have a repository where you've got malware that's tagged in terms of their behaviors actually of course that would be very useful in numerous research efforts and then finally you know NBC supports actual malware analysis and that's because as we'll get into you know there's a structure involved there's the higher-level objectives lower-level behaviors and having those in place and things well-defined it helps the analyst because it helps them know what are they looking for you know maybe how they know when they found it you know how it can be captured so it kind of guides analysis in that

way so this is these are those high levels tactics that I listed on the slide about attack and that's where we start with NBC and those are we they call them tactics we've moved toward objectives that seem more natural to us in talking about malware and so for the most part we've used those that they've defined with the exception of you know initial access pertains to how the malware got onto the system and you know we're focused on the code in the malware so unless the malware itself got and got itself there and that's not what initial access generally means for attack you know that's kind of outside our scope so we've grade it out there that's not one

that we mapped to but the ones in blue are two new ones that we define in NBC two new objectives and that they're both related to anti analysis behaviors are they're behavioral or static and so we've got you know that objectives are the high level what malware is trying to do and then kind of the layer down is the behavior or as attack calls and techniques but the behaviors of how the malware is doing that and so there's sort of four ways relative to attack that we define things in NBC so I've got a alright so the first is just as a reference and that's probably the most common thing that happens BC you know we look to attack it's

something that does apply to malware and so we're basically just creating a wrapper pointing to that attacked technique you know we don't want to replicate a tax content we assume that a user of NBC is going to use attack I mean that's where they're going to go for the information that pertains and so that's the most simple way that we create a behavior in NBC and then also at the bottom you know called enhance it may be that there is an attack technique but it's not quite focused on malware in that case we provide additional content that is malware focused and then you know there's that content in addition to the attack content that the user of NBC

would would use so for example with the execution guardrails that's a technique that attacked defines under the fence evasion we added more to the definition some examples and when we were done it also then applied to the anti Behavioral Analysis objective so we've kind of extended it hack in that way the third way is you know we've refined some technique some time so attack will define a technique and it's a little broad for what we need for malware analysis and so we'll maybe break it into two pieces I haven't I only have done it more than two pieces where we define two separate things that are we think you know support malware analysis better so for example with software

packing we kind of broken into two where we have a separate behavior for compression and then another behavior which is for obfuscation which includes like encoding and encryption and more and then the fourth way that we define behaviors is you know those things that attack does not have so for example sandbox detection that was something that we defined an NBC as a new behavior and this kind of just it gives a overall view of the numbers of behaviors relative to attack and just to know what's what what in NBC so each of the objectives are listed and then there's three numbers shown and the first number are the number of behaviors in NBC for that

objective the second number are the number of techniques and attack for that they would call it tactic and then the last number after the dash are the number of new things that NBC is defined so for example with execution you can see that there are 21 behaviors and NBC attack had 33 and that's that you can see we've reduced it but then of the 21 6 are new and you can also see like for anti behavioral analysis it has 12 behaviors attack has zero because they don't define that tactic but only 10 of the 12 are new because in fact in the last example that thing was the execution guardrails we borrowed that from attack and that's one that is kind

of repurposed this is an example of what the actual all of NBC right now is just on github and markdown documents so this is an example of a behavioral behavior page dynamic analysis evasion so you can see we've got you know our ID which we use similarly to what they use how they use them an attack the objectives that it pertains to there's a short description and we have also methods where methods are it's a specific implementation of a behavior so it's kind of a refining further the higher-level behavior so it's kind of like the third level down and then not shown will typically have a table of malware samples or examples that exhibit the behavior and then references and

that and as far as labeling and mapping it's flexible whether you map to an objective only a behavior only you know maybe a combination objective behavior pair it's just because you're not always going to know you know the higher level why or maybe you won't know the how it's being done so for example with a code snippet this example at the top it's pretty explicit you've got the actual code that you're looking at you know it's anti behavioral analysis you can see what it's doing and that you're trying to detect a debugger and then there's even a method because you can see that you're working with the process environment block so that's you know the most specific you would map to

and then some of the automated tools they'll have you know they write signatures that are identifying behaviors and so the middle tier there is you know ideally the person writing the signature will do the mapping to NBC and for whatever reason whatever they're finding in their analysis they know it does command-and-control for example they may not know so it's like you know why they're doing something they might not know the lower level details or maybe it's from a combination of things so they may just map to an objective and nothing more or the other example you know they know it might know the objective and the behavior and then it's kind of funny the last example up there

for how something is mapped because you know it's ideally the person writing the signature is going to map to NBC but we've got actually it might or one of my the groups I work with they're kind of the second party they're getting reports from tools they're looking at the signatures or the behavior indicators that are being reported on and then the vendor has not provided the NBC mapping but they're trying to do it themselves and so you know they'll see something like download and it's not really clear exactly what that means so it may be that you end up kind of mapping to two different NBC objectives or behaviors and again it's not great but it has

reduced it down quite a bit so you know much more narrowly might be might be happening and so and this is I think a good example of you know the how NBC helps if you're looking at malware this is a report that McAfee wrote on Web Cobra it's an analysis report it's a really nice report well-written and it's great because they've mapped they've identified behaviors and map them to attack and so what I did is I looked at the report and I looked to see well if they would have used NBC instead how would things have changed and the behaviors listed or shown in black would just be the same you know they were they found them

mapped them to attack made sense but there were some cases where because of the descriptions are they they were missing kind of malware specific focus or whatnot they had mapped to things that really I think weren't mapped as accurately as they could have been but again NBC adds the content it makes it more about now we're focused and so those things in purple are the behaviors that with the additional NBC enhancements seem like they kind of fell out as a good way to map and then finally in blue are the behaviors that they actually in the report they were really explicit and they found these you know things that were going on but there wasn't anything and attacked him

happened you but NBC had those and so those are now in blue I think now Haley is going to talk about some case studies okay so as dust mentioned that NBC have four uses and I'm here just presented case study on the bottom two which is how to use NBC to label malware and how NBC can be used to inform an analysis process and hopefully to demonstrate why embassy is needed and how to use it but before I get into that so let me talk about my research a little bit to say why we need NBC in the first place so my research I am my partner not des how to research on evaluating various

mount similary detection and we want to answer questions like given a method given two samples why did this methyl group or not group be too simple together and do we want them to group together or not and so for that we need to understand the malware very deeply and none of the published data set enable us to do that so then we go about Korea on ground truth we are a manual analysis which as you many of you know is tons of work it takes days if not weeks to analyze like one sample so anyway then we want to maximize all that investment by like capturing our knowledge so that it end up in this a very long verbose report

but we also need to be able to cut it query them in a meaningful way so we figure we have to attack the malware hence the report with something that is descriptive of the malware itself so there's an and I both work at mitre so naturally we got connected and I learned of MBC so to me MBC is provide language to describe malware behavior essentially you know what does it matter I do and can't even give you details of how it does it to me is is when you think of a malware that's the first thing you think of and I found that you know MBC language to be succinct in high level just good you know short it contain a

variety of malware objectives and behavior which mean it's just gonna describe a lot of things so that's not so a good things now as some of you know attack is also a minor effort there's people as death mentioned also use attacked attack malware so what really pushed my team to use NBC as opposed to attack because NBC provide tagging for anti analysis so so anti analysis techniques brain drain they are designed to consume jord Hamlet analyst ham like for those who haven't done deep dye analysis or don't know what anti analysis mean out here just sort of like an imprecise description they could range from something like ko office keishon that just basically make the

code hard to understand so you can't figure out what I'm out does their tools like the LLVM is designed to create code obfuscation they cause to be basically features in the malware whose sole purpose is to make some of the analysis tools not working and not working properly so your tool doesn't work it hinder your understanding of that malware now there's currently no coherent strategy in dealing with like a variety of anti analysis techniques so it's all dealt with on a case-by-case basis up to the skills and experience of the analyst themself it's and I I believe that we need to improve our analysis tool to kind of automate the defeat of B and time anti

analysis techniques to make like one big real progress towards dealing with malware but in order to do that we need data we need data label by anti analysis and right now we don't have that so hence having language to describe ante now it's very important so all right so anti analysis techniques evil tacking them is also not easy so I mean thankfully NBC actually have rich language to describe them you have two objective anti static analysis and anti Behavioral Analysis under each you have a long list of behaviors under the behaviors you got the methods now you should be better kind of differ considerably in code and observable in indicator and by that it means if you

were to go about you know devise a way to detect them or defeat them that way it's gonna be different for every each of the methods some of the techniques are hard to capture their heart capture because it involves complex code logic it's hard to understand then it's hard to explain now if you were to label something and then send it out to the wall you need to be able to explain it and convince another person that you know this is what it is this is what happened it's definitely easy to miss if you don't expect to see it like not every techniques going to defeat everything so let's say an came across a technique but it didn't

defeat anything in their environment they wouldn't notice it so today I might start it's something we a piece of code but it has nothing to do with the function now do you have done malware so I'm just going to capture it because I don't know what it is and definitely on the long list of things that you produce from analyze the malware figure out how antenna analysis techniques and what they are how they work and definitely low on the list is under things like figure out pretty low on the list so it's very hard to capture we pretty much expect that or rather accept that we won't be able to capture every anti analysis there is but we aim to capture

as much information as we can so in our research we decide to tap the ante analysis via like the method to the most granular level of details NBC allowed which is something that we haven't done for the other objective so okay so here I'm just gonna go through some of the sample that I and our team have analyzed just because you know language is only useful if everyone agree on how to use it so just some example anyway so first we have a rat basically it's a custom remote access tool because it is a rat it is tact what impact remote access impact is the objective is all in all cap remote access is the behavior now

this rat it part of beaconing is and it go and find a list of installed ABS on a victim host and then send it to a controller we tack that with discovery security software discovery again discovery is the objective security software discovery is the behavior now note that some of the malware what it does is CEO is there any install AV here if it is quit this one to send it to a controller the controller happy by the control I mean a control server with this someone else operate the person behind that have the option of like hey there antivirus here let's just kill it but their intention is outside the scope the malware so we just only tackle with

discovery if the malware decide to do something on that information we would have tacked it with defense evasion so the rat support you know many commands I think two of them listed here one is called the run F command as you expect run this command that's what it is so upon receive the run EV directive then the malware called create process to run the command so we tag it with execution execution through API execution through API indicator is use an API like create process it's also attacked with execution remote command because it's received a remote command execute in Cray in gray there is actually method we don't actually attack it with execute but you can and then

there also a del F command which then attacked with execution remote command delete file so this rat has a big global array where it contains run ham configuration it's containing things like error messages encryption keys and things and this global array is then decrypted at the beginning of its run so this is an anti static analysis techniques it's called a fall under executable code obfuscation behavior and via encryption being the method it's also construct literal string value by moving character into a buffer app order so for those who have written code you could define a string and saving available what the malware does create an array of character moving one character at a time but they also move

them our order so if you just look at the code that's no way you see the resulting strings so NBC provide a method to capture that call stack strings and then now the rat also contains like several core functions that is also encrypted and it can only be decrypted by a key to send from the control server if the control or the operator choose to send it and then if it's received a key it decrypted code render code memory the KO is never said to dislike ever so clearly the code remains safe I mean it remain encrypted on this which is under a cover by the first they're anti static analysis but because at amis also remain encrypted in memory

which is this where I found by perusing MBC trying to figure out why oh you know what describe this and I found that having code encrypted memory is a way to evade memory dumped so this is sort of an example apply by using MBC you've learned something in in return so this rat also have NT analysis surprise so this is an example of like why is it hard to capture anti analysis so there's a piece of code there I've kind of suck all in three places good the code split in three places but it's a I'll call the Select function and WS AF D is set function which is both does the same thing but one of them belong to

windsock library and the other belong to windsock to library normally you use one or the other one but like why would you ever use two I'm sort of nowhere I like call both check and they agree with each other if they don't sleep and they try calling both again and I rat down was like well what is this like it's even anti static analysis it's just some developer to being overly cautious don't know and it's simply not a problem on standard installation will window chances are you would call both both on each other article works all the time um and then obviously somehow for whatever reason if the two AP I don't agree with each other they not gonna

agree with each other next time you test for it so I what's this and it also doesn't help that the code is split in three places that's kind of far from each other so as an analyst you gotta have to scroll back and forward and you follow the connection it's just hard to see whereas there's a dense tree right in the middle of them they actually contain all of the the big code like the core functionality of the malware so as an analyst and you press what ham you're trying to figure out like what does is now why do you pay all the attention to that and by the end you probably forgot about this week oh you see in the first

place so I because this is sort of my resource so I spent ham dubbing into it so just more picture here so they does the task that you call if they don't agree with each other they essentially call clock and then Co sleeping Co clock again and what they're trying to do is measure the number of CPU cycles that have passed through a known amount of sleep and I deduce this is entirely subjective idd is that that calculation can be done to see what is the cpu rate are and so if the value is sufficiently high the malware restarted sour but I mean it's restarting the main function it's called beginning again and go through everything if the value is

too low it's just repeat the test again and if the value is somewhere in between it sent a value to the control server and then repeat the test again and so I ask myself and also again perusing MBC looking at why what can describe this now embassy didn't really give you something explicit like you know but have pertain to every situation in this case but I did find there an emulator evasion technique that involving making the coat loop so much that the emulator to quit and I didn't use it this is the case where it's both problematic on an emulator if you run it through a debugger essentially you run on like a real machine of Em's you wouldn't have

like a repeated loop it's only an emulator somehow forget the EMA like one of the function that's when you get that and and that's how it divides the the infinite loop to killed emulator so again I've learned something and it's I choose attack with anti behavior analysis emulator evasion via extra loops hemlocks okay so this is a different malware this is going to be a lot shorter this time but anyway so this is the malware where there's an original sample then it go it drop another sample in and execute that sample now we know that malware author can they basically stall them out where it's a service there could be three different people who write a tree code and we want to

tack them in them in aware that you wrapping like forwarding input output and error to a control server one of the cases of the reverse shell like is there's a river shell that just wrapped some forwarding pipes to to and from rats and use command prompt it's also tagged with execution command line interface because it's used command prompt but the victim program doesn't have to be a shell it and obviously the behavior this malware is going to include all the function that and limitation of that victim program so what do we tap this with and again we want this hack it so that the same box have the same story at the other analysis process so anyway in

summary most of the time I found using MVC is straightforward they have a lot of behaviors obviously a lot more methods but their behavior because a group under a different objective and sometimes you can kinda get a sense of what are the objectives so that kind of help you narrow down what behavior you look at I mean I think for reference they can take something like two hundred and sixty plus behaviors they guys clearly guys analysts the method they don't know about I sometime have free time and just kind of perusing through all the cut-offs and it helped standardize the language analysts use to talk about malware to each other and this is important because now we're

essentially is code you can write pages and pages about a piece of code so you want to talk about it and like you know transfer information efficiently so you want to standardize your language and and this the last one might be a personal bias for me but I think having method for antennas objective is great anyway so des is gonna talk about forward direction

right so going forward you know what are the next plans or steps for MBC recently we just we converted or translated the markdown into now NBC is available I don't know if this available publicly yet but it's in the six sticks to JSON format which will help with you know using it in an automated way one of the things that enabled us to do was to use the attack navigator and that's the picture of it right there it's a basically interactive matrix that attack uses we are now using it for MBC and it's nice because it gives you an overview of all the behaviors in their column according to objective by the end of the calendar year we're hoping to get

or we should have an NBC website up which will be much nicer than the markdown that it's in currently and then kind of longer-term what we really like to do but be some manual effort involved there is to create a repository of code snippets so that for every behavior in the catalog there'll be a specific or not they'll be at least one or more examples of how that behavior is implemented in code and then otherwise you know there's a few organizations that are using MBC now of course we'd love for more people to be using it we like to expand the community you know the feedback and refinement that we've got from people actually using it has

been really good and I think the thing that's making NBC useful so if you are interested in contributing or just finding out more we have a discussion list that you could join just send the email to NBC at mitre org and then for now though all the content is and github there's the link there under NBC project and we've got you know so we've got the markdown documents for the behaviors we've got you know readme for all the main sections and we actually have a really long FAQ which covers things like you know the basics of NBC how behaviors were defined its relationship to attack you know different use cases how you use it so

there's that and then you know if you do have content or suggestions or whatever you know use the github stuff to let us know that and that'd be great so any questions I have I could not say either way I mean at this point we did talk to them of course you know a couple of years ago asking could we how about defining this and this and this to help with the malware you know they have really taken off a ton of people or using attack but they can't expand for all the use cases we weren't the only one that's asking them to do something more in something more specific so at this point there is no plan but I who

knows so I don't know in the future what might happen

we're not as far as I could we would like to be able to represent the behaviors of all malware out there and but we're not really like attack you know kind of emulate so they look at the stages of the kill chain malware isn't doesn't quite work like that so where ours is really code focused and so it doesn't have I don't think that same kind of chronology or linear aspects of what you would see in an intrusion and that's what attack kind of covers but the why malware does something does still apply so it does that because it's trying to be persistent or it does that because it's trying to make lateral movement or so it

still applies and what we want to do is from a code perspective look to see what the malware does what is the behavior that it does how is it doing things those are the behaviors and then the why is that higher-level structure that attack has but it doesn't quite flow in the same way I don't know if that answers I have no idea no I mean we've looked at like that web Cobra attack I showed you we we've looked at lots and lots of reports trying to make sure that everything in those reports were covered we have it's not a huge database now on and github we've got maybe 30

Malware Behavior Catalog

Related talks