Day Two: Malware Reverse Engineering

Name: Day Two: Malware Reverse Engineering
Uploaded: 2021-02-13
Duration: 4 h 37 min 44 s
Description: 0:00 Day Start 0:41 Start of Workshop and Outline 2:18 Topic 1: Malware Traffic Interception/Decryption and Controlling a Backdoor 16:32 Walkthrough Start 21:13 UDP response setup 28:20 SSH setup for tunneling 39:09 Traffic interception/decryption 54:14 Controlling the Backdoor 1:05:20 Topic 2: Bina

BSides Islamabad · 20214:37:441.1K viewsPublished 2021-02Watch on YouTube ↗

Speakers

Umair Irshad

Tags

CategoryTechnical

DifficultyAdvanced

StyleWorkshop

Mentioned in this talk

Tools used

CyberChef Ghidra IDA Pro Immunity Debugger Process Explorer Snort UPX Volatility WinDbg Wireshark x64dbg YARA

Service

VirusTotal

About this talk

0:00 Day Start 0:41 Start of Workshop and Outline 2:18 Topic 1: Malware Traffic Interception/Decryption and Controlling a Backdoor 16:32 Walkthrough Start 21:13 UDP response setup 28:20 SSH setup for tunneling 39:09 Traffic interception/decryption 54:14 Controlling the Backdoor 1:05:20 Topic 2: Binary Unpacking 1:14:42 Demo: Mapped vs Unmapped PE 1:24:42 Walkthrough Start: UPX manual unpacking 1:37:12 Removing ASLR and demo of .reloc section's effect in case of self injection 1:48:05 Walkthrough Start: Custom packer unpacking and manual dumping 1:55:15 Types of breakpoints 2:04:45 Fixing mapped PE for static analysis 2:11:47 Identification of Crypto/Compression 2:15:48 Topic 3: Binary Patching 2:17:54 Demo: Using static patching on a flareon binary 2:27:53 Walkthrough Start: Hot patching cmd.exe 2:36:45 Hot patching "dir" command in cmd.exe 2:50:28 Static patching on a custom binary 3:05:25 Topic 4: Binary Emulation 3:11:10 Walkthrough Start: Router malware emulation 3:20:19 String decryption with emulation 3:40:57 Final Words (on the advanced topics) 3:50:38 Topic 5: Shellcode Analysis 3:56:05 Demo start 4:11:03 Topic 6: Detection Signatures (yara/snort) ------------------------------------------------ Corrections/Additions by Umair ------------------------------------------------ 1:49:30 - In case of the malwares that after multiple layers of unpacking, inject shellcode to external processes, the final payload can be directly obtained relatively easily from the injected process, rather than following the whole path as well. 2:49:58 - CryptEncrypt* 3:09:28 - The requirements stated (minimum 2 vms etc) are true only in case of non-windows hosts 3:10:12 - Petya resides* in master boot record 3:38:48 - Content-Type* Umair has been in the cyber industry for over 7 years working in the areas of OS internals, reverse engineering and malware analysis. He started his career focusing on Windows Internals R&D for FireEye's detection technologies. Later joined FireEye Labs, and switched focus to APT analysis, detection engineering and hunting. Now he is part of Kaspersky's Global Research and Analysis Team (GReAT), A team of globally spread out researchers, that has discovered and reverse engineered some of the most notorious and sophisticated APTs to date. Irshad has more than five years of hands-on experience in Threat Research/Intelligence, Malware Analysis, Reverse Engineering, and Detection. After completing his bachelors in Electrical Engineering from UET Lahore in 2015., he joined Ebryx Pvt Ltd, where he provided detection capability for ordinary and APT malwares for FireEye NX/EX products. In 2017, he moved to FireEye Labs Singapore, where he mainly focus on detailed analysis and detection of APT malware

Show transcript [en]

[Music] [Applause] [Music] thank you everyone we are back in our workshop uh malware reverse engineering day two so we have um uh with us so i will hand over to umair he will give you overview what we have done yesterday and what we will do today so head over to you thank you thank you so much assalamualaikum everyone uh so welcome uh to the second day of the malware analysis workshop um so let's just go over an overview of what you guys went through yesterday uh yesterday rashad uh went over some basic static analysis he went over some assembly language and then he also went over debugger overview today we are planning to so the thing is

that we are kind of behind our schedule a lot so we will have to be shifting gears a bit and we'll try to speed up things i'm going to cover uh i'm going to try to cover all of my topics today it's we're going to do some traffic interception try to control a back door we're going to look at some binary unpacking we're also going to look at some binary patching and then finally we'll look at binary emulation and some of its use cases and if we're able to we'll try to get to ushad will try to get to some yara and snot in the detection engineering part so i think we should just dig in so i i the way i've

set these topics up is that i've tried to pretty much remove most of the slides so i'm it my part is for the most sake for the most part it's going to be just demo demo and demo i think only the first topic has a few extra slides other than that we're try we're going to try to do some fun stuff and try to go over some scenarios and some cool tools really quickly so let's start uh the first topic for today is traffic interception and controlling a back door so uh what exactly will we be doing uh we are so by the way um you'll notice that i'm looking to the side constantly that's because my screen is there so

please bear with me uh so the first thing that we're going to do is that we set up a network environment so we can interact with the back door so we look at what that back doors network [Music] prerequisites are what it's looking to do what it's expecting and we'll try to emulate all of that in our environment and the second thing that we'll do is we'll also look at what the backdoor requires to get triggered and start its communication with what it thinks is the c2 and we'll do a man in the middle to intercept that traffic because it's going to be encrypted so we'll try to decrypt that traffic on the fly and finally we'll try

to exchange some commands with it and see if we can try to control the back door um so why will we be doing this the goal of this exercise is uh so the first thing is that i've made i've structured this exercise so it's going to be strictly network stuff we're not going to be looking at any disassembly or any debugger because that will uh just get us off track the goal of this exercise is that you can kind of get a taste of what kind of scenarios you might face when you're dealing with malware especially backdoors uh because the way that they communicate with c2s can be sometimes really complex and this is just one of

those scenarios right so you cannot look at it like it's a sequence of steps that you can just do with every malware you have to look at each specific malware's requirements and then you have to build your setup according to that particular variant of the malware or family and so this is just going to be one of many countless uh potential scenarios that you can come across and finally yeah obviously we'll be using some tools so we will get an introduction of them if you're not already already aware of them so that that would be a cool thing and finally you'll be able to set this all up yourself as well so you learn by doing that's always the

best way to go about it and yeah so why why exactly uh so if if a malware analyst is working on a back door why exactly would he want to interact with it or why exactly would you want to pass commands to it um so the thing is that when you're writing a malware analysis report or reverse engineering report you cannot say that okay so this it looks like this is doing or this is trying to do this you cannot be you always have to be 100 sure uh about what you're writing in your malware report right and sometimes uh it can be hard just by static analysis to be a hundred percent sure of

what exactly this code is doing because there can be a bunch of obfuscations that the author mallorata has put in there that make uh static analysis somewhat harder uh like one of uh my team leads uh used to say about the way that guy i had been doing reverse dating for about five or 15 years so he constantly used to tell us that you live and die by your analysis so what he meant to say was that even if you're getting another report and if you have to put your stamp or your name on that report and send it forward you better replicate it and you better make sure that every word that's written in there is correct so you

cannot uh just rely on someone else if ever you're pushing out something uh by putting your name on it you have to be 100 sure that it's correct otherwise it will just mess up your reputation so to speak because re is hot stuff um and what better way to confirm what a command might be doing or what a particular sort of interaction might be doing than to actually reproduce it and see it happening on your screen so that's kind of the reason why we might want to interact with the backdoor uh so how exactly uh are you guys gonna find out all the commands that uh that you can use with that back door or

that it can pass so that part is out of scope for this exercise right so that's where i come in i've already reverse engineered this malware i have all the commands i already know what it expects and what it wants to do and so on and so forth by the way i will share a document of this particular malware's analysis in-depth analysis that i did myself so you can later on you can go through it and sort of get more educated about how the internal workings of the malware are so let's discuss the particulars of this particular malware this specific malware uh so this is uh a banking chosen basically a brazilian banking trojan the way it works is that

it tries to detect uh any sort of online banking activity on the victim system right so whenever for example uh we'll take a local example let's say it was targeting pakistan so if you were the victim and you opened standard chartered or habibank's website on your browser it will detect that this guy is trying to open this website and then it will try to spawn fake uh forms uh that correspond with that particular bank so it can trick you into putting in uh standing uh sending c2 back uh some um some confidential information like your passwords like your pin codes so on and so forth uh since this is a brazilian trojan this is going to be mainly targeting the latin

america region mexico brazil and that's why you're going to be looking at a lot of mexican language in the commands and this the screenshot on the screen is basically off uh the of the bank that we uh used to try to trigger the malware activity how does this particular malware get loaded um if if you look at by the way don't follow along when i do the walkthrough because it can get you confused but these are the three files that will be in the archive that we distributed uh for this malware the first file uh ctf1.xe that's actually a legitimate binary uh it's from microsoft the other two msctf monitor.dll and i'm a ctf monitor

those two are the actual uh malware binaries and malware code so it uses a technique uh called dll side loading what dll side loading is that it uses a legitimate piece of software to get itself loaded the way this works is that whenever whenever a binary executes and there is a dll that's been dynamically linked to it there is a sequence of for example it's trying to load in this case mscdfmonitor.dll which usually would be in the system 32 folder but in the way windows works is that there is a sequence of directories that it's uh supposed to search so the first directory that it searches is actually not the system directly directly first it will try to

look in the local directory where the binary itself exists so that way you can force a legitimate binary to instead of loading an actual dll from system32 to load another fake dll that you've played placed in its directory so when you load up ctf1.xe what's going to happen is that it's going to search for msctf monitor.dll which should be in system32 but since it's also in the current directory it will search for it there first and that's how it will end up getting loaded and the second file msctf monitor that's actually the encrypted final payload uh which uh the loader decodes and then the back door starts its activity okay uh let's quickly uh just glance over some tools

for traffic interception um there are a bunch of tools available the most famous one is burp suite you have fiddler uh then there is mit and proxy um so to be honest uh personally i prefer mit and proxy for malware analysis i know that burp suite has functionality that other proxies don't but those are specific to pen testing and when it comes to malware analysis um mitm proxy is much better and also it's uh it's a command line uh um tool so it's it's easier to deal with by the way we won't be using these tools i just wanted to kind of mention mit and proxy so if someone doesn't know they can go and look it up so it's a really

good tool for network simulation uh we have inet sim which is a bit old uh but for this exercise we'll be using fake net ng which was developed by uh fire station and then we have netcat and you guys if you're not aware it's basically people call it networks this knife so you can connect you can run it as a tcp server udp server or you can connect uh on a or two words uh you can also connect to a tcp server with it or udp server with it and do a bunch of other stuff and then we are going to be using ion ninja you'll see that when we start the exercise so some things to note before

we get to the exercise about fake net uh so the default behavior of fake net is that any sort of port that you've not told it to act on a specific way it will just send back a echo response to it for example uh if it's uh in case of for example in case of http uh traffic it will send back an html page but if you're sending a request to a packet to a random tcp port it will and whatever the packet contains it will just send it back to the original sensor and the second thing that's important here is that the default setting of fake net is that it hijacks all the traffic so regardless

of which port the traffic is coming towards it's gonna hijack it so for example if you have legitimate programs running on your server uh for example you have ssh running if you run fake net it's not gonna let any traffic pass through twice as such instead it's just gonna redirect it to itself and just deal with it uh depending on how you've configured it okay so uh i'll quickly just go over the communication flow of the back door this is so you can um get an overview of what we are we're going to try to accomplish uh during the walk through so the first thing uh the back door does is that it tries to

connect to some two udp ports i've mentioned the youtube reports here but that's not really important so that's the first communication that's that it sends out uh it sends a message to udp ports on to its c2 server and from there it expects to receive a configuration uh to create an sss channel on so what basically that means is that it will send back a port that it can connect to uh for ssh channeling and then another port that will be the termination point of the tunnel uh secondly when it receives that it will go ahead and actually uh actually create that tunnel with the c2 so now if you notice on the slides let

me just go into full screen so if you notice on the slides uh there are some lines uh the first two rows of lines uh they are straight lines but the ones below are dotted so what that means is that the communication that is taking place in phase one and two uh that is not on the inside the ssh channel but after phase two uh any communication that takes place the dotted lines that is through that ssh channel so once the tunnel is created it will try to negotiate an encryption key with the c2 this will be a custom encryption that the backdoor has logic for inside it and once the encryption key has been

negotiated it will send out a beacon which will be encrypted by weekend it will just send out some basic information about the host that it's running on and once that's done now the backdoor is ready uh to exchange encrypted commands and their responses through that terminal so there are basically the five phases of this network setup that we are going to try to replicate in our lab and some information that i've shared here is a malware as such credentials so this user and this password is what uh is the ones that the malware tries to log in through in in the c2 and then there's the format for the ssh command config that it gets and some

other stuff and we'll go over it in the labs one more thing though uh this and on the top it mentions legacy kicks and ciphers so so basically the uh the key exchange algorithms that this backdoor uses for ssh tunneling and the cipher that it's using are no longer the default in servers so you have to go ahead and specify this explicitly in your access servers can check otherwise uh it won't be able to go um create a ssh connection or do pretty much anything so um these are a bit outdated but that's the requirement for it okay so now let's uh dig into the lab

so we're gonna just start with baby steps here so we'll really gonna start from all scratch

okay

also please don't follow along with me just try to see what i do and try to understand it uh this whole walkthrough has also been uh listed down in the lab manuals that you have and you can later take a look at them and try to reproduce it okay so let me just try to look up something right you should have the vm setup the way that it was explained in the in the lab setup guide so the remnant server basically should be your default gateway at this point so we don't have a dns server running at this point uh on the ramnik's vm so when i did an ns lookup for google i i basically got no response

let's see what happens when i start faking it right so i'm gonna look up google.com and this is the ip that was returned to me by fakenet but obviously we have to make changes to this so i'll go ahead and stop faking it right and

so if you followed the lab document there will be two config files for fake net because the first installation of fake net wasn't working properly so the second one that's in the user local directory is the one that you're supposed to be using let's just go to this directory and the first thing we'll do is we'll create a copy of the config file

right and now we'll try and make some modifications to it let's find the dna server settings

okay so this is the chunk for the dns server settings uh we'll just go ahead and i'm gonna set this to the ip of my randox premium but when you guys do this just set up to whatever static ip you would set uh you would written down during the configure of your vm 172.16.218.203 that should be correct i think yeah so let's try taking it again

okay so now we're getting back the correct id this means that we can now connect to r so what this basically means is that no matter what domain the malware is trying to connect to uh when it it's gonna resolve it it's gonna end up connecting to this machine which is other remnants vm so we split it and we'll just quickly check if it's working

actually i don't have to do this because

right so we were able to get a connection from here that means this is working so we've done the dns setup part uh the second thing that we we are going to do now is that we are going to set up uh the udp uh response that uh that the malware requires

actually let me show you something just run iron ninja

we'll try a udp flow monitor this is kind of like wireshark functionality but i'm just going to be listening for udp so i want you to see uh what the manager tries to do when we execute it and we'll shut this down so we have more space and start listening here uh these are just some uh dns queries that are coming through and some other uh traffic on on udvp ports uh we'll try to execute the malware now just so you can see the first sort of um activity that it spawns going to back though uh so like i said that uh this malware uh only triggers when there are certain bank activities noticed on the victim's machine so what

we're going to do to fool the malware is we'll open this page up in chrome so that when the malware runs it can detect this and then it can start trying to communicate with the c2 and we'll minimize it here and let's start this will load up the malware now and then we wait and see what happens

okay so you can see a little bit of an activity here let me see if i can filter this off

right so this was the message that was sent out by the way think that it's sending out to the c2 to get the ssh tunnel config so what's happening right now is that the malware is expecting the response in a certain format but the thing is that uh since the default response of the fake note uh fake net application is an is an echo message so it's just sending back the same thing to it over and over again and i can i think show you here as well let me do a udp connection so no matter what domain you put in right here it's just gonna go to rvm right so no matter what i type here i

get the same message back so that's what happening right now and if i put in 777 here you will be able to see this message too which i just sent these are the messages that i just had just sent and i'm getting back an echo response so we need to fix this okay let's just minimize this and we'll shut down fake net again okay so fake net provides uh the capability to specify custom config files in which you can set up your custom responses or custom scripts uh to respond in certain ports based on based on certain messages so there is a template right here that we can use we'll just make the copy of it

sample and we'll just rename it to c2 response dot init

we'll open it up okay so we only want to send set one uh udp fake response so we're just going to remove everything that that's not really here just let this last part stay you can rename it something like c2 udp response and you can see that the second parameter here has a script provided to it so if we wanted to we can actually script the responses uh but that's not what's required here but you can look that up later it's a pretty cool functionality i'm just gonna remove this and i'm gonna set a static string instead and the string would be four new one two three four and twenty two uh so what uh what this is this is

basically telling it is uh telling the back door is that you have to create the ssh channel on port number 22 and then the tunnel will terminate on local port 1 2 3 4. okay let's close it now we need to put the same information in our primary config file and let's search for

this this is the chunk that we're interested in this time we'll add another entry to it and we will just write the name of the file that we just created okay okay so you'll notice that uh this config says that this is for port 137 right uh but the thing is that uh in default mode fake net ng takes care of any traffic at all it redirects all the traffic so you have to specifically disable it for this setting to matter so it doesn't really matter in this case right uh now we start up fake net again and let's see what happens this time around okay i'll create the connection again and i'll send it something random

and you can see that now i'm getting back the string that is hidden there so that sets up sets the phase first phase of the network for us let's go and look at the slide yeah so the phase one uh this is completed now we want to take care of the second phase which is basically uh we need to set up the ssh server so that it can actually accept connections from the malware okay so just so first thing we need to do is we need to add these credentials we need to add a user to our system which is let me just copy paste that [Music] mm-hmm

[Music] let me copy the boss okay paste the password and you can leave everything else empty right so that's to deal with the user thing okay so the second thing that i also talked about i think i might have already set that up but let's just take a look and see so i told you that it requires i've added them already uh but there were these uh these key exchange algorithm ciphers will not be there by default so you have to specifically add them so the connection can be can actually be established so that's the second thing and what else you have the user um yeah there's one more thing okay so let's start the server and see

what happens

right let's try to ssh to it it's not working uh there's the problem and i'll explain what that is so the the uh the connection is stuck i've actually already explained it so the thing is that i told you uh fake net ing by default redirects all the traffic so it hijacks traffic even intended for legitimate services or servers so what we need to do is uh rather than disabling that setting completely because it's a useful setting uh we'll just add port number 22 for ssh to a blacklist okay

right here uh the tcp blacklist so i'll add 422 here and we'll start taking it again and let's see what happens now now you can get the connection there's no need to log in you can get the connection okay uh one more thing so what i'll do is right here i'll run a name a tail on auth.log so we can see any sorry logs oh it's not long right uh so we try to keep an eye any uh connection that is made to the ssh server now will be displayed here so for example let me try to ssh again so you can see that oh the malware is already running i need to revert my vm so you don't get

confused so because the malware was already running i'm going to revert the vm and then we're going to uh execute it again uh so one uh very important point is that whenever you do this exercise or pretty much any exercise that has to do with any malware or something like that please make sure that you have uh taken base snapshots where everything has been set up and everything is in a stable state because any sort of malware it's going to mess up your system so you need to be able to revert to it quickly so that you don't end up wasting your time and you if you made an error or make an error or a mistake you can

go back to square one and start from there okay so actually it's already executed my if i may if i'm stopping for a second so we had a question uh from uh i think amazon youtube that why is it that whatever whatever input we provide through fake net the response is always the same oh yeah that's because i've set it up that way so the raw udp listener chunk that we were dealing with that's kind of a catch all right there is capability in fake net where you can specify different responses for different ports but at the end of the day if something has not been handled specifically it's going to come back to the raw udp

listener or the royal tcp listener so the because of the scenario that we have right here uh because it's the only udp connection that it's going to stand out i there is no need for me to go ahead and create a config that is going to be uh specific because if even if i put it in the catch all config that's going to work work in my case so that's why it's sending the same response to every uh port or every sort of connection that it's getting all right awesome thank you okay so so this is like actually what i wanted you to see about the malware already well let's do this one more time

so we will open up the bank page again so we can trigger the map and we'll go to ctf mode and you guys just keep an eye here because the malware will try to let me see on iron ninja what's going on there

change the filter so you can keep an eye here as well right and let's execute

it's a show up here when when the activity starts right so you got a new session from the user sigmund of era which which is basically our malware

okay so we know that we've been able to at least uh let the malware attempt to connect the uh attempt to create an ssh channel but the thing is there is one thing that is missing from this let me

i'm not sure if this i'll be able to show you in the log sale but

you know it's not going to be in the login so what's going to happen is that once an ssh connection has been created you remember the port number one two three four that we mentioned in the command so that's the uh tunnel termination point uh that's gonna be a local port so even though that our ssh is set up we don't have anything running on port number one two three four so that's the problem what we need to do is first we will shut down fake net again so we can add port one two three four to the blacklist so we can handle it independently

okay go to the blacklist actually i'm going to add another port here also so we don't have to keep coming back again and again i'll explain the other port later so we invited one port number one two three four um i'll start first i'll revert my vm so we can again do this from scratch that's by the way that's by design yeah to make you understand we have to go back to square one every time now let's start pregnant again and this time before we execute our malware you will create a new selection on ionizer this time it will be a tcp server make sure to check run a sudo whenever you do this and the tcp server will listen on port

one two three four and switch the interface to whatever your local ip that you're actively using is for the room and we'll start the server now let's go back to the malware and execute it again so we know that we have uh the udp fake udp response setup we know that we have the ssh connection at least set up a server setup and now we're going to take care of the local port that should determine it and that at where the attendance should terminate let's see what happens now

all right so we got an asset such connection here we got the port here and now we've got a message from the malibu this is the first message that it sends out and the purpose of this is to request a key negotiation for the encryption so now we have a tunnel setup and this message has come over us karma over the tunnel so if anyone tries to listen uh to the ssh uh to any point between between this as this traffic it will he won't be able to see the principle coming through so now what i'm going to do is i already know what the format of the uh of the message that it's requesting

that it's expecting is because i've already analyzed it let's try sending it something

right okay so this command tells the back door shape or shower however you want to pronounce it that this test key is the key that you're supposed to use for encryption for whenever you send any messages to me and this is also the same key that i'm going to encrypt my messages before sending it to you so now we see that we've gotten a response but it doesn't make any sense what it says it's not really readable i'll try sending another command nothing is happening it's like it's a legitimate command pane for the malware but it's not responding back the reason is that it expects the commands to be encrypted now so what i'm going to do is let's see

what the speak connect really is

okay so

okay so i already uh reverse name the values i know what the encryption looks like i've written scripts for it so in in the directory that we've shared with you guys there will be two scripts and code.pyendycode.com let's take a look at them it's the simple encryption it's not too complicated it's just a zorba cipher and you can see that we've hard-coded test key here so these scripts in this state can only work for the test key if you change the encryption key it's going to fail and similarly for the code god buy right this is the decode algorithm what i'll try to do is i'll try to run this decode script and decode the beacon

that we just received okay so this is the beacon when you decrypt it so pong is the command santa max is the bank that it detected test name is the host name of my machine of the victim machine i can show you right so this is the host name 16 i don't remember what this was wm player uh that's the process in which the malware is currently running right now and you can actually verify it if you go here process hacker should be down below somewhere write this so i don't have a windows media player instance running uh you can check and the path is legitimate also that's because what it does is it executes uh windows media player in

suspended mode creates a process and then it hollows it out and injects its own code into with it so now a wdm player is no longer w and plate it contains the code for the malware the final payload okay let's see uh so now the thing is that i want to interact with the malware but it's obviously not responding to me i already tried sending in it ping which is the estimate command that's just that's because it's expecting uh the command to be encrypted also so what i'm gonna do is i'm gonna go ahead and encrypt ping with the script that we have in code dot pi right and this is the encrypted form of thing what we'll do is

we'll set it send it the encrypted version now you can see i actually got a response back from it and i can try decode it again but it's going to be the same thing ping pong that's that's kind of the echo message for the man let's try to decode it it's the same thing okay so now the problem here is that we know now how how the malware works we have set up uh so what have we done up until now we've done the s such config the request on the udp port we've made created the ssh tunnel we've created a termination point for it we've handled that we also handle the key negotiation part

uh and we're able to manually decode it also but the problem is that if you want to interact with the malware and debug it along with it it's gonna be too much of a hassle to every time encrypt your commands manually and then decrypt them manually and that's just gonna waste a lot of your time so what we want is uh we want this to be done on the fly so that we don't have to do the same thing over and over again so what i'll do is i'll shut down speak now

let me reword the vm also one more time we will terminate the server that we had running on port one two three four we don't need this anymore and we will so i'll show you a script that we have there is a there is a tcp proxy in the files that we shared i i didn't write this proxy this is by ncc group but it's a really simple script and it works very well uh so we can't use mit and proxy or burp suite for this we want a raw ecb proxy we don't want an http proxy we just want to specify a port to receive that on uh we want to make modifications on that

data and then we want to send that back data forward and we want to do we want to be able to do this in both directions so a cool thing about this uh proxy the ccp proxy is that you can specify custom scripts to it uh that it can use uh to do transformations on the traffic that it's sending out or receiving and i've written certain script it's what's the name encode the code right this one so this is the script uh filter script that i've written for the proxy so it handles both the decoding of the traffic and the encoding of the traffic and if you come down here you can see that there is a specific check

for the commands principle and shape uh if if the message that is being sent or received does not match these commands only then will let the time to encode or decode the traffic the reason for this is kind of obvious but i'll explain anyway oh i've shut down the proxy but if you remember these commands were the ones that we the malware used for key negotiations so these commands were not encrypted these were the only commands that were sent unencrypted and after that uh the encryption um took place so that's why we have explicit so sorry we had a question so if you're analyzing this malware for the very first time uh how can we actually identify the

response that we have to send to the malware and return for the port number and the port information it returns us i guess principal and three nine to eight yeah so that's the hard part right so that's where you'll have to do the actual reverse engineering so what uh all the fundamentals that you went over yesterday you have to really polish them there's no there's no other way around it i'll get to that at the end of it i'll show you the my analysis report and i'll tell you why i included this in it so this activity that we are doing right here that this builds on top of i think i wrote this in my slide as well that

this build on top of a proper old school or hardcore reverse engineering right uh you cannot just jump to this stage there is quite a bit of analysis that is required the first thing is that you saw that the malware was being sideloaded the second thing was the malware the final payload of the malware was actually encrypted as a block blob on the disk so you'll have to go through all those phases you know you'll have to be able to decrypt the malware then you'll have to be able to if there is any sort of packing you'll have to be able to unpack it if it is being injected like it was in this case

it was injected in wm player that's the actual process where the final version of that malware is that you need to analyze so that's where you'll dump it from and then you'll open it up in ida pro and then at the same time you'll attach a debugger to wm player.xc and then you'll analyze it side by side with a combination of static and dynamic analysis inside a debugger so that's that's kind of where your skills for assembly for debugging and your experience comes in so this is this this is a later stage but the thing is that uh the reason that i've set up the exercises this way is that um i wanted to go through uh this at the

end but i'll uh say a few words about it there is no way you can teach anyone malware analysis let alone the concepts of reverse engineering in seminars eight hours or even seven days or maybe even seven weeks uh what we can do in such a workshop is take you through the fundamentals which ishad did a very good job on yesterday the other thing which was in my mind was that because the fundamentals tend to be somewhat drier right uh people usually have interest in stuff like this for example interacting with the back door uh doing uh encryption decryption on the fly some other topics that we're going to be getting into later on packing binary

emulation patching all of that stuff so my goal is to kind of give you the baseline information so at least you can understand what's going on here right but at the same time if you want to be able to do this without zero information in your hand and you just want to start from scratch and reverse engineer the whole thing and then i get this point you need to build up those skills first right so but that doesn't mean that you can't do all of this in parallel you can learn this in parallel learn all the tools in parallel at the same time you can start to learn the fundamentals assembly the debugging and all of that good stuff

so that's uh your response was actually great amazing so we just have one side question so let's say you responded to the malware by using the command c-h-a-v-e like that particular segment now i know much reversing goes into it but uh could you possibly identify maybe one source maybe on how someone who is just starting with manual analysis can identify uh maybe command execution like he can identify yes that yes the matter goes through this command it gets the encryption key or the decoding key of the whatever that is and then it proceeds with its uh usual operations i hope that yeah so the thing is that if you've managed to go through all the phases

that i explained uh and you you're at the final payload what you're essentially going to be looking for is uh for for example in case of this malware all of the commands were encrypted right so the first thing that was needed was to figure out what the encryption is then you write a script for that encryption and then you decrypt all the strings that are in your malware once those string strings have been decrypted then you then it will be quite easy because you you'd already have done quite a lot at that point it will be quite easy to figure out the command just from the strings and in case of back doors there's usually a a switch case right in

the switch if you've written a c ever will know how the switch cases work so what happens is that uh there will be a socket api for example uh receive that will be receiving something and then whatever the buffer that it's getting it's uh going to get compared to a bunch of strings right and whichever string it it matches it will go to that case of the switch statement and then from there it will do whatever activity that it was meant to do so for example in case of shape the malware will receive this string it will decrypt it first then it will check it against the list of strings whichever string matches the case that will be

against it it will go ahead and try to perform that activity i hope that clears up awesome thank you okay okay so let's get back to what we were doing because i've shown you the script now what i'm going to do is run the proxy real quick and it should be minus l and we want to listen on port one two three four and we want to redirect it to on port number 8080 if you remember this is the port that i added to the blacklist also at that point so this was the reason uh because we were gonna redirect the decrypted traffic to this board and the other flag was minus m and encode

encode the code filter right so you don't need to put the dot pi in this case so we have this running now we have and we'll also run fake.net right and now we'll fire up ioninza and this time we'll create a tcp server but the port we'll use is port 8080 and that will receive the decrypted traffic

okay um so now let's go back and execute the malware let's start the bank file again i hope i've not skipped anything let's see let's just do it

now

okay yeah we got so this principle the message that this figure this is not coming directly this is not port one two three four this is port 8080 which means that this is coming over the tcp proxy that we set up right so let's try to negotiate our encryption key

right okay so now you see the last time when we did this this response was encrypted but this time around uh our tcp proxy is decrypting it on the fly and like i said this is only going to uh let's try another command first i'll set ping which was not working before nicely it's also accepting commands in plain text our proxy is encrypting it forwarding it and we're getting encrypted traffic back proxy decrypts it send it back to us so this will not work with any other key obviously but let me just show you let's try abab as a key right so now we're getting back garbage because our script says that uh we're only supposed to use the test key so we

will again negotiate the same key with it again okay so this part is done we've done everything now we're ready to try some commands that we know work with the malware just for fun and and then we can move on to something else let me see which commands

okay let's try and set up a fake form for the bank so these commands are listed in the in the lab manual right let's see what happens okay so you see now it spawned up a fake form on the victims machine which says that santander and it says something in mexican i think so it's it's basically trying to fool the user into thinking that the bank's application is loading because these banks actually have applications standalone that can be installed on the system now let's see let's try another one there are a bunch of commands i'm only going to go through a few and then you can try the rest on your own right so now it's asking the victim to send

his secret pin or whatever and keep looking at these messages are going to be in mexican i think or brazilian whatever this is but you get the just of it so i sent back the key i wrote some random numbers did it gets into it i said continue and it received the token that i entered so that's how it tries to steal and we can send the fish what was it one second hide fake so hide fake will hide the form and we can use show fake to show it again okay so we can try a few other commands so one is you can use color text to

right so this command sends the the whatever text you're sending it'll paste to the current cursor location so i'm gonna make the current cursor location the address bar of chrome and i'll send this and you can see what happens right so i sent this and it ended up being pasted in the address bar uh there is one more i think we can try send message send message does something similar but it's instead it pops up a message box

okay so it pops up the message box in the context of the current process so if you look at the name it says google chrome so you can try to fool the user basically using this and we'll try one last thing uh so there's a command atv key that you can use to enable key logging okay send now once i start typing first we'll get some junk maybe and then this is just

this right so this is just the the thing is that we are trying to look at these commands uh in a raw sort of terminal but these are supposed to be passed uh by an actual client for the back door so the key logging that it is doing it is actually sending sending it back but there other stuff with this uh with it also that we don't really need but the actual backdoor client can use it or parse it right and we can also we can shut down the key logging right okay and with close key we can shut down the key logging and with k kill now we can kill the connection as well as the

process okay let's just revert the vm and that finishes the walkthrough now let's get to what you guys are supposed to uh so for today we're gonna be skipping any exercise time uh because we have a lot to catch up on uh but you don't have to worry everything that i've done i've actually listed this walk through in the lab manual you're supposed to replicate this walkthrough but you're you're going to change a few things uh you you're going to use workshop 2020 as the encryption key and you're going to change the port uh to 999 instead of one two three four and you're also gonna try out all the commands that are listed here

uh make sure that you are changing uh these two things in all the places that are needed otherwise your connection is going to get corrupted at some point and you're not going to be able to do any of this so that's the whole point of the exercise uh the other thing is that so mit and proxy we usually use our mit and proxy when we are dealing with https traffic uh so this is uh another exercise i know that i've gone i haven't gone through mitm proxy but it is pretty simple you can go to the documentation and what you will do is whenever on your own place on your own time uh you will try to set up

a man in the middle for https traffic and then you will redirect it uh to a fakeness net instance surrounding on that same machine it's it's very simple and you'll be able to do it the last thing that i've written is a bit difficult but let me see if i can open the document and show it to you okay so this is the analysis document that i've shared uh in the google drive so i did this analysis myself you won't find it anywhere on the internet uh this includes all the details uh even the ones that we haven't discussed all technical details about the backdoor and this will help you if you try to go back and reverse

engineer it yourself right so what and there are a lot of a lot more commands that we discussed and those are all listed here so the point of sharing this uh document with you so there are two points uh one is that you can actually take a look at what a malware analysis report is supposed to look like because these are mostly internal the blogs are different from these type of reports so you know how to write one and what information you're supposed to put in the second point is that if someone is actually really interested in doing malware analysis the details that are all the details that are listed here and some of the

stuff that i've mentioned based on that it will be a really good exercise to try to reproduce and verify all of the information that i've mentioned in the report and i know you you're obviously not going to be at this level yet but you can try it can take weeks or even longer but it will be really fruitful and what else what else okay let me just bring it back here

yeah so i've also written one more thing so this malware is based in delphi delphi is a repeat application development language you can say it's based on pascal so this tool that i've mentioned in the in the bottom of the slide idr if you actually want to reverse engineer this this tool is going to come in really handy and you'll find all the details about how to use this how to decompile delphi with it so that's going to be really good so this finishes up uh our first topic uh the our next topic is going to be binary unpacking uh i'm not going to give you any time for exercises but if there are any questions we can take a

few minutes to answer those otherwise i'll start yes i might just just before we begin the next question sorry the next phase so there was one question uh okay is there any possibility that maybe we can brute force the response that we have to send back for example shave test key is there any possibility that maybe we can of course there is a possibility yeah there is a possibility but it's not feasible right uh i mean you have no idea how long the command is going to be it's possible right it's just like brute forcing a password the black backdoor is stable enough that if you send it a wrong command it's not going to crash so

you can you can set up a script and you can maybe the thing is that the command is just not one keyword they're going to be sub commands and parameters uh to those commands so for example there is one command then hash then sub command then hash then one parameter two parameter three parameter four parameter so it is possible but unless you're working on quantum computers or something like that it's not feasible so so there is no uh really easy way you'll have to reverse engineer it all right i think that's it for the questions on this part okay

okay so the next topic is binary unpacking and we only have three four slides for it so i'll go through it quickly okay so how to make a malware analyst job harder so i've written down a few things there's packing obviously everybody knows everyone's heard about packing uh there's virtualization we'll come to packing at the end there's virtualization so that's an additional layer uh underneath uh whatever packing that might have been used so this virtualization is pretty much not identical but it's in concept it's similar to what you see uh with your normal virtual machines for example uh with jvm java virtual machine or with something like dotnet right so you have an intermediate layer of code

and then you have some data that gets interpreted and then uh it does something and executes it right so the point and the thing with virtualization is that there is no point in time where you will be able to get all of the code of the malware intact and unpacked in memory because that's not how it will be executing these packers convert the code into their own intermediate language and then that language uh kind of acts as an uh that that vm kind of acts as an interpreter and executes uh the code for that malware and it's really hard to get through these or reverse these there are some common ones vm protect and tamida is the most infamous ones

i don't i haven't seen any malware analyst who comes across a tomato sample and then he says okay i'll reverse engineer this because this is too time consuming and the roi for it is just not feasible it is doable people have done it but it's mostly for research purposes because it takes too much time there are advanced techniques that are that now people are using there is something called binary lifting uh which basically lifts the code up to a another intermediate language and then it really compiles it back into x86 so that's kind of the solution to this but it's still not mature enough but people are still working on it other stuff that uh

that can make america analyst job harder is obfuscation anti-disassembly anti-debugging so what is obfuscation obfuscation is when you when you don't do something in the shortest way possible so to speak you kind of so for example you wanted to go from point a to b but instead of going directly you will just go through small streets and take different turns just to confuse whoever is tailing you for example uh as to what you what your intention is so that's kind of a good analogy for obfuscation uh you'll not be able to obfuscated gold you'll not be able to judge just by seeing what it's doing you will have to either debug it and try to connect points or the other

option and the better option is to write something to remove that your obfuscation other the other technique is anti-disassembly so anti-disassembly is you can use certainty the malware brothers can use certain techniques so that when you try to open it up in a disassembler for example ida pro binary ninja or any of the other disassemblers out there grid it's going to the disassembler is sort of going to error out it's not going to be able to display the disassembly properly so and if you're not able to see the disassembly your rob you'll obviously know what not know what's going on so anti-debugging there's a bunch of techniques that malware others use to detect debuggers and then either crash

the malware or kind of try to fool you in other ways uh so that you're not able to uh proceed further with the analysis and then there are obviously um different encryption layers but you can with for example the strings that i mentioned with the previous backdoor uh that are encrypted so these are a bunch of things that a malware or a malware analyst can come across and that can make his job his or her job are quite difficult now we come into packing uh so apart from the virtualization ones there are some generic packers upx is the most common one aspect and i've written down a bunch of names these are somewhat easier to unpack

i mean it takes time but they're still relatively easy especially upx aspect they're really easy and there are also standalone tools available to unpack these so if for example if you go search upx unpacker or aspac unpacker you'll come across a bunch of tools and you can just some of them just statically unpack it you don't even have to run it and then we have custom packers so custom packers are the ones that are specific to a particular family or to a particular malware and they'll vary from one malware to another and you can even in some cases detect the family just by the type of factor that's been used and at least those custom packages that

i've come across they're most of them are not so difficult to unpack but some can be quite complicated okay so when dealing with uh these common packers the common ones i'm not talking about the tomato or enigma sort of on factors or which places uh the when the malware is unpacking itself these are some of the apis that it will be using virtual alec keypad virtual protect so the point of the alec apis is that once the malware unpacks itself obviously it has to copy itself somewhere write the unpacked code so it has to allocate memory uh and it has to use some uh some sort of api for that and that's where these come in and then virtual protect

obviously when the code is being unpacked it also has to execute right so you have to change the permissions of those code those sections where the where the code is being copied so that it is executable because if it if you don't change the permission to executable and try to execute the code the malware is just going to crash so these are good uh good when you look at a generic technique to unpack the common unpackers that we have the common package that we have these are a good sort of apis to look for so what are the issues that we that we come across after unpacking uh one is memory map i'll get to this on

the next slide the second is not marked as a module so we have a few tools available that we use to um dump unpacked binaries so this is this is once you once the malware has been unpacked and now you want to dump the unpacked code there is still there's only down packs there's a few others in some cases for example silla sila only dumps uh unpacked malware if it's marked as a module for example it's showing up as a dll or an executable if it's just in a chunk or blob of memory it's not going to be able to dump that ollie dump x can dump it but all the dump x is not good at

rebuilding the import table so both of these tools have something missing and then sometimes what the malware orders do is that after the code has been unpacked they strip or destroy the pe headers so that's another problem even if you dump the bindi you won't have the headers intact so you'll have to find a way to fix that and then you can have the import address table relocated so it's not at the location where it would normally be with the pe and you'll have to again either set the address for it or rebuild it and then distributed iit is that you can have the import of this table broken down into different chunks so then i've put

in a screenshot on the right side of the screen this is just showcasing um the import address table from the debugger so the import address table is basically the region of code where the binary after resolving all of the apis that it's going to use it lifts down all the addresses in this table so whenever whenever the binary is executing if it has to call an api for example it has to call printf or it has to call close handle or whatever it goes back and looks into that table to know where this the code for that api actually resides and that's one of the most crucial things in a pe as i said we'll talk about

memory mapped versus raw i think shark went over some of this but there is some some things that i wanted to show you that not so obvious so when you look at look at the section headers you'll see that there are two type of dressings one is the raw address and one is the virtual address so the raw address uh is the one that that's the address that where that particular section starts when the file is on the disk right but when the that same binary gets loaded into memory this address can change and almost always it changes so it doesn't get loaded at that same address those addresses are different and so the thing is

that that kind of causes a problem when you have to dump a uh a binary file from memory manually because it has been mapped and the addresses have been changed and now if you try to dump it uh even after the dump they're not gonna correspond to what the reuters said they're gonna show up more as uh what the virtual rest was saying actually let me try and see if i can show you an example

i think i can close this vm for now

okay let's actually go to

i should have this order name

okay so this is showing the let me also open this up in

so this is what i was showing you and raw address virtual address for the text section

okay so this the the this part here right here is actually the pe headers and i can show you if i go to the disassembly

okay so from this 4000 address this all this power are the pe headers and this occupies about 400 hex bytes until right here and then our next section starts now if you if you look at this on the disk right if you open it up in a normal hexadata this is exactly how it's going to look like right and there will be no gap and it will be a contiguous chunk of data obviously but if you look at the addresses here this is showing the base address but if you if you start from here zero offset you can see that right at this point 400 instead of uh going to 3e1 or actually it's going to start at 401 it

starts from 1000 right so it skips a whole chunk of uh index indexes there

let me do one more thing i'll show you how it how it'll look like once it's been mapped don't worry about what i'm gonna do here i just need to i need to dump the mapped version off the binary so i can show you the difference this was i can't do this in only or x32 because i'll have to you can't do that it won't go in there because the sections are divided up in different chunks so you have to do it a little bit differently with window book i can do it in one go hmm

it's okay if you're not able to read the text i'm just gonna dump it real quick and show you

so this is what

it should be done copy this back one second

and let me open the so this is the mapped version of the same binary that i've done and let's go back so now when i tab and switch tab and switch between these two you can see that the upper part of this is identical right the section point but the difference is that the file that is not basically the file that is on disk right after the text section right after the pe headers it starts with the next section this is the text section actually but if you look at the memory file uh the pe file that i dumped from memory it the code does not start from here there are just a bunch of

zeros and this will continue on

and it will only start to if you look at here and you see that this is identical but there is a gap between those two so that that's what happens the virtual address is pretty much in all the cases different from the address and once the uh once that once a binary gets mapped into memory there there becomes a gap in in between the sections so if you dump the binary as it is from memory uh it's not gonna correspond to the addresses that are that have been uh written in the pe headers that are there in the p headers and i'll explain why that is let me first close this

so the reason this happens has to do with how a windows memory paging works so a page a memory page in windows is the smallest unit of memory that you can allocate and the minimum size actually the size of the page on x86 and is 4 kb the minimum size and that's true for x64 also but i think you can have pages that are bigger than 4kb on x64 as well i think so uh so the thing is that that means that these uh one thing is that these all these sections memory for all these sections has to be allocated individually so because these sections are different entities so that the windows loader or that windows can

handle uh the memory for this individually so it cannot all be in one page it has to be different pages right and we just talked about now that uh the minimum size for a page on windows is 4 kb and if you convert 4 kb to hex actually it's gonna become this one zero zero zero so four kb is four zero nine six and if you convert it that's gonna be thousand bytes so that's the reason that even when you have data that is less than uh 4kb it still has to occupy so that has to be the alignment basically even if it's less than 4kb it has to be in chunks of 4kb so when it's on the disk

there is no such restriction but when it gets mapped into the memory uh you start to have this restriction of 4kb and that is why the memory sections when they are in the ram they are on different addresses from when they are uh on your disk so i hope i have been able to explain it because i've known people who who even been analyzing analyzing malware for quite a while but they're still quite confused about this part okay let me close this save and let's go back to the slides and now we'll go through a walkthrough and now we'll do a walkthrough of unpacking first we'll pack something with upx and then we'll try to unpack it and

we'll try to do that manually so with upx uh the way it works is that it does sell self-injection and i'll explain what that is so let's just do something first right what we'll do is we'll copy uh cmd dot xc to the desktop here all right and then we'll pack it with upx cmd xz minus o cmd right so we've got a packed version right here and i'll also let's once open both

and let's just look take a look at both the sections so this is going to be the unpacked version everything is how it's supposed to be and then we'll take a look at the packed version right and you can see that the sections have completely changed it's upx and it's going to be compressed encrypted all of that stuff so it's actually quite easy to unpack upx you can just do a upx minus t and it will unpack it back uh but that's not the point of the exercise so what we're going to do is we're going to try to unpack this manually okay let's open this up in x-32

oh yeah there is one more thing that i wanted to show you so i said that uh upx does self in uh self injection so there are two ways that the malware can unpack itself one is that it will allocate a chunk of memory that's not in its image space so it will be in the heap somewhere and then it will unpack itself and copy the unpacked code to that chunk and change its permissions and then it will transfer execution to that chunk of memory the other way what this upx does is that instead of allocating a separate chunk of memory on the heap it writes the code that has been unpacked to its own memory

uh to its own image memory and the way that works is if you look at the section us0 of upx you can see that the raw address we just is 400 000 whatever but the raw size is zero right and if you look at the virtual size that is 3d00 so it's quite large what that means is that when this file is on the disk uh there's gonna be nothing in this section nothing it's not going to be taking any space because there's nothing there right but when a windows loader loads this binary up it's going to create 3v0 worth of space in the memory in in the memory for itself and it's going to be filled with

zeros so it's going to be an empty space empty region it's going to be filled with zeros but it will be allocated so the malware will now have an empty slate a chunk of region that it can unpack itself to and copy itself to and it will be in its own memory so that's how upx works so what we need to do now is let's see

let me load up we load up the packed version okay i hope you uh you've gone over some of the commands shortcuts that shots with you for debugger i'll try and use this in the beginning and then i can just use the shortcut so you don't get confused so once it's loaded we'll go to the entry point we have to do f9 for this so this is the entry point of the binary let's see what region this is loaded at what address this is loaded it's loaded at 490 eight something something right and what was the

so the image base 480 okay and we know that

okay so we know that uh the virtual address for this section is 1000. what this means is this is a relative addressing so what this means is that whatever the base address of the binary is this section is going to start 1000 hex bytes down from it right so we just took a look here and we saw that the base address for this is 498 000 let's just write it down somewhere so we don't four or nine d8 right so this is the base address and we know that the section that it has allocated for itself where it's going to copy the unpacked code into starts at one thousand plus from there that base address what

this means is so we will have 498 and so this is the address where that empty region of memory should start from and we can try and verify this so this is the top left uh bottom left we have the hex we will do ctrl g and we'll try and go to this address

sorry one extra zero right so you see this is all empty if you scroll down it's going to be empty mdmdl yeah empty empty empty because it's the binary has not been unpacked yet and but the memory uh for it has been allocated so in the case of upx when we want to unpack uk upx it's it's the simplest untracker this technique is not going to work on other unpackers so please be aware but in case of this we can try a very simple trick what we can do is we don't want to get bogged down in the details of how it is being unpacked because it's going to be decryption uh decompression algorithms maybe some sort

of decryption as well uh we're not really interested in it we're not going to be able to understand it it's going to be complex mathematical stuff or something like that right so we just want to skip ahead we know that uh once the once the all of the code has been unpacked we know that upx is going to the unpacker is going to copy this into this particular region of memory right the one that we just calculated here so what we're going to do is we're going to try to look for a jump and once the once it's been copied and once the permissions have been fixed which i think are already fixed in this

case uh then it will have to jump uh to some place in uh in this region of memory to be able to execute uh start executing the original entry point for it right so inevitably it has to jump into this section so what we can try to do is try to look for a jump that is close that is after this section right so it has to be close to the to the start of the section which is 498 something we'll just try and try to scroll down and read all the addresses so one thing is that it will be a non-conditional jump it will be jmp it will not be a conditional jump like j and e or j here or based on

some condition so it has to be jmp and it has to be a start in 498.1 and onwards from that so this is too large these all are 490 cc that's too too far right so we'll just keep scrolling down down down still see okay this one this is also see this one looks quite similar so what address we had we had 498 one triple zero and it goes up to plus three v zero zero zero bytes right so 498829a is after this section and it's uh this boundary and it's within this section and there is nothing after it right so that's a quite good indicator in case of upx that this is uh going to jump into

our actual binary and to the original entry point so what we're going to do is we're going to put a breakpoint here we're going to press f2 it's going to turn that which means that we now have set up breakpoint the other thing we can do is we i'm going to execute it executed i'm going to resume it so that it's running again and you can look at this section here while i do that it's going to get filled up which means uh upx just the packer is going to unpack uh this binary and then it's going to copy here so this is no longer going to be zeros so i'm going to press f9 and take a look at what happens

right it got filled up so this is the unpacked code and this is now going to jump to our original empty point entry point i'm going to do f8 right so this is actually where the unpacked code resides and where the actual execution starts so this is the perfect point to dump this binary and basically have ourselves an unpacked version for cmdb.xc so we're going to use sila for it and like i said like i told you that it does self injection so you don't have to look into any other process in any other chunk of memory we'll just look into the current process that we're running which is fact cmd dot exe right the other thing we need to do is let me

just show the address so i'll try and do an iit auto search it tries to find the import address table here so the important thing to note here is that you have to change this original entry point oep to match this address otherwise it's the dumped mind is not going to work uh sometimes uh i'm not sure why but sometimes it it's uh celia is able to properly fetch the oeb but sometimes it doesn't work so you have to change it yourself so it's going to become 49b 888 498 298 okay i d auto search and we get the imports so it seller has the capability to rebuild the imports so we do get imports

so that it can do that as well first we'll do a basic dump of it right here should be fine uh so now it has been dumped and it has been unmapped so the thing that i was showing you uh the difference between raw and virtual and mappy and unmapped pe this already fixes it so it has fixed this but the thing is that it still hasn't fixed the import address table for it so we'll do fix dump and we'll give it the binary that we've just done and see import tree build success it has rebuild time ports so it should work now right we've done everything we've done it we've rebuilt the imports let's do it

this is the one oh it doesn't look like it's working and it's actually not going to work and it's going to crash and there is the reason for that um so ideally it should have been it should have been showing the mantra but it it instead it spawned up a deeper so that's a problem and now we have to identify the problem and fix it so i'll i'll quickly tell you what the issue is so first i'm going to fix it and then i'm going to tell you why why exactly i did that let me copy this so this is a neat utility set dll characteristics that you can use uh to turn off aslr on any binary so if

if you don't know what aslr is aslr is address space layout randomization which is a fancy word for basically what that what it means is that a binary on which aslr is enabled every time you run it its base address is going to change so for example first time it loaded at one zero zero zero zero zero second time it like might load at three zero zero zero zero or something and it's going to keep changing every time so what we want to do is we want to disable aslr on this particular uh binary i'm gonna delete the dumps that we just made so we don't get confused and sub dll characteristics might nasty and

we're going to disable it but oh i already have it open in my debugger so i have to close that okay [Music] okay so it's showing that the original characteristics in the original characteristics dynamic base was one dynamic base is basically aslr and it has now turned it off so dynamic base is nonzero and now i'm going to quickly run through everything that i did early on

let's

thank you

so this is the final let's try running it now and now it's running fine that means we've finally been able to unpack it successfully now i'll explain what the problem was and why we had to turn off aslr uh let's close every thing and [Music] let's open this off let's open this one and the second one that we want to open is the actual packed one okay so let's take a look at uh the headers the sections for the packed one sections so the section names are upx0 upx one because we packed it with upf so obviously now we'll take a look at what the section names uh let me see if i can run another instance

of the yeah it will be easier

sections and

okay so you can see that on the right we have the packed margin and on the left we have the unpacked okay so the primary difference here is that scy has been added we're not we're not going to go into the details of that resource section everything else is already there the main difference that is of the note of here is that the raw size is no longer zero right because the file on disk is now no longer empty the section section on disk is no longer empty it has the actual final payload uh final code that we just unpacked that's a good thing uh now let me show you the original cmd.exe without any packing how it looks like

so this is these are the section headers for the original cmd dotage you'll notice here that it's fine i mean the name of the sections uh it can be important but let's not get into that for now but the thing is that it's fine dot if the name is not text section or not the data section or not dot resource section or not they do contain the same data but the original binary has one additional section.section and our unpacked version does not have that so what exactly is a dot reload section uh so a dot reload section actually contains information about uh about any patching that needs to be done within the binaries code when the base

address changes so if if a binary gets loaded at its preferred base address the base address that has been mentioned in the headers uh like for example this one if this finally gets loaded at this address uh there does not need to be any modification in uh the binary score in memphi but if this address changes which it does in case of aslr if you have dynamic base then this is the selection that provides information as to which uh which parts of code need to be need to be patched uh in memory for that change to not corrupt everything right and i'll i'll give you an example of what sort of data might need to be patched

so i've i've increased the font size it kind of looks a bit ugly but i hope you're able to see so um for example so there are two type of call instructions let me see oh we already have one but let me do another one so there are two type of calls one is one starts with e8 and this other four bytes are relative to its uh to the current position of that call instruction so even if the base address changes this call is not going to be affected it's going to uh function the same way but there is another type of call that starts with ff15 which was this but i'll try and write another one just so you can see it

here it's called an indirect call and the addresses which it's supposed to call into are hard coded so for example let me do this ff15 and i'll put in some address from bcaa2

0c right so this has been converted into this so this address is now hard coded the problem is that if the base address changes from something other than for example base address in this case is something like

the base address starts from 4ac that's why the address looks like that but if this was starting from five something for example and you take a look at this address it's still going to say 4a something but the base has changed so that means this address needs to be patched but the thing is that our original binary cimd dot xd it had the information to patch all these addresses but the ones that uh the uh version that we dumped after you unpacking from upx it does not contain that section anymore the reason for that is that upx takes care of uh takes care of that itself whenever it loads up the unpacked binary it already knows uh which parts to pack

which uh locations to patch in the code so it patches them itself but once you dump it and it becomes a standalone binary and upx is no longer there to patch it and we also have we also don't have the section relocation section anymore uh then it's going to be problematic whenever the base address changes and whenever you run the binary it's going to be at a different base address these addresses will will become invalid and there are similar jumps that have hard coded addresses so those need to be patched as well and it's just going to crash so in order to avoid this we have to turn aslr off on the binaries uh the

ones that we get back um dumping from upx this is not true for the binaries that don't do self-injection it's only and there might be some packers in which you can dump the resource relocation section intact but in case of upx that's not true so that's the reason that it was crashing okay so this walk through the this final that finishes this box one we'll see what's next let me close down the unnecessary stuff

okay so upx walkthrough is done and we're going to do one more unpacking and just a quick one this is going to be the unpacking of an actual malware so when you do this please be careful uh to not execute it uh outside your beam all right so remember before we move on to the next uh malware samples there was a question for example uh more advanced apt malware what they do is they try to pack the same malware sample maybe twice maybe thrice so what would be your uh preferred approach to dump the actual binary from that pack malware i didn't understand tag that you mean there are multiple layers of packing oh yeah exactly

yeah so in every in any case you have to get to the final payload right so you have to take the shortest way possible through all the stages of packing obfuscation whatever that is whenever you're whenever you're dealing with any malware the goal is to spend all of your time on the actual final payload you cannot for example it would have been really stupid of me if i ended up looking into the upx's unpacking code because that's just not relevant and there can be multiple layers and actually there are multiple layers in a lot of the cases so you have to one by one unpack all of them and then you have to get to the final payload and

dump it so it's basically you have to drill down and do it there is no shortcut actually the one that i'm gonna unpack now it has one extra stage we're not going to look at it we're just going to skip it uh but that that that will kind kind of be a hint to whoever asked the question

all right so there was one other question uh this is basically i think the right has made it specific to a particular malware that is drydex so how can you reconstruct imports in case of that particular malware if you've actually analyzed it before there are multiple ways to i mean even in case of tridex it depends on it it depends on what the what state the malware is in there are multiple tools that you can i remember one by the name of i think universal uh a1 is impract and there is another one i'm forgetting the name universal import reconstructor i think so what that tool does is that when you have i'm not sure if i have it on here

because it's not included in fair pm that's for sure so i probably don't i know i'm not gonna have it here so what that tool does is that whenever uh whenever a process is running and there are there is a bunch there might be multiple chunks of modules loaded into it or whatever what you can do is you can attach that tool to that process and then you can specify a starting address and an ending address and then you tell that to to scan for all the imports within that memory region and the thing is that even cilla the tool that we use i do it does the same thing it also scans for imports and then it rebuilds

the table but the problem with stila is that if the if the unpacked binary or unpacked code is not marked as a module it's not marked as a dll or an or an xe you're not going to going to be able to use salon on it but with uif or a universal import reconstructor you can actually uh even scan chunks of uh memory that have not been marked as module so it doesn't matter with whether it's dll or not as per the headers or as per the internal structure it's it's going to scan it anyway and then you tell it that okay at this particular address uh rebuild the import table and it rebuilds the import table and then you can dump

it and it should work there is actually a hack to make uh scilla work also but that's gonna be a bit complicated if i start explaining because it it requires modifying the load order module list to sort of fool celine to thinking that this particular chunk of memory is actually a module which is not so you can do that as well but it it's quite out of the scope of this exercise all right thank you for the answer let me continue thank you okay so we'll try and quickly unpack the yeah i've already extracted it so i'm gonna use only for this one so like uh ershad mentioned in his um topic also ali already has a built-in

mechanism to it load dlls x32 also has

okay okay so again uh now i know this that the smell is packed and i'm not interested in how it's packed and i'm not interested in looking at any of uh those sort of internals there are uh like i mentioned a few apis in the beginning on the slide so for common unpacked for common packers even custom ones this is a custom packer uh which are not using any anti-debugging techniques you can follow a certain sequence of breakpoints to try to get to the final payload and that's what we're gonna do now so what i'm gonna do right now is i'm going to set a break point on virtually this i think the font will be

too small but i'm basically setting a breakpoint on virtuality right so i'll set a break point here and and then i'll let the malware execute right uh so my breakpoint has been hit i'll let it call into the so i'll do this from here so you can see also so there is an option which tells you to execute till return so when i do this it will execute till here and then it will stop so once it reaches here i'll be able to see the buffer get the address of the buffer that virtual alec has allocated so let's just do this right so the buffer address will be in eax what i can do is

okay so this is the newly allocated buffer that we've gotten right what i'm going to do is i'm going to add a hardware i'm not sure if it'll short vent over hard with breakpoints uh so i'll just give a quick overview so there are three types of breakpoints there's the normal execution ones that you put to put anywhere in the code stop execution like i did then there are memory break points right uh so what a memory breakpoint does is like i've already talked about it that the smallest unit of memory that windows can manipulate or deal with is called the page and the minimum size for it or x30 to the only size for it is

4kb so and the permissions also apply to that whole 4kb chunk of uh memory so with a memory page uh with a memory breakpoint what the debugger does is that it changes the permission of that memory page so for example you want or you wanted to break on whenever something was written inside that memory page you will remove the right permission from it when the right permission has been removed once when anything will be attempted to be written into that page an exception will be generated and the debugger will handle that exception and break at that code and then you can proceed for forward but the problem with memory breakpoints is that even if you want to

monitor just one byte like for instance we will only be monitoring one byte here it regardless it just sets up uh the that break point for pretty much uh the whole page the whole 4k chunk and even though all of this happens behind the scene uh there are repercussions from that so i don't really like memory break points but sometimes you do have to use them so hardware breakpoints are completely different hardware breakpoints are actually handled by the cpu as in the actual intel cpu in this case there are a bunch of i thinks i think six or eight i don't remember exactly uh debug registers so uh those are the so the uh you put the address of

the debugger puts the address of the location that it wants to break into based on the memory permissions that you specified and the cpu itself generates the exception for that breakpoint so i mean it's a lot of mumbo jumbo but basically if i set a breakpoint hardware breakpoint here i'll show you and i set the right permission with one byte you can set it for word or buy it a word is two by eights and the d word is four bytes uh but i always go with byte and there is a reason for that and but maybe i can explain at that so we will set a breakpoint for one byte with right access what that means is that when i continue

execution whenever something is attempted to be returned to this fight the code is going to break so let's see what happens i'll continue continuing the execution okay so it break right here and you can see see that it says e8 here so e8 byte was written to this chunk of memory and that's why our breakpoint triggered so right now i'm just going to remove this breakpoint because i don't need it anymore okay so almost instantly i'm able to tell that this is not the final payload right because the final payload should have started with mg that's where how the pe head how the header for nxt startup so e8 is actually uh the op code for call so it seems like it's

unpacking an intermediate stage of you can call it shellcode or it it's not exactly shellco cooper you can call it shellcode for the purpose of this exercise so that's what what the earlier question was about that what if there are multiple layers so in this case there are at the very least two layers this is the first one and we're not really interested in it but let's see if i keep on executing it you're looking if you're looking at the bottom left hex view it's going to keep filling up so it's just um it is just unpacking the intermediate stage i'll just let it let it run i'm not interested in like i said so you see that another virtual alec

breakpoint has set it but at the same time this region has been filled so let me show you something i'll go to this address in the disassembly view which is one one a three you can see that this is actual code right shall code you can call it channel code so this is actual code this is the second layer so if this actually gets executed and then this unpacks the final uh payload but we're not interested in it so we're just going to skip it uh we saw that another breakpoint was hit so we'll see what that was another virtue select breakpoint again we'll add this run till return and this time we don't have anything in

eax so nothing was allocated we'll let it run again and another breakpoint hit we'll execute till return and now we have another address let's see if we're able to access it we're not not been able to access it uh even though that it has been returned and the reason for this is actually this this will be an exercise for the user you can try and look up the difference between committed memory and reserved memory so this is reserved address that's why we're not able to hit it but let me run again another breakpoint and this time we have the same address but now if we try to go it we will be able to access it because it has now

been committed okay so what i'm going to do is i'm going to repeat what i did earlier on and let's first remove the virtual electric

we'll set up a heart bray break point again right access and let it run right so you can see that it says 4d5e a which is the magic number for xc file that's how the p hat so we we can be reasonably sure now that this is this looks like our final code but the problem is we don't know when this whole chunk is going to be when it is going to be completely unpacked right now we can't just keep stepping through because that's really slow that's going to take a lot of time so we need to have a solution for that and what we're going to do in order to overcome that is we're going

to use that other api that i talked about virtual protect so like i said that the first thing is that the malware is in common cases the first thing that it's going to do is allocate memory for the malware to be unpacked in and once it's unpacked it has to jump to execution inside that chunk but before it can do that it has to change permissions so it has to make that region uh executable so for example if i take a look at this region it starts from

one two three four zeros oh it's the font is too small but but i was basically going to show you that if this section is not yet executable but the font is too small so i'll just we'll just go over it so what i'm going to do is i'll remove this hardware breakpoint we don't need it anymore and i'll set a breakpoint on virtual protect right here let's see what happens

okay so the virtual break point virtual protect breakpoint has now hit and now if you try to scroll down we can see that pretty much all of the chunk has been filled right so what that means is most likely in non-likelihood uh the malware has been completely unpacked now so what we need to do now is dump it and i'm not going to use any tool to dump it i'm going to do it manually so that you can learn uh so the first thing is that i cannot use sale on it anyway and the reason for that is not because i'm not using x32 dvd but the reason for that is because this particular address

has not been marked as a module it's just to the uh to the operating system it's just a normal blob or chunk of memory so i'm going to dump it manually so one two e four zeros you'll not be able to see the addresses here but just go just trust me that i'm dumping it so i'll right click the region go to dump and now i'm going to right click again because you have to open it in a separate window to be able to dump it to disk and now we have an option back up and save that file right and i will name it what [Music] just unpacked dot right okay and i can close the debugger now i don't

need it anymore okay so let's try to take a look at it and i mean we've already looked at in hex view but anyway just open it once again

let me just i'm having issues with my touchbar okay and so if someone was looking at it even if a malware analyst was looking at it this is a perfect executable there are the this is the header like dolls header all of that stuff this is these are the sections section heads everything is there and then but the problem is if you try to open this up and you can even rename it to let's say dll just so you don't get things the extensions don't matter but you can rename right let's try and open it in either this is the right digestion unpacked right off the bat we've been we've started getting errors so it's not able to properly open it but

let's see what happens

it will create functions for it because that's how i designed it's designed to find function to find code within chunks of memory and it does have headers so there is some indication but if you go to the import section imports table there is nothing there and this means this binary is pretty much useless uh in all likelihood all the functions have not been disassembled also and there are other errors in it also so the reason for this is the reason for this exactly what i had explained to you uh in this slide and then when i showed you the example of uh in binary ninja of the uh mapped memory mapped pe and unmapped pe uh raw address and virtual address

so what we need to do is we need to fix the raw addresses for this dumped binary for it to actually load up in naida and for us to be able to do any sort of analysis on it so we'll go to pebbler again

okay let's go to the section headers okay so i think i've given you enough background i'm so i'm not gonna go over it again if you're still confused you can maybe try rewinding the video of the stream um and you'll understand it once you go over it again but the way to fix this is to match up the virtual addresses with the raw address because in the memory this was this the virtual address column was uh was the one that was accurate uh when the um file was mapped in memory and even though we've dumped dumped it to disk we have dumped it to this in a raw format so this form this column is still true

the raw address format is no column is no longer valid so what we'll do is we'll modify these addresses to match the virtual address

okay and now we will go here and we save the executable as what unpacked fixed i think it's a deal yeah and we'll close this here and we'll go to wide again and let's try open this up now okay so at least in the beginning we haven't gotten any errors that's a good indication right we've yeah just from the beginning we're also seeing starting to see uh api calls if you're seeing api calls that means the import table has to be correct so we go to the import section and we can see that all of the imports are fine now so we've successfully unpacked uh custom packed malware that at least had two uh stages

yeah so i'll close this down and this finishes the unpacking section and for exercises not right now uh but at home and you guys go over the stream again and i've uh i have all of the stuffs that i did i've also listed them in the lab manual so what you can what you're going to do is replicate all of these two processes what i did unpack something with unpacked command prompt with upx and a packet with upx and then unpack it just the way i did and then do the same thing that i did with the custom malware effect and the third thing what you're gonna do is instead of cmd.xg that we used this time try to use any other

random binary and try to pack it with upx and then try it okay so that finishes our second topic as well uh next topic is going to be about binary patching and i'm not sure if there are any questions or not i'm going to take a five minutes break because my throat is hurting a bit and if there are any questions you can let me know otherwise you just chill all right that's fine i guess we can take a break and continue after five minutes oh maybe uh let's make it 10 minutes i think that would be five minutes will be small right yeah 10 would be nice my throat is really bad yeah okay let's see you guys after 10

minutes

okay guys we are live okay everyone uh so now we're gonna start off with um binary patching and we're gonna try to speed it up even more because there's still a lot of stuff to go through uh even with a shot okay so uh sorry before we begin with binary pattern so there was a question before the break uh like for example for custom packers if you talk about them how can you identify crypto or comprehensive algorithms which malware or maybe adversaries often use and are there any good static analysis tools which you normally use in your normal workflow to identify these algorithms so please do something yeah uh there are a few tools um the plugins

actually the one that i use is uh find crypt uh i think it's fine crypt v2 and there was another one second search so what they basically do is that they have a bunch of um so for example the best way to identify crypto and hashing is via constant so pretty much most of the at least the block ciphers like aes and des and so forth they use some uh constant uh in their encryption mechanism so that's usually how uh those tools like find crypt we do and second search uh they identify uh them and the other thing is that like one sort of trick is that whenever you see a constant a d word or something uh in

disassembly just google it right and if it's an encryption uh you will end up finding the code for it and that's i mean even for fine crypt and the other plugin that i mentioned sign search uh they don't have constants for all the encryptions that are available um so the best way is to uh just pick the constants that you're saying in the code and try googling them and you'll in most cases end up with the exact encryption algorithm that it was and and the other thing is that some cryptos you can identify just just through experience like for example rc4 is really easy to identify and that's probably one of the most common ciphers

that are used it's relatively simple it's a stream cipher and usually with encryption you will find somewhere that there is a operation that is taking place that's true for pretty much all the stream ciphers and all the block ciphers when in case of decompression uh there usually isn't a zar it's when you look at it there's gonna be a bunch of mathematical code and weird stuff that doesn't make sense but at the same time it will be taking an input and it will be giving you an output and the size of the input will be smaller than the size of the output that's a good indication that this is an uh compression algorithm and then you

i don't think that it's easy to uh identify um compression through constants but then again it's it's it's somewhat experience based and there are also i think sigil and search i think does identify some uh some comprehension algorithms so compression exact type of compression is relatively uh a little bit trickier than identifying the encryption but the good thing is that if you come across compression it will be something like lg and t1 or like there are a very limited number of compression techniques that are used on the other hand anyone can create an encryption if customs or encryption or there are a bunch of ciphers out there so that's how it goes awesome thank you

okay so now we start with binary patching and luckily we only have two slides for it and then we jump into uh the walkthrough uh so there's basically two types of patching one is hot patching one is static patching hot patching is when you modify the behavior of a program while it is running inside a debugger um static patching is when you patch something statistically and that is permanent right so that gets written into the disk so that behavior modification stays permanently hot patching is uh it's useful for when you want to when you're debugging a malware and malwares often have certain checks in them and instead of providing the exact conditions that the malware needs

sometimes you just bypass those checks and for that you use hot patching uh for example in some cases maybe the malware is sleeping for a few minutes you don't want it to let it sleep you will modify the argument to turn it change it to zero it won't sleep anymore if it's reaching out to a url maybe you want to modify the url in some cases it's maybe making a call back in https uh for example using wayne http apis and what you can do is you can just modify the you modify a flag and a port and the https communication will turn into http and http so that's some of the use cases for

hard patching static patching there are scenarios it's not as common as hot patching but there are scenarios where static patching can come in quite handy and i'll actually go through one uh in in before the walkthrough uh but one of the places where static patching is quite common um but it's for malicious purposes of of course is uh is with pirated software so any patches that you get to run a pirated software so that it doesn't ask for a key or doesn't should like limit you or have any restrictions uh that's basically static patching okay so let's just jump right into uh before the walkthrough this is not exactly a walkthrough but i want to show

you a sort of semi-real life example of one static patching came handy for me recently uh so there was a program that i was analyzing it was one of the flareon challenges but uh but what it was a linux binary so what it does was that it created a child process and after the child process was created the child process would go back and debug its payment process so it would attach a debugger with it and the problem with that is on pretty much all the operating systems you can only have one instance of a debugger attached to a program you cannot have more than one instance now it it seems simple enough that you

can maybe just kill the child process and then the parent process would be free and then you can attach your own debugger to it but the problem with that was in this case that the logic of the program was divided between the parent and child so if for instance you're not able to attach your debugger to uh to the parent and you wanted to examine that logic you cannot kill the child because if you kill the child half of the logic of that program will go away so there was so there were three parts of the flag that we were supposed to get because the way the flareon challenges work is that for every stage you have to

find a flag in this case in this particular challenge the flag was divided up into three parts so for the first part um a static patching came in really handy and it made me bypass a lot of complicated stuff that i would have had to do otherwise so that's what i'm going to show you quickly first and then we'll do a normal walkthrough of hot patching let me open up

this

okay i think the font must be too small let me try and fix that if i can um

uh hussain is the is the font readable now if okay so i'm not gonna explain or go through the binary i'm just quickly going to jump directly to the region that i'm looking for and then i'm gonna explain a few things to you oh by the way um i'm not gonna go through an ida overview and richard might not have time for this as well but we have a cheat sheet for ida pro shortcuts that we included uh in the google repository that all of you have um or maybe i'll try and give a short overview at the end but for now let's just uh go through this okay so i've straight uh jump to the code

that is relevant i'm gonna rename a few functions so this function is a decryption function it's i think a yes if i remember correctly but i'm just going to rename it to decrypt this is the start of the data that we are interested in first part of flag

and it is encrypted okay so what was happening right here was that the flag that i wanted to get first part of it was right here and you'll see that it is decrypting these three are also uh the same flag they are just being decrypted in smaller chunks so it decrypts the flag here and then there's a mem compensation which is a memory uh compare call and what checks the input compares the input start of the input with the flag that it has just decrypted and if it's correct it moves on to the next state otherwise it just stops here and you fail basically now this is simple enough if the anti-debug technique that i did in this

that i just described wasn't there all i needed to do was just attach a debugger with this breakpoint here and once i reached here all i needed to do was examine the memory and i would have gotten uh the decrypted uh part of the flag first part of the flag but it would have been easy enough but the problem was like i explained um there wasn't a simple way which i attached debugger to it without compromising the program's logic so in this case so memcom it's comparing this is the input s let me rename it also this is the first argument that gets pushed this is the second argument that's get pushed and the there these

two then get uh compared so an easy way to do this if is if we patch this api not api yeah api if you pass this api and change this to something like printf right and then we execute it we can if we pass a percentage as string to which bit i'll do and it will be simple enough then we'll be able to actually get the program to print print the first part of the flag so that's what we're gonna do now i'll find the address of printf and let me just change it so we can sorry i'll copy the address go back here and we go to edit patch program assemble and we do this

sorry

it has to go with 0x

okay yeah so it's gotten changed to printf now what's going to happen is the input that i gave to the program that's going to be the first parameter to print off and the flag itself is going to be the second parameter so let me you're not going to have a decompiler on your ida versions but just to make it clear so you don't get confused this is how it will look like printf and then the input the first argument and then the decrypted flag so what i'm going to do is okay let me first try to show you without that

right and if i pass anything to it it's not gonna work because it's not been patched yet even if i pass percentage as to it it's not gonna work but now i'll apply the patches that i've just made create backup now i'm going to copy it again

let's try running it okay percentages but the reason i'm passing percentage s is so that uh when i show you in the decompile view percentage adjust is going to come here and the decrypted flag would be here so because of this format percentage as it's going to print it out and this right here was the first part of the flag so that was just a quick uh overview of sort of a real pseudo-real-world example with that where static patching came in handy uh and now we'll move to by the way you can there are many i mean i did a lot of uh complex patching with this program as well so the patching can be much more

complicated than just what i did here but i'm not going to get into that but just you can do a lot of stuff just with static patching a lot of modification let's close this and get to our actual walkthrough so for this i'm not going to show you how to change jump condition the parameters or all of that i think you've gone through that before as well uh maybe in your shots topics or exercises uh the point that i wanted to get across with this exercise is that when you're debugging a program you pretty much have complete control over it so you can make it do anything at all that you want and that's going to be the point of this

whole exercise this is not going to be so practical but when you repeat it yourself you will be training yourself one how to write assembly or how to understand assembly better and second you'll realize that you actually can do anything when you have a debugger attached to the program so let's open up cnd.exe from system32

okay we'll just let it run and now i'll break it [Music] and i'll by the way all all of these steps have been written in the walkthrough as well so you don't have to worry and we will execute the user code so what this option does is that so right now you can see we are in anti-dll module and we want to get out of it because we are going to make some modification and write some code here so we want to go back into a cmd.exe code so this is this is going to do exactly that for us so now we are back in cmd.exe right so what i want to do here is

i want to print out a string of my choice in comma in cmd here but i don't want to interact with the application itself i just want to do that with the debugger so what i'm going to do is i'm just going to write some assembly okay so first we what we'll do is we will find the default heap just remember this when you're doing the exercise you have to find the default heap this font here is too small but it's written in the lab manual we'll find the default key and we'll search for a location where there are just zeros for example four two zero four zero zero so we'll go to four two zero four zero zero

four two zero two yeah right here so we'll edit this location and we'll write the string that we want to print out for example

right so this is a string that we want to print out on cmd or exe now we need to write some assembly instructions here to make it do that let's just start editing where our eib currently is the first thing that we need to do is uh push an argument and we're gonna use printf for this and we need to push the string that we just wrote in memory as an argument so what we'll do is push and the address four two zero four zero zero right and now we need to call printf okay and the third thing that we need to do after printf is that we need the parameter that we just pushed onto the stack we need to

reclaim that space if you want to know details about this you can [Music] read the topics of calling conventions c calling convention versus std call calling convention this is a c calling convention and i think your short went over it also yes it is so to reclaim that space we have to add into stack because it grows downwards so we have to add it when we want to take it back down and we will add 4 to esp because that was the number of bytes that we just pushed right now let's just try to execute this so this is our cmd and you can see that there's no string right here print it here let's check now so so we just

successfully were able to modify the program's behavior to print this is cool here this is not really a practical example but this is not to showcase the capability and also to give you some practice of assembly now now the now what we're going to do is we just painted uh this string one time what we want to do now is print it 10 times in a loop so we'll clear this up how it was earlier and we'll start editing it again this time the first thing we'll do is we'll set up a counter we'll set up the counter in epx register so move evx 0. second thing we'll do is we'll again push the argument that we pushed before

four two zero four zero zero and we'll again call printf and we will again reclaim the space from the stack add esp four right and now what we need to do is we we have printed this this means that this will print out the string one time and now we need to increase our counter so we will add one to ebx dbx1 now what we're going to do is we're going to compare ebx to see whether it has reached so what we want to do is we want to print it 10 times so we have to compare it with 10. so what we're going to do is ebx cmd ebx with 0 a which is hex for 10

and then we're gonna use a jump condition j and z and we're going to jump back to where our original argument was being pushed if it's not equal j and z and the address would be four is this will be the address that i'm going to be pushing for a zero d four

so let's just verify that we've made the correct jump right because we don't want to reset it back to zero again so we need to jump here so that we can uh push the argument again and print it again so what we're gonna do is now we're gonna click right click here this is the other thing that i'm gonna do here no matter where you are you can change the position of the eiap and start executing code from there and then so what i'm gonna do is i'm gonna set the new region to this location again so the eap uh changes to this so this also answers one of the questions that someone asked yesterday

um another way to call exports for example in x city if you're working in x32 dbg and there is no plug-in there that will that can aid you in calling the exports if you're able to figure out what the parameters for those exports are you can just jump your right eip to that uh location like i did here from uh set new region um where did it go yeah it's already there so we have to click some someplace different yes a new revision here x32 dvd has the same option and once you've done that you need to set up the parameters according to what it is expecting and then you'll be able to call your export without

having to use any plugin the last thing we're going to do is we're going to set a breakpoint below the jump condition so that the execution stops once it's been printed 10 times so let's take a look at cmd or taxi and i'll let it run and you can see that if you count it other than the first one it's going to be 10 times right so we've been able to successfully modify the behavior command prompt and we've been able to print it 10 times and now this is going to crop the program if you let it run because we've messed with the code if you wanted the program to not crash then you would have had to be more

careful doing this sort of an exercise but the whole point of this exercise was to make you aware of the capability of a debugger at the same time give you some practice for assembly so you will do an exercise like this but i'll get to that at the end i want to show you something else now let me restart cmd or text um now i want to make a change by hard patching that is a bit more permanent right so what i'm gonna do is first i'm going to let it run there's a breakpoint again let's set up a breakpoint on the api call write console so this is the api call that is used to print

any data at all on command prompt so we've set up set our api and let it run ends okay so let's run a command ddir command here right and it breaks into the debugger and if you see it's showing you the string volume and c drive that's basically the first uh string that it was going to print um as a result of this command if i continue here you can see that it gets printed what i want to do is let me just go back a little i'll go to the actual call of write console i'm gonna set a breakpoint here

remove this breakpoint okay so let's just do the same exercise again i'm gonna write a string in memory like i did last time in the default heap let's pick the location 2e0

[Music] we will edit edit it again this time we're gonna put this in utf in unique code because this api expects unique code strings and not not ascii okay so first part is done let's put it here now i'm going to modify the parameters of this api call so that whenever this particular command gets called the result is always the same and wow gets printed out

okay so in order to be able to understand this i'm going to just do this and it's going to be recorded in the stream and you can go over it later i'm not going to explain every single instruction because there's not enough time but in order to be able to understand this you need to do two things one is that you need to go to msdn and you need to take a look at what the parameters of write control w are right and the second thing you need to do while you're doing that is watch this part of the stream again maybe a few times and you will be able to figure it out if not you can

reach out to me and i can help you in understanding what exactly i did okay so [Music] let's just need to move the lia operation upwards because this is important pbp four okay then we need to push ex push five these are the arguments that are required by the write console some are required by the call previous for that as well but once you take a look at it you'll understand and then we push the address of the string that we just put in memory that would be two e zero three two zero all right okay push five push address and then we need to push one and the last one is between the common knob

okay so now we've hard-coded the arguments right so ideally what should happen is that and actually what will happen is that as long as this program is running is is running in memory these hard-coded arguments should be used again and again and whenever we run that command we should see this string but let's see what happens i'll just let it run for now

we ran into an exception wait let me i think i'm just going to repeat it really quick i think i made a mistake on there okay

f2 uh

i'm just repeating the same specs steps that i explained to you i'm just going to go over them a bit quickly

okay

so that's actually the string length five we need to we can write the memory

zero four seven two

let's see okay so it got printed but we it should be resident and memory as long as the program is running let's try that one okay so let's try running the dir command now so every time i'm running the drag on instead of uh displaying the directory contents it's gonna keep on uh printing out wow and even if i detach the debugger that's gonna stay that way yeah i'll close the debugger and the behavior is the same as long as that string does not get overwritten in the memory which will probably not uh this is gonna stay same but on the disk the file has not been modified so if i close it it all disappears and if i run command

prompt again or even a parallel instance it's gonna work okay the change is not going to be prominent okay so that was the walk through for hot patching and for exercise when you do it at home um i think i have the lab manual in front of me yeah so the exercise i've written it but one of the things that you're going to do for hot patching is you're going to repeat uh the printf exercise that we did walkthrough that we did but this time you're going to modify the arguments in such a way that along with the string it also prints out the index of the strings so if it's just printing out so something

like this let me just so we were printing this is this is cool 10 times right something like this but what you have to do in the exercises at home it has to print out something like this one and then two and three and four and so on and so forth okay uh now we'll try and quickly go over uh the static patching walkthrough and then we'll move on to the final topic uh all right so man we had one last threat question so is there a list of uh maybe api calls that you would normally monitor or that you'd normally look for while analyzing a malware while analyzing about that yeah oh yeah i mean um there are telltale api

calls for example in case of a downloader it's going to be url download to file in http something like that right and even in case of a back door for example for the backdoors that use http or https to do communication with c2 they're always going to be there uh in some cases uh always uh when malware others choose not to use win http apis they're gonna end up using something like libkar and then you're gonna end up having and having a socket api instead that's a low level api but you'll have to backtrack from it and you will instead put a breakpoint on something like receive or send so these are because pretty much

any type of final payload that you have in i mean other than maybe a ransomware because the whole point is disruption there but any other sort of payload a back door a downloader it has to call back even an infosteeler so it has to do some sort of network communication so that is i think one of the first things that at least i look for because what what that makes easier uh for the analyst is to then backtrack from those particular api calls and be in a location of the code where things are actually happening and not be bogged down or stuck on the sections of code where there's just stuff going on that might not be so relevant

do you normally maybe have a cheat sheet or something of these such api calls that an analyst should look for on the first instance uh i would like to answer answer this question so i would like to add so if you go to the appendix a of the practical malware analysis book so that you remember i shared a book yesterday practical malware analysis so it has appendix a so the it lists all of the apis that are usually used by the malware so you can just go there and if you see any of those apis in uh malware uh if that's usually give you very good detail tell science and it also gives you a one to

two sentence description of what this api mean and how it is used and what type of eyeshadow what type of behavior you can expect if this api is present so please go to practical malware analysis appendix say and if there's a very good cheat sheet there awesome that was great that will be quite helpful but i mean to be honest uh it's it's like a short set it's very helpful but you cannot really determine that something is a malware just based on the api there are most more things that you have to look at maybe one of the apis that is probably the most suspicious uh is uh crypt decrypt right in case of ransomwares

because other than ransomwares very few programs require some sort of encryption i mean you have to take a look at flags also maybe it's just doing some sort of hashing because um that also used the same api but other than that you cannot just go based on the api calls you have to kind of piece together multiple things you have to look at a bit of logic a bit of api is like a sharp set and [Music] and yeah okay so thank you now we'll move on do static patching and i'll try to make this quick okay let me see the exercise with this so this is the exercise that we're gonna i'm just gonna copy it outside so

make it easier right we'll try and run it and see uh so it says that give me the flag as an argument this is a binary that i just wrote for this and if you give it an argument let's see what happens a random argument that was same nothing really happened so you need we need to solve this binary via static patching so that means that we need to load it up uh in in this in adder pro basically maybe i can just let me just load it outside

okay so this one i'm gonna do really really quickly um so we so we're not gonna be looking at this assembly so much for this one because the point is patching not so much uh disassembly you've already done some assembly exercises um so what we're gonna be using for identification of which path we should take the first thing is strengths right uh so we try to find the strings that we encountered in failure conditions so one was that's sad right and one whole stat was lame actually we we haven't yet seen this one but we saw that was playing and the other one that we saw should be right here give me the flag as

an argument so this right here rxe is the argument count so it checks that if the uh count of the argument is not two which means that uh the first argument is always the name of the binary itself so when you when you compare it with uh two it means is that it is expecting just one argument so if it doesn't have one argument uh it's gonna give you this error which we got and then it's going to exit and we got here as well which says that was lame so we know that at least this is not the path that we are supposed to take and this one says that sad so that's probably not it as well so we need to

see which path we are supposed to take and we want to force the binding to take that path right so on here this was the failure path and left side uh this seems to be something interesting that's happening so what we're going to do is we're going to patch these three conditions in such a way that it always goes to the left of the code and it never goes to the right so there are a couple of ways to do this but i'll just choose to do it the easier way because we're out of time uh just go to options general and change this setting here when you're looking at the stream again so you can see the op

codes also alongside with it when you're doing the touching right so the this is the location that we want to get to and these are the conditional jumps that we don't want to take so this is pretty simple uh what i'm going to do is i'm going to replace this with knob nope is no operation which is exactly what it sounds like and it's one byte nine zero we're going to put in six knobs here so that this becomes redundant or ineffective let's go to the sample now one two three four five six right so we've patched one jump and we need to practice these two as well so we'll repeat the same process

actually i also show you how to do it in the x view so if you want to do it through the hex view when you click on it this gets highlighted you can press f2 to edit it and then you can write nine zero nine zero nine zero nine zero nine zero nine six times and press f2 again and this got converted to not and for this one you can just i just wanted to show you now for this one you can do it again this patch program because it's more descriptive i guess two three four five six okay let's see now we've passed all the jumps and know it and we do know that at the very least

we will hit this part of the code uh which looks interesting so what we'll do is we'll write back these changes to the file patch program apply batches to input file we'll also create a backup just in case and okay so let me copy this back into ibm

replace it let's try running it again

copy the wrong file one second binary pattern

sick i think it's not big

um i think there's some problem i'll just edit it inside the vm so i don't have the copy issue and let's go back here open it up that's needed

and we need to find the main function it should be

this

yeah this is it so we'll rename it

we'll do the patching really quick

so these were the jumps that we were patching options

faster

this um

okay so it should should but now let's try and apply patches to input file and create battery okay let's try and execute right so we've now released that second section of the code that initially were unable to reach so we know that at least we've bypassed some of the checks just through static patching uh what we want to do now is that it says let's one more patch let's try one more patch let's just do that exactly so we know that this doesn't look right that's obviously through the strings so we need to try to execute this part of the code and how will we get here let's see through this there's no condition here

it's just going straight up straight up straight up right

that was uh yeah so these this is where it's going to that [Music] these duplications so these two jumps are the ones that we need to patch them let's see go back to this mode

jng this one this time it's only two bytes because it's a short jump which means it's closer to its own location here f2 again not not f2

okay

okay so now this one has no entry path left this one has one so we need to pass this out also so that we always only go to the success condition

right now there's no way to get to the failure path there's just one path and it should be the success one so we'll try and apply the input

okay let's see there you go so that was the flag uh which gets decrypted and printed out so before i finish this topic uh just real quick uh i need to find another challenge here so right now what we did was uh we we changed these two knobs so that the execution would execution flow would continue as it was going but what if we wanted to change it so that it will also it would always take the jump instead of going down straight so in that case what you can do is change this to a non-conditional jump like this good sorry go to assemble and just change it to jmp right and now it will always take the jump so

it depends on which condition which path you want to follow and either you can knop it out or or otherwise you can change uh the jump uh to a non-conditional one and that's how you do it okay so um later today i'll upload a binary to the google drive that you can do as an exercise so first uh you'll do repeat this as an exercise for static patching and then uh you can do that other binary as well which will be somewhat similar to this one okay let's just close all of this and now there's just one topic left binary angulation okay so what is ammunition [Music] ambulation is when [Music] what would be the correct word for this

emulation is when you implement an architecture and for example you want to execute a arm code on x86 you can program arms logic on top of your in on your windows os or whichever os you're running and that would be called emulating that particular architecture on top of whatever you're running so that's what emulation is for example uh this is a code snippet from cam you uh many of you guys might be aware of camu it's kind of like vmware but it's an emulation engine also it can emulate stuff also it has both virtualization and emulation you can run it in both modes so this is a snippet from its code and what it's doing right here is that

it's emulating the logic of the jnz instruction in c plus plus code so instead of going to the processor to the cpu which is other which is how otherwise jng would have been executed it's executing it in its own code so it's emulating that cpu so that's what emulation is emulation versus virtualization so virtualization is when uh for example what vmware does virtualization is when the instructions get executed by the actual processor even if you're running a virtual machine but in case of emulation the instructions they don't get executed by the cpu they just get executed uh the way that this diagram shows this screenshot shows uh by c plus plus some code like that and there are a bunch of

uh emulators available out there cameo and bush are one of two of the most famous ones then we have vivisect and unicorn these are actually emulation frameworks that you can use as program programmable apis and build on top of them so the problem with uh emulators like unicorn for example is that they are bare bone as in they can emulate they obviously they can emulate the cpu but the problem is that when you try to execute for example a windows binary on with unicon it's going to fail because whenever it's going to try to execute an api call those are not implemented there windows kernel is not implemented there so there is nothing there it's only able to

execute the instructions so that's a problem but now people have started developing frameworks that actually solve this problem i've listed down four of them in this slide down at the bottom bind flare emu speed easy killing so the way that they've been written they cater to different use cases but i'm not going to go into the details of that because of time constraints right now what you can do is you can google all of these four frameworks and in their descriptions and they kind of hint at what they're best for uh what we're gonna do today is we are going to go through one of the frameworks uh that that is mentioned at the very end kailing uh or chilling or

however they pronounce it this is a relatively new framework uh so speak easy by the way speakeasy has been made by the flare team and they they are somewhat similar but we'll just uh go through killing today so one of the cool things about killing is that not only can you emulate windows uh binaries with it you can actually emulate windows drivers as well which is a big deal for someone if anyone has ever debugged a windows driver they would know that it's a very tedious process at the very least you need two vms but if you're using the same machine and you're using two vms the process is very slow the best way to do this is to have

two separate computers uh to do speedy debugging so being able to do that in user mode on pretty much any operating system because this is an this is an emulator so it works on mac it works on linux it works on windows pretty much anywhere also that's the big deal so that's quite a nice feature that killing has similarly killing can also emulate mbr so right here this is a screenshot from the twitter in which they're emulating a ransomware patea which is which is an mbr ransomware because it's it encrypts the master boot record so let's just go through i missed some stuff i think in the beginning so this is not really important we'll

just jump down right into the uh walkthroughs so i'm gonna go i'm gonna do two quick walkthroughs one is going to be around iot iot internet of things so this is going to be a router malware that we're going to try to emulate uh the second walkthrough is going to be around string decryption which is uh going to be as a use case of emulation okay let's let's go back and let me find the directory

okay let me just start with a little bit and increase the phone

we need one more thing

okay so first i'm going to show you let me just show you the malware so this is going to be a mips binary mips is another architecture like arm and like intel uh it's just supposed to run on a router in a linux environment it's not supposed to run on a mac or on a linux on intel cpus

quickly show you what it does

so jb that's kind of like ida but it's much better on decompiling uh iot architectures so workshop and actually this path should be a bit different oh

so this is what um mips assembly looks like but we're not interested in we just jump to the decompilation of this so this is a very small downloader it's an actual malware so what it does is [Music] it connects to a hard-coded server which is encoded at the moment but it it connects to a hard-coded server then using disconnect and then it calls receive on that socket and once it has been downloaded it writes it to the disk and finally it executes it it says exactly so the point of this exercise is to show you that you can emulate a completely different architecture and a completely different os using this frame so first i'm just going to so this is

this is pretty simple right now i've connected out everything so nothing is going on actually we don't even need these three so just no one gets confused i'll remove it completely right so all it is doing is that it is initializing uh the emulation framework and we're passing it the binary that we want to emulate and it requires a dummy argument it's not relevant so we just pass you can pass it anything and then we're passing it this emulated fs directory which i'll really quickly show you is uh the font might be too small but in short whenever you emulate anything you have to provide it a directory structure with which contains all of the binaries like for example in

this case would be on linux so you have to provide a bare-bone linux's uh directory structure that contains the commonly present binaries that would be on on a router but it has to match the architecture so in this case it should be from a mips router okay so i'm just going to really quickly execute this

let's see what happens

one second we need to delete this

it's actually getting stuck but let me just comment this all

right so what we've done is basically i'm going to skip one step ahead because the binary was getting stuck what we're going to do is the call that i showed you right here uh in the decompile view that it was attempting to connect to a server that then it was connect attempting to download something from there uh write it to disk and execute so i want to show highlight 2.0 one that we are actually able to run this using emulation the second is that we are able to modify it in any way we like so what we're going to do is we're going to hit we're going to hook the connect call and we're going

to replace the parameters whatever it was uh whatever it was getting initially it doesn't matter we're going to put in the local a local ip and port so that we can interact with it so i've put in one two seven over the total at one local host and uh force uh seven port so if i uh so once i executed that you can see the log here it connect it created a socket and then it opened a file on the disk that which it meant to write the downloaded malware to because it's a downloaded and then it attempted to connect to uh this portal and hold the host but we had nothing running here so that's why

it failed so i'll quickly run a netcat instance uh right and then i run the emulation again and now you see that we got a connection and i can try send something random data and it's going to write it to this so right now this malware is executing and it's executing perfectly even though it's a different completely different architecture completely different os i'm gonna terminate the connection right here and it's gonna dump it to this part and then it's going to try to execute it but the execution failed i'll show you the reason the execution fails is

oh wait i sent the intro up to the long location once again delete this run around this server again this again same thing send some data and i'm going to kill this connection from the netcat site okay so now what it did was after it reverted to this path its root uh it tried to execute it but since uh the data that we fed it was just junk it says that majestic number does not match because it's not a proper binary it's just junk so it wasn't able to execute it but this kind of goes to show that what how powerful such a framework is that what you can do even with malwares that are not intended for

this architecture or os okay so now i'll try and jump real quick to the second demo which is going to be string decryption this should be our last walkthrough for today

so as an exercise for the first part you can try and you might run into uh some issues when you're trying to uh replicate this because i haven't put in instructions in the lab manual for this particular walkthrough but you can reach out to me if you're really interested i'll let you know the exact steps to prepare for it

okay so the final one

and just go back to so this is also going to be an actual malware it's a windows malware and we're going to emulate and decrypt the strings here so again i'm not going to go too much into that because of the time constraints but i'll just explain the concept behind trying to do something like this so sometimes sometimes it is much more efficient to treat functions as a black box so for example i explained to you how malware others is how it is pretty common for malware others to encrypt strings and data uh within the malware on top of about from the packing layer um you can dig deep down and you can try to

figure out the encryption and write scripts for it which is what we do in quite a few of the cases but sometimes emulation is much faster you can just emulate and the other thing is that with emulation you can do this on in bulk so for example if you're written an emulation script for a particular family uh you don't need to uh when you and in a company where you're getting a huge number of samples every day right uh you don't want to waste your manners you want to have some sort of an automation now the samples can be pretty much identical but their malware or c2 configs can be different and we want to

extract those configs from the malware so what we can do is we can write some sort of a generic [Music] configure section script in a framework that we might have to automatically extract the new configs from the malwares that we already know about so that people don't have to waste time on it and we don't have to execute them in a whole windows vm which is going to take too many resources while on the other hand if you do it with emulation it's just going to take a few seconds and you're going to be done so that's kind of the whole point of it um now i'll load up this sample and ida so i can quickly

walk you through what we are doing

so this sample is also a downloader so i'm quickly going to jump into the import section to find a way in http call so i can just jump to the relevant section let's just find a brain http caller and we'll track back from that call so this is the function where we have when http open within http connect being called so this is a downloader so this is the function of interest and this is uh what is going to download the next stage of the malware so i'm just going to rename it as a downloader function right and i know that this function is used for decryption i'm not going to explain right now how i

know but uh again you can just reach out to me we have to kind of speed this along right now

okay so this function is being called for a total of four times right uh what we want to do is let let me show you what it inside if you go inside it it's doing a bunch of stuff and it's obviously going to take some time to figure out what it's doing and we don't want to do that we know that we have an emulation solution that if we use we can just skip all that and we can get the decryption strings without even having executed it and then we can automate it further so that uh whenever this family comes along we can automatically uh decode this new config a new c2 config without having to waste

any manners so so what really quickly the first argument that gets passed to the decrypt function is actually the encrypted string so here the server encrypted server name is being passed if you look at the second call here's something else that is encrypted that is being passed if you go down here's the object name that's and finally those are the headers http header is most likely a user agent i think that's being passed and encrypted so the first thing is that it gets passed as the first argument and the second thing is that uh it makes modification to the same buffer so this same buffer because it's being if you look if you go up

you can see that once it gets gets passed to decrypt that same uh buffer again is being used for this http connect call so that means that it's not allocating the new buffer for that decrypted string it's using the same buffer over again based on that

so this is kind of like debugging for example when you place breakpoints during debugging this is kind of the same concept but we just say we're hooking that particular address so whenever that address gets executed we want it to jump to the function that we have implemented here and executed right so so the first thing even before this what we're going to do is we are going to we don't want to go through all the for example if you go to win main there's a lot of stuff that's going on here and if you go inside any api call we don't want to go through all of that we just we're just interested in uh the strings that we

want to decrypt so what we can do is even in the very beginning of the malware's execution we can change the evip uh or the change the eip counter to point to this function so that we skip all of the stupid stuff that we're not interested in and we just start executing from here and then we just decrypt our strings and we exit so this is what's being done here jump to downloader function right so what it does is in the very beginning we hook some place in within the win main function and then we change the eip address to this address and this address is the address of the malware i'll copy it and

show you of the downloader function so for example if i paste it here it goes to the very top let me just do it again so i pasted it here and it went to the top of this downloader function so this is going to change the flow of execution and bring us bring it directly to the function that we're interested in and that contains the encrypted strings the second hook the last one uh stop emulation this is this one so it's stop simulation so we know that uh the crypt string is being told called a total of four times right and once it's been called this very last time we don't really want to execute

we don't really want to continue execution from there onwards because it doesn't make sense we've already gotten the decryption stings uh strings at that point so this is the second place that we hook this address and yeah right here so after hooking this when this address gets executed we just stop the emulation now we'll go uh to the remaining two hooks that we've placed one is read string address so this happens uh this hook gets executed when uh the decrypt function starts so for example i copy the address from here and go it points to the top of the decrypt function so what it is basically doing is that it is reading the first parameter from the

stack that was passed uh to the decrypt function because that's where our decrypted string is going to be at the end of the function and i'm not going to explain this it's pretty simple but if you don't understand it again just reach out to me it's just reading uh esp 4 contains the first argument and it's just reading that address and saving it to a global parameter here right so the second hook read decrypted string that's at the very end of this function so if i paste it because we know that when this decrypt function is exiting that's the point where the string has been decrypted for sure so we want to stop here and read

the decrypted string from the buffer address that we had saved earlier here right so this is again uh doing uh some stuff i mean uh it's reading uh the address four bytes and then it's converting uh it to uh it's it's fixing the endianness because uh we know that uh in memory uh for um for x86 okay uh the addresses are in little indian format so they are in reverse so we will have to change it back that's the way that they've made the framework they're not fixing it themselves we have to do it our own self so this just fixes the endianness we have the address again and then we read the string from here and

then we print it out with this statement so there's also an accept statement and that's because there are two ways that the string can come out one is that if the string is only four bytes uh four characters long uh it will be right there uh on on the buffer that we read but in case uh if the string is longer then it will be a pointer to a pointer so we will have to read it twice but you don't have to get confused as to why it is this way that's just how the function was programmed the key take away from this is how you read values off of stack how you convert addresses how you fix their

endings and how you can read strings and convert them from utf-16 for example you don't have to get walk down in this logic so anyway so we're now reading the decrypted string from the very end and then we print it out to the screen okay let me run this manipulation just the name was long

try a good night phone oh i have the wrong part let me fix that binary emulation

by the emulation

okay

should work so it is running okay actually let me show you another output before we do this so i'll remove this from here because this disables the debug output so i just want to show you how it's setting up everything and then we can move on so you can see it's emulating a whole windows environment it's emulating tv which is the thread execution block then p which is the process execution block this is the piece entry point so right now it's running on a mac machine and it's simulating a full-fledged uh full-fledged windows environment so let me just stop it right here because i we need to disable the debug output and now we'll run again

and it should print out the decrypted strings

so it was able to decrypt a total of four strings uh one was the c2 ip the other was post because it was making a post request this is the uri and this is the user agent that the malware was going to use okay so the reason that i've showed you on the terminal here is because uh in your ida versions you won't be able to run python scripts but ideally the way that we do it because the primary reason for at least us is uh to aid in static analysis so the way we do it is that we integrate this in ida python scripts and then we can have this information right here on our

ida instances for example i'll try and run the ida version of the script just so i can give you an example uh just keep looking at the decrypt function for example and let me find the script binary emulation i think this should be it okay just keep looking here and you can see how it can be helpful let's give it something

you

it should have finished by now just make sure the script is yeah i did just finished so you can see that it commented the user agent right here and if you look at the other four calls you can see right here the decrypted string was the c2 which was being passed to within http connect and then if you look at the second this was the post string that was being passed down here to win http open request so you tell when http open request whether it is going to be post or get in this case it was post and similarly the uri was also being passed to an http open request and finally the user agent

so yeah that's how it can uh help you so for example uh if uh for a malware analyst who can't who has to deal with a lot of uh binaries on daily basis um if you have these sort of scripts written with you uh um all you need to do when you come across this same sample again or an identical one uh or if you write a more generic script than uh from the same family but it can be a bit different so you can just run the script and you'll have all the decrypted strings right here you won't need to run it in a vm or debug it and you can statically tell a lot about what it is

doing and these are the c2 strings but for example in the case of the back door that we interacted within the first exercise pretty much all the commands were encrypted so we had to write a script for it and anytime we come across a variant of that somewhat similar variant of that backdoor we can just run that script again and we will end up with the decrypted strings and we could easily statically analyze it okay yeah so one last thing before we finish this i'll also i think i have it i'll show you the version of the script which is not emulated i'll explain once so for example if you were not using emulation so this is the script to

decode uh the strings for the same sample but this is not using an emulation so for this one i had to figure out what the encryption was or obfuscation in this case and then i had to write code to de-obfuscate or decrypt it and it took time obviously so that's the advantage of emulation over not treating something as a black box and trying to understand the whole algorithm and then trying to replicate it in something like python so you can decrypt it so emulation can save you a lot of time okay guys so i think that's about it uh so we weren't really planning for it to go this way i mean uh a lot of information got crammed

up uh for my exercise from my topics we weren't even able to get time for exercises but i'm hoping that when you guys go over it again uh maybe someone will find it helpful uh there's a lot of useful information in here if someone is really interested and you guys do get stuck at some place please do feel free to reach out to me you already have my twitter handle so i'll be more than happy to have you are given that you've done your best to figure it out and you're still stuck and you're more than welcome okay so i'll stop sharing my screen

yep we are live okay guys welcome back from the break so i hope you guys are learning a lot so just wanted to say a couple of things before we move forward so if you guys don't understand a lot and or if you are confused about some stuff so that's pretty normal and we just wanted to give you a taste of some of the advanced topics that we cover so while they don't stop stopping from the malware analysis field so just just so you know like you can set your expectations okay this is the level you want to achieve and you know man did a really good job going through some of those advanced topics if you feel stuck that is normal

that was by the plan so don't feel bad and try your best and if you still stay stuck please reach out to me or um we will be really happy to help you guys and second thing i just wanted to say you might we might begin to uh you might have been realizing now at this moment why malware analysis analyst get such a high salaries so so stay with us and rest of the just there are two more topics that should take less than 30 minutes i will try my best to cover them in 30 minutes and i will just give you an overview of the shell code and the detection technique the snot and yara

so but i think first hussein have a couple of questions from the chat uh let's go through yes so while we were on the break we have received two questions first is uh can you go over the api is used in a ransomware i believe omaha has already covered this part and possibly how can you backtrack and decrypt ransomware so first of all uh we need to understand one thing one thing so all of the ransomware you encrypt your files so there are two type of encryption like if you divide in a very wide wide manner so one is the public key private key encryption and second is the symmetric symmetric encryption so in public and private encryption like

that's usually so you have a public key you encrypt something with the public key it can only be decrypted using the private key so and second thing symmetric key in the symmetric key so second um so you use both in order to encrypt and decrypt you use the same key okay so we need to understand these two concepts so for example if the right and now coming to your part so can we decrypt the basically ransomware encrypted files or not so if we reverse the key generation algorithm of the ransomware and it is using symmetric key and there is all of the things that used to generate the symmetric key are present in the code are on the host that

it encrypted to the file so we can basically write a decrypter to decrypt those files and the number second thing is like if you're using public and private key so i would say just forget about it it will be very very hard unless there are some i would say like uh some faults in the algorithm that it used or some i should say some technical issues with the algorithm or the implementation so i would highly recommend you to go to the hasi bhai conference uh see by talk uh how to hack her in somewhere where he actually decrypts the ransomware encrypted files and it will give you much more much more detailed overview of this and

coming to the question what are the apis so usually uh crypt encrypt function so vincrypt.h is the header file that provides the encryption in the windows so most of the ransomware will use function from this crypt encrypt is one of the most famous function crypt encrypt message so you can just go just google crypt encrypt functions and you can after that you can go docs at microsoft.com click on the link and most of the ransomware will be using these functions but keep in mind they can also use publicly available apis so publicly available libraries and have them statically embedded those libraries in the sample so in that case it will be difficult so yeah all right uh okay so we have one one

more question this is more like a scenario let's say you've received a malware sample and you do not know if it has any sensitive information in it or not now you what you can do is you are basically short on time and what you do is you submit on a public on online sandbox analysis tool for example sorry by shakir virus total now as an analyst what other tools uh well actually what tools would you like use other than strings itself to analyze if something sensitive is in the binary right so i what i'm basically what i want to do is i want to know a list of tools that can maybe redact the sensitive information so that i can

personally review it before submitting it to the sandbox for analysis so basically uh in that case to be honest it's a it's a tough scenario to go through like it would be casey it would have to be handled on case by case basis so personal information can be present in many formats it can be present in code it can be present in form of some data it can be present in the form of strings and for each of those so most of if the you haven't written the sample yourself or you're not the author of the malware and just got it from someone else so i think the best option is to see if the

malware is backed if it is not packed then you are lucky so you can just after that you can load the malware in iota pro and just simply like quickly go through the code like if you see anything sensitive there second thing you can do you can look at the strings of course like but you and third thing is like you can run the malware okay i would recommend run the malware and take a snapshot of the basically there are frog explorer provide some ways to dump the memory there are many tools available that can dump the process memory so take a snapshot of the uh memory and then run strings command on it so there might be some decrypted

strings or of a scattered string that might be present in memory as any simple form so but uh to be honest this this discovers the basics of it and in if you really want to be sure there is no sensitive information in it you will have to reverse engineer it at that point you won't need to submit it to wireless models so yeah any other question i don't think so i think you should continue now okay so i'm gonna continue so one minute okay so melvin analysis okay just wanted to give you guys like up okay we haven't gone through the ida pro because we think this software is pretty easy to learn so what i did is like

you're providing your video links to two videos so one of the malware analysis trips so let's just see it oh sorry my youtube is blocked so okay so you can just go click on the link so and the channel is oa live also follow this channel and he did discuss many good details so let me just open it i think i can just copy the link and open incognito mode

so do go through this so this is one hour and 38 minutes and also go through some of the other videos of this channel so he covers some of the really advanced stuff and will provide you a very good understanding of the malware analysis and also help you deal with some of the packers okay so i just wanted to share this and also the second thing i wanted to do if we are like this is this is another video that you can go through and it's one hour it's i think it's two or three hour long video but he uses the ida pro to do some of the reversing so it will be assembly heavy

but if you go through at least watch this and go through it you will have a good understanding of how the ida work and how the how can you modify the instructions to do the crack me so it's more ctf based other and there are like a very quick introduction to the ida pro in chapter 5 of practical malware analysis so whenever you see the pms it means the practical malware analysis so you can quickly go through it but if you go through these two video you don't need to go through the chapter five of the practical analysis okay okay debugger demo it's already done so we need to first the next topic shell code analysis

okay the shell code analysis so here i'm just gonna guys give you a very brief overview what the shell code is and how can you analyze it shell code i think most of guys you must be already aware that is a small piece of code that is used in exploit to deliver the main payload the whole purpose of it to deliver the main payload and after whenever the exploit runs so you transfer the control to shell code and shell code and loads the main payload and so why it is called a shell code because typically in the past it used to give you a command shell like bash or cmd.exe so shellcode is commonly used by malware

so to inject payload into the running processes okay the pe file so both of these are advanced topic but reflective loading and how to inject payload so i won't go into details but details of it are then i think you can just google it how the shell code is used to inject payload or the reflective loading you will come across a lot of videos so one thing you need to know cell code is position independent code what do i mean by position independent it's like it won't have any hard-coded address okay so there won't be any hard-coded address so everything that it will be accessing will be relative to something either it will be relative to some position that

it has marked in the library and that is marked at the start or relative to the current code instruction that it is executing okay so one thing okay almost all of the shell codes need access to windows api if they are running on windows if they are running on linux they need access to linux api so almost every shell code will need access to load library and get broadcast so we discussed these two these are used to load library is used to load up pln get proper text then find the address of the function that you need to execute so how that it does okay i won't go into detail because the time is short so thread environment but these

are pretty windows internal concepts but just wanted to show one thing i think you guys need to remember that remember these offsets if you don't if you understand anything if you want to learn more detail i will provide you the link and so 0x3 0x1c okay 0x18 and 0x10 okay remember these offsets whenever you these all see these offset 30 30 c 1 c 1 8 in a shell code so most likely it is trying to find the kernel 32 address so that will help you like help you or do some quick analysis and here that is how it looks in the like in assembly language so 30 0 c 1 c 8. okay if you see offset in this order and

all of them so and you're analyzing a shell code so most likely that is like finding the kernel 32 bits and so here is again 19.4 if you want to read more about it you can head over to 19.4 it will discuss in great detail what are these offset means and what what actually present there okay the second option that i want you to discuss the second part it will give this this shell code the like this shell code or this piece of the code will find the address of the kernel 32 dll so so now we have the first parameter of the get procurement get proc address takes two parameter first the action module so we have found

that module using this code now we need a proc name so proc name we already have now we need the address of that so how it does is like it will go simply pass the pe file it's a p file it's a very common thing that is done so go to 19.2 if you really want to read more but here i just want you to remember these offset sequence so 181c 0x20 0x24 so if you see these offset most likely it is trying to find the function address so just remember these offsets for this workshop if you want to learn more in detail you can head over to this link okay here is like the actually how it

will look okay find the kernel 32 address and after that you move to you see like 1c 14-h number of supported this is how actually it will event all come together it will look like this okay okay one another thing is like i want to tell you most of the stack strings most of the shell code will be using stack string because they don't have a section separate section for data it's just a one continuous memory which contains both that encode so how they then how how can they basically come up with a way to use strings because they need strings for a lot of functionality so they usually or stack string is one of

the most common way in the shell code so far there are other ways but this is the most basic and so what it does like it will prepare the thing we see all of these like these the four diff difference of four f eight f c and six double zero okay and if you know ascii to the decimal so these are basically hexadecimal integers that represent abc uh small abc and capital letter k small letters and the capital letters here i have converted them to the original so if you see if you get g t p r o g k double r e double s get proc address so that is what it is basically

preparing the string so just preparing the get prompt string in the stack okay so there are a couple of tools that are available for the shell code analysis so i will quickly go through it due to the shortage of time okay i have already demo prepared shall go down

okay thanks

okay good i think is text visible on youtube actually let me take a look yeah please

all right i think this is good good to go okay awesome awesome uh okay

shellcode.txt shellcode formatted shellcode.com okay so what i'm going to do is going to so basically usually you will get text shellcode in this form okay or so then you can basically do some replace backslash x so you can convert into this form okay now the shell code is in this form it's very easy some easy replacement and after that if you are still you can any hex editor hxt so i'm gonna use the hxt

so basically then you basically right click after copy it right click and paste right so that what it will do it will write all of this code inside so then you can just save it and it will become shellcode.bin so i am just saving i have already done this step so i'm not really going through and so shellcode.bin so that's your binary and now it's kind of in a good form and we can convert it to the next thing that we are going to do is to convert it to executable okay so let me launch a shell

okay let's launch it to testo okay awesome okay one thing i did is like shell code to exe.pat okay i asked you guys to download this program during your lab setup so it should be placed place there so either you can place it in your path or you can directly go to it and so there's one question that one i think that you will be wondering so 32-bit 64-bit how to know if the code is 32-bit and 64-bit the easiest way for you to know that is to to be honest load the program in the any of the disassembler so that's a really hacky way but it works most of the time oh sorry cancel i'm not locking the shortcut

okay so binary file click ok so 32-bit mode let's try 32-bit mode 64-bit mode you see it's all garbage nothing makes sense okay let's try to uh don't say okay now [Music]

oh sorry

usually in so in so now it's kind of a code so that's a hacky way but in most of the cases it works so you can actually load that binary to disassembler and if you try to decode the binary 64-bit so it won't work and but in a 32-bit it will work that's a really hacky way usually uh there are actually i think there might be some programs available online some tools available online that can do this otherwise the good way is to know the opcode and how basically they correspond and then basically you can basically figure it out up there a person who is good at assembly and opcode the shellcode writing will be

able to figure it out but for now i think you can use this so now we have the executable and how we need to analyze it i think i am in the okay let's go ahead i have a new snapshot don'ts

so during the labs uh snapchatter by i have asked you to download i executed download a lab for so that's what we are gonna do so shell code lab four most likely that was the name of it okay we are here so you will see this binary so all this is actually so the first thing that you need to do is as of so once so you can convert into the let me convert quickly again

it's taking longer than i wanted it to take

go to xc.bat okay 32 shellcode blob shellcode dotping shellcode.2.exe okay it says it's generated so now the shellcode.2xz so the easiest way is one of the key thing that we need to notice from the shell code the functionality is to find out what apis that call so we ask you to install a tool run with pin so in the lab setup we explained how to set up so you just double click on it you just click on it and it'll grab okay

okay generated the dot tag file and after that you can just get right

so if you're familiar with the networking api so we see the load library and get every shell code so this is like famously calling load library you get proc address and wss startup bind so this is also a part of your lab so bind socket get your dr info so bind is usually used for listening so but it's trying to listen on some port and after that you can do the basically rest of the analysis and second thing you can do you can load this file into the ida pro let's let's load this shellcode.exe into the ida

just click okay okay now it's good so you see three functions oh yes you see i told you the stack string so if you go here and you press r so it will convert them into readable letters so it is basically here it's kind of all of the apis that it will usually preparing strings for them okay you see that and after that okay what else so basically from these apis you can get a pretty good idea what it is trying to do and okay i told you like 0 c 3 0 0 c 1 4 7 8 so these are the offset that i asked you to remember in the order so most likely it's trying to find the kernel

32.dll and after that there is loop so usually after these offsets if there is a loop most likely here it will find the actual address of the api it will resolve all the address of the api so we can label this function by pressing and find api address okay and how to know what apis are called there okay it does go there okay call so the next thing is you need to know whenever you see the call so basically what the stack does is like most of the shell code does when you see the call in the shell code there will be something called double word pointer yes plus 2c so you don't know what these calls are

but usually where after resolving the api address it will store them in the memory and then you call them from there okay

yep you see esi plus 4. so most likely esi points to the points to what should i say points to the base of the array and which contains all of the addresses for the shell code so let me quickly load into the debugger and for that i will go to the other exercise i think that would be better for me

because if i do that then you won't be able to answer some of your lab questions okay awesome give it some time okay okay shell code okay this is the binary that i have already generated x32 debug yes file open shellcode.x0 okay so as i think you need to press f9 to reach the entry point okay one thing you want to do in shell code

is to set up a breakpoint on calls that will basically let you know what are what api it is trying to call okay should call edx

okay

awesome i think these are enough calls if i am right press f9 call eax so eax didn't get proc address okay it's trying to get the proc address and from here you can see what address what address is trying to get it's basically load library trying to dissolve the load library okay move on f9 again okay it's trying to url moon.dll and okay i think this is a good thing that it happened here you see it doesn't make sense what it is there the second parameter should be pointing to a string from the in uh no it's a no it's basically oh it's load library sorry it's a url dot d l it's trying to load the url mode dll okay

so just contain load library just contains one panel just takes one parameter and that's okay it will unload this let's move on to the next call okay edx call edx edx it's trying to get the proc address of this download url download to file a so that's pretty uh i think obvious self-explanatory so it will use download some file and let's see what is the next f uh sorry f9 it gets passed okay f9 okay what it is doing okay you all that monday it's jumping into this f7 okay f9

so now we are here it's trying to call edx edx trying to resolve the win exactly so previously we saw you it's trying to uh trying to resolve the url download to file and after that mean anything we need execute any executable on the window so most likely that's what is trying to download a file and then execute it okay let's monitor our like okay meanwhile if you go here this is the traffic that it generated let me reload it so that i think you guys can let me clear it cancel and i don't okay let's let me try to reload this so that you can guys can see it's generating the traffic okay um

side by side i will simply press f9 multiple times

you see the same traffic has been generated again and fake net mini program has run so whenever it's you so fake net automatically serves the exit if the program asks for an existing malware as far as it will automatically serve a dummy program so this is a fake net many programs it basically confirms our suspicion that it downloads the file and on it okay that's that was all about shell code and one more thing i think this is more than enough for you guys at this stage and there's a lab in there so you can do it on your free time and let's move on to our next topic so we're pretty sure oh my god we are already

over time okay i will quickly go through the detection signature and so yara yara is basically used for malware hunting family detections are hunting so usually whenever you need to create a direction for a static static detection like so that will be in yara and will come across this so that is how the yara rule looks so there are three main parts metadata so metadata is usually basically if it doesn't provide it doesn't basically participate in the logic of the rule but it provides the context what you are trying to detect who is the author when the rule was written so it's optional but i always recommend please use the metadata and strings are basically the strings that you're trying

to detect in the malware okay and here is the condition so this is pretty okay so these three parts let's move on so goodyear rule name so i have written some points for the good url name so what should it indicate and what should it not i think you can go through it and i have also provided some links here so you can go and some of them have very good names so in metadata what should you include and optional so it's always good to include this thing okay strings it's basically string types now we are coming to this part so basically where which are actually the strings that you will detect that you think are can identify this

malware piece of the piece of the malware uh uniquely so string types so name so the name starts with the dollar sign okay after that you provide the name so it can be anything just like a variable double quotes full word mean so it should basically should be a full string so it should end on a period or like the full word in control f you type the full words it will only match the full word if this is the same thing and you control f or replace you use it that's it means exactly the same thing or the whole world ascii and white white is the unicode and the ascii is a single singular double byte i think you should

understand you should know the difference between ascii and y ascii or the unicode so why is the unicode ask is the ascii okay if you need to put some hexadecimal has the decimal values are the hexadecimal number so you can use backslash x so backslash r backslash n so these are the same as they are in the they are in the programming so if you want to if there are some if some character like double quotes you need to escape them okay these are the hex string for example you want to detect some of the hexadecimal data so that is how you put it in so you inside the curly brackets and just write the decimal data but make

sure there's a space in between so if you want to skip some data so for example you want you don't know what if this byte can change in different samples so you can just put a question mark it also provide you to only a nibble nibble is basically half of the bike or the four bytes and if you want to skip a number of bytes for example f2 to the these two has a decimal number appear and after that it skips four to six byte it will skip four to six byte and after that these two appear okay also there's r operation okay and this is there are some like these are advanced strings i will leave them up if you want

to read there and this is the link where you can eat of this string more so we have covered the string part and now come to the condition part so condition is basically how we are going to how we are going to use these strings in detection

so all of them if you use all of them okay this uh this is these are the different parts of the condition so i'm just showing you what are the parts for example this use you use all of them so it means both s1 and s2 need to be present in the malware for this rule to be triggered okay let's go any of them it means just one of them need to be present and you can also use like logic and or logic same as programming okay if you have multiple strings you can prepare a set so you can s1 and s2 okay so if you have a lot of string that uh like you can also use a wild card so

what does it mean all of the strings that start with that will make a set of all of the string that start with s okay and any of s so basically this this is the condition that will apply any of the any if any of the strings present from this set the rule will trigger okay mc at zero if you need to use the offset okay mg should be present at offset zero sorry this is the opposite and here if you need to count the number of like for s1 need to appear greater than 10 times so this is how you do it okay let's move on to the next part okay so usually whenever you will write a

rule it will write a good condition it have three parts at least so that is recommended but there will be some cases where this might not work but more or less this should work like 90 to 95 percent of the cases so first thing this is the magic number of the file that you're trying to detect so for windows it is mzr if you translate into uh basically uh hexadecimal that would be 4b5a and after that make sure you have a file size limitation so it's just an estimate you don't have to be exactly accurate and most likely what what i used to do i used to what is whatever the file i'm trying to detect i will double it maybe

some if you have a good reason triple it may five times it but usually double or triple are good that is just that will basically help you skip any file that is too big and time consuming so it's a basically performance improvement and also will save you from a lot of fps and here any combination of condition okay so you can use like where do any parts of this so all of them any of them so you can combine these strings okay one thing that i wanted you to do so most of always whenever whenever you need to check the magic number use this so you and you integer 16 basically it will read the unsigned integer 16 at

offset 0. so use this method don't use this method so this method is highly frowned upon if you see if if you submit your rule to an experienced malware researcher and he see this he will automatically know you are a noob okay don't do this okay now i'm just gonna go through the structure of the era so this is a metadata okay it has a name very good name and lockdown service single line okay this is the comment so i did so i write some strings and after that imagine checking the magic number i file size okay and after that combination of string and apart from that i have given you this link so florian roth is one of one of the

most famous guys in the malwa uh yara rules so you can go through these rules and you should be able to understand all of this but if you don't uh please feel free to reach out to me or to um we will be more than happy to help you guys here are some of the like it's using some of the advanced stuff like this is the advanced like p signature so if you don't understand this it's okay but at least the other conditions are simply should be able to understand it okay okay okay i was gonna go into details like what strings to include in the yara stores basically i will say you one thing

uh the strings that are relevant to the malware code you include them in yara okay and if you see any obscene words so basically i have specifically written them here why why because you will see a lot of these words so if you go here like twitter to twitter you can basically read this the whole thread this whole hashtag is about those obscene words you can read them in your free time okay so please include that and it will be more than i think any misspelled words networking artifacts backdoor commands you can you can always include these strings in your url and and the tools p studio flaws i does or search so these are the tool you can basically google

and know about their functionality okay here okay this is lab 3.4 i was gonna go and write a rule with you guys but the time is short so you can basically include these strings uh any of these strings that will be really good i will make a good rule okay so one thing i wanted to give you guys is like okay this is a very basic introduction so but i wanted to leave you with a video so if you just understood nothing or maybe five to ten percent it's okay it's normal don't worry about it just after this video just go there and watch this video it's a 1776 video 76 minutes video and it's from the same team the omaha

this is from the homage team okay and they will give you they will even tell you how can you detect the zero days using rule so be ready to be amazed okay detection like let's quickly cover the smart so for example let's assume you are using like you are using facebook and your friend sends you this link okay okay click on this link and you will be directed to you will be able to hack someone's computer or whatever else but actually it's a malware so you click on the link so what type of request will be generated up if you pre put this link in the basically copy this link and open it here so what type of traffic

this is the traffic that will be generated okay that's how the wire shock is a tool that is used to capture the traffic we don't have time to go through it but it's pretty easy and most of guys would be all i think you would already be familiar with it so let's go to the different parts so get is the http method malware.exe it's the uri so if you see like uh evil.com and the malware ui this is what's our actual evil.com slash this is how it gets divided get uri plus host make your url test r just gives the information about what type of the protocol you support and some is like basically technical

data related to the protocol uh user agent is one of the most important field when writing the rule it gives information about the client system so why it is important most of the malware will have hard coded user agent and you can use them in network detection other than that except language pretty obvious except in coding eating this this is basically encoding connection keep alive you can basically go if you don't understand any of this center you can google so for our purpose the method is important uri is important http version is important okay host and the user agent is important so let's go next and go to our start rules so basically this is how the

snort rule look very commerce not true it's a very basic rule alert tcp any any any message content so for readability say okay we i have kind of highlighted so let's go to the first part first so this is the rule action so there are multiple action but alert is the one that you will always be using it will send alert to your detection engine to same or whatever user interface you are using okay this tcp is the protocol you can also use udp source ip you can actually put an ip and if you want to know how what is the syntax please go to this link and this is the source port it's a

direction operator so traffic is going from source to destination and destination ip destination report you can specify different destination ip different ports please go here formula and then there is a space and then bracket then the actual will start so every rule will have two main parts the first part is the message and the second part is this uh the actual rule strings that you are trying to detect in a packet so the message is the basically actually string that will appear in your alert and content get so okay now we have the strings so content for if you if you are writing a rule for this traffic so what i did i after every like

a highlighting i also wrote some keywords so these are the keywords that you can use in snore to detect if you're detecting something in from here from uri you will use http uri if you're detecting the user agent sorry user agent you will use okay there is no so you will use http header so all of the below if you're detecting anything from the blue box you will use http header okay and okay let's go here so get is the http method so we write the string content it basically specify a string which we are trying to detect are basically the bytes content get okay we are trying to detect in http method so we write in front of it

semicolon http method then we are trying to detect malware.xz and in which buffer it is present where it presses uri so we use the http uri after it and then the fast pattern it's a basically optimized performance optimization so it instructs the smart rules to look for this thing as a malware.exe first before checking any other thing okay and rest of it is the metadata so we covered the metadata in your rules so it's kind of the same the only thing that is important is the set number it's actually the identification number so every rule need to have it rest of the rest of it is optional if you are confused you can just google

it and it's pretty easy to understand if time can state i can actually go through each of it so or the content modifier so i think this is pretty clear this would have been pretty clear very easy message after specify the string then specify where you are trying to have the buffer uh which from which buffer or which place you are trying to detect it that's pretty easy so let's okay distance keyword okay these are the keywords so distance how many bytes not should ignore before starting to basically it's relative for example i will give you example like writers not rule which alerts if host is found 30 or more bytes after user agent so

there should be a user it's the relative so for example you did a match and after that you find the next string relative to the previous string so what it will do let me just like i'm just going to go through the pointer you search for the content user agent okay and after that you see host evil dot com distance 30. so what it will do it will find the user agent then leave 30 bytes it will skip 30 bytes okay and then we'll start try to find the evil dot com host evil dot com and since we are detecting this strength in a http header buffer so we can we write this like http i have a fast pattern we

already been through this uh metadata let's move on to the next word within keyword so it's basically the opposite of distance keyword so what distance that is skip those byte but for what within those it will try to find the string in tho in only in that number of bytes okay for example you need to write a rule where conf is found within 10 bytes after data okay so content data so for example we have a conf we're trying to find the con within 10 bytes should appear within 10 bytes okay we find the string data and after that there is some random data and it's trying so content conf within 10 that is how it will be

you find the data and then you write con and within 10 so once it found this match within next 10 bytes conf should appear otherwise this rule will not trigger one important thing that you guys need to notice like here what i have done is like one two three four it's like 10 bites finish here but the cons is present can't become only complete after the 11 byte so it will not trigger on this traffic because the conf is not present in first 10 bytes immediately after 10 bytes okay just wanted to give this is important distinction okay you can also combine conf this within and distance okay so i have given you write a rule that that alerts

if conf is found within 10 bytes after skipping four bytes from the data okay how can we cover like we need to cover this scenario so what we do is live if okay we ask our to skip four bytes okay find the data then basically skip four by distance four will skip four byte and for in the next 10 bytes five six seven eight try to find the chord now this will this if this is present in the related this rule will trigger okay let's move on to the next offset in depth so these both distance and within they are relative to the previous match but offset and depth they are basically absolute so i think i can write absolutely

they will always start from the start of the packet specify where to start searching within a packet okay and specify how far into the package should look for each specified pattern so offset is basically equivalent so depth 20 let's assume that uh there is a malware we sent its name evil dot in the first 20 bytes of the packet so if you specify the offset it will only start searching from that offset so it will skip that my many number of the bytes and start its search after that so for example let's cover the depth first in that if there is a malware that send its name evil mal in the first 20 bytes of the packet how will you cover so this

is the basically traffic looks like this okay there is some random word one two three and then after that there's the evil male and then there is some random date after that or maybe actual data so what you do is like you just like content evil that male depth 20. so this is optional will force the snort rule to look only in the first 20 bytes of the packet to find evil now okay now let's assume there is a new malware which is out and add five random byte at the start and append the name of the malware so if you reverse the malware that basically there is always five random bytes at the start okay

and after that there is a evil map so five random bytes we don't need to basically look for them so what we can do we can ask our rule to skip those bytes okay it's basically the offset five it will skip first bytes and then evil mal then in the next seven bytes depth okay so you are using them together so in the next seven bytes it will try to find the evil map one thing you need to notice evil mal is actually also of seven bytes three and four oh sorry one two three four and three four and three seven so it will only trigger if after the first five bites evil mal is present okay

other important variables so basically you will see external and http ports flow to server established flow to server it means the tc connection should have been established so float you're detecting the flow to server okay and home net basically usually the ip range that you have so basically present in the snot.com you can read more about in the slot manual but that is how usually most of the rules will look let's go on to the next slide okay i was gonna write a rule but instead now what i will do i'm going to refer you to this course due to the shortage of time come on yeah okay so this is from my actually from my team

so my team from telus cisco so where they will take you some pretty good examples not to if i also want to learn the snot three so we have some resources what's not three that launch version of this is still in the pro still not in the production for like wildly but we are getting there so for you i will highly recommend you guys like start one start these three videos watch three videos they will give a very good introduction it's from my team and you guys learn a lot okay so that is all so this basically uh finishes this con this workshop so again just want you to repeat a couple of things that we have been

repeating from the start okay the purpose of this conference was to give you a head start it wasn't to make you a bellwether malware analyst and to be honest we can't make you malware analysts and out seven hours which was to give provide you resources and give you a head start second thing you need to work on your own please do practice practice practice if you just keep on going through courses and you don't practice yourself the moment somebody will handle your hand you hand hand over the malware to analyze you will be lost i have been why i'm saying i have done that myself when i was trying to learn the malware analysis number third thing

we touched some of the really advanced topics and if you don't if you understood 30 to 40 percent or even 20 to 30 in this workshop please don't feel bad that's perfectly normal start learning going through the stuff if you still stay stuck please reach out to me or um our hassan on our twitter on our linkedin you are more than welcome and last thing that i want to repeat so on average the malware researchers come on the salaries somewhere of somewhere from six seven lakh pakistan in outside world like i feel let me talk in terms of the us dollars like six thousand us dollars to fifteen twenty thousand us dollar per month so

the only reason that this like companies are willing to give you this high sale this much high salary because the malware analysis skill is not easy to learn okay and many people will run away the moment they encounter the assembly language that's the first barrier like 90 percent of the people who want to learn malware analysis will run after the when the moment they encounter the assembly language they will be like okay we need to get out head out of here so please go i have given you the open security training go through that really good guy you will start watch that and give it at least two months for yourself to basically understand the assembly

language basics and after two or three months working hard if you don't still don't understand the sunday language reach out to me and only after that take your decision running from malware analysis and okay that is all from my side and i will transfer the control back to ali or fod whoever is online okay i i like to say a few words as well before we end this um because the salary is a very good incentive like hisham mentioned and but please don't come into malware analysis for the salary the thing is that uh some of the stuff that we've showed you some of the some of it was structured to give you the fundamentals some of it was

structured to show you some of the cool thing cool things that are possible that was the whole reason for putting a show here today so that you can know that if you're able to pass that learning curve there is a lot of cool stuff that you can do with this skill reverse engineering is not just used for manual analysis it's uh whenever someone is doing hardcore low-level security research they have to be able to reverse engineer stuff it it's not limited to just software it's also included it also can include hardware and and so much other stuff and all of these guys that come up with zero days project zero all of that stuff uh

you can't even come close to that uh without having reverse engineering as a skill so if this is something that attracts you because of because of the sort of puzzle that it is because this a puzzle is is exactly what it is every single day on the job when you're doing this you you're going to be solving a puzzle and you're going to be challenging yourself and every single day you're going to be learning something new if that is what attracts you then please do try your hand at it there is not a very good representation presentation from pakistan in the field of reverse engineering or even in the field of vulnerability research when we

talk about thick clients uh there are a lot of guys who are doing web bug bounties but uh no i i don't think there's anyone at all who's doing uh take client uh vulnerability research so we do do need people here and there needs to be a better representation from pakistan and even i think even if let's say three or four of the guys of the people from you who watch this talk can actually get interested in malware analysis and come this way i think our job would have been done so please do your due diligence don't don't just come to us with questions without having tried yourself without having tried your best but if you're still stuck and if you

need guidance uh you're free to do feel free to reach out to me or shout out to hassan and ali pretty much anyone here so yeah over to you

Day Two: Malware Reverse Engineering

Related talks