
hello guys thank you for joining us in this day my name is philippe peters i am talking from brazil and today we're going to talk about very interesting talk that i like a lot to stop it's more analysis and of course i chose to talk today about the pdf malicious and so yeah let's start let's begin let's start the presentation right let me share my screen here with you and we're gonna talk about discovery cnc it's a command it's acronym right it means a commanding control in a malicious pdf using a description in code and other techniques right so let me share here my contacts on social medias if you'd like to follow me or maybe send me a
questions i use a lot of linkedin sharing the papers and articles as well and twitter and i have some projects in github if you'd like to see that i talk about the security and freddie hunting so if you have some questions or doubts you can send me a message if you want of course right and let me introduce myself i am security research and security developer advocate advocate at soup innovation dot innovation with a brazilian company right and you have we have a simple purpose um by the way i will show you about the website about my company right so uh i think we have this good definition here by our one of us our founders right at supernovation our
main purpose is to create an explanation growth environment it provides a lot of opportunities to involve your career right and here i would like to share with you we responsible to create to create many open source products right we have by the way for reach charts and giggle all those uh products it's open to the community because it's open source of course and i i'm advocate of the aura sec right our second is basically performing a static code analysis in your code to try search a possibility or any possible vulnerabilities in your code write in and many things that you have in your project right let's return here i am advocate at hacking is not a crime it's very another
very interesting project that i am participating right i'm sharing here one a simple definition about this project this phrase is a good definition right contrary to the population misconception behind a hacker is a life cycle a lifestyle sorry in mindset because usually this expression hacker or hacking many times the media using it the way around right like a bad guy it's not correct to use this you can use a thread actor or maybe an attacker but never like a hacker right because as i mentioned it's a lifestyle is a mindset it's not a fashion statement or a move characters right hacking it's a simple curious and outside of the box thinker who creates underdog solutions for
every problems right the actions and the methods by each of these problems are solved it's called it's called hacking this is a simple and correct definitions right so i'm a vocate of this amazing project and i am a part of the staff team of the devcon group uh sao paulo it's a very very good group here part of the defcon groups i think you know about this befcom event so it's very interesting we have this kind of group we have a discord if you'd like to you know we participate with us we are hoping to talk with you because in there over there we share knowledgement we we share videos we share demos we share
contents many things that you can imagine because it's a community right and i am a security research and instructor at hacker security it's a brazilian company and i am instructor of the mar analysis courses fundamental courses and we have their penetration tax curses and something like that right and i am a writer and reviewer of this magazines and fantastic magazine hack nine in a forensic magazine so this is information about me right so let's talk about our summary or our agenda right so first we understand who is a thread because i would like to put all those in the same page right i think it's very important and after that we're gonna talk about more analysis
and we we need to understand we go to understand about the structure of the pdf we are i will do the demo in the end of this this explanations in the end if you have any questions i may hope to answer to you right so who is threat it's very important thing here because in security we have a many different explanation or you know about this uh word the simple word right so i like to i just chose one of this right based on according this is all right so thread is a definition has a potential cause of the incident that may cause arms in the system and organization and here we have an interesting point
maybe threat can be it's a software attack or a theft of intellectual property identity path of sabotage or or not and information distortion are example of the information security threats as hazut most of the many organizations chose active freddie hunting to practice to defend their organization from the network uh no threat so that's a good point here so maybe you can think now so philippe so actually the thread it's almost everything maybe as of course related to a softwares or attacks of course we have a physical attack as well so in this case is almost different but in the end of the day the concept is almost the same right we need we the attacker will um you know
attack something a softer or a people right that's a important thing here so so the first step when i when i need to when i will execute some analysis the first idea i have of course the asynchronous simple i because i don't know if malicious not right so i have a sample and i need to analyze this and the first step and the first idea is to identification stat because i have a sample i don't know if it's malicious not i don't know if he's a maori or not or it is i don't know it's document malicious or not so this is a simple definition right malware it's a malicious software then maldot is a document malicious so when uh
when i didn't identify this sample i can choose what the best method that i will apply in this binary for example this file right so i will use an static analysis for dynamic analysis right this is just a concept after that i will prepare a report because this is a very important step because when after this the iso fitted this analysis i can prepare a report because of course this is part of our job like a part of the room of this the analyst or the treaty hunter or something like that so this report i can present to my manager to my tech lead or you know wherever but i can explain all those steps that i
have that i executed when i made this analysis right and but philippe if i have a report what can i do with this information so the next step it's very important because i can improve the defense's mechanism because the end of the day when you have or when you do these analysis you can improve your defense's mechanism because you will discover what kind of the steps or way the attack can use in your environment so if you see that attack maybe uh using the bypass technique to you know to export your fire for example you can uh do different uh you can improve the better configuration you can use another different or the best practice you can
see the configurations on the settings you are tools that you use or the product that you buy for example right so after that you can create the good word you know you know the cyber threat intelligence can build it it's very interesting because so if you have a small company maybe it's impossible if you create the cyber club intelligence but today we have many uh different softwares or open source products that can help you to build it you know and why you can create or why you you could create the cyber threat intelligence because if you learn how this behavior of the attacker of course you can prepared you can prepare more your defenses you
know the mechanisms this is a very important here and of course because the cyber threats or the you have a straining cyber resilience resilience we have we need to have this because during our presentation now probably we have another guy another attacker or a study actor creating a new different attack using a different techniques right because of this we have this life cycle not cycle but the cycle of the attacker phase right so you can use in this method let's talk about statistical so what is a static analysis so it's very interesting usually this is the first step that the analyst using because usually the statistical describe the process of the analysing a program code
it means you have a code you can analyze all those parts of this code if part of this code is malicious or not or you can analyze the structure of this code if in this inside this code we have a structure if this structure has called some function for example you can find a dll uh what function this this dll is calling inside the system operation for example and the program itself doesn't run at this time right of course the paint of the program that you can use but usually it's this this this method it's more safe right because as i mentioned doesn't run at this time you using um like a manual command to try
understand of this code right and when you talk about the dynamic analysis it's another part another different method usually it's based on solely on behavior it means in the interaction that the malware has when it's executed or made of our or ammo dock in this case uh this analysis also known like a run finance it means you as a mower inside a controller environment right or not control environment can you use an asm tools or product maybe you can as a put this mower inside this this product this code not called this tool you can analysis of the behavior inside this contract environment right so it can be easily automated there is main sites today that already
perform analysis of this malicious artifact right so using um a small concept called ascent box it's similar that you have a virtual machine and we always put this model inside this virtual machine and after that we will see while those behavior this malware has inside this virtual machine right so let's uh let me see here yes we have before talk about the the physical and logical structure i would like to share something here with you about the faces because it's important if you remember in the beginning of this presentation i talked about the identification step right so the idea here is to try to understand about this point so here i have many artifacts right i have an amazon i have
a view invoice i have a linux i have a pdf this is a folder not a file and i have here repeal docs so and so i have many different five years with different extension so what this what come on what comment i can use here to try and find any uh information about this file so i think many people know about this file command right so i can execute my file command to try identify what this sample is right so here i have an amazon microsoft words right 2007. let me try looking inside another for example view it's a different view it's a pdf file right so let me see here about the linux dot text maybe the text
so in this case it's not a text it's a l5 so it's it's different here because i have a another uh you know interpretation here so let me see another file resume about pdf it's a pdf perfect so let me see here in windows if i have another here i have a simple file simple so here i have a pe 32 executable right so as you can see here i use it with a file command we have a identification process but i have we have here some interesting point because i have i has a clip here file linux stock text but actually is not a text file right it is a file so how it's possible because nearly
we need to understand about the base that's my point here when you execute this file command in this case file determine the files type but what is the actually the correct information that this command are using when is included in your environment so here is the manual i know probably you don't like to read the manual not not only in the linux platform but you don't like it to read any different manner i don't like by the way but i think i suppose that you don't like right so but here we have important information to understand the basis right here we have important information so let me explain more about this the mesh tests are used to check
for file with data in particular fixed format so probably we have a specific form specific format to try to find or to understand what kind of file is it's not of course not based on the extension right and here we have the explanation this file has a magic number here is the key right so all or all those files has a magic number right so this magic number right is started in a particular place near the beginning of the file that tells the unix operation system that file is a binary executable so here we have a simple and very important explanation so all those file has a matching number right and all those magic number are
started in particular place near the beginning so but how it's possible to understand more days so here i made this very interesting thing to try explain more about this magic number for you right so i downloaded the file command not actually not the file command i downloaded the database responsible to offering this information to the operational system of course when you have in in the in the unix platform as you can see here in my machine i'm using the this um this binary is compiled inside the the system operation of course in my case hi here i downloaded this this source code and i search and i download the database of the file command right here
so let's see very interesting information here let me looking looking more inside the javascript so here we have the file of the javascript of course and we have the definitions some definitions important when i execute the file command and this file command looking inside the database to try find the magic number of the javascript file for example let me put here this information i will copy here and i create in this simple the simple file here pdf pdf dot text right and i you right here and i will put the information in smaller now actually we do the different i will put just an hour right and i will save and i you look the pdf
right actually it's a text as you can see here right it's a text but now you change this information in the beginning because i learned now how i read i i i read now this uh in the ma of the file that all those file has a magic number in the beginning in the particular place right in the beginning of the file so i will say here and denote when i execute the file take a look what happened this is a of course the text but in this case the matching number it is next identified by node.js script take a look i will change another thing here i will put just a simple sign over here and i will save again and
let's see what happened ooh it's a python script very interesting so let's change something here it's a pdf move to pdf from the file and i say more pdf figure oh yeah let me change here there privilege to pdf right and let me look again it's a pdf right so let me call python here three affiliates here and we have a syntax wrong because it's not a python right it's very interesting because it's not a python but if you see in the beginning so let's see here cut python take a look here there's some regex ah because of this because i put here the magic number as you can see we have a different hero
different information like uh single reacts you can have a reacts you can see the informations here that you can use here for example let me copy here and i will change one more time here right it's a python i will cut here and probably you see this case you can ask e i think it looks very interesting it's in this case it's different because probably we use just just only like a policy inside the database right now paul is actually it's one of those rules i think it's better this group right so let me change the last the one more and the last time here um oh yes it's an honor but here i will
put the percent pdf because this is our challenge here today and i will say here and let's see what happened whoa it's a pdf the arc man but in the station it's a python so all those so i did it during this presentation basically let me talk with you i did it during this presentation because you need to understand all those bases as i showed during this presentation you see here we saw here actually the file or the identification step it's very very important right because you need to understand inside of the file probably you know about the file command but the idea is here is it's try to you know explain you what how it's
important to understand all those things right because for example let me share my screen one more time with you here so let's let's look inside this one more time here i think you know but probably so maybe some people don't know right here uh let me look inside this mod here we have here another interesting interesting very interesting information uh here uh okay was format or in this case the management the mesh classes are used to check for files with data in particular fixed performance right i read already the canonical the canonical example of this binary is um binary executable compiled problem because it's inside of the system of bridge right and h dock out five boots format is
defined in elf dot h here's another format and possibly there's a good doc h in the standard included directory so let's see because here i am i can see i will copy this format right so let's see if you let me return our yes uh locate this file let's see so this binary actually it's included it's in this path user include elf right are you ready are you read here take a look what is important here this file i think you can react with it this file defines the standard elf types so take a look what is important inside this file right so this file define the standard of types structures and macros right so here we
can find the correct destructor of the elf file if you read below if you see here you can see the the number of the beats use it right the words the words text words and here you can see the interesting form information the elf file headers i know this presentation is about pdf but take a look about the man or the information that we founded and then we found them inside the map the elf file header over this is the file that the elf file had this appears at the start of every elf file it means appears at the start whatever start that is in the beginning of the elf file what started in what start in the beginning in this case
we have in the beginning of the l5 we have uh 60 bytes right of bits in this case and you have this structure 60 bytes right and the first array of this um bytes is called e ident and you will find the magic number and other informations inside this 16 bytes will find all those informations you see so magic number and others information what kind of information you can see here let's see okay this field is filled in the e ident array you will find you identity you can find the file identification byte the first zero the second file hidden fire it's e the third is l in the four it's f i will show
another thing for you here let me open you see file linux 32 right and let me using file i use a file right and if i use here x xd 32 glee nuts right no it's not me knows not me knows
yeah oh it's correct so take a look here the four bytes the first is zero the second it's e in this case it's e here as you can see that the third it's l it's l here right and for it's l you see
you can see so that is important you need to understand this because i read the amount of the file and i used and i found many informations in part of the structure of the file but if i didn't read if i didn't read this information inside the map you know how it's supposed to understand the basis of course i need to run i need to study i need to to read actually i need to study about all those bases this is a main point here when you talk about the modernizes when you talk about the red teams fantastic uh defense team like a blue team or credit hunting teams or sock teams or whatever things you
need to understand all those bases right people right so let me turn here my presentation right let me return here the structures right so let's talk about the physical and logical structure about the pdf usually we have a four parts the header usually the it's very common in whatever binary we have a body we have a cross reference table and we have a trailer right so this is the main part of the pdf in the beginning of the file we have our version number right and you have uh objects you have a images you have a cross reference table this is a location of the object inside between the file because it's random right the access and
we have a trailer it's a location of the certain objects inside the board it means you have a mainly body and you have many objects inside this body and all those all those objects is or are in this case are reference one and another right so let me show you the demo to you so i use here um some tools to help us in in this demo i'll do as i put the pdf id it's very you know too it's created by dj stevens right and i was equipped using linux i if you see here the first information that um we found in this demo as you can see here we found the header as i mentioned the structure right so
the second information here we found the 15 objects right so we have an 15 objects and of course we have a and this is the same 15 right so maybe you have here not made you have two and extremely because we have a two extreme and here we have a cross reference table and if you see here we have others important information of course all those informations are inside the dda stevens blog right you can find all those explanation marks inside the the blog page of the dj studios right but here they slash or slash phase is slash encrypted slash object steam streaming slash javascript here we will find interesting informations all those informations are inside
this mainly part of the pdf so here i have an important tip for you probably when the even created this tool he studied a lot about the pdf structure and of course he analyzed it probably many hours and he found many information inside this this uh samples yes and when probably of course i'm you know supposing because i'm thinking the idea of the basis right so all those slashes are informations inside the main employer the main object and these tools only uh just really possible to create it because probably the dds even the main creator right no new about how this tool works right or not tools but how this pdf works it's the main point here right so let's
continue to see here and we have here one page and we can see here we found five javascript inside the pdf and we have an open action opening actually it's a very interesting point here if you read in the website if you if you read of course you will see the open action is responsible just the the the user don't need to click many times in the pdf file when you have this open action inside the pdf the only action that user needs to do is download the file for example from the email if user receive an email and click in this pdf and download it in for example in the machining in the bitman machine or
in the in their own machine because they have this this file has an open action when the user download this file this file will execute this open action in this case it was a good something inside the pdf right so i can suppose of course it has an open action and it have javascript inside the file and this file just have an only page and we have some information crypto probably it's malicious but i don't know exactly what is malicious but it's malicious i was equipped after that the pdf part it's another tool created by the dj stevens i i set the the run because i would like to see all those information inside this pdf
and here i will explain something to you as you can see here the header and after that you can see here the object the object one and here we have object one take catalog and here we can see the referencing this is here's a very interesting point because the object one is referring object to object three object four object five six and seven right and what this means in this case what is what you know here it's interesting point do you remember when i i explained about the structure if we have we have the object the body inside the board we have objects reference or inside or located inside the file right so many many times many times no
but all those files or those objects are reference one another as i will show you here so take a look in object one we have a javascript and we have an open action oh let me return here we have an object in this first step it's an open action is these open actions after this action executed in the bitman machine is executed or will be as equipped with a javascript so here it is evidence that is malicious because the user when downloaded the file after that is will be executed open action and after that is uh will be executed the javascript right so let's continue to see so okay we have an object two three four five and take a look here let
me return one more time here because i have here another information important information here okay take a look here object four in this case object four references uh object eight and nine because the e4 we just have an object one and two seven but now we have an object eight and nine two more objects right so let's continue an object object six four and here we have another reference object seven we have a another reference an object thin and inside the object then we have a javascript you see okay so and here let me put in object nine take a look what happened here we have an object four because you know one of these one
another one of the another we have this reference between this the object and here we have another reference 11 object 11 so we have an beginning this is the one object and now we have the 1 until 11 right but when we executed the pdf id do you remember how many objects we have inside this file 15. you don't need no don't forget this right here take a look we have the object 10 inside objecting we have a object another reference object 12 right and we have a probably here we have a javascript okay take a look here in object 11 we have oh here we have another information different contain contains a string so probably we have a
something inside this stream right and when we have this flag reflect the code because we need to decode the information right and here we have a length it's it's like a size of this this kind of stream right so maybe when you see here the size it is small okay and in inside of object 12 we have another reference in object 13 and inside it is object 13 we have what a job script so let's continue to see what happened in here take a look continue streaming object 13 and you have the same flat the code but here we have a big big size 31 5 30 151 it's a big a big size right so here
maybe i need to looking more deeply right so okay and we have another object 14 the last then maybe or almost the last right and because we have 15 objects inside here and we have here the object 15 okay so we need to see more deeply of the 13 object right so next step next next step here we always put a pediatric it's another tool actually this tool is not from this even but it's another very interesting tool i will basically hear the comment i always could i will collect the output of this sample all those information because i will structure this information here i and i will uncompress all this information inside this them dot text
right okay i was equipped this so now we need to see what information we have inside this them and take a look what what we will see here i can see all those information and take a look here here we found the first technique used by the attacker so here we can see the javascript hopeful status so what what the next step i need to do so state of this information but here the attacker use this technique right so why you see here some informations i will i see some parameters that the attacker use it in this demo in this case so i will cop the code and i need to of course this also states
this called it to try find some information right so i will set this uh file in html why simple because when we execute any web applications usually they're using the javascript and html and css and something like that right so i found here the evo parameter and i will using this evil parameter to try this off to skate this code right so basically i will see inside this parameter and i will change it to the document writer right because of the point right actually because i do need to i will the idea here is to try rate what information you can find inside this javascript obfuscated right so take a look what happened here it's very very nice what happened here
so after that i gave the privilege access to to access this information of course and i will execute this payload.html so take a look now what happened here in our demo wow i will show you how you found the var a variable payload as you can see here we found a payload right so what what means exactly in this case we found here a payload malicious so first of all we have a pdf file inside the pdf file we have an open action this open action after that will executed the javascript these are scripted to us what of skating inside the javascript has a payload payload is responsible to you know load this payload inside the vitman
machine and this code was responsible to call reverse to call back to the cnc in this case right so now we have a payload responsible to you know load this information or to download this information actually to download this information inside the victim machine and this payload in this case this payload as i am showing to you this will be responsible to call to the cnc from the attacker or the freddy actor so now i could to finish my analysis but here we have interesting point e let's suppose if i have the payload i you know i can try to find this uh ip responsible to this uh attacker right so after that what i
see so let's see what kind information i can try and find so i see here some information right like a a percent here and some numbers maybe so the next step i created another file here right i called it real payload perfect and i would generate i copy and paste all the information inside this file and i'm using the the set to cut this percent here because my idea is to try to clean the file take a look at here the the var you know the parameter and i cut some information and here i cut many others uh like a slash like a percent and i copped this information and i arrived in a unicode hold here
so now i have this unicode code it's another technique because the attacker in this case uses it it's of course it's it's um it's not new but it's old uh the unicode code base it on issue uh ucs yeah actually it's yes yes no ec2 you says yes so it's unicode it's basically two bytes different of the ste is because of it's very common now so uh this is here it's very interesting so let's i i use in many linux platforms so i will see now another platform mozilla in a windows machine to show you so here we have the same code right so we have a key all those percent here i need to cut that i did in the
linux machine right so i cut this information and here i have the pure uh you know pure unicode so after that so do you remember i have i had a a javascript observator i just obfuscated this code and after that i generate a payload this payload i found the unicode codebasin in uct in in e in u c 2 sorry and inside this information i generate a next file doc binary using this tool mozilla right so after that take a look what i did i using a short search it's another two provided by dda stevens i use a put an extra file my file that i generated here in this case and i called http protocol
and take a look here what i found basically i found the cac from the attacker you see this attacker was in this case in estonia europe right so now we finish our presentation and i will show you here my contacts again on the social medias and just to finish the presentation so we let me talk with you we we had a fast as simple because we have a fitted file so we found a javascript inside the pdf file we have an open action we had an open action inside this this file this pdf had an open action and this open action called a javascript the javascript has or had in this case um just quick off stated inside this
javascript looks skate we had a payload this payload it will be download another with my machine right and this payload has a unicode code right it's it's old of course but it is another technique and and after that i generate um extra binary right and after that i use a short to collect the http protocol to find the cac from the attack in this case a commanding control right responsible to set or to send this attack into the bitcoin machine so we finish our presentation if you have any questions so please let me know i am available to you thank you one more time for you stay with me during this presentation so again if you have any question i am
available available to you