
good morning hungary my name is piotr guavasca and it's a true honor for me to start first session at besides budapest 2021 i'm based in poland and for last four years i've been working for infoblox which is secured dns vendor and during these four years i've been analyzing different malware phishing campaigns different malicious activities under just one angle how the attackers were using dns system dns protocol itself in their malicious activities and today i would like to present you a short summary of this research so let's start why we are talking about dns in in cyber security because basically just as most of good programs also malware almost always is using somehow dns if we would
look at cyber calcium model there are several phases of infection so everything starts with a reconnaissance where attacker is trying to get as many information as possible about the victim victim infrastructure ip addresses they are using systems they are using applications everything possible which makes the attack easier later at this stage attacker is usually using dns to gather information from for example passive dns about host names which sometimes can reveal applications used by the big team they gather information such as ip addressing used by the by the victim company then they go into weaponization phase where they prepare the malware they prepare the tools they will be using in terms of dns usually they they need
to register some domain names um they need to set up maybe c2 servers which will be using dns uh prepare the malware itself then there is a delivery phase when the malware is actually delivered to the to the victim uh usually it's a form of the email this email can contain attachment it can contain a link uh it doesn't have to be an email it could be an advertisement on some social media with some malicious url usually when there is a link in the email on the advertisement it's given as a fully qualified domain name not as an ip address and in this uh in this moment when the user is clicking this link there is a first
possibility also for dns system to protect the user because malicious domain names could be blocked and that's the most simple way how people think how dns can protect users then there is an exploitation phase where some vulnerabilities are exploited on the victim systems um in terms of dns that could be for example dns protocol anomaly so especially crafted packet which would cost some actions on the server side or on the client side nah also domain hijacking [Music] which could be you know made easier because sometimes people forget to set good passwords dns registrars sometimes they don't use multi-factor authentication and so on then we have an installation phase where the malware is actually installed in the
victim computer and then dns is not used at this phase rather much the most interesting phase uh specifically from dns protocol perspective is a common and control phase c2 because dns is used at this phase in different ways first usually the c2 address is given or said fixed in the malware as a fully qualified domain name it's about 80 percent of the malware that is using dns names as a c2 server addresses the other 20 is just using plain ip addresses uh when attacker is using just ip addresses here is that when someone will not notice that the ip address is malicious it will be easily blocked on the firewall and then the you know the whole
work which was done by the attacker will be lost so this reconnaissance time lost reconnaissance on preparing weapons and so on so usually they prefer to use a domain name which allows them to easily change ip addresses in case one would be blocked there are several other algorithms such as such as bga domain generation algorithms because it can happen that the domain name can also be blocked but then so there are several algorithms to overcome this possibility the whole command and control communication can also be hidden uh inside of dns queries and dns answers so we will go through real examples later i will show you how this is this is done then the last phase is actions on
objectives where actually the attacker is trying to get what he actually did the attack for so maybe he wanted some sensitive data maybe he wanted to cause denial of service maybe he want to get the ransom maybe he just want to disturb some activities of the victim in case of data leakage so if if attacker wanted to get some sensitive information dns could be potentially used to transfer this data out of the victims network in a very hard to actually hard to detect way uh also in case of the dos of course um the nail of service dns is a great protocol unfortunately uh to be used uh to to to cause such uh unavailabilities of of the system
because if dns doesn't work basically nothing is working let's look at this last phase and let's look how actually data can be stolen with help of dns protocol so let's assume attacker is already in the network he has communication with the c2 server from the cm2 server to the infected endpoint and he has some data uh to be exfiltrated or stolen from the victim's network uh to do that before he started the attack actually he needs to register a domain name one or more domain names and set up authoritative servers for these domain names this means when whenever there will be a dns query for any host in such domain name this query will finally
reach the attacker's dns server then when he is in the last phase of the of the infection in this um phase where he is already in he takes the data he wants to steal he might optionally encrypt it before sending then he needs to encode the data he needs to encode it because the data will be sent inside of the dns query itself and it will be stored somewhere in the fully qualified domain name it could be stored as a host name or a sub domain name but anyway it's always in fqdn which means data needs to comply to the rfc or the encoded data needs to uh comply to the rfcs which are uh describing how dns
query is built how fully qualified domain name is built there are some limits like you know fqdn all together can have 255 characters with dots included there can be only letters uh digits a dash and underscore sign and its case insensitive so some kind of encoding needs to take place before we build such queries it could be a hexadecimal encoding because these are just you know 10 digits and six letters abcdef so it complies the rfc it could be base uh for t2 and coding it could be base64 usually with some modifications later after the attacker has encoded the data he needs to cut this data into into chunks because the limit between the dots in
the name is 63 characters so it needs to be cut in the chunks which should be shorter than 63 characters usually it's actually much shorter then after the query is built it is sent from the infant infected endpoint to local dns resolver of the victim company so it is not usually sent to the c2 server directly no attacker is sending this query to the local dns resolver and this dns resolver looks at the query it's completely you know rfc compliant so basically it sends this query out to the internet and finally it will reach attackers dns server authoritative for his domain name for example here company.pw so here also attacker is using this characteristic of of dns that
basically it's kind of a proxy communication wherever the endpoint will send a query to a dns resolver this resolver is not just forwarding this query like you know passing it through it will build a new query and as a source ip of this query will use its own ip address so ip address of the end endpoint will be actually replaced with the ip address of the dns server so imagine even if the infected endpoint does not have internet connectivity but can send a dns query to this local resolver then actually this query will uh possibly can reach the internet right even the if the endpoint does not have direct internet connection so the dns here is actually working out
as a multi-hop proxy because even in between can be more dns servers for example local dns resolver will send the query to isp of this company or maybe to some open resolve which is just google cloudflare whatever and then finally it will reach the attacker's dns server how does it look in realities here we have an example of alina post malware which is a malware affecting point of sale system and this malware is trying to steal credit card data information here are example queries used by alina pos they are quite long as you see because it's quite simple malware so attacker are not really afraid of being detected they target some companies with quite low cyber
security protections let's look at this second query let's cut the hostname part and subdomain part and try to decode the data i will do that in the cyber chef tool here as you see i have copied the the domain the dns query without the final domain name and let's try to decode it so because we see here upper and lower case we can assume this is uh probably base64 encoding here and standard base64 cannot be used because it is using plus and slash signs so basically usually attackers are using an url saved version of base64 which is replacing those characters with dash and underscore so let's use this one we still don't see human readable data
so let's try something simple because as i said it's quite simple malware for example uh xor since we don't know which which key is used here let's use a brute force capability of cyber shaft and check the different values for different key values uh let's move it a little bit maybe i'll make it a bit slow smaller maybe a little bit better and here for the aa key actually we can decode the data so we see here a unique identifier of infected endpoint it is needed because as i said its ip address is actually replaced so it's lost in the transmission so if the attacker wants to know from which host he gets the queries he gets the
data he needs to somehow uh [Music] set unique identifier of this host and put this id inside of the dns query itself and then we have a computer name of this infected cost here we have a process name the process which is processing credit card data information and here we have a credit card number and the expiry date of this of this credit card so the data is actually encoded in a very simple way uh even more you know funny is that this aa key is used by alina paul's since 2013 because this malware is quite old but originally it was using http to exfiltrate data and in recent years they switched to dns but actually they still use the same xor
key so this is uh how it looks uh in a simple modeler how about more advanced cases for example uh trigbot so trigbot is a a group and the malware family is actually which is modular so it can do different uh different things uh it can for example load a malware which will be later doing ddos attacks it could be a ransomware module uh or if they have no customer for the computer which they infected maybe they will just uh minecraft cryptocurrency so there are several options one of the quite interesting variants of trigbot is named anchor dns it was prepared for kind of premium customers of the trigbot group and it was installed when the trigbot
has successfully infected computers in financial sector or high impact servers such as active directory controllers normally trigbot is using https as a c2 protocol but imagine if you have an active directory controller and you would notice that this computer is connecting some unknown addresses over https on the internet then it would be suspicious on the other hand usually active directory controller is having a microsoft dns on it and it is set as a default resolver for all the computers in the company so it is processing a huge numbers of dns queries to different domain names so it is receiving the queries it is sending them to the internet getting the replies and so on so when they do c2
over dns then this traffic will be actually hidden in this huge number of dns packets so it will be a lot more complicated um to to detect this c2 communication so this is uh why they decided to go with c2 over dns in this case and here you have some example of dns query encoded data here and here you can see decoded data for example a campaign name hostname client id and then the content so the data itself very advanced used of uh c2 over dns could be seen in uh dark hydrous group actions because the communication they are using over dns is highly randomized so for example [Music] imagine the hostname part contains some
encoded data normally most of the malware will use a fixed length data in their case every query they were actually randomizing the uh hosting part it was uh choosing between 30 to 43 characters so every time it was a little bit different so it was you know a little bit more difficult to to detect with some simple signatures also please now that this is not very long it's just 30 to 40 free characters so it's completely different than alina pos alina pos was having like a very long query and they were not afraid of detection here it's an advanced group attacking government agencies they try to build queries which will be average on size usually the size of dns query is
somewhere around 60 to 100 characters so here they get 30 to 43 plus at the domain name and all together it is more or less average domain inquiry uh also the delay between the queries was quite long around three seconds and it was not fixed three seconds it was randomized so it was usually between 2.4 to 3.6 seconds they were also randomizing query types so not always asking about one query although this modeler could work like that but also they could just you know randomly check uh select a query type for each uh query the domain names which were used also it was not just one domain name but each query was choosing randomly a domain
name from a set of those and domain names so it was not like you know that you have a host which is sending a lot of queries to single domain name so it's quite suspicious but here now there were a bunch of different domain names and they were changed so quite quite interesting uh approach even more interesting case was uh is using the bat wpat attack um wpart means web proxy auto discovery basically it's a protocol used by uh for example windows systems when they need to automatically detect which web proxy the the windows should use and this detection is uh based uh partially on the dns system the end point is asking for a host called wpat
and then the domain name is added from the search suffix list from the windows network configuration so usually it's something like you know wpad.company.hub for example if this name is a is not found depending on the windows configuration windows might skip the company part and it will ask for wpat dot hu domaining right so what is possible is that someone will register such domain names such as wpath.huw.i don't know top or whatever whatever victim is using as a top level domain name and then windows is actually asking for a file which is contained which should contain web proxy configuration so there should be an answer which proxy to use for specific url uh so imagine someone is registering such
domain name and if the company victim company does not have this wpat entry then basically the windows possibly could reach to attackers server and ask for the file with web proxy configuration then the attacker will be glad to help and give this file with the configuration but in the file it will direct the endpoint victim's end point to malicious web proxy for example adam jaya which wrote the block which you can see at the bottom had done some research and checked different top-level domain names if wpat's entry exist and for example he found a very interesting case of wpad dot software domain name which was serving a web proxy file which was actually instructing the
browser to encode a url which user is browsing in base64 and then exfiltrate this url to the attacker over dns query so it was he was not like you know the attacker was not redirecting all the traffic to the malicious web proxy actually he was redirecting http and ftp traffic but encrypted traffic uh because of the certificate problems was not redirected only the url was exfiltrated over dns so very interesting case of bad w part attack the most advanced attack i have seen and probably most of us it's sunburst and in sunburst attack so the one where the supply chain problems were were basically exploited in an infection over solarwinds software the first phase of command and control
was used over dns in the first phase sunburst actually infected something like 18 000 computers and because it the first phase was meant to you know do such a wide infection they really care to not generate a lot of traffic and a lot of suspicious traffic so they decided to use dns dns was used to first register infected computer in the c2 server so say hey infection was successful i mean what's next and over dns queries um they also send information about internal corporate domain name of the victim because they didn't want to infect all these 18 000 computer 18 000 companies they actually targeted some you know most important organizations such as some ministries government
institutions and some other important companies so they needed a way to find out in which company they are they choose to to check internal corporate domain name and this name was exfiltrated to the c2 server over dns queries they decided to just send 14 letters of a domain name in a query and if these 14 letters would be not enough they would just ask uh to send more to send another 14 letters if that would be not enough they would ask another 14 letters and so on and then they get a corporate domain name and they check okay is it the ministry of justice looks like that okay so we move to the second stage of the
infection for example uh those queries were built like here so first in the hostname part there were encoded data then some subdomain names suggesting it is just some public cloud like you know europe west one region and then a main domain name which was abs vmcloud.com which could be treated as a look-alike for for example aws vmcloud.com so the the query was supposed to look unsuspicious uh delay between queries here it was extremely long i mean they use something like a few minutes between each and the other query sometimes it could be even two hours and in some cases if there was some failure in the communication they would wait even you know six nine hours
so really really big delays between the queries uh in the encoded data apart from the internal domain name there could be an information about uh status of some security solutions so they check if the endpoint is running crowdstrike or fireeye or something like that and they get the status of this solutions [Music] also in every query there was this unique identifier so there was a mac of this pc of this host basically sorry the name of it and global unique identifier all together this encoded data part had 30 characters so quite similar to the previous advanced case they try to build queries which are not too long so if you are looking for extremely long queries in your security
system that's good but here it would not work all these this data part was encoded using base 32 algorithm with custom alphabet so they didn't use a standard base 32-inch link algorithm quite interesting case was also dns replies so the answer from the c2 server or let's say task assignment from the c2 server like give me another 14 characters or give me security solution status or move to the next c2 stage it was sent in form of ip addresses so basically infected host ask about an array record and in the reply he gets an ip address this ip address meant those specific tasks such as send me rest of the domain name send me security statuses send me
or or move to the another stage those ip addresses were chosen from network rangers sorry of aws microsoft google so basically if someone was observing this dns traffic he saw those queries looking like some public cloud queries with ip addresses as the answer and those ip addresses were completely not suspicious because these were legitimate ip addresses of amazon google or microsoft so it was really quiet quite clever approach another uh approach how dns is used by the attacker is to to download information from the c2 server into the infected cost for example malware could download additional malicious modules this is the case of invisible malware which was detected by eset in some diplomatic and military
sector organizations in in our region of europe and here dns was exactly used to do that so basically when they infected the host in such organization they knew that possibly this host might not have internal direct internet connectivity it might be in some isolated network segments but very often defenders are forgetting that just blocking communication between this host and the internet is not enough and if this host can send a query to local dns server this local dns server will act as a proxy and it will send a new query build a new query based on that and send it to the internet so finally it can reach attacker's name server then the attacker's name server is sending dns
reply to the dns resolver of the victim and then this resolver is replying sending this reply to the end point which was infected so basically there's actually bi-directional communication thanks to dns capabilities in those dns records they encoded new malicious modules so how you do that you basically take a malware module you compress it and code it uh with some algorithm if it's needed for example if you want to store data in txt record you use base64 for example if you use null records actually you don't need any encoding you just can take executable cutting the pieces put in different records in the dns zone and then the endpoint is asking about each and every record
concatenating it decompressing and it can run new malicious model so that was the case here uh let's move to the other techniques how malware can actually get a domain name of the c2 server address there are several options it can be fixed in the malware but more interesting techniques are using some dynamic resolutions here we have an example grouptab.net uh how it is getting uh domain name for c2 actually group taba is asking legitimate cryptocurrency servers such as blockchain.info and asking about details of cryptocurrency transaction and in op return field it is storing c2 domain name and cryptand with is 256. so this is the encrypted data and after decryption you can see c2 domain name so later
when it is contacting c2 server it knows which domain name to use and here we have uh some example communication c2 over dns as executed by groupthemarker another interesting example is this case where attacker was first using dns over https to bypass local dns system so it is sending doh queries um directly to dnsgoogle.com asking to resolve a specific domain name when we look at this domain name dmarc.jqueryupdatejs.com we could notice that in the reply this is a txt record and it looks like a dkim record with rsa key right that's the first impression but actually if you know dmarc and dkim protocol you know that it's not the correct name for the dmarc domain name
and definitely it's not the correct content it's not the correct name for domain keys domain name also so basically just look quite similar but it's not and if we would take this this data here and go again into the cyber shaft we can notice that this data is actually not a key but each slash is actually delimiter and if you take for example this string here you should decode it with base64 once and then decode it with base64 twice and then something interesting will will show up let me just show you how does it look like so okay
let's clean that so this is uh the record let's do base64 decoding url safe as usual [Music] and then let's remove
that so here we can see that this is actually another base 64 uh encoded text so let's do again basic c4 and this is actually a digit a a number a 32-bit integer number and if we would just you know use pink to decode it you can see that actually this is an ip address so it is just stored in 32-bit integer format so in this record actually we have several ip addresses of command and console server here is the one another one another one everywhere where you see this 9pq it's a double equal sign encoded in base64 there is an ipv4 address so very interesting uh case as well another technique which attackers are
using is dji algorithms so they can basically instead of fixing a domain name for c2 the malware contains small small code which will generate a lot of random domain names it can generate like hundreds of thousand domain names per day and attackers register for a specific date just one of these domain names so when the malware is uh will successfully infect a computer it will generate first domain name try to resolve it if it's not existing not existing domain name here we have it will generate another one not existing generate another one if he will be lucky to hit the domain name registered by registered by the attacker then the button will receive an ip address of
command and control server and then c2 communication can start if this domain name will be blocked by someone like firewall dns firewall whatever then basically malware will generate another domain name and another and another and another until it will hit the the one which attackers has registered so basically it makes blocking domain names much harder because every time you block one the another one is generated here attackers are using random characters digits letters to build such domain names in more advanced attack they are using whole words to build domain name because uh it's quite easy to actually detect such domain names uh you are using different arguments to do that like you know you check
engram distribution vowel ratio uh entropy and so on and you can detect such strange domain names but if the attacker will build c2 domain name out of words it's a it's it's much it's much harder so for example here here malware has two dictionaries built in and whenever malware tries to generate a domain name it will randomly select a word from this dictionary from the other dictionary connect it and then it will build domains such as this ones so they look completely you know normal i would say right so this is much more difficult technique to detect but also it's actually possible let's move to some simple things so how about look-alike domain names why they are used basically
it's used in uh usually in some phishing campaigns but not only and they try to build domain name which will look familiar to the user uh if they try to target some brands such as here we have a netflix phishing uh netflix targeting phishing from poland we can see that the malicious domain name the whole domain name to here contains actually partially contains a good domain name netflix.pl so for some users they might not notice it's not netflix.pl but actually it's some other domain right so it's a simple look-alike domain name could be more advanced for example some letters could be used which look like the other letters for example here targeted brand was impost which is a
logistics company and a logistics delivery company and here attackers were have registered domain name ln post.pl because uh lowercase l looks like a capital i so exactly like in the brand name right there isn't capital i as a first but actually this domain name on the left this website on the left is small issues it looks similar the domain name looks familiar but actually when you look in the html code you can see that the all the credentials you enter are actually exfiltrated to malicious server uh last year in poland for example we ca we we saw a lot of com campaigns uh which were using uh look-alike domain names in composition with advertisements on
google or facebook and here you have here you have some examples why they do that because some users when they enter the bank web page they don't put a bank your address in the url bar but instead they put bank name in the google search bar and then they click the first link because usually it's their bank but here actually attackers bought ads campaigns and here if you enter uh for example targeted bank name like getting bank or sgb the first link is actually malicious one the other one is a good one but the first one is malicious here it can be easily seen because the url is not correct but in the other campaign they actually
got smarter and the ad name was containing good url but it was linked linked to the malicious url here we have similar campaign but in the facebook uh pekka bank was targeted and here is a lookalike domain name and here uh advertisement special variation of the look-alikes um use special characters we call this technique idn from international domain name homographs so the domain names which look quite similar by using quite similar characters to to the latin characters but different and here we have an example adobe.com with a special b sign dot under it here we have a sms phishing uh targeting lot polish airlines saying that you know you can get two free tickets just click here
and we have lot dot com but the o here has a dot under it in some communicators such as what what's up this is also underlined so this dot is barely visible there are also some cases like this one so look carefully do you see any difference in these domain names because if yes then basically you need to visit the doctor there is no no graphical uh difference but actually these are two different domain names first one is built using serialic character set uh in syria character said the lighting characters are repeated but under different codes so you can build a domain name which will look exactly like the you know legitimate one but it will be completely
different name in this case this here would be seen in dns system as this one also you know in in browsers uh right now if you put such such domain name like this or this or this in the web browser the browser will show you this puny code name so especially encoded name in outlook unfortunately it's different so if you will get such email with such domain names they will be shown exactly like attackers wants the only good thing which happened is that if you click such link in newest versions of outlook outlook will show you a warning that possibly you are going to some other place that you wanted not very clear message but at least
there is some some warning and finally some simpler case uh sms phishing with lookalike domain name so we see uh this domain name here it was targeting uh check post there was an sms sent asking you to pay some additional money for the package and asking you to pay this money under this link [Music] this domain name was a domain name [Music] registered two days before the attack so as you see the creation date of the domain name was september 5th the first query was seen just 10 minutes after registration but actually the campaign the phishing campaign started on september 7th it was quite interesting because when you click this page you see this uh
page to enter credit card data information it's in check language when you click pay then the other page is shown and here we notice an interesting mistake done by attackers because the text here is actually not written in czech language here on the left you have a google translate of this sentence we have now sent a one-time code to your mobile phones this first one is a czech second is in slovakian third is in slovenian so you can notice this is exactly the same so here actually attackers did mistake and they were targeting czech republic the first page was in check but the second one they forget uh to translate so it was used probably from the previous phishing
campaigns in slovenia so they left it uh on virustotal uh this domain name did not have a big number of detections so during the campaign it was actually zero at the evening after the campaign there was six seven detections but one very really powerful technique to block such campaigns is to block newly observed domain names in your networks so basically you can block uh you can block your users from accessing domain names which were registered in last let's say three days or which were first seen on the internet in last free three days we did some measurements with uh my friend from france nicola and he checked how much time it takes from a newly observed
domain name which is malicious one to get into the thread feed and in different vendors it was more or less between 30 to 50 hours the medium time in many iocs we checked and many vendors it was 43 hours so basically when you block fresh domain names you are getting additional 43 hours on average of protection so very good technique on the defense defense side so key takeaways for today please remember that dns for the attacker is just a method of sending and receiving some data that could be anything that could be c to server address but as well it could be a list of tasks to execute it could be malicious code sent to the infected
endpoint it could be sensitive data exfiltrated out of the network second dns resolvers function as a proxy so wherever you have isolated segment in your network make sure you also check that dns queries cannot go out from this segment and the third thing to remember that you know dns traffic is really well monitored not to mention protected uh and that's why attackers are using it so it's basically uh good to have a look at it so thank you for today i'm available in the in the chat so if you have any questions please don't hesitate to to put it in the chat thank you and goodbye