
[Applause] so I sir a 10-game a 10-game rebuildin awesome hun platform and I work on network security aspects of it like a lot of research in passaic I really all lot of work to do these some of these guys especially Joe who is in-house expert critics foot and also the author of Maxwell's equation Shawn to give this project a lot of DevOps loves and finally broaden caffeine from all the traffic and add caffeine for working tirelessly to try pal us and giving us a dealer to work out work with so last year I was working on domain repetition you know domain scoring and stuff and expected for becoming a big challenge none of the traditional ways or features
of classifying domains would work for Sparkle so I started digging deeper and it came down to this multistage exact chain that exploits confused so malva the spreadsheet exploit kit has three different cities redirect exploit and in fact and what's interesting is that these stages are managed by it's not much but single entity there's three different components at play there are very loosely coupled so if you look at any report form alpha traffic or caffeine will see something like to the rockledge redirecting to neutrino dropping locky and so these are three different components three different players that are working together to make this happen on network levels it looks like this so a user visits the
website it may already be compromised or it may have a malicious edge that redirects the user to a series of games now gates are like bouncers they filter out unwanted traffic to exploit it and they are intentionally hosted on you know throw away domains like dynamic gina's URL shorteners and stuff like that and so they get replaced very often once you get past the case you end up on an expert kick landing page again this park is landing page is loaded really often from exploit kit quite often a steam user may not see a same landing page ever again and so once you're on a running page the browser gets exploited the boundary is dropped and again the binary might
change over time so what I'm going to highlight here in this slide is that there is a law of dynamism in play there are lot of moving parts domains and content gets swapped in and out really frequently so given this challenge I started looking at what tools and techniques we have at hand to look at the expression and they broadly fall into two categories one is to detonate a URL in a sandbox and then observe the artifacts from the oil now that's a really thorough approach definitely but the challenge is the cost of near management you know starting up a vm sailing the state coming back to it so that you don't really can scale that
really well on the other at the end of the spectrum we have to slice vs detox we have ravello we have solid JavaScript engines that lets you walk through the code the office getting on on the run by applying break points at specific dom manipulation function so that's another good approach understand what happens in the browser but you have to Manly give up the code while sounis and what's going on so that's giving the idea that there is a gap between the two approaches and maybe we can find a Midway could we find a way to understand what happens in a browser when an expert critics cute and without having to manually go to the code so enter
headless browser many of you might know already know what he'll disposals these are frameworks that let you interact with the browser without having to load up the GUI phantom juice is the you know the oldest and the most widely used one it's usually used for QA automation stuff like UI testing and stuff like that and it's based on big on top of sub get slimmer jess is another one which is built on top of firefox but has a same phantom Smith API so these frameworks have some interesting API that let us do what you want to do the lettuce inject a JavaScript into a page before the page is loaded so that gives me an idea in
subduing breakpoints could we hook onto the Gaussian function could we just apply the hooks to all the functions that say ravello lets you break point into and intercept the fall in addition of that these frameworks have tools like you know a lot of even handlers that let you intercept HTTP requests and so on can I bought a request drop the modify the request in such support and finally you know you can change the user agents and okay change user agent to look like ie and stuff like that so that was the first idea and I thought all right so I'm going to do some JavaScript cooking on all the functions that ravello lets you to so talk when detroit eval
and all the stuff like that and benefits also cooking it looks like that let me warn you guys I'm knowledge of JavaScript programmer this may look ugly well it works so it's specific you know replacing the original function with a rapper that log so request the responses and optionally call social function and response back to the polish just that's the skeleton of it so armed with this code I loaded up a particular exploit kit and this is I think through tackle each going to neat we know so first phase redirect there is not much code here our regular website may have a small ice cream that gets injected somewhere or a small shower script that
runs an injection iframe but basically not much obfuscation really you know to line three lengths of course that's hidden somewhere so that's easy to get past then we enter the series of gate like in eka mention earlier they are bigots are meant to filter out traffic that exploit kit doesn't want to target think about it an integer infrastructure that you don't want exposed to any random person any random HTTP request so there are gates in place to make sure it just redirects that unwanted traffic out of exploits its path the way the gates do this is firstly the youth line subjects there is going to be browser tuna putting oil fingerprinting it's going to look for AVS it's going to
look for sandbox environment it's even looks for headers process in some cases in addition to server side blind side chicks it has some server side chicks there are exploit kits that are known to target specific countries there are exploit exploit kits that exclude a certain set of countries that they know their they have risk of getting persecuted I would imagine a lot of IP ranges are excluded so you find that if you're running if you try to get reach out to these websites from idiots you won't get a real data so there's our server side chicks in place as well and this is where you start seeing a lot of code obfuscation and encryption let's
take an example so this is again the seal that leach code as you can see there is whole span element which is hidden and has some code that looks really different from the rest of the pool and then this is JavaScript that's highly office gated I don't think you can look at at our script until find out what's going on but one thing i can highlight here is that it is it defines a lot of variables and assigns character values to it towards the end it starts appending these variables in a particular manner to make up a string of code and basically evaluate after two rounds of the opposition I remember is code starts to look like this it stings
browser fingerprinting as you would imagine it's going to look for a user agent and look for IE sort of sub strength in there but whatever highlight here is the first line which initializes a variable using window dot sidebar and windows chrome objects now window dot sidebar is only available in Firefox and window that chrome is only in chrome so if any of these objects are not undefined the value is not zero and subsequently the code fails to decrypt once we get past the browser fingerprinting we get to the decryption phase and this is the challah code like the first line has the key its second line is you can see it is putting the code from the span element the
decryption routine is very typical setup string manipulation where you have alternate calls to string dot from calc road and calcio dad and finally once you get the decrypted strings it's passed on to a function constructor function congestive heart chakra java function prototype in java is basically a function constructor that takes a string argument and returns a function that you valves that code so that that's how the political cute and that code again looks like way similar to this sala p directed to the experts landing page let's look at one more example this is rigged VIP version that came out after Rick a newer exploit kit this exploit kit does something more it looks for specific
browser ego synchronization and assigns probability of above gossiping Firefox or Chrome or I and then towards n it calculates whether it's 3d I or the report and based on that it makes a decision on whether to redirect to the x-bike it or not so our first big have some drawbacks it wouldn't work right so what have we learned firstly we need to replicate ie window object we want to look like I so literally I enumerated all the properties in I is window object and replicated that is Firefox as the next side has some code on how to do that secondly our function hooking code should should go sure we should be hooking on to more functions little eco
wild or leave it may make the output mode noisier but you catch more stuff so function prototypes Dom manipulation string manipulation is even handlers just added a lot of more function hooking to that was it functions at least you know like I saw the code earlier and lastly we saw the youth of window that navigator object that's another sort of stuff object that are still being used heavily to detect browser so then navigated has OS Navigator has plug in mind types that are used often to see what for a broader it's running speaking of modifying an object to look like something else but this is how attribute hoping looks like so you can define a
defined Gator object or a function on an object on a particular attribute and that function gets called any time that attribute is accessed so this gives you the hook into a pivot axis you can choose to trigger a different value which used to hide it do whatever you want with those particular attributes so once we have this particular code in place the original code runs through this JavaScript looks like that a lot of scale earlier I actually removed arguments and resolution to make it fit in a single slide but think about this in a single second of execution that some of course you can exert that sort of call stack that you can make sense
off the understand what's happening here so that's the first win some exploit kit go a step further they use plugins and activex activist plugins to look for AV this particular code example is from magnitude I think and it's looking for kaspersky activates objects similarly there is another vulnerability in XML Dom which they were the one we want to build in XML Tom for I even below where you could load an XML with a path with colon double slash and look for file paths based on how the the bitumen ll you can find out whether a particular fine part exists or not and this was used heavily to look for drivers to look for sandbox environment
to look for a visa artifacts so with such objects we need to sort of introspect a little harder that's where proxy of discovery literature proxy objects basically the name scissors our man in the middle proxy to a particular object you can define handles that are called anytime a particular operation is performed on an object if you are a person programmer this is like descriptors so here's a simple example of a generic activex object you can define get an apply to look for attribute access and function calls but you can take this idea further for specifically for XML Dom object and implement exact in the face of an activist object so for XML dom i implemented load XML attribute and you
turn the function put epoxy for that particular attribute axis and I was able to capture all the file names that x-bike it was looking for so that gives us the ability oh sorry so for good measure of add immediate a human behavior you could insert regularly insert some move the mouse in search keyword keyboard events and on keyboard events and look like this at the human human element there so with all that code and again this is like I would say finer lines of hockey code you have a call stack of whatever happens in a browser you have the attribute excesses over the orders call stack we get to see what's plugins and what activex the browser is looking for
if we have defined any particular activex program to the proper interface we see exact interaction with this plugin we get access to cookies we get existed Dom so we can extract iframes objects you know extract the architects of the exploit and follow all the band scanning and finally we get access to the application requests that we can film man in the middle and modify on the way so what can we do with this output at the least we can write signatures and these are all over the place the signatures here are from Maxwell that we use in this project and literally you can search the sixth for these villages and find existing exploit kits but this
play with framework lets you take this idea and further you can write signatures on arguments to document it right arguments to a function prototype so Paul's decryption data is available to veterans images on I haven't tried this but I really want to try to addington it figuring out what decryption keys are uses we use of prescription Keys across six pockets but you know signatures are brutal it's an arms race we write a signature also will change the code so we could do better than that so here are some other features that we started collecting on the data to be able to classify whether a website is definitely bad or something suspicious going on and if it looks
suspicious maybe we should run it through a sandbox so invisible of the screen text box with we are we as entropy that we saw earlier is a telltale sign of encrypted code order it cause 2 string manipulation functions like these are heavily for the caption function constructors seems quite fishy activity of the exhibition for AV definitely bad xml Dom listen for file paths like xml dong loading xml it starts with reg colon double double slash definitely bad and lastly you could look at javascript code and be able to find out whether this is office kidded minified or plain simple JavaScript I did a simple bathroom analysis and was able to classify really well but if you have your machine
learning chops of convolutional neural network would do this really really well so what do we have here so this is a new approach that you can use to build a fast low interaction honeypot that at the least does everything that a JSR JavaScript view of the skaters let you to understand what's happening on a browser while an expert Chris executing you can spoof your way through gate you can you can extract the exploit artifacts in flash files and adobe exploits for out of band scanning for virus photos and such and you can ask skin look for signatures for known expected and that's it [Applause]
any-any questions
oh how and yet put the sights on the pls but I will put it up there for in discussions because a lot of code over there so i will put it up there so I bet you know what I'm going to say next on behalf of b-sides and flip it we'd like to thank Andrew [Applause]