
introduce myself i am currently attached to a blue team um i in the past i well i call myself a reformed red teamer because i i spent most of my time doing red team i've done pen testing for for years mostly on the public sector for the u.s government's department of defense i've done a fair amount of training i taught a lot of courses for the us navy uh for department of defense um did some time at the nsa that's where i went after school a lot of people have asked me like what's my history i i did the typical mis degree from from utah state then i went on to idaho state from there joined
the governments worked for nsa for for a number of years and then moved on to government contracting from there and i did a bunch of different things training pen testing that sort of thing now i work for stage 2 security we do a lot of fun things there anything from pen testing to defense to um you know security engineering that sort of thing so you name it um i probably have my hands in it in some way so this is my own talk just a disclaimer it doesn't necessarily reflect the views of my employer but they definitely support it um so the question is why javascript malware you know it's it's not not all that exciting you know you you look at
people like waylon who though do reverse engineering of of real life malware um so why why javascript and and this came from a two-pronged approach the first is javascript malware is everywhere so you know it makes threat hunting really easy if you have so many targets to look at you know i'd you can take a look at a bunch of sites and any given day you know a quarter of them have some sort of javascript malware on them especially when you're talking about e-commerce sites um reverse engineering isn't necessary you know i i haven't broken out ida pro in in a number of years now um and i've gotten a little rusty on and so you
know reverse engineering javascript malware is 1000 times easier you know they they try to obfuscate it but there's there's a lot of tools out there that can help you with that if these slides are shared i've got a bunch of links here talking about javascript malware battles between rival gangs trying to get malware on on systems a company that we'll talk a little bit later sansec is is a leader in this so e-commerce what brings us up you know this time of year there's a lot of e-commerce malware you know e-commerce is big business this time of year everybody's shopping online everybody's looking for you know christmas gifts so you know this this means that the
attackers are there as well because they know they can they can gather that information so we have these people shopping online we have the attackers in there um you know traditionally we we've been a reactive group towards this you know we detect malware the merchant gets notified and then you respond so you do ir you reverse engineer the mount where you figure out what happens i wanted to be a little more proactive at it so that's what what started this whole project so if anybody's worked for or a merchant before they've probably you've probably seen this this kind of letter you know it goes along the lines of saying you know your account has a lot
of fraudulent transactions you need to do something about it we're going to shut you down so let's first talk about what a zero dollar verification is zero dollar is a whole place you used to see it as a one dollar transaction but they've gotten gone away from that and done zero dollar um what does this mean why does it matter so you know if it's zero dollars what do you care as a merchant what this is is carter's are are out there and when you're selling on the black market you can have what are known as verified or unverified cards if you have a list of verified cards you can sell those for for way more um you
know 10 times the amount per per card over unverified so what these attackers do is they the carters will take their list that they've stolen and they go through and they find a site that has has a weakness and they'll go through and they'll authorize transactions so they'll go to buy something with the intention of immediately canceling they just want to know the card works and then they take those lists and they sell them as verified and usually there's some sort of like buyback policy so you know if if they're that cards aren't valid then they have they have to buy them back for a certain amount of money so it's in their best interest as
as black market sellers to have verified cards because they can sell for more and they don't have to deal with returns so where do these come from you know they they don't necessarily indicate a breach on your your site um if you're a merchant that has this sort of thing happen you know a huge amount or a spike of zero dollar verifications um it probably just means that your payments process is weak you're missing something in in the steps that will will prevent bot activity from verifying these because you know they're not going in there manually they wrote a script to go through test the card add it to their list [Music] and it's it's not a zero cost for you as
a merchant either a lot of payment processors will charge a fee for verification transactions that don't complete so if they complete they charge you their normal rates you know which is a flat fee plus a per percentage of the sale but verifications especially if you're a repeat offender can they can charge you up to like 25 cents a transaction so imagine that if if a carter targets your site and they test 10 000 cards you could be paid on the hook for you know five hundred dollars which may be your your revenue if you're a small a small merchant so counter measures for this you know you're talking about the typical anti-bot techniques things that
would prevent bots from carrying out an action but not limit your human customers from doing it captcha is rate limiting watching for things like shopping cart reuse so a lot of carters will take and build a shopping cart and then reuse that over and over again for these you know 10 000 transactions um watch your trends on on checkouts if you suddenly have a lot of checkouts that that don't complete or you know don't go all the way through the shipping stage or the orders get cancelled immediately after you know that's a typically a sign of this sort of activity um typically a 0 verification means that you're you're authorizing the card but you're not going to charge until
later date this is common for when you charge the card when when the the product ships so if you can avoid zero dollar verifications altogether you know that's all the better if you can just charge immediately but of course you know that depends on on your your your site and how you operate the second one we're going to talk about um this is this is the one that we we're actually interested in today um this one is um the payment processor or a bank has aggregated a list of stolen cards so that all these banks get together and they share this stolen card data and then they start looking at transactions and when they find transactions that
match between all of these stolen cards then they decide that that must be the culprit and they're usually fairly accurate because we're talking about huge data sets here so all these payment processors they get together they they figure out who has that the common transaction you know where were all these cards used before the they were exposed and then it's the same sort of thing they talked to the processor so the processing bank and then the processing bank turns to the merchant that says hey you you need to fix this otherwise we're cancelling your account so this is just a summary of of our difference here we're looking at stolen cards versus somebody trying to verify stone cards
um most of these are are almost always an indication of some sort of compromise whether it's they've compromised compromised the payment processor they've compromised your site they compromise some sort of database on the back end it's almost always indication of some sort of breach in the chain and whether that's your fault or the payment processor's fault you know somebody's on the hook for it
pretty common targets especially when we're talking about e-commerce are the major players out there so we're talking about things like woocommerce which is a wordplus plug-in it's very common to see multiple cvs or multiple vulnerability announcements come out for woocommerce after the holidays because that's when they discovered them is after the breach other other targets are the platforms themselves so lots of the sales as a service type platforms shopify magento demandware um you know if they can compromise the platform whether it is compromise the credentials whether or is compromising the platform's back end then they can get this information and we're talking about um you know zero-day exploits are fairly common to they'll discover them and hold on to
them until the like the holidays or other common high high um e-commerce times so you think about christmas um other holidays where a lot of purchasing takes place such a valentine's day you know these are times when these will pop up you know they may have sat on them for months before finally using them or they may sit on that access for months you know it could be a sites compromised and then they just sit on it and tell it um the the right time to enact their plan and they're pretty smart about it these are these are pretty sophisticated groups they know what they're doing and then to dovetail right in with the previous
talk you know password stuffing credential reuse this is extremely common with um especially the the sales as a service platforms where it's some mom and pop shop that decided they want to wanted to be online so they set up an account and they use the same password they always use and then you know that's found in a breach and sure enough the the attackers find it tested and then they have access to the to the site themselves another one that's popped up more recently is marketplaces for stolen tokens so this is kind of an interesting one they get tokens from you know a whole bunch of different ways whether it's compromised endpoints stolen data that sort of thing
there are actually marketplaces that take stolen session cookies and sell them to other hackers to use in the tax so basically it's keys to accounts and they're being marketed on the on these forums to be to then be used later in enacting these attacks so our traditional approach to this this is the the approach i walked into was first you get a notification there's some sort of breach we need to react to it um so you go into ir mode you hunt down where the breach is figure out what happened you clean off the site as best you can so that it means removing whatever malware is there try to return return it to a secure
state so that means you have to figure out how they got in the first place change passwords patch you know whatever it is and then return to business as usual the problem with this is it's entirely reactionary and you have no idea what the time is between number five here and returning back to number one and getting a new breach notification and that didn't didn't sit well with me especially because the steps involved so so how do you find malicious code on your site especially if you're relying on third party providers for a lot of this stuff you know do you know every line of code on your site do you know exactly what every line of code does
and how it affects your site you know what about third party plugins what about tracking what if you use an analytics platform what if you use open source dependencies now this was a common one that we saw just recently there was an npm package that was compromised and i i forget the numbers off the top of my head but there were you know thousands of sites that were using this single npm package that um installed a crypto miner so you know you have so many different points that your site can change and maybe you don't know about it just some examples of of code we've we've seen before um so who can tell me what that does i
have no idea this is just a snippet but this is javascript code is actually part of part of a compromise and they've gotten really good at obfuscating its purpose now this one uses a little bit more plain language plain plain english but even still it's it's fairly difficult to spot and this was inserted into a jquery library so that means there was a library file that was several megabytes in size you know jquery is not not small by any means and then this little this little snippet this is just part of it was inserted into that library here's another example this one used a bunch of strings which it then reassembled into code later on so i had
a whole bunch of these static values here and then it reassembled it into code later on in in there um so the to the casual observer you know figuring out what that's doing and and why it's there is extremely difficult so this one here was another another skimmer um it was a little more obvious you can see right in the middle there that push to a url so that was actually harvesting data from the site and then it would just push it to this url and then down at the bottom it actually used a math.random function to generate a slightly random url that put that data to just another example there this is was an interesting one as well
it was inserted into google analytics code so a google analytics library was pulled down from google inserted into the page and then they added their own code to it so right at the top of the page you saw copyright 2012 google and then further on down there was a function within there you know who knows what google analytics should look like who knows whether google analytics code should be on your site versus pulled from you know google themselves so it's fairly easy on a lot of these to get an idea of what's going on things like de-obfuscate.io can help you reverse engineer what's going on it it's not 100 in every case but we don't really need it to be
you know that when you're looking at these these types of code here all you really need to know is that something looks fishy and once you get used to or once you see a few samples of these they'll start to to stick out to you because your javascript code on your site is going to look well formed it's going to have defined functions it's going to be short little lines of code rather than these big long strings and so once you once you know what to look for then it's pretty easy to spot those and so you don't necessarily need to reverse engineer the whole thing just to figure out what it's doing what you really need to do is
figure out your indicators so you're looking at you know what domains is it contacting how is it what's it doing with the data that's harvesting that sort of thing we don't really care exactly how it's assembling that data to be shipped off we just care about where it's going and how we can stop it js beautify in cyber chef is a good tool especially if you're dealing with like minified code you toss it in there it formats it nicely for you so it's much easier to read so once again we just need our iocs we don't need to understand exactly what happens but that led me to how can i be proactive how can i move
forward how can i detect these before we get a letter from the payment processors how can we figure out how to stop this before the the compromise or at least minimize the compromise because after the fact leads to things like loss of reputation or loss of you know your relationship with the payment processor you may have to move on to a different payment processor if you have too many of these breaches so we started brainstorming you know what can we do we looked at tracking the domains that are called you know a lot of these javascript malwares are are pushing data to some um relatively obscure domain and so if we can track those if we track
those dns requests then maybe we can figure out when the site's compromised because they're doing strange dns calls on a regular basis we thought about using selenium to obtain network statistics so we go to the page we we visit the page and we track every network call that that happens as we go through and process the page thought about even just using curl and regex looking for specific keywords that one didn't work very well because there aren't very many keywords we can use we looked at hashing the dom and just every time the dom changes on the site investigating but you know with any file integrity integrity monitoring that's you know lots of false positives
especially when you're dealing with third parties that may update underneath you an interesting one entropy analysis so i thought well maybe i can look at entropy because these this malware that we're looking at it looks pretty random to me as opposed to regular javascript code but once i started putting the samples i had into some some entry entropy analysis it actually turned out that um the entropy was higher on regular javascript code from you know third-party sites versus the the malware the malware repeated enough um that it didn't look like it was very random and so it was it was actually the opposite i expected i expected that um an entropy analysis would take a
malicious script and it would immediately flag it as high entropy and look very suspicious but it turned out to be exactly the opposite of what i thought thread intel feeds turn out to be not great usually because of the um the way these these people operate and that is that they buy their time they wait for the right time to drop their their malware their skimmer code and then they operate using a basically identic domain per engagement and so once once that's burned they don't go back to it and so usually the thread intel feeds are behind several steps behind the the attackers so we started down the the path of selenium at first and
um pointing selenium of the site analyzing the results and generating statistical analysis on that um but it turns out why why reinvent the wheel and this isn't really an endorsement of url scan it's just the tool we ended up using there's a bunch out there but urlscan.io has a an api interface where you can do exactly what we were talking about doing with selenium which is we load the site we parse it we stored data about it and then used that for statistical analysis and it has everything we need there when you load up a site it shows you all the domains that are called all the scripts that are loaded all the ip addresses that are touched
you know all the various supporting components of the the site so if we run url scan provides us back at json with everything we need to know it gives you hashes of all the files it downloads names you can even grab contents if you want but i was interested in a few key fields to begin with and so we started with that i just used python and i parsed out the json grab the dates the pages obtained a pagehash um but at this point i i wasn't quite ready for um full analysis of it you know i just wanted to gather statistical data to see if i could prove my point and it also makes it really easy if you
do find something there's an indicators that you can pull either from the api or url scan site that has every possible indicator related to this site so then you can just go through and pick out the ones that you're interested in and there's a bonus so a company called sansec dia io they're pretty popular in the e-commerce like scanning field they've actually collaborated with url scan to provide classification of sites and and scripts that are pulled in so they there's a little bit of thread intel in there and so far i haven't found it super valuable but it's you know a bonus you get for the the cost of an api key which is if your small volume is free
and then from there what we did is we actually turned it into a splunk soar or phantom project we use phantom to query splunk figure out domain names that of sites so we don't have to keep that list up to date and then kick off automated scans from url scan pull back the results and then ingest it back into splunk for our own data analysis the original backup plan was to use lambda lambda functions to kick off the api calls and then pull back the data and store it in s3 and then we'd figure out how to ingest it from there about midway through this project there was an announcement from i don't know if you're familiar with him
his name's scott helm he runs report url uri.com he actually made an announcement about they were offering an e-commerce solution and that included a couple key tools one called script watch and one called content security policy and basically it was designed to do exactly what what i was intending to do with this project um so you know great minds think alike maybe hopefully um but it's a turnkey solution if you know if you don't want to implement this yourself there's definitely companies out there who are doing it and you can employ them to to do it for you leaves you time to deal with the false positives and tuning rather than you know building infrastructure and that
sort of thing so like i said before this isn't necessarily an endorsement of url scan it's just what we found that works well for us and the intent of this talk was more to demonstrate that the techniques involved with this rather than dependency on the specific tool so what i have um so far uh we haven't completed this project so so if any of you are working on something similar definitely find me on the um besides slc slack if you'd like to collaborate you know i'd be that happy to share share what i've been working on we can figure out how to make things better together any questions
all right thank you [Applause]