
the besides DC 2017 videos are brought to you by threat quotient introducing the industry's first threat intelligence platform designed to enable threat operations and management and data tribe a new kind of startup studio Co building the next generation of commercial cyber security analytics and big data product companies hello everyone my name is Andrew Jones thank you for coming here and seeing me today so a little over about a year and a half ago I start working for a new company and what we're focused on is security against automation targeting web applications and this is kind of a new area for me I've done you know Network forensics host base forensics you know malware reverse a I mean all sorts of different
stuff but this particular aspect of kind of web development and testing frameworks and things like that I've never dealt with before so it's been an interesting learning experience for me to get familiar with different ways that automation can take advantage of you know Web Apps on pages and that's what I'm going to talk about a little bit today they lives they might see in the movie they live do you know what I'm talking about right John Carpenter loved him death this is a great film holds up really well made in 1988 basically about aliens I've infiltrated society taken over in this Joe Schmo average blue-collar guy figures this out and goes on a rampage it's it's awesome but the concept of
aliens masquerading as humans I thought was kind of very relevant for the type of automation attacks that I'm looking at talking about today so disclaimer nothing I'm presenting here is world-changing or really that unknown but don't use it for bad stuff and you know the old Google model don't be evil that type of thing and also you know I work for a company all shape security this is kind of relevant to what I do so it's not a vendor demo but it is you know self aggrandizement for myself anyway what I'm gonna cover today is the attack vector I'm gonna talk about what I mean when I say automation to go over some of the tools that are out there
being used I'm gonna experiment you know hopefully some code I wrote watching work today and talk about some defensive measures because you know it's mostly what we do we're on the defensive side but for me at least I think it always benefits you know how to attack in order to know how to defend so that's kind of what we're addressing here so what do I mean when I say automation well let me start off with what I don't mean I'm not talking about sequel injection I'm not talking about cross-sex corrected or CSRF or in a vulnerability scanning because frankly you know those things you don't look like a human you know if you're inputting you know script tags
and semicolons and you know other special characters until you'll say a username field that's not what a normal user does so we're not talking about that that's not the automation I care about the automation I care about is when you go after things like the user name and password field when you're doing real form entry using real standard characters you're pressing buttons you're clicking buttons you're having to worry about you know defeating IP filtering and you know getting around laughs style filtering and looking at some header values and that's the automation that I'm talking about and that's the kind of Defense's that you know I typically have to overcome when we go after this type of stuff in of
course captian and that's one of the most common ones we see out there and I'll talk about ways to get around that a little bit later in the presentation but yeah I mean you know the motivation here is the consumers getting their video streaming services it could be you know hacktivists going after you know petitions online or you know crimeware groups going after banks going after PII you know stuff like that it's it's all relevant it's all accessible so the tools what kind of tools are we talking about here and there's a lot of there's scripts and headless browsers web drivers and our favorite guy over there on the right scripts are great for this type of stuff
they're they're really fast they're lightweight if we're talking about really doing a large-scale attack you can distribute these scripts really easily no sorry [Music]
okay we'll see how do I sound now can hear me
testing one two three perfect okay so jumping back just aminute scripts they're great they're easy they're lightweight the problem is you don't look like a real user when you're coming using curl to interact with the website plus curl is not gonna be able to render JavaScript and that's a big thing a lot of modern websites they have JavaScript components that you can only target if you have a solution automation sophisticated enough for looking for you know JavaScript tags and elements and things like that so for certain cases yeah perfect for a lot not so much headless browsers so this is a browser designed for you know web testing and development and it looks like a real
browser it runs like a real browser supports JavaScript but it doesn't have a GUI you know it's designed just for you know testing basically and the problem with that is there's some elements that that just do not look like a real human interaction you know you can look at things like font or engineering screen resolution other aspects that if you're a you know advanced defender you have some good tools you can pick up that type of stuff and you can see that it's not somebody real again for most cases that's probably you know good enough for you to script automation it gives a target but in our case if we really want to look human that's not the great solution web
drivers and that's what I'm gonna be demonstrating today a tool like selenium it's not a malicious tool to testing development tool again but it's very sophisticated it runs a real browser it drives a real instance of Chrome or Firefox or edge or whatever so it gives you a really realistic presence when you're attacking a website when you're using automation against a website if you're doing deep packet inspection if you're looking at user agent strings it's a real browser there's no difference at all from a normal user a Krauser from their machine so that's the benefit but it is kind of heavyweight requires a lot of libraries things like that not saying it's impossible to push
out but that's a consideration maybe and one other two I'm gonna focus on just a minute for a minute because it's it's really widely used out there is sentry MBA and this is just kind of purely an attack tool there's no good use for this and it's designed for credential stuffing to test very very large sets of username password combinations to try and break into a website see what's valid or not now the thing with sentry MBA is it's actually really simple it's complex in terms of its capabilities but in terms of its functionality it's really simple it's no better than curl honestly however it has a nice GUI if you look at this lots of options in here you can
give figure and there's lots of support out there for it jumping back if you go search centering me on YouTube there are thousands of videos and tutorials teaching you how to use this how to get you know all the information you need to run it all these kids out there doing this and incriminating themselves really stupidly it's kind of funny but yeah it's all out there so I'm not gonna spend a lot of time on it but basically what you're gonna see in these videos is they're all using the same kind of resources they're going to get configs which are specific to websites it just took it tells you basically tells the tool basically we're
to post data to what elements to use may be timing and timeout values and retry values and etc etc so you can have a grow your own with that or you can go download them from a forum like Nold which you just find by google searching combo lists these are these are name password combinations and you can get low fidelity ones by using a tool like paste bin DB 3 Rd 3 V spider which is just going out to paste bin and just scraping any username password combination it sees posted in the last X period of time last day last week whatever like I said low fidelity data if you go out to some of the underground
forums and the blackhat forums they have higher fidelity lists that you can purchase from vendors on those sites so you know that's that's another source where a lot of these guys get stuff from and then lastly proxies gather proxies when you see used over and over again it's just grabbing open proxies you can send your traffic through to mask your source pretty typical stuff so you don't criminate yourself so if you think about the synturion be a for these guys is pretty much running from just a single computer instead of distributing the load to a bunch of different assets you control and you're just masking your you're targeting using these proxies really simple procedure but really easy
for script kiddies that's what this is it script kitties so the action let's see kind of what we can do so what I'm gonna try and do is player on Twitter you know Twitter bots are all the rage these days their race talking about them so you know one great use case for automation is generating brand new accounts you know creating new accounts on the site and then using those accounts to go do stuff with and you know it could be in case what I'm gonna do with Twitter I'm gonna go and you know I'm gonna retweet say an existing tweet out there I'm gonna add a hashtag to it so I can get something trending
potentially I'm not gonna go to that level today but that's theoretically what you could do more malicious stuff you could log into a bank transfer funds you could you know get rewards points and different websites you know hotels and airlines things like that there's lots of different things you can do with automation the power of using a tool like selenium which I'm going to demonstrate here is that it allows all these follow up actions it's not just about getting you know a username and a password testing to see if they work it's about follow-up interaction after that post login process has been completed just random thoughts here selenium is nice for us just because it
opens up a real browser you'll see this and we can visually verify what's happening so it's nice for for us to test and validate the functionality using a proxy you know you can easily incorporate that I do it really sloppy wet you'll see but you know you can do it much smarter than I can and like I said post authentication actions now Twitter account of creation so I only started playing around with this this week cuz I procrastinate really really badly but you know a couple headaches and roadblocks I ran into on Twitter screens like this so phone number validation left this is one thing Twitter employs when it detects you're coming from you know basically an IP
space it knows is bad or it detects obvious signs of automation so they're not blind to what we're trying to do here is they're not stupid and they have some tests to figure this out and so if you get this pop up during the new account creation it's really hard to pass in an automated way there are ways around it but even then Twitter is accounting for services where you can create like burner phone numbers and things like that they know about a lot of those free ones out there so you know that phone number is not gonna work Plus this is an actual voice call not just SMS text so this is actually a
really really good layer of defense here it's a pain in the butt for a you know your normal user if he runs into this but at the same time for a bad guy like me you know that's that's a good defensive layer and then on the right side there you see another message this will sometimes happen if you get past that first phase and generate an account on Twitter because you don't have to use a real email address you don't have to use any real data on there you can generate just a you know garbage garbage Twitter account but at the same time if it detects that garbage Twitter account doing automation or unsavory things it's
gonna lock it until you validate it with the cap show or with that voice check right there or both really so those are a couple things I ran to I got to work once or twice kept writing this and it's gonna be a project left for the future to really successfully you know get this working consistently however log in was a little bit easier to accomplish so let's see if I can actually get this working here
so here's my selenium code and you can see there's a whole bunch of different libraries I've imported you can see I've got legacy stuff in here to run chrome driver Firefox driver Safari driver any browser you want selenium can support and it's easy to go find this stuff download it get it running at least at a basic fashion so you can do this on your own in a couple days it's not 15 minutes so that's really nice I've incorporated a proxy here one thing to note is Firefox seems to be the best to use in terms of a proxy because you can set these Firefox options build a profile that references you know there's built
with a proxy address so you know I've done just a one-off right here instead of going through a giant list of thousands of proxies but at the same time it seems at least in my initial kind of interaction with this Firefox is gonna be one of the more commonly used browsers and user agent strings to look for in terms of automation coming at you because it's more flexible when it comes to integrating with proxies out there so just kind of something to keep in mind out there I'm not gonna go through all my code here but you know one of the nice things with selenium and this is the java version of selenium is using
XPath to find references so I don't know if you guys read code look at source code and websites but you know a lot of times you can't identify dynamic elements using just looking at the source code so using Chrome developer tools and looking at the you know inspect field on there to see where the where the XPath is is SuperDuper helpful when you're putting together something like this but let's just run it and see if it actually works or not so I've got three accounts here which I'm gonna try to login with I'm going through the proxy right now connect to Twitter I've instrumented some technique in here to kind of delay typing a little
bit or ever coming up there so it doesn't just blast out a complete string pattern into the username and password field right away it's going to delay that a little bit to hopefully allow us to look more like a human
dented there we go so you can see I've integrate I've inserted a basically a random value between 1 and 300 milliseconds between each character now go ahead and login and boom we've logged in automated no human interaction at all and now what's supposed to happen is it's gonna bring up a specific tweet that I created
so you could target this using whatever you wanted look for a username you could look for a hashtag you know kind of whatever value you want to go to to bring up the next step here or you can simply start tweeting away and in the case of a credential stuffing attack you know what you're doing is not triggering your own accounts you're taking advantage of tons and tons of different accounts out there and using them to start tweeting get trending going on and boom there there's a tweet there's a retweet with a special hashtag I just created we gonna look for that later and now it's going through account number two so I've got you know three
different accounts like I said it would go through then you can just go through as many of these as possible now in this case I'm only doing this on one computer but I could easily distribute this out to as many machines as I really wanted to if I had you know hosting services if I'm using AWS something like that I can get a whole bunch of machines spun up doing the simultaneously all going through proxies so it's flexible in terms of how wide a range you want to use for your targeting out there and that's kind of the gist of this so once again I just want you to see how easy this is this really did not take me that
much time to put together and I am a very much of a novice at you know actually doing this stuff by hand and coding like this so me figuring this out in less than a week versus somebody professional especially a crime or group that is has teams of folks targeted to building this you know that's going to be much more effective for them I'm go ahead and quit out of that
so one other aspect that I really wanted to touch on and I didn't have time to actually put together code to do it unfortunately was CAPTCHA because we run into this all the time of course this is the automation defense and it's pretty useless now Google reCAPTCHA version two this is one of the the best out there the most commonly used it's really easy to integrate it's free there's really no reason not to use this unless you simply don't have CAPTCHA on your site or you have something better and protonmail is what I used to generate my accounts that I was using to attack Twitter and I thought hey look it's awesome let's generate some automatic automatically
generates them some new mail accounts and there's much easier targets I could go after for free mail but in this case I wanted to try proton they actually do stuff really well in terms of design and security on their site so they do proper CAPTCHA here but what could I do about that and I found this you know as I was googling this week and I thought it was kind of funny so a random guy on Stack Overflow asking how can I beat CAPTCHA with automation and just like everybody's a dick on the internet of course they all come back and say yeah you can't do that CAPTCHAs intended to stop that blah blah blah now for every
problem you have out there there are solutions so these services to CAPTCHA and death by CAPTCHA are have been around for a while and they're really really effective at beating CAPTCHA regardless of what type of CAPTCHA it is whether it's the one with the picture where that you rotate the picture you select different squares whatever type of caption is it doesn't matter basically there's an API code you can integrate into your tool set and beat any of these you serve up this CAPTCHA back to the service and there are humans on the other side now from a moral perspective what these guys are actually doing is employing sweatshop style labor so these guys on the other end and
places like Vietnam and India they're getting paid a fraction of a penny to solve a CAPTCHA and they'll sit there for 15 hours straight solving CAPTCHAs one right after another getting paid a pittance for it at the same time for me it's really effective I got human solving my CAPTCHA for me and they feed back that information to me and within a matter of seconds and I can go and I can progress and do whatever nefarious things that I want to so it solves a problem that I have that may be hard to do programmatically and you can see here basically how easy it is so there's a reliable indicator about where you would you know look for
basically the key the Google key and it's in the iframe here it's K equals so you can look for that string get imported in your code and then serve it back to to CAPTCHA in a request and then you get a response from this request and boom that's your value that you used to serve back to the website and bypass the CAPTCHA it's just that easy so hey you can defeat this it's not a problem and there's some other things that I'm looking at doing as I kind of develop this into a more reliable solution inserting random mouse movements right now I have you know basically Mouse move and incorporated into my code so it does
move the cursor kind of visibly to us from here to here but I think I brought a real user they don't use they don't move in a linear pattern from coordinate a to coordinate B now there's random swirls and I click here and there and I'm you know ugly about how interactive website throws off the marketers but you know in that in my case I really want to emulate that so that it looks more like a real human for anybody that's trying that thing phone barricade home phone verification that was a big pain in my ass so there are solutions and this Twilio might be a good option for me I haven't looked at it looked into it with
that much depth but has an API interface I could build up to basically feel those phone calls to maybe a voice to text translate and get those responses back in real time integrate that into code and boom you know we're golden a creep more portable code you know like I said this is running from just my machine right now I'd like to be able to dump this pretty much anywhere and have it execute reliably so that might be fun to look into and lastly you know a more reliable proxy network so I'm using open proxies as I do this testing and anybody can you find those up in proxies of course that's open proxy it's a free
list so if you have you know compromised assets or you have servers and bulletproof hosting facilities things like that you know those are gonna be more reliable and more hidden from you know the parties that you're trying to target especially large organizations that have you know more sophisticated knowledge and security teams that understand you know a SMS that are untrustworthy and things like that so on the defensive side you know what are you doing what can you do to you know basically prevent this type of automation that looks like a real human and one of the first things is prioritizing your security team working with your web development team working with your mobile development team
they've got to be in lockstep they really do and I see this over and over and over again on the real world in fact I didn't touch on it today but mobile the mobile site is one of the most heavily targeted out there because it's just as easy to go after mobile as it is for web to take advantage of that API interface that they have on the mobile side and in fact when you do that there's usually less defenses on the mobile side than there are on the website it's easier to incorporate captioned people are more willing to do caption on the website than they are on a mobile device so it's a lighter weight
or it's an easier target some definitely keep in mind CAPTCHA yeah I make fun of it and it is a joke but at the same time it will filter out a lot of the noise those come in coming to you because the bad guys out there that are automating they're gonna use the simplest method possible to start off with they're gonna use curl to hit every single website in existence and you know put some cash out there it'll defeat curl from being able to interact with your site and maybe some other simple tools so something to keep in mind JavaScript it's again you know a simple solution but it will filter out a lot of the noise yeah it's an easy thing
to you can implement and none of these costs anything really you can do it pretty easily requiring a little bit more investment but you know highly highly useful is enhanced monitoring so both on your side standing up big data systems I was at FSI sack earlier this week and one of the guys at the bank's was talking about what an effective solution he put together by looking at all the data coming in from browsers hitting his website from his mobile site integrating that data into upticks and authentication patterns and you know he was looking at things like the referrer field on his mobile device he knew he understood his mobile traffic understood that when somebody launches a mobile
device a mobile app to log in and authenticate there's no refer in the HTTP header but some of these this traffic is all coming in did have a referrer there so it's some attacker that didn't understand how that happily worked and thought he was doing good by masking his presence by inserting something in the referrer so understanding your tools and having the data analytics to actually you know review them and bubble that up in an effective amount of time so you can actually react to it step off step-up authentication you know that's we saw that earlier with the phone verification there's there's lots of others asking where you live ten years ago and things like that and those
can be effective they can also be very irritating for your users a user friction basically what that's adding so it's a cost-benefit ratio for you there about what your tolerance is and then specialized anti-fraud tools out there like what I do so you know these all exist and they're out there and you can you take advantage I mean all depends on you know what your risk level is and you know how badly you want to fight this type of threat so wrapping it up you know what I want you to take away from this on the attacking side you know there's lots of ways to do this automation there's lots of tools out there they support all sorts of
languages languages that you want to use it's all capable and selenium I was using Java but there's Python selenium c-sharp selenium Ruby I mean whatever you want to use it's out there and you know don't think the dynamic code or polymorphic content is going to prevent anybody from targeting your website or your web applications because I can find that stuff just as easily by using selenium and rendering the page and interacting with it captions a joke why are these login failures this is a big thing you know you have an internet facing application you're gonna see login failures with it you just accept that we all know this but you got to keep an eye on them you know is it
normal that you have an uptick in you know ninety percent login failures on your website for you know every hundred thousand accounts that do succeed you know understand your behavior and what's common for your platform and lastly you know to track your user post logon this is one of the biggest things I could say if you have the capability to use something like tea leaf or other tools to look at what users are doing after they log in this is gonna be a differentiator because the attackers especially those guys using century of MBA you know all they're doing is trying to see if an account is valid or not they don't do anything after they don't
follow up and in fact most attack campaigns do that as well you know there's a stage one reconnaissance with a investigator side a stage two where they test out credential and then a stage three that could be a month later where they actually come in and interact with those accounts and do things so if you start flagging them early on by users logging in and not doing anything that's gonna save you so much time and potentially money and I mentioned that earlier checking the referrer field you know little things like that that you know are unique to your traffic your mobile app your website things like that if you understand that you can get ahead of the
attackers and them not understanding how your website works and that's about it I appreciate your time there's some resources down here if you can read this kind of small Selim hu guru 99 you know that that was a lifesaver this guy has walked through step by step on how to install and get the stuff running and it's it's a great fantastic resource jQuery UI is a lot of testing interaction stuff you can do to play and make sure that you know your code is working as you expect it to and just to throw it to this guy I already know who he is but he had caps on his page for testing so it was nice and again thanks
for John Cochran or some other sources to support this presentation any questions yes yeah absolutely so we jump back couple steps here so you've got kind of three steps you could take as well as tools that you can purchase and implement and some of the simple things are CAPTCHA even though I make fun of it filters out the noise as well as enforcing JavaScript that's a really simple thing you can do enforce JavaScript before they're allowed to log on or interact with your website that's gonna filter out a lot of that noise a lot of that simple automation that's not going to get be able to render the JavaScript and interact with it so you
know step number one do that because it doesn't cost you anything once you've done some of those simple steps taking advantage of a free you know time and investment you can do you know start looking at things like data analytics because if you guys are getting more data about how your applications work and basically the responses you're get from your user can you bubble them up to an interface that you can look at easily and I'm not just talking to assume an arch site or Splunk or whatever because all data is dumped into there I mean something a little bit more effective where it's pulling out stuff specific to your web application that looks a nominal
you know that's gonna be a manpower investment on your side it's gonna be a tools investment but at the same time it can be really effective and reducing the time to respond when you see someone attacking your site especially when you see someone getting through successfully and if you tie this back this enhanced monitoring to that that use case I was talking about earlier we're looking for users who log in and don't do anything afterwards that's gonna allow you to respond and flag account really really quick when they're being targeted and then you can respond to that and start you know either shutting them down directly forcing password resets or you know handing off to your fraud
department to follow up with later and then other tool step-up authentication everybody's kind of looking at this these days there's different kinds of step up off but honestly for most executives I talked to on a regular basis they want to avoid that type of stuff at all costs because it adds so much user friction to the experience of somebody your average consumer logging in and two-factor is the same thing multi-factor off here if they don't have to use multi-factor if they can figure out a way around that and still have security that's what they want to do and they'll pay lots of money to find a solution that does that that is your question oh yes
your web services gokane create accounts they just bought a ship
so yeah the question was why haven't we seen basely account exhaustion on sites if it's so easy to generate random accounts you know people are more interested on targeting basically log in than they are account creation for most cases if you're gonna target account creation you have to figure out basically is there value behind that and then some of those account creation that's usually where you see CAPTCHA we bring up a funny website real quick if the s-- loads for me it's called a phantom buster and if you guys have heard of this but this is a new company that's basically they're selling this as a business this automation the scraping if you look at their pricing here they
talk about how many pages you want to scrape a month you know do you require cap should be solved all the stuff so in terms of facilitating this account creation this automation there there's companies out here doing this for you so if you don't have a skill or the interest in doing it yourself you can outsource so yeah it exists and it's a it's a constantly improving barrier both on the defensive side as well as in the attacking side the attackers are always one step ahead of course so you know if you start off you know from scratch like kind of I did this weekend developing code you're gonna hit run into a lot of
barriers and that describes you know 99% of the folks that are trying to use automation to do bad stuff like this so it takes them a while or more dedication to figure out how to get past some of those basic defenses like step-up authentication any other questions oh yes
it's always late until right so there's a couple ways to do that there's there's good ways and there's bad ways the bad way if I scroll down here is to use thread sleep mm-hmm oh sorry sorry sorry so using a selenium one of the problems you run into and you kind of saw it when I was using the proxy today the page was loading really really slowly all right so if you writing script how do you account for basically those delays you might run into when you're using random proxies out there that are gonna just not respond very quickly and there's a couple different methods and selenium and Java to do that one is thread dot
sleep which is just a specific evil sleep for this many milliseconds this is a bad way to do it right for us it's not as bad because we're not doing development testing you know in those cases you want stuff to run as fast as possible for us we want to insert specific delays purposeful delays and interacting with the site so it looks more like a real human so this is not bad but in the case like you're talking about with dealing with latency it's not an ideal scenario so there's other solutions in here they have explicit and implicit wait times built in and if you go to that guru 99 site I mentioned earlier it details all this stuff out
there but this is the line that you're interested in wait until expected conditions in this case is the visibility of an element so wait for a visible element to load on the site before I do any interaction with it you can also do wait for an element to be clickable you know like a button before I interact with it there's there's a whole bunch of different options here whether they're hidden hidden values hidden visual elements or you know they're they're supposed to be rendered so yeah there's lots of ways just guru 99 and implicit and explicit weights any other questions oh great I appreciate your time I hope this was interesting for you and like I said my name is
Andrew Jones and if you ever need me my contact info here a Jones 85 at gmail.com thank you very much