
hello all right we're gonna get started so hello everyone welcome to b-sides and welcome to ground one two three four you're here for the talk giving the dog a bone exploring OSN capabilities of pen testing tools and this is our speaker john Brun first i'll just talk a little bit about the sponsors because without them you wouldn't be able to get this conference off the ground we got critical stock and valid mail and secure code warrior paranoid and Robin Hood along with many other ones and it's again without them and our donors and volunteers that we would not be able to make this possible just a couple quick notes please keep your cell phones on
silent or turned off because these talks are being live-streamed so we don't want any disturbances and at the end of the talk we're going to be doing a Q&A session so if you have a question just raise your hand and I'll be going around the audience with a mic and without further ado I will give it over to John and get this dirted set me okay all right okay okay all right so yes thank you for coming my name is John Braun quick background on me I'm currently the head of security at CMD I live in San Francisco I've been there for about 20 years working in various information security roles there in San Francisco
and my contact information if you want to watch me not tweet you can follow me there so quick caveat slide these this research the opinions are about to here are my own not of that of my employer if you decide to taking this information and use it if you think it was a good idea then if you're on your own using your own risk so at my previous gig I spent a lot of time doing a number of security roles but really what was really interesting is some security awareness training we were doing and when we say awareness trade whereas training really was fishing and spearfishing our employees our executives and that's a lot of fun if
you if you get to do it especially because exit executives well I was really surprised by how efficient the spear-phishing was now you know we understand that this is to be true but it was really interesting at the contrast to the two and so I really started thinking about the comparison between target attacks on operating systems open ports on the internet and software versus kind of just you know a spread attack you know someone's scanning the entire Internet hoping to find vulnerable servers and so I just kind of thinking of the efficiency you get in really really one thing kind of stuck out to me in my mind the question was as people are moving
into the cloud is is you know automates scanning has it kept up with with the the tooling that the cloud allows people to do essentially as you know pets becoming cattle how have the automated scan tools kept up and so I went to an interesting talk at San Francisco beast side San Francisco this year and someone was talking about the idea of they're compromising containers and they put a wordpress container up and they were surprised it was fully patched and they said you know after a number of weeks it never been compromised and so I took that thought it was it really kind of tied back into the idea I've been having for a while which is well you kind of
expect a fully patched system to not be compromised but because it wasn't compromised I started thinking where people trying to brute force it do they not care did it you know were they looking at the system seeing that there was no patches avail you know it was fully patched and just walking away and it kind of really started getting my you know getting my brain moving the only thing they have in San Francisco that besides they have drink tickets and so as a few of us kind of put got together talked about the talks we really start hashing this idea out right about this particular WordPress installation the expected variables of whether or not this should
been compromised a few drink tickets more I really started making some bold statements like I bet you could put the username and password somewhere in your WordPress fields and no one would ever notice it a few more drink tickets my friends told me this sounds like a great research product please stop talking about this so this is kind of really the genesis of where this kind of talk went to and here are my general thoughts Mike almost like a hypothesis with the advent of someone that we're going to dive into kind of the cloud methodology that you know everyone's kind of are taking on I think this the reconnaissance tools have to work smarter not harder but I don't
think they actually are I don't think they're able to take always sent information if you're able to give them you know bread crumbs if you will of what we're doing you know either purposely or accidentally are the tools smart enough to either take those take that information turn it into an input and continue scanning your environment or just you know flag it as an alert or vulnerability like this is something that's you should look at this is a this is abnormal in you know kind of to the WordPress example could we literally say here's what here username password and have someone actually compromised the system or use it against us in in the back of my mind is like brute-forcing
logins especially at via SSH is that really still a thing anymore so what I really wanted to dive I wanted to look at SH I want to focus here for I was trying to keep the scope small but specifically because you know if you compromise the system vs stage you have a shell potentially the good chance that the root privileges with that you can do a number of things you can look at see the information you have a foothold in someone's network maybe when I get metadata from the system and look at other systems that the arm is configured in the cloud so I thought she the sage is really kind of a good place to start
so why did I think you know why did I think that the method Cod methodology and why this would make tools have to work smarter it's not really just cloud migration so you know in my mind many companies have done the lift in shift it's not just moving your data center of the cloud anymore a lot of us are now using cloud methodologies we're you know you're talking auto scaling groups you're talking mutable infrastructure infrastructure as code essentially you know you know you don't have long lasting systems anymore they're not being patched your redeploying new systems and so for attackers that becomes difficult because what you know could be a rightist server ten minutes later could be a my sequel
server ten minutes later could be nginx server as it relates to a specific IP so their information is becoming invalidated very quickly another interesting aspect and the cloud it's very difficult to do reverse DNS at the point I don't think a lot of people are doing it anymore so you might be attacking a system you might have a potentially vulnerable system and you don't know who owns it resolves to Google or Amazon and so it's not as easy to even understand whether or not you have a system that's worth attacking did you know you can see one about IDs yes that tax have the defense tooling has has really matured and the cut the barrier to entry in the
cost is really significant it's very inexpensive not to roll out these tools when it comes to monitoring alerting people are using sims now they really got a feedback loop so when you when you have something like a brute-force attack via SSH your systems can understand that they can alert on it you can also start telling you tell your API servers that this is a bad I P you can tell a threat feed and this is a bad idea you can find out from other threat feeds that they're another company that found a a malicious IP address so the tooling is is much improved in specifically for SSH - very very cool and very free products failed
abandon SSH guard are fantastic they allow you to ban systems that basically threshold say two to five failed attempts in a number of minutes and it just banned them and then it they are not allowed in your system and SSH guard is very interesting it come actually comes default with GCP in a Bluetooth so this leads what I called the heart my mind of calling the heart breaker theory so Tom painting the heartbreakers had a very simple mantra when it came to songwriting don't Boris get to the chorus so essentially they said we know what our fans want they want the hook they don't want to slow build they don't they're not doing stairway to heaven
right they're getting right to the chorus in my mind with all these cloud met with the cloud methodology methodology adoption that we're seeing scanning tools can't spend a lot of time hammering at a server they need to get in get the information they want and get out preferably without being detected and at some point I I think there's diminishing returns with the you know especially with the IP address is changing very quickly I also don't think adversaries if they're smart enough to write tooling and increase the tooling to look at the internet and look for these for this OS in and do this research I'm not positive and they get this is these are all my theories I'm
not positive that they're gonna do it against the Internet they're probably gonna say you know things like census which are made to scan the entire Internet we're going to go into this a little bit later those are probably the tools to work off of my guess is they're probably writing tools against census and they are writing tools against your servers and my servers so I call this the attackers New World Order it's very straightforward it's a lot easier to be detected therefore it's easier to get blocked it's very easy to get in that feedback loop where you're hammering an SSH server you're not going to use the same IP in two or maybe even your same
IP block into API servers you might get banned from other companies and then really you know I think it's easier if you have you known exploit if you have one known exploit it's probably easier to find a thousand servers you can use that exploit against an is to try and hammer and get thousand servers for your botnet using brute forcing and that last point just the idea that if you think you have a vulnerable server you don't know how much time that server is gonna be available in the cloud it could easily your production your production Bastion host could easily become someone else's development nginx server so attackers need to move quickly and smartly so we're gonna look at testing
with SSH and so I wanted that so to test these theories I wanted to do a number of things and I wanted a lot of levers to play with so I wanted to register a new domain I wanted to let right just launch systems into multiple multiple cloud providers I want to do some reporting on it and obviously we all want profit so quickl quick information some caveats I'm using fully patched Ubuntu 16 systems sample size concerns obviously you're gonna apply here right so the data is what I got but you know we'll make some I'll make some generalizations but it is what it is I'm also looking at automated tooling I'm not looking at
Danny ocean specifically targeting Terry Benedict here right maybe in the testing we're gonna open these systems up to the world maybe we're gonna catch some of these that's not specifically what I was looking to go after so we launched the Nagini project so I didn't consider this a honeypot I didn't really care I my honey pots are what our users doing in the internet I didn't care what happened if someone brute forced my machines in essence everyone I talk to my vows the ideas in life use this a nice little honey pot project you have and so after the 15th person told me that I'm like alright screw it I have a honey pot project I
thought may so I wanted my servers want to name my domain Harry Potter seem to match the names a little bit too on-the-nose asked my wife Nagini apparently is the snake from load Voldemort so I registered Nagini Co Y Co it's inexpensive a lot of systems of three cloud providers picked the you know GC PID up AWS I wanted someone suggested that I take a look at digitalocean it's a little bit different because knives made enterprises are on it maybe it's more of the Wild West and so it probably gave a little bit different data than safe using Azure as a third cloud provider just you know again fully patched Ubuntu 1604 20 to open the world it's very
important if you want brute force data to disable the brute force security software so this actually was a problem the beginning GCP enables SSH guard for example which is a great feature not what I was looking for so made sure I disabled that also I did centralize logging using elf to anyone if you're looking I used elk elastic dot IO has it has a two-week very powerful tui cluster for free and then from there you can adjust it and pay less it's not too much money it's a great platform to use for research probably so projects like this and when you think you over at your log files and you thought you're really screwed yourself when you're logging
real time using file beats it's fantastic that may or may not have happened so I want to go through comments um the testing methodology but also some of the technical details there under going to some of the details namely I was really you know I've been thinking about doing this for a long time the b-sides talk kind of kicked off me actually standing up and saying I'm gonna do this but they're one of the reasons I didn't do it was because I was really nervous that the technology was gonna be way over my head I was will get into some of it and I just wanted to let you know once I looked at it was very simple so don't
let I'm so we go to the details but don't let you know what seems to be you know too big of a wall with technologies stopped you from actually trying to do some of these when I looked into deep-dive they're like one line changes that was very simple so how are we gonna test this right so the goal here we're gonna give some breadcrumbs here on the Internet we want to see what these automated tooling what they're gonna find if they're gonna be able to find it in marker systems is vulnerable so we're gonna increase the logging that's very simple we're gonna you know nice safety feature OpenSSH they disable password authentication we have to enable that so
we did this is very simple in the sshd config file now another nice security feature of sshd which doesn't help me at all is it by default is not going to tell the log the passwords that people are entering so we have to fix that now Ward out you know again we're talking breadcrumbs how we're gonna publish these to the Internet there's two ways using SSH like identified it can do this one is what's called the pre off banner this is when you write before you log into the Machine you get a banner and we'll go into that the other thing is what's something that really intrigued me cuz I've been wondering about this for years what what this what this was
for and what I could do with it it's called the protocol banner so part of my apprehension to starting this was I hadn't done you see since my intro to see in college which was let's just say a very long time ago so I was very nervous about taking this on but it turns out was a one-line a simple one-line fix to add logging of the username and password and the IP address when the person was doing so I probably spent eight months in apprehension and 10 minutes on a fix pre-op and this is pretty straightforward this benchmark for example suggest or kind of requires that you have it a number of default SSH installations has this it's just a it's
just a standard text file there's almost I didn't test this there's probably not a limit to what you can put here I imagine at some point the SSH client is going to stop returning some of this data but you can kind of see if you want a lot of people say you know there's unauthorized use of this I kind of saved unauthorized use but if you want to use it here some username and password you can test out the protocol banner now this is this part in the bottom this is what the SSH server will gives during the protocol during the initiation of the session this is always always this is fascinating and if we're looking
especially this piece at the right I'd never know what this is for right some systems just as a boon to is or the OS it's being used sometimes it's blank sometimes like Raspberry Pi distros it says Raspberry Pi I didn't know why well of course there's an RFC for that and I'm gonna save you the time we don't want to get too happy are you know through the RFC what let's look at 4.2 the protocol version exchange this one line just dictates what that line what that protocol exchange is and so essentially what that says is that first task first half has to be of a certain variant it has to match the protocols in
the snap so I don't want to touch that half as long as there's a space I can put anything I want to the right and have a carriage return and a line fee at the end I've got 250 five characters for that entire segment besides that kind of world's my oyster right on that comment section so my server is my new genious they launched they start with this right this is what by default they were booing 216 shipped with but I can kind of start doing some fun stuff and seeing you know if this triggers anything what's going on I can slowly kinda start getting a little more a little more bold or I can
start saying hey you know please attack me this is what you know I'm here and so I did how do you enable that simple version to SSH literally it's just the end of the that line the only issue here which again felt daunting it turned on to a simple simple task to enable logging or worked in every time you change this banner you have to recompile SSH and to make it simple you want to create you know I'm using we're going to have to create Debian packages what was a simple 18 line with spaces docker file just took 15 minutes to compile I'm sorry five minutes to compile and it do it every time and built 12 to 15 of
these very simple you just basically take the existing package copy those two files I modified and rebuild so got my list in New Guinea zai had so you know one of the reasons I really wanted to register a domain for this without you know I don't wanna go too over-the-top by one of the other levers to play one is see the statistics that maybe I could we would see in terms of driving traffic so some systems I put in DNS sometimes like I did not I I pivoted I stay pretty hard on GCP for the simple reason that they give you three hundred dollars of free credits and I drifted towards free in a double a-- systems I wanted to do
comparison with those against against the GCP host and of course the digital ocean was a late add that I did also some systems are in DNS time systems are not a few of these I actually actually installed nginx and I registered let's encrypt certificate I want to see if certificate transparency logs might help drive kind of traction to these systems and whether or not that made a difference so what did we see so this ran for the system's ran for varying lengths long at the oldest ones and there was these are all still up maybe a bad idea the oldest ones are forty five to fifty days but let's see where do we see I'm going to publish the
data because it I couldn't get the context correct me go over the context the raw data it doesn't work well in a presentation where you know you're at twelve point font so let's go let me kind of generalize what we saw I use eleven to Gainey's multiple launch dates multiple cloud providers over 1.2 billion brute-force attempts sorry 1.2 million before it will edit that out the interesting thing to see in terms of the data centers the AWS host definitely saw more traffic grant granted this is all sample size caveat right but significant more traffic in AWS however the GCP host saw significantly more unique IPS scanning them let me see if we can look at so for example could please a couple
hundreds of thousands of hosts very interesting unique get peas for like Annie Guinea one we had twenty 500 unique IPS one subnet constituted about 95% of those it was not from North America that was pretty I again I'll publish all this again that's pretty well down the line the interesting things were you found a few IPS from random from different hosting providers that stopped exactly at a hundred requests and like we had like I had like 15 of them flat out just like a hundred hundred hundred I started doing the research on each one each from a different you know almost from digitalocean one was something in France one was from AWS so it does seem like
there was some smart control of all these I'm not gonna say a botnet there's some interesting research I did also run a lot of these through grey noise and of the top like nine every even actually even the ones that only scanned once nearly all the IPS came back as saying they had been known by grey noise which is kind of a threat fee that you can use to query to gradually great tool to query what these systems are someone came back is known malicious some came back as probably scanners but very interesting ninety-five percent of the hits were from the same subnet I didn't have enough data it's seeing that if you're in if it was in DNS and this is
not reverse DNS just normal DNS that it made a difference but it did I didn't have enough data to say that I do have enough data to say the biggest surprise was digitalocean had almost was kind of surprised by this most the servers were easily averaging five to ten thousand brute-force login attempts a day and digitalocean is less than eight hundred per day and the system's ramped up very quickly so officially an AWS uses in AWS the first day you're using eight five thousand hits against it TLS pretty there is almost no discrepancies between the hosts with TLS the host without TLS I don't really get transparency logs didn't seem to do anything so my hypothesis we could do
all this stuff and nothing would happen we could say here's my username here's my password not suggesting this is a good idea but we could do this nothing would happen so what happened nothing happened no one attempted once to brute-force my systems so I don't you know this was my hypothesis do I struggled with this because this was my hypothesis granted it was you know a drunken hypothesis but I stood by it I did not think this was gonna happen I was really hoping it wouldn't I was hoping we were gonna get some attacks I was hoping then we can start slaying more breadcrumbs saying can we start sending the attackers of different spots it didn't really happen so what's next
so I wanted to look and maybe this was expected I did not look at the you know the well-known tools ahead of time to before I made this hypothesis so let's look at that let's see the data they're gathering and take a look to see what they're seeing so I'm gonna start with nmap it's a great tool it's just it I think it's 22nd birthday it's very basic but you know a lot of scanning tools either use it or it's kind of the engine of what they were built to do in a very basic sense this is hitting a host that I did not comprehend the band around you can kind of see it seems to know what it
is and everything looks normal now I take it as a similar system everything else is the same except I change the banner I'm gonna overlay these to kind of make it more apparent it doesn't know the operating system it doesn't really know what's going on it's just finger fingerprinted that comment section at the bottom it just vitals like I don't know what this is if you want to submit this there's a really red flag saying this might be vulnerable it's just as you know hey this is a I've never seen this before so I started looking up all the other you know the top 20 OS and tools and researching tools available and you know I new
senses and showed and those are the big dogs and they're fantastic tools but really all the other tools I looked at just just to make sure I was covering all essentially seemed to use census and showed it as the back as their search engines right so that's what I wanted to take a look at what did they see did they find my systems are they telling suggesting that my systems are vulnerable taking again taking those breadcrumbs suggesting at these this is a vulnerability or something should look in deeper into it well yeah this was actually great news since this found my dick my systems you're not familiar the census it's it was based a project out of the
University of Michigan which I'm legally obligated to say then they gave me research access and I'm an alumni then they yeah I saw I saw you outside my estate guy they what they do is they scan the entire internet once a week and they grab a number of information so in this case they happen to grab all of my assets h banner information what about shodhan yeah sheldon found it too now i should know census and shodhan are not exactly parity the same parity features for i feel for this use case they're pretty much the same thing for my use cases i know that they have different use cases the other thing is shodhan does not to
my knowledge does not appear to is not supposed to claim to sample to these sample i peas they don't claim to scan the entire internet it actually found that to be accurate as well looks like they've showed in founded by eighty percent of my systems and census found 100 percent so i started wondering well maybe you know i can't be the only person putting the word password for example or username in the ssh banners until I looked at census and sure enough I apparently am I the only person that was dumb enough to put the word password so when you think of that maybe is not unusual that we didn't find any data right maybe this was a this was expected
the other thought I had was maybe you know maybe people like well this is a honeypot then I'm not gonna attack her I'm gonna give you my crap data my brute-force systems and I'm gonna do something over here and distract you that's actually absolutely logical and could have happened showed it has a nice a nice feature it agreed with me they're not honey pots so it didn't I didn't apparently red flag anything in any of these scans um this was this kind of really surprised me you know we talked about the two banners I really harp down the noise really interested in which it was the SSH protocol banner remember that other banner we did right this is
the pre off banner most your systems are probably using this in some form hopefully you're not as dumb as I am nothing index these nothing found this so when I started looking on like this this was very surprising to me and I started getting a little I was a nervous but I started thinking and different a different direction from where you know my original statement was no I also this is a working at stage public key and that in that user that I published so this is not a system that's being scanned on the internet no-one's attempted this which isn't really surprising once I looked into it this is censuses statement their FAQ they actually don't do authenticate they
don't try out - to authenticate and you only get this better if you try and authenticate this is where I get scared if no is detecting this how do I know as a security practitioner how do I know that I don't have engineers who are doing this if not if no one can detect detect it and trust me I've known plenty of engineers who are feel stifled by security if they knew they could share key using SSH banners they would do it I don't know if I have a tool that can detect this now I'm starting to get nervous so where do I go from there I wasn't thinking about Nessus Qualis next suppose how those would fit into this
because I don't think attacking oh they're expensive I don't think attackers you're using them but as security practitioners you probably are even if yours mandated because your compliance team needs you to use a tool so let's take a look at that did that find this OS sent data and did that say this is something to look at this is something is misconfigured it's a vulnerability well no of course it didn't so koalas was pretty the koalas report was pretty interesting this version of SSH apparently is susceptible to user enumeration on the same pages report it actually gives a username and they don't notice that the other thing I thought was kind of funny is an SSH banner they
flat out say there is no exploitable information so the commercial tools aren't going to help us here either and this came back to the same thing I used those tools I figure they're all kind of the same so I got a little pissed and I'm like I would think with someone hacked these systems and so I did a data dump and what happened nothing happened I'm not saying she's new to the dumps I assure you saying you should use it it's just driving home the fact that I don't think the automation is there to make smart decisions based on information so just summarize what we found I don't think it's at that point I don't think to scan
tools are equipped to really deal with it to either market as a vulnerability or just tell you hey this is different I've never seen this you should look at it the other another interesting aspect I'm not positive any smart scanning is happening and you know data point sample size so given you example if you launch in a boon to server on AWS unless you spend a lot of time reconfiguring things it's going to there's going to be in a boon to user on that machine they launched with it they want you to use a share a credential that you have to download and or you can automate therefore if I give you in a boon to banner
I don't muck with the banner and you notice in a boon to system in every race day and s you know a to be us publishes their IP range you know it's Native US House you know you know it's in a boon to banner so ninety-eight percent there's a chance there's no boon to user on that host on the system I had that skiing 5,000 scans I did a brute forces a day eight attempts using the boon to user that seems like a pretty easy marriage for a logical attack and no one's even doing that these pre off banners they're not being indexed and I get to that a minute that's that's really interesting to me for a number of
reasons you do wonder if census and Sheldon is getting so much information is just signal-to-noise it's just it's needle I he'll this haystack it's hard to kind of dive into this and last do not do this at home this seems like it's it really should be research purposes only but I do get you know this isn't and this is not where I started I am now nervous about I know engineers who will do this and nervous I can't detect them so what's what's the next kind of where my gonna take the research or where do I'm hoping you know people will kind of take this now I launched a couple blogs back to the WordPress example I these are under
Levy's live for a while registered Buckbeak as well another Harry Potter reference I've left some breadcrumbs on there I'm kind of interested to see how long they could take they get picked up what I can tell is and probably if anyone has ever done web SEL why they're always complaining I think the web is a long game I think it takes a long time to get indexed I don't think two to three months even is a good sample size to even know who's picking you up or who's kind of scanning things but I'm gonna leave these up for a while to see if I had to hack PHP to get the username and password on this that was awful by
the way so hacking see after 20-something years was much easier than hacking PHP Jesus but it's done I really want to do more of an SSH break an IP address break down I'm interested to see if there was just some way I was surprised how much data there was and so I didn't have a chance to analyze it as much as I wanted you know were the systems coming in and hitting every five minutes to try and detect that they weren't going to be a car and then started ramping things up was it a bunch of bad IP addresses that they just flood if they're blocked they don't care and when they were not blocked they send in
the cavalry and they start looking with the real IP addresses I'm kind of interested the one thing I didn't do just could have done it was actually just launch a system with the SSH guard with the security controls in place and kind of see what that data looked like and matching those IP addresses up so I'm kind of I'm probably gonna take that on the pre off banner indexing for completely different reasons is extremely interesting you know in the world of a lack of reverse DNS you don't know what hosts are but for C is benchmark which most any company dealing with any sort of compliance kind of has to follow you have to have a banner most
likely your organization's banner is probably going to be somewhat unique to your organization so if I can fingerprint the world it's gonna be expensive but if I could fingerprint all the Barons the world I can probably start picking out what IP addresses are related to each other that is a it's not exactly what I was thinking of when I started this but that just kind of got my mind to rolling I'd be very interesting census does not pick up this information but I'm kind of interested in some other things I do want to look at some more advanced tooling I don't have any hope that the commercial tooling we talked about is going to pick
this kind of stuff up so I don't have any hope that we're going to get any coverage for our own co-workers who might be screwing things having the same dumb idea I had there is someone I don't know if the person's here someone emailed me about something called intrigue core core intrigue looks like a very interesting platform that kind of backends two senses a number oh s in and it seems like you can make it very extensible I'm very interested in using that for something especially around the banner fingerprint so that's all I'm going to so that's my github I'll probably really post this as well as the IP address data as I that a little bit further so if
anyone has any questions you can just raise your hand and I can go over and bring the mic
hi very interesting talk I think if you look at it from the perspective of expected return though I'm not surprised at your results at all because you know probability of success times a reward where's the reward yeah I I really thought that if our if something was to happen I really thought I'd be stumbling stumbling upon it not an automation tool and that's one of the reasons I really increase the verbosity of the logging so I could so with increase so one thing about increasing the verbosity you can start seeing the error logs of when someone's doing a protocol exchange a person is actually trying to authenticate so I wanted to have that information collected cuz to that point
my assumption was it would not be an automated something someone would stumble PI and say this is stupid alright I'm gonna try a few things and then I had that data to kind of use to say well did that person scan it are they using something else then kind of made leave me in a different direction
anyone else have a question all right I guess all right thank you [Applause]